Friday, October 17, 2008

Penalized Likelihood for Trees

The parsimony versus likelihood debate seems to have cooled off considerably in recent years. Still, we don't know how complex of a model is appropriate for building trees from DNA sequence data. This is expressed, for example, by the question of how many separate partitions should be used for a particular data set. Basically, we create models with widely varying numbers of parameters, and it is not always obvious when to stop doing this to avoid overparameterization.

One solution to this problem is a method called penalized likelihood. In PL, parameters are allowed to vary in the model, but there is a penalty for dramatic changes in parameter values. For example, one might want to vary the rate of evolution across a phylogenetic tree, but give a penalty when parameter values change a lot between adjacent branches. This is the basic idea of PL in the r8s program.

Kim and Sanderson have now implemented this approach as a way to build phylogenetic trees. This could be very useful; as the authors mention, various other phylogenetic reconstruction methods can be viewed as special cases of penalized likelihood, which can thus cover ground in the spaces between existing methods. This seems promising to me, I'm excited to try it on my own data.

The picture, brought to my attention by Brian O'Meara, is from Sanderson's lab page. I think I need that for my own lab.

No comments: