Thursday, November 13, 2008

A Tangled Bank of Genes - and Methods

The identification of orthologous genes is a fundamental process in many fields of biology. For systematists, sequencing and analyzing orthologs is critical for accurately recovering the right phylogenetic tree. Molecular biologists hoping to understand protein function and evolution similarly need to be looking at orthologs. The trouble is that gene duplication and differentiation is rampant in genomes and many, many genes exist as members of gene families. Start looking at more than one species and now speciation and the genetic processes and consequences of that will make things even messier still. In the most recent Trends in Genetics, Kuzniar et al. present an ambitious review of what I can only hope are all of the currently available methods for identifying orthologous genes. No fewer than 25 different methods, algorithms, and/or programs (that collectively created a jumble that made me feel a lot like I'd fallen into a bowl of Alphabits) are listed along with their pros and cons. These programs are grouped into tree-based methods, graph-based methods and hybrid methods that use a little of both. The authors attempt to give a few simple examples of how different methods can produce different conclusions and also show how hybrid proteins can really mess with the various algorithms. Fortunately, they do a nice job creating a decision tree at the end of the paper that is based on just 3 simple questions to help one choose what ortholog detection method is right for the job at hand. Still, reading the paper made me wonder why so, so many of these programs are out there - why hasn't natural selection pared a few of these out?

Kuzniar, A., R.C.H.J. van Ham, S. Pongor, and J.A.M. Leunissen. 2008. The quest for orthologs: finding the corresponding genes across genomes. Trends in Genetics 24:539-551.

1 comment:

Glor said...

Thanks for pointing this out Susan. I've been fidgeting over my inevitable foray into this field as I begin working more and more with the anole genome...