In a recent paper from BMC Biology, Bob Thomson and Brad Shaffer at the University of Californa - Davis quantify progress toward resovling the vertebrate tree of life. Using a phyloinformatic pipeline and GenBank data from a large sample of vertebrate diversity (100 clades, encompassing about 12,000 species), the authors ask the simple question: "How many nodes in the vertebrate tree do we have some information about?" The brief answer is about a quarter, though this information is highly skewed. Avian and mammalian clades are on average better resolved than the other major vertebrate lineages, and marine clades are on average very poorly resolved. In addition to estimating current 'resolution', Thomson and Shaffer analyze the accumulation of this resolution through time. The superexponential growth curve of sequences in GenBank is now well-known. However, there is little understanding of how this accumulation of data correlates with accumulation of phylogenetic information. These analyses indicate that information is accumulating polynomially and, if current rates continue, we might understand a large majority of the vertebrate tree within a few decades.
Bob has made their data available via a google motion chart, which allows for easy exploration of the studies' results (embedded below):
Anole Embryos Don’t Mind the Heat
1 day ago