Monday, June 30, 2008

Foreign Dispatches: Reporting from the ASP Meeting in Texas

For my inaugural post, I'm relaying some info from the 83rd meeting of the American Society of Parasitologists, held this year in Arlington, Texas. Taxonomy and systematics have always been a large component of our society, though this year there seemed to be slightly fewer talks in these categories. Following the posts from the Evolution meetings, I have to say that parasitologists are still a little behind the curve. Most systematic studies being done were using the typical genes - 18S, coI - and doing concatenated analysis with parsimony plus one other method - sometimes Bayesian, sometimes ML. But, we have to be cut a little slack, I think. Parasites are notoriously difficult to do molecular biology on - often there's not much material to work with so you have limited ability to troubleshoot primers or methods. And sure, there are some genomes available for the parasites that infect humans, but often there's just the one and there is a lot of divergence in some of these groups. Nonetheless, progress is clearly happening and there are some cool things being done. Pete Olson, from the Natural History Museum in London, presented some really cool work exploring the expression of hox genes in tapeworms, animals that have a very different kind of segmentation. Janine Caira (UConn) and Kirsten Jensen (Kansas) and company once again dazzled us with how poorly the parasite fauna of the world is known, by showing how their field work in Borneo has revealed not only dozens of new species of tapeworms, but sometimes new species of hosts as well. Jessie Light (Florida) made us scratch with her work on the taxonomy and phylogeny of human lice, and Mark Siddall (AMNH) hinted that his recent work on EST's in leeches and their insight into paralogy may eventually explain some of the really bizarre findings of Dunn et al. in their recent animal phylogeny. Stay tuned.

Sunday, June 29, 2008

A Boys' Club No More

It's my pleasure to welcome a new author to this blog -- one whose work might actually be directly beneficially to human society. Dr. Susan Perkins is an expert on the systematics and taxonomy of Plasmodium, the nasty little parasite the causes Malaria. She's based at the American Museum of Natural History where she's currently Assistant Curator in the Division of Invertebrate Zoology and a member of the Sackler Institute for Comparative Genomics. In addition to her expertise in invertebrate systematics, parasite biology, and comparative genomics, Susan brings much potential for insight into the fascinating minds of those rare creatures known as the capital 'C' Cladists.

Software Review: BEST version 2.0

As Glor hinted in a recent post, "species tree" analyses are pushing the phylogenetics world towards a paradigm shift. One of the methods currently available to researchers is Liang Liu's computer program BEST (Bayesian Estimation of Species Trees). Version 2.0 is available at the BEST website in both Windows and Max OSX executables. BEST estimates the posterior distribution of species trees that are estimated from multilocus, and multiple-allele DNA sequence data that attempts to account for the persistent pattern of deep coalescence of alleles. This is one of the mechanisms that can result in mismatch between gene trees and the species tree. Details of the method are provided in a paper by Liu and Pearl published in Systematic Biology (subscription required for PDF download).

The BEST website provides example files for analyses when single alleles are sampled for each species, as well as the sampling of multiple alleles from each species. Both analyses assume that species are reciprocally monophyletic. Given that BEST is a modification of MrBayes, the data formats are very similar except that BEST includes priors for theta and mu. Also, the tree topologies, branch lengths, and mu are unlinked across the sampled loci. If a haploid locus (mitochondrial or chloroplast DNA) is sampled, BEST allows the user to define the ploidy of the locus (default setting is diploid).

If a multiple locus dataset is run in BEST, a sham file is needed to summarize the trees after the burnin (the familiar "sumt" command from MrBayes). Liang Liu has posted an example of this type of file on the BEST website.

The trees are summarized using a burnin value that discards all trees and parameter values sampled prior to convergence. As in MrBayes, summarizing the trees produces a consensus tree file, where the consensus percentages for clades are interpreted as the Bayesian posterior probability. Progress of the BEST run and assessment of convergence can be monitored using the computer program Tracer.

My laboratory group has been experimenting with BEST for the past few months, and we are generating some interesting and exciting results. The prior on theta appears to be the one issue/nuisance that we have run across in our explorations using BEST. A fairly wide prior is given in the example files. We are beginning to run BEST with more narrow, and realistic, priors for theta. So far the results are promising.

Overall, I have found BEST straightforward to implement with my multilocus phylogenetic data. Familiarity with MrBayes will certainly help new users of BEST. Also, Liang Liu has been very helpful and encouraging to users, and has implemented suggestions into the example files on the BEST website. My entire lab group is excited to be exploring the frontier of phylogenetics, with the hope of that we are making the most reasonable inferences regarding species relationships that is afforded by our hard earned data.

One More Reason to Post Free PDFs of Your Pubs

My effort to use this weekend to catch up on the latest research in journals like Molecular Ecology, Ethology, Biological Journal of the Linnean Society has been greeted with endless frustration. The incompetent blood-suckers at Wiley-Blackwell have decided to take all of their journals off-line for two days. Un-fucking-believable. Seriously people, let's take science back: get out there and post all your publications as free PDFs. Maybe that way people will be able to actually read them.

Saturday, June 28, 2008

Phylogenetics Grant from the Discovery Institute?

You'd think, given all their problems explaining the history of the horse species we know existed, that opponents of evolution would be loath to add another species to the clade. Not so! The scholars at Answers in Genesis want you to understand - with no uncertainty - that unicorns are real. Now it's our job to complete the taxon sampling required to solve this interesting phylogenetic puzzle. They concede that it may also have been a relative of the cows, so best to sample broadly. (Image cribbed from Weinstock et al. 2005. Evolution, systematics, and phylogeography of Pleistocene horses in the New World: a molecular perspective. PLoS 3:1373-1379)

Gecko Porn

Well folks, our initiative to incorporate porn in the blog has already payed off with at least three visits from perverted Google users. This week I offer a stunning photo of gecko porn discovered on Flickr by my lab's gecko guru, Daniel Scantlebury. Dan tells me these are Ptyodactylus ragazzii from the family Ptyodactylidae Phyllodactylidae. Now that Verne Troyer has expanded the web of celebrity sex tape scandals, I can only hope that the GEICO gecko will not feel tempted to produce one as well.

Friday, June 27, 2008

Highlights from Evolution 2008: Part II

There were a few notable advances in comparative methods. Emma Goldberg and our own Boris Igic illustrated how several recent rejections of Dollo's law may have resulted from a failure to consider differential rates of species diversification and several other violations of the Mk2 model (see our related previous post on Maddison et al.'s work and the BiSSE method). 

If lineages with the character state A are more likely to speciate, or less likely to go extinct, than lineages with character state B, there will be a tendency to infer an ancestor with state A. Emma and Boris illustrate the practical implications of this sort of differential diversification by showing how it may have resulted in the incorrect inference of re-evolution of wings in stick insects (Whiting et al. 2003; in Nature).  If there is any truth to the widespread belief that vagility is inversely associated with speciation rate, it seems logical to suggest that the non-winged lineages have undergone more species diversification and biased the conclusions of standard methods toward the reconstruction of a non-winged ancestor and repeated re-evolution of wings. 

For those of you who missed the talk, Boris was sporting a mustache to symbolize his allegiance with Dollo.

Thursday, June 26, 2008

Tex and LaTeX: Dork It Up!

So, you think you're a science dork, eh? In my book, you don't get to really wear the dork crown until you write all of your papers in LaTeX. LaTeX is a language for creating documents with TeX typesetting; I don't really know what that means, but I do know it makes beautifully formatted pdf documents. There's also a good free mac implementation of LaTeX called TeXShop.

Here's the caveat: it does take a bit of effort to learn. It has a bit more in common with writing computer programs than it does with MS Word, for example. If you've ever edited html code, it's sort of like that. But the effort is time well spent. Here are the main things I like about using LaTeX.

1. Good bibliography management with BibDesk, and automatic citations and bibliography generation. I like this system better than EndNote because it's free, and it doesn't crash or do unspeakable secret things to your document.

2. Ever tried to get a figure in the right place in Word? This process makes me want to stick flaming skewers in my eye. With LaTeX, you put a reference to a figure in the document, and the program figures out a logical place to put it.

3. Equations. Word's equation editor has gotten better, and LaTeX requires some learning of syntax, but once you get it, it works beautifully. All of my math geek friends use LaTeX all the time.

4. Readability. LaTeX documents are easier to read than Word documents.

5. Integration with r through sweave. You can even make documents where figures and results are generated on the fly from your data when the file is processed - so if your data changes, the paper is updated automatically.

5. Reign over other dorks. Being good at LaTeX is the computer equivalent of wearing a Tron costume and speaking klingon (the warriors tongue). AT THE SAME TIME.

Wednesday, June 25, 2008

Highlights from Evolution 2008: Part I

First some highlights from talks on phylogenetic reconstruction in general:

1. The species trees have arrived.

2. 5 to 20 'independent' loci analyzed via partitioned analyses in MrBayes or RaxML are the standard for high-end phylogenetic analyses (of non-model groups). Most people were concatenating, but that may be changing (see point 1).

3. Branch lengths are a growing concern in Bayesian phylogenetic analyses. Both Joseph W. Brown (Michigan) & Jeremy M. Brown (UT Austin) showed that our prior assumptions about branch lengths require revision. Joseph Brown gave a nice example of how shifting to more appropriate (i.e., better fitting) assumptions about branch lengths can turn bad trees to good in the case of paleognathus birds.

Tuesday, June 24, 2008

A New Paradigm? Species Trees From Gene Trees

Lots of talk about a 'new paradigm' today at a symposium on generating species trees from gene trees at the Minnesota evolution meetings. If most of the speakers in this symposium have their way, the days of generating individual gene trees or trees from concatenated datasets will soon be in the past. Talk of such an advance dates back more than a decade, but things have moved very quickly over the past two years. This seems due in large part to the emergence of the software package BEST by Liang Liu. I've received mixed reports about BEST's accessibility and limitations. As one might expect from a new package it's a bit buggy and may be a bit less user-friendly than you'd like, but is working for most people. Liang Liu said during his presentation that version 2.0 posted on June 18, 2008 is considerably better than the previous iteration, so you'd be well-served to upgrade if you've already been messing with it. More generally, there are some potentially significant assumptions of the existing methods for generating species trees from gene trees (e.g., horizontal gene transfer) whose violations are not well understood. In any case, you'd better move BEST to the top fo the list of programs you need to learn...