dechronization: April 2009

Wednesday, April 29, 2009

Dechronization Turns 1

This is going to be a self-congratulatory day here at Dechronization because I can't allow it to pass without marking our first birthday. Over the past year we've had 156 posts and more than 50,000 hits. Although we've been through some peaks and valleys, we ended strong thanks to Brian Moore's engaging discussion of Bayesian inference, Luke Harmon's popular series of interviews with leading figures in systematic biology (1, 2), and Susan Perkins' forays into the evolution of disease causing organisms (3, 4). We've also seen our number of contributors grow to nine, most recently with the addition of two of the brightest young minds in systematics: Liam Revell and Dan Rabosky. We have big plans for the coming year, including the continuation of our interview series (John Huelsenbeck and Rob DeSalle are up next), inclusion of more guest posts, and a renewed dedication to highlighting important new articles and books in our field. Feel free to use this opportunity to comment on the blog, and where you'd like to see it go in the coming weeks and months. Thanks to everyone who's been reading and commenting at Dechronization!

The Importance of Understanding Evolution for Public Health

While lots of people are scrambling to make [good] phylogenetic trees of the new swine flu sequences in the context of other flu viruses, another paper came out in this week's PLoS Biology that presents a really powerful argument for incorporating evolution into public health. Andrew Read, of Penn State University, and others both at Penn State and at the Open University in the U.K. just published the results of their work on late-life acting (LLA) insecticides, arguing that if you understand a little bit about natural selection, you just might be able to profoundly stack the deck in the global battle against malaria. Mosquitoes as a whole, suffer a high mortality rate - according to Read et al., the figure is around 10% per day or 20-40% per gonotrophic, or egg-laying, cycle. In order to transmit malaria, a mosquito must take one blood meal from an infected person and then survive long enough to need a second one, with the interval between these events being long enough for the parasite to develop to an infective stage. Because development to the infective stage typically takes 10-14 days in malaria-endemic regions, very few infected mosquitoes live long enough to actually vector the disease. The paper argues that conventional insecticides such as DDT and pyrethroids, classes of which are "early-acting" insecticides that kill 80% of mosquitoes they come into contact with, exert tremendous selection pressure on mosquitoes to evolve resistance, because they are robbing a large proportion of the whole population of all of their fitness. Conversely, LLA insecticides, which kill mosquitoes after their first gonotrophic cycle, impose far less natural selection and thus the corresponding selection for resistance is slower to evolve. Crunching some numbers allowed Read et al. to show that an insecticide that killed mosquitoes after 2 or more gonotrophic cycles could reduce the number of malaria-infectious bites by 99.2%. Even the insecticides that took longer to kill mosquitoes still showed drastic reductions: the 4-cycle killers showed a 94.2% drop in deadly bloodmeals. Because evolution of resistance itself bears fitness costs to mosquitoes, LLA could allow for some insecticides to be "evolution proof" - i.e. the time it would take for mosquitoes to evolve resistance to these sprays would be so long as to essentially be immortal.

Hopefully this study and others like it (e.g. Wargo et al., 2007) will be carefully read by those making public health decisions - because thinking long-term - in an evolutionary sense, not just a medical sense - is absolutely critical. And if you're not a public health official, read these studies anyway - they make great examples for teaching the importance of evolution to everyday life.

In Good Company

Congrats to fellow Dechroner, Luke Harmon, who along with several colleagues has a nice paper in this week's Nature. In this article, they describe their research on how the diversification of species via an adaptive radiation strongly affects the ecosystem itself, through a series of studies of sticklebacks. There's a nice News & Views on their article as well. Good work, Luke (et al.)!

Monday, April 27, 2009

Time to Start Thinking about Next Year's Meetings...

Although the 2009 meeting for the Society of Systematic Biologists (SSB) has yet to happen, it's already time to think about symposium topics for the 2010 meeting in Portland, Oregon. I've been told the SSB is particularly interested in receiving proposals from the type of people who are reading this blog:) Because proposals will be evaluated at this summer's meeting you'll have to submit them to Kelly Zamudio (SSB's program chairperson) by June 12th. More details and contact information can be found at the SSB's web page.

Sunday, April 26, 2009

Swine Flu: Info-epidemiology

The recent outbreak of swine flu in Mexico which has now spread to other places (including my fair city) has spawned a plethora of websites and Google maps to try to help people track its spread. Ben Parr from Mashable has made a nice summary of some of the key ones, but my favorite so far has been Rod Page's Timemap, which allows users to see the spread through time on a map of the world (also see Page's post on this effort at iPhylo). Having been involved in a project myself from 2005 to 2007, which sought to merge viral genomics and GIS, it is really cool to see that some of the databases and information sources have finally come together in ways that are actually allowing for rapid dissemination of these types of data. Hopefully this virus can be contained very quickly - and hopefully this new field of info-epidemiology will help with that.

Saturday, April 25, 2009

Evolutionary Psychology: Deadweight or Paydirt?

According to Jerry Coyne [1], in science's pecking order, evolutionary psychology is a deadweight, dragging evolutionary biology closer to phrenology than physics. Certainly, that is not all-wrong. Outlandish popularized and scholarly accounts of the causes of emotional and moral trait evolution (including pathologies) are sometimes nearly baseless and generally lacking any pretense of rigor. Loose banter about 'theories' associated with evolution--even in name alone--is not exactly what we need. But it is inevitable because, "evolutionary psychology satisfies our hunger for a comprehensive explanation of human existence [...] Freud is no longer the preferred behavioral paradigm. Now Darwin is ascendant. Blame your genes, not your mother" [1]. As it turns out. some of the blame may fall on your genes, mother, and father.

I was in for a small surprise when, in preparation for my evolution class, which will be taught to a mostly pre-med audience, I read some fabulously interesting papers by Bernie Crespi and Chris Badcock [2,3; and popular accounts in NYT, Science]. Briefly, taking a cue from the early work on sexual conflict by W. D. Hamilton, and expanded by D. Haig, they contend that asymmetric expression of maternally and paternally imprinted genes may be responsible for a wide spectrum of seemingly unrelated mental illnesses.

Given that intragenomic conflict can drive the evolution of paternal and maternal imprinting, and imprinting can affect the development of the parts of the brain involved in social interactions in an opposite manner, Crespi and Badcock argue that balanced expression of those two components results in 'normal' cognitive and social development. Alternatively, a wide imbalance can have a strong negative outcome. If the mother's genetic self-interest wins, this can lead to hypermentalism (e.g. paranoid schizophrenia; pathologically conspiracy-prone with delusions of grandeur, ambivalence). Conversely, male imprinting can lead to hypomentalism (autism spectrum; poor inference of intention, inability to decieve, deficit in personal agency, single-mindedness). The key prediction of the theory is that autism and schizophrenia occupy ends of a 'social brain' [4] spectrum. This is significant because of the puzzling and hopelessly contradictory medical evidence. Autism and schizophrenia do not obey simple Mendelian inheritance, and this paralyzed the search for clinical treatments.

Although definitive evidence is still lacking, and some individuals can show signs of both spectrum disorders, the imprinted social brain theory is now supported by the frequent genomic co-localization of the two end-spectrum disorders, the distribution of copy number variants, predicted correlations with other mental conditions, as well as anatomical and epidemiological data (but see critiques in [3] and elsewhere). At the very least this contention is testable and, if it holds up, it may show outstanding clinical payoffs. It seems that the deadweight could have paydirt potential, afterall.

Notes
[1] Coyne, J.A. 2000. “The fairy tales of evolutionary psychology.” Review of A Natural History of Rape: Biological Bases of Sexual Coercion, by Randy Thornhill & Craig T. Palmer, MIT Press, 2000. The New Republic, March 4, 2000.
[2] Badcock C. and B. Crespi. 2008. Battle of the sexes may set the brain. Nature 454:1054-1055. (Photo credit: J. Robinson)
[3] Crespi, B. and C. Badcock. 2008. Psychosis and autism as diametrical disorders of the social brain. [with commentary] Behavioral and Brain Sciences 31:241-320.
[4] Dunbar, R.I.M. 1998. The Social Brain Hypothesis. Evolutionary Anthropology 6:178-190. (Dechro peeps: we should totally get and re-analyze this data).

Friday, April 24, 2009

New Magazine for Herp Lovers

The International Reptile Conservation Foundation just relaunched its journal with the new name Reptiles & Amphibians: Conservation and Natural History (it was previously known as Iguana). It's a shame that there isn't an on-line version because the first number of this new magazine is fantastic: it's a full-color format featuring eye-popping photos and interesting articles. The photo on the back cover of a Resplendant Quetzal (Pharomachrus mocinno) eating an alligator lizard (Abronia sp.) alone is worth the $25 subscription fee (the crappy iPhone capture seen here does no justice to this photo by José Yee). Articles appearing in the first issue include:

Battle of the Sexes: Asexuality versus Sexuality by Jesse L. Grismer
The Herpetofauna of Guana Island: An Annotated Checklist and Travelogue by Gad Perry and Robert Powell
Arboreal Alligator Lizards in the Genus Abronia: Emeralds of the Cloud Forests of Guatemala by Daniel Ariano-Sánchez and Lester Melendez
Beyond 2008 "Year of the Frog": The Challenges Facing Amphibians and the Amphibian Ark by Ron Gagliardo
One Species that Will be Saved: The Grand Cayman Blue Iguana by Fred Burton
Madagascar Travelogue by Seth Rudman (Glor Lab undergraduate!)

Basal and Derived Taxa

Recently, I’ve been plumbing a bit of the macroecological literature and have been somewhat baffled by the usage of ‘basal’ and ‘derived’ in reference to extant species. These terms are frequently used in reference to the spatial distribution of phylogenetic diversity: does species richness within regions consist primarily of members of basal or derived clades? I am a big fan of much of this work, and I think that the patterns of phylogenetic diversity through space can tell us much about the feasibility of niche conservatism-type models for diversity gradients. However, I have a hard time wrapping my head around precisely what basal and derived mean in this context and think there is a real need for terminological clarification here.

As an example: one macroecological metric is the “root distance”: basically, the number of nodes separating a species from the root of a phylogenetic tree. Several studies have looked at mean root distances among species within regions, classify species as basal (few nodes between root and tip) and derived (lots of nodes between root and tip). Under this classification scheme, there are very interesting differences in species richness between basal and derived taxa.

I have a hard time getting over my initial visceral reaction to the use of ‘basal’ versus ‘derived’ in this context (see previous discussion on the “coffee shop phylogenetics” series). While I think these studies are on to something, my take on root-node distances is that they are a metric of diversification rate or total diversification. Regions with more “derived” species thus contain more species from clades that have undergone substantial diversification (and hence, have greater root-tip nodal distances). But I think a focus on basal and derived taxa is confusing and this literature could benefit from eliminating the use of these terms in association with extant taxa (see, for example, Crisp and Cook on this subject).

Thursday, April 23, 2009

Dechronization Interviews Joe Felsenstein

This week, I've conducted an interview over email with Joe Felsenstein. Dr. Felsenstein requires no introduction, really. If you're doing something in phylogenetics or comparative methods, chances are, Joe thought of how to do it 20 years ago.

Most of the questions below are from me (LH) but a couple come from Dan Rabosky (DR). Many thanks to Joe for participating.

LH: What are the most exciting recent developments in systematics / comparative methods?

JF: The availability of genome-scale information is certainly one. The arrival of a generation of young researchers who are comfortable with statistical and computational approaches is another. But the most important development is reflected in recent work on coalescent trees of gene copies within trees of species. What this does is tie together between-species molecular evolution and within-species population genetics. Those two lines of work have been developing almost independently since the 1960s. But now, with population samples of sequences at multiple loci in multiple related species, they are coming back together. This is not another Modern Synthesis, but it is a major event that needs a name. How about the "Family Reunion"? Long-estranged relatives who have not been in touch are getting together.

LH: Take us back to the beginnings, back when you were working on phylogenetic and comparative methods for your PhD thesis. Where did you derive your inspiration? Did you anticipate the impact that this work would have on the
field?

JF: I did not anticipate it at all. My original thesis project with Dick Lewontin was a rather grandiose theoretical population genetics macroevolution model -- my idea, not his. It didn't work out and I didn't have any useful results. Meanwhile Lynn Throckmorton and Jack Hubby, whose labs were nearby, needed someone to write a clustering program for protein electrophoresis band data that they had in multiple Drosophila species. I volunteered and was
fascinated by the algorithms. I went on to write parsimony programs for the Camin-Sokal, Dollo, and polymorphism parsimony criteria, and then to work on how to infer trees by likelihood using Anthony Edwards and Luca Cavalli-Sforza's brownian motion approximation to gene frequency drift. Dick finally suggested that I write this up for my thesis, which I did in 1967 (the degree was officially 1968). Through the 1970s I maintained a sideline of work on trees while mostly working in theoretical population genetics. It was really not until about 1978 that I began to see that this was becoming more important, and that it fit in with my interest in evolution beyond the species boundary. So I shifted my work toward trees and dropped out of theoretical population genetics.

DR: A lot of what we do in comparative methods is based on Brownian motion, or models for which BM is a special case (eg OU). As you (Felsenstein) have written, "Brownian motion is a poor model, and so is Ornstein-Uhlenbeck, but just as democracy is the worst method of organizing a society 'except for all the others', so these two models are all we've really got that is tractable. Critics will be admitted to the event, but only if they carry with them another tractable model."

And for discrete traits, we use Markovian models that assume (generally) homogeneous rates through time and among lineages. Undoubtedly, the math for this could get out of hand, but at some point I think we'll have to do something to explore (among other things) more realistic constraint surfaces etc.

Given this, what do you view as "the frontier" for models of continuous and discrete character evolution? New mathematics? Approximate Bayesian approaches that rely on simulation to deal with analytically intractable scenarios?

JF: Hard to see what. I think one framework will be models in which a population "chases" an adaptive peak which is moving. But we need to have some model for how the peak moves, and aside from having a mechanistic and ecological model of the function of the character this is not forthcoming. Nor is it easy to see how adaptive peaks in sister species become different from each other. We're also going to find that the amount of information available to tell different schemes of selection pressure apart will be small. We are going to have to be able to characterize what we can and can't know given the data. Just adding new mathematical tools or lots of simulation will not resolve these dilemmas.

DR: What do you think about the unification of modern (neontological) comparative biology with paleontology? There seems to be a lot of room for progress in this area. Do you have any suggestions for future directions?

JF: Oh thank you thank you thank you for giving me an opportunity to mount the soapbox and hold forth on one of my favorite topics. I've been working on this. See my paper in 2002:

Felsenstein, J. 2002. Quantitative characters, phylogenies, and morphometrics. pp. 27-44 in Morphology, Shape, and Phylogenetics, edited by N. MacLeod. Systematics Association Special Volume Series 64. Taylor and Francis, London.

and watch my Julian Huxley Lecture to the Systematics Association in London in 2008 which is available as a video also with a PDF of my slides.

Basically we can infer the tree of present-day species from molecular data, and then use it for morphological characters (or other measurable continuous or discrete characters) with a Brownian or OU model, to infer phylogenetic covariances of changes of characters. Then we can use these together with the fossil morphology to help place the fossils. (One could also use all this together in a giant likelihood or Bayesian inference but the gain in doing so will be very small as the morphology will add little to the inference of the tree, I think). One can also use bootstrap samples of trees in this, or samples from Bayesian posteriors.

There is lots to be done here and I am rushing to do it, and working with Fred Bookstein on the morphometric angles to this too. I wonder whether statistical frameworks such as this, together with within species quantitative treatment, will not be important in untangling the paleoanthropological mess caused by nonquantitative approaches to hominoid fossils.

LH: What do you think about the current trend in phylogenetics (and, lately, comparative biology) towards Bayesian approaches?

JF: I am a curmudgeon on this, in that Bayesian approaches do not feel right to me. So I have been resisting them. Bayesians were unhappy with the treatment of Bayesian Inference in my book, in that I did not give them four chapters, the last of which ended by declaring victory. I think we're all Bayesians when we come to cross the street, balancing evidence of approaching cars against our priors. But that's where one of the criticisms of Bayesianism comes in -- do we all have the same priors? Is there necessarily a single prior that you can use that will be broadly acceptable to your readership? If not, then maybe the reader of the paper should instead be given the likelihood curve so they can apply their own prior to it. For phylogenies, priors giving equal probability to all topologies (or to all labeled histories) would be noncontroversial. But the part of the prior that puts distributions on branch lengths could be wildly controversial. There is also the issue of whether some things, such as whether the sun will rise tomorrow morning, really should have a prior.

People should be Bayesians if that fits with their philosophy of doing science. But not just because a Bayesian program happens to run faster than a non-Bayesian one. They should also realize that we will continue to have both Bayesians and non-Bayesians. Biologists sometimes think that this controversy emerged in their field and will be settled there -- that one more really good argument and everyone will become a Bayesian. They might not be aware that Bayesian arguments have been around since 1764. There is no new decisive argument that's going to arise in our field.

The issue to contemplate is the priors, not the details of MCMC techniques. We have not yet seen a case where an important conclusion depends strongly on what prior you assume. Perhaps we never will, but if a case like that arises, and causes trouble for Bayesian approaches, people should not be too surprised.

LH: Your work has inspired a generation of comparative biologists. Any
advice for those of us just starting out on our careers?

JF: I have too many opinions on that for this forum. I guess I would urge people to take a long view and to realize that it takes time for methods to be developed, published and used, and to prepare themselves for the new forms of data that are coming. When I submitted my 1985 comparative methods paper, the referees were dubious about it because it required phylogenies, whereas they felt that only classifications were going to be available! A year or two earlier and it might not have been accepted for publication. I would also urge people to become familiar not only with phylogeny methods and statistical techniques, but also with the theoretical side of evolutionary biology. We're entering a period when there is going to be a merger (or Reunion) of between-species phylogenetic inference and within-species population genetics. I'm worried that we are graduating too many people who know what Subtree Pruning and Regrafting is, but who have no idea what Wahlund's Law is, or how mutational load arguments work. Theoretical population genetics is in danger of becoming a lost art, just when it is most needed. Comparative biologists should learn it -- and teach it.

Wednesday, April 22, 2009

From the Mathematical Biosciences Institute of the Ohio State University

Guest Post from Dr. John W. Wenzel:
As a point of information, earlier statements posted by other writers on this blog may have led to mistaken impressions. The 2005 workshops run by the Mathematical Biosciences Institute of the Ohio State University were entirely organized and funded by the MBI (see http://www.mbi.osu.edu/). From the web site, we read the relevant events as follows:

September 7-9, 12-13, 2005
Tutorial on Tree Reconstruction and Coalescence Theory
September 26-30, 2005
Workshop 1: Phylogeography and Phylogenetics
November 14-18, 2005
Workshop 2: Aspects of Self-Organization in Evolution
December 1-2, 2005
Current Topics Workshop: The Problems of Phylogenetic Analysis of Large Datasets

The three workshops on phylogenetics were arranged under the direction of Dennis Pearl of our Department of Statistics. Dennis had help from people he chose for each workshop. In the September 26-30 workshop on phylogenetics, contributors were in order, (again from the web site) Elizabeth Allman, Mike Steel, Flavia F. Jesus, Ligia Mateiu, Michael Hickerson, Jeff Pan, Amy Russell, Liang Liu, Bryan C. Carstens, Yoko Satta, Craig Moritz, Antonis Rokas, Marc Suchard, Tandy Warnow, Laura Salter Kubatko, Susan Holmes, Scott Edwards, Noah Rosenberg, Mark Beaumont, Lacey Knowles, Stuart Baird, Peter Beerli, Chuck Cannon, and Robert Griffiths. In the December workshop that attracted so much attention on this blog, speakers were in order: Walter Fitch, Diego Pol, Dan Janies, Usman Roshan, Pablo Goloboff, James Farris, Bernard Moret, Andres Varon, Ward Wheeler, Gonzalo Giribet, Alexandros Stamatakis, and Bret Larget. The substantial funding and organization that OSU has put forward is aimed at producing the highest caliber program. We are proud of our accomplishments, we continue to lead by example with subsequent and current workshops, and we invite others to emulate our efforts.

Cladistics Post Deleted

A recent Dechronization post related to OSU's Workshop in Phylogenetics has been deleted. This post was intended as a inside joke for the small readership of this blog, but was inappropriate and unprofessional in this context and for a public forum of this nature. If you were affected by this post and would prefer to have some of the content reposted in this forum we would be happy to accomodate you.

Saturday, April 18, 2009

New Program for Studies of Environmental Niche Evolution

Late last year, Dan Warren, Michael Turelli, and I wrote a paper about niche evolution in which we developed new metrics and statistical tests for comparative studies of environmental niche models (ENMs). Our basic metrics permit quantification of similarity between ENM model projections generated by two or more populations. These metrics may either be explored in a phyogenetic context, or used in association with pseudoreplicated datasets to test two null hypotheses at opposite ends of the niche similarity continuum : (1) ENMs are identical and, (2) ENMs no more similar than expected by chance. Although our methods could work with several types of niche modeling algorithms, they are best suited to output generated by the maximum entropy method implemented in the program Maxent. We have now written a program of our own called ENMTools that interacts extensively with Maxent to implement the analyses discussed in our paper (more accurately, Dan Warren wrote a program and Michael and I beta tested it). Dan has done a masterful job with this Perl application, which presents as a simple GUI interface on any platform capable of running ActivePerl (including Linux, Mac OSX, and Windows). In addition to performing the methods we've already introduced, new functionality is being added constantly (although some of the coolest stuff is purposely left unexplained so we can publish the methods before they start getting used by others). Dan has set up a website and a blog to keep people informed about the latest developments, and hopes you appreciate his retro-internet stylings.

Friday, April 17, 2009

Evolution 2009: Last Day for Early Registration and Presentation Submission

Just a reminder - today is the last day to submit presentations and register at the early rate for the joint Evolution meetings in Idaho this summer. A bunch of us from Dechronization will be there. We'll be cooperating with conference organizers to expand blog coverage of the meetings and partying in the Dechronization suite. According to Luke Harmon, our suite is located just across the street from an alcohol and tobacco vendor and gun shop and, so we should be well equipped.

Worse Than Alligators in the Sewers

Writing in the latest New Yorker, Burkhard Bulger suggests that Florida is like Club Med for exotic tropical species, "an exlusive seaside getaway, far from the fang and claw of the usual tropical crowd." Exotic species have been checking in for decades, but the potential gravity of the problem didn't enter public consciousness until a group of Everglades tourists captured footage of an exotic Burmese python's epic battle with an alligator in 2003. Shortly thereafter, the hero of Bulger's piece - Everglades biologist Skip Snow - started to discover hatchling pythons, prompting state wildlife managers to quickly switch from telling him "no problem at all" when he raised concerns about Everglades pythons to telling him "you might as well give up". Although it may not yet be time to give up, I was surprised by the overly simplistic strategies supported some professional wildlife managers (e.g., "It's time to stop studying these things and start killing them"). I would hope that if we've learned one thing about invasive species it is that simple brute force extermination does not work, particularly in an area as large as the Everglades and when the strategy being employed is as simple as the intentional road-killing or "rapid-acceleration removal method" practiced by some Florida biologists. One source of scientific information with management significance comes in the form of recent climate envelope and niche modeling studies conducted by Rodda et al. (1) and Pyron et al. (2). Although Rodda et al.'s climate envelope models suggest probable expansion of Burmese Pythons throughout the southern United States, Pyron et al.'s niche modeling analyses suggest a much smaller potential range, and that concern about such expansion should not be a the top of the list of challenges facing managers of python populations. (Is this the first time niche modeling studies have recieved a nod on the pages of the New Yorker?) Of course, a range of phylogenetic and phylogeographic studies are also providing insight on the history, biology, and management of invasive species.

Although most Floridians seem more likely to be concerned about the safety of their house pets than the loss of an ecosystem, Bulger does a beautiful job driving home the profound significance of the latter when he suggests that some invasive species can "...change the way we see a place. A parrot in Miami is like a McDonald's in Kathmandu: a sign that you are everwhere and nowhere at once."

Thursday, April 16, 2009

Coffee Shop Phlogenetics #2: What's More Prehistoric the Robin or the Blue Jay?

I just showed up for a latte at my favorite coffee shop and my buddy Ian was there waiting with a question: "What's more prehistoric, the Robin or the Blue Jay"? He was thinking it was the Blue Jay due to overall physical appearance and their prehistoric squawking calls. It's impossible to answer this question in a manner that's going to satisfy the serious phylogeneticist, but because I like to think of myself as a phylogeneticist of the people I'm going take stab at this one. To avoid troublesome inference about which species is more primitive or more advanced we should focus simply on which extant species has been around for longer. If we look at Hackett et al.'s recent phylogenomic analysis of birds, we find that the Robin's genus (Turdus) is included, but the closest thing to a blue jay is the con-familial crow (Corvus). Let's approach this question from the family level by contrasting the phylogenetic position of the crows and jays (Corvidae) with that of the thrushes (Turdidae). It's clear that the Corvidae branched off from a clade including the Turdidae and a range of other families relatively deep in the Oscine radiation. This pattern certainly fails to reject Ian's hypothesis, but its unclear that anything shy of a comprehensive species-level, time-calibrated phylogeny would be able to do more. Any ornithologists or paleontologists care to weigh in on this important topic?

Wednesday, April 15, 2009

Why I love the American Museum....

I love the American Museum of Natural History. I was in NYC this past weekend and spent some time on the fourth floor of the AMNH. For those of you who haven’t visited, this floor hosts what may be the most awe-inspiring and beautiful collection of mineralized bone ever displayed. I could spend hours just wandering through the Hall of Vertebrate Origins. If there is anything, anywhere, that better illustrates the shockingly bizarre diversity of vertebrate body plans through time, I haven’t seen it. I really like the fact that the AMNH still believes that bone and stone are preferable to the interactive “discovery center” exhibits that dominate the majority of natural history museums these days. When I go to a museum, I want to see disarticulated ichthyosaurs that speak of rotting flesh on the bottom of a Kansan ocean. Give me wrinkled duck-bill mummies, blocks of dead fish from Eocene swamps, or tangled Coelophysis skeletons from Ghost Ranch. Call me a purist, but I don’t like my fossils soiled by dinomation and artistic reconstruction. When I was an aspiring young paleontologist, I found inspiration in the fossils themselves, and I find it a bit sad that so many museums have moved away from this in favor of the sound and fury of faux dinosaurs. I can't be the only one who feels this way...(?)

As an aside, I also think the AMNH has done a fabulous job of grounding this paleodiversity in a phylogenetic framework. Trees are everywhere. My suspicion is that most museum visitors take away very little from this, but I found it to be wonderful.

Dechronization Interviews Jack Sullivan, Editor-in-Chief of Systematic Biology

I have decided to conduct a series of interviews of prominent evolutionary biologists who work with trees, and post them on this blog. For the first of these, I interviewed Jack Sullivan, Editor-in-Chief of Systematic Biology and a professor in my department at the University of Idaho (photo at left, in his natural habitat). I asked Jack a few questions about the field of systematics and some related issues. It is probably worth noting that I didn’t have a tape recorder or anything like that with me, so Jack’s answers are paraphrased. Thanks to Jack for being my guinea pig, Jack; if there are errors below they are probably mine.

Question: What are the most exciting recent developments in systematics?
I think there are three. First, there are second-order statistical analyses that can now be applied across a sample of trees from a Bayesian posterior distribution. These include biogeography, comparative methods, and macroevolutionary tests. We used to have to rely on a single tree for our analyses; now we can do the same analyses accounting for phylogenetic uncertainty by sampling from the posterior distribution of trees. Second, the explicit accommodation of incongruence in analyses of multilocus data through the use of the coalescent. I think it will be really cool when we can use these approaches to differentiate between incongruence caused by coalescent stochasticity from that caused by nonvertical transmission such as horizontal gene transfer or hybridization. Third, the development of phylogenomics. I remember a symposium debate at the Evolution meetings when I was a graduate student in the early 1990s. The debate was about total evidence approaches versus other methods. During the debate, someone raised the question of, “If we could sequence every single nucleotide in the genome, would we then get the best possible estimate of the phylogeny?” I think that emerging datasets demonstrate that the answer to this question might be, “not necessarily.”

Question: What is the role of Editor-in-Chief of prominent journals?
It really depends on how heavy-handed you want to be. In our journal, Systematic Biology, content is really meant to be driven by the Society for Systematic Biology (SSB). Because of this, I have tried to be less heavy-handed in the journal’s direction. The direction of the journal should be driven by members of the society as reflected by submissions. There are some topics that I wish we had less submissions (for example, phylocode and DNA barcoding). When papers are submitted and go through review with positive results, I am very reluctant to reject them based on the subject matter.

Question: So you view the editors role as more of a service to the society rather than an opportunity to shape the field?
Both. The editor can shape the field by insisting on maintaining the highly rigorous standards for data analysis that Systematic Biology is known for, especially for empirical papers. Particular things that I require as EIC might differ from my predecessors.

Question: What is the difference between a good and a bad review of a paper?
The primary characteristic of an excellent review is that the reviewer has assumed the role of silent partner - this comes from Dick Olmstead when he was the editor. Reviewers do this because it has been done for them at the journal. We have an incredibly valuable tradition of rigorous yet constructive feedback in reviews.

Question: Do you have any advice for the next generation of systematists?
As early as possible, find your niche that differentiates you from all of your peers that are doing great work. You cannot just do “comparative biology of (fill in the blank)” or “molecular phylogeography of (fill in the blank).” Probably the easiest way to think about this is to imagine yourself on an airplane next to an intelligent layperson. Convey to them what is important about what you do in a manner that is unique. This is critical for the job search - it is a rare situation when the audience [of a job talk] is just phylogeneticists or even evolutionary biologists.

Question: OK now I’m going to ask you about two controversial groups. What is your take on the cladists?
The view that statistics are anathema to systematics is dead. All the vitality in the discipline is in statistical approaches.

Question: And how do you think we should respond to the creationists?
Fighting court battles require very different tactics than changing public opinion. To affect public opinion, there are two things we can do:
1. Publicly deconstruct the false dichotomy between macroevolution and microevolution.
2. Engage in a strong public outreach campaign over the importance of evolution in day to day life.

Snakes on a Plane

Everyone who's seen Rain Man knows that Quantas is the safest airline in the world (but see Quantas fatal accidents). Today's news out of Australia, however, suggests that they may also be the first airline to ground a flight because of a harmless python. The Age is reporting that a flight from Melbourne to Sydney was cancelled after four juvenile Stimson's pythons (Antaresia stimsoni) went missing during a previous flight from Alice Springs to Melbourne. I hope they cancelled this flight because of concern about the snake gumming up the plane's mechanics, because Stimson's pythons are among the most docile snakes on the planet. As someone who's had ~50% success tracking down escaped snakes, I wish Quantas the best of luck in their efforts to track down the little beasts. My advice: If you look at something and say to yourself "I bet a snake couldn't get into that", you're wrong.

Thursday, April 9, 2009

When We Fail MrBayes…

A recent Dechronization post highlighted the unsuccessful attempts at Bayesian estimation of a large-scale bird phylogeny based on a multi-locus data set by Hackett et al. The apparent failure of MrBayes in this particular case (and under similarly challenging inference scenarios associated with large and/or complex data sets, e.g., Soltis et al., 2007; Moore et al., 2008) appears to raise serious concerns regarding our ability to estimate large-scale phylogeny using Bayesian methods.

However, it is important to carefully consider precisely what such studies have actually demonstrated: that Bayesian estimation of phylogeny appears to be intractable for certain data sets using default settings implemented in a particular program, MrBayes. Unfortunately, these anecdotal observations have led some researchers to a nested series of increasingly dubious and unsubstantiated conclusions. First, that it is impossible to reliably estimate phylogeny for this particular data set under any settings implemented in MrBayes, and more generally, that it is impossible to reliably estimate phylogeny for this particular data set not only using MrBayes but using any Bayesian methods, and finally by following this false premise to its ultimate conclusion, that it is impossible to reliably estimate phylogeny not only for this particular data set, but for any large-scale data set using Bayesian methods.

Although Bayesian estimation of phylogeny appears to succeed for the vast majority of empirical problems, there remain inference problems for which Bayesian estimation is apt to be intransigent, which may be usefully divided into three categories: (1) inference scenarios in which reliable Bayesian (or any other) estimation is likely to be problematic (e.g., whole-genome alignments for extremely large numbers of species); (2) inference scenarios in which rigorous application of existing Bayesian methods are apt to fail; and (3) inference scenarios in which imprudent application of existing Bayesian methods using default settings are apt to fail. I believe that the vast majority of reportedly “impossible” Bayesian phylogeny estimation problems fall within the latter two categories.

Existing implementations, such as MrBayes, approximate the joint posterior probability density of phylogeny and model parameters using some form of MCMC sampling (typically based on the Metropolis-Hastings algorithm). These methods quietly specify a means of updating the value of each parameter (the proposal mechanisms), the probability of invoking each proposal mechanism (the proposal probability), and the magnitude of the proposed change issued by each proposal mechanism (the tuning parameters). Proposal mechanism design is an art form (there are no hard rules that ensure valid and efficient MCMC sampling for all problems). For this reason, for many (non-phylogenetic) Bayesian inference methods, it is the responsibility of the investigator to explore a range of proposal probabilities and tuning parameterizations that deliver acceptable MCMC performance.

Accordingly, most researchers familiar with Bayesian inference would consider it extremely naïve to expect that any specific MCMC sampling design would perform well for all (or even most) empirical data sets, especially in the very difficult case of phylogeny estimation. Nevertheless, the default settings of existing Bayesian phylogeny estimation programs are so successful that we are actually “shocked, shocked to find that MrBayes does not solve all of our problems!!”. Without going into detail (as doing so would constitute an entirely separate post), the analyses detailed in the supporting material of the Hackett et al. study reads like a recipe for failure, and I would venture that the putative 'impossibility' of obtaining a reliable estimate with MrBayes in this case falls squarely under the third inference scenario defined above.

What does this mean for our phylogenetic community? First, I would argue that researchers interested in Bayesian estimation of phylogeny need to become much, much more sophisticated about diagnosing MCMC performance, carefully assessing convergence (ensuring that the chain has reached the stationary distribution, which is the joint posterior probability density of interest), mixing (assessing movement of the chain over the stationary distribution in proportion to the posterior probability of the parameter space), and sampling intensity (assessing adequacy of the number of independent samples used to approximate the posterior probability). Second, I believe that developers of Bayesian methods need to encourage and facilitate more vigorous and nuanced exploration of MCMC performance among users of these methods.

Researchers unwilling to develop the requisite knowledge to properly diagnose and troubleshoot MCMC performance should seriously consider alternative strategies, including collaboration with researchers who possess these skills or, of course, pursue alternative inference methods, including ‘fast’ ML approaches. However, it seems that most researchers are equally unclear about the potential deficiencies of the latter methods. Along these lines, and in the spirit of the anecdotal account that inspired this post, I note that I have encountered many data sets for which multiple independent searches using fast ML methods (implemented in GARLI and RAxML) rendered a series of estimates with significantly different MLEs, whereas convergence to a significantly higher mean marginal log likelihood using MrBayes appeared to be unproblematic. Indeed, Hackett et al. note that 80–90% of their fast ML searches converged to solutions with significantly different MLEs!! Moreover, the best of their fast ML searches--apparently based on a partitioned analysis using RAxML--resulted in a phylogeny with a log likelihood of -866,017.07, which is ~5,000–6,500 log likelihood units worse than the ‘unreliable’ plateaus in the time series plots of the marginal log likelihoods estimated with MrBayes!! Clearly, there are no easy solutions to these hard problems...

Tuesday, April 7, 2009

DNA and pinned specimen

In a paper from last week's PLoS One (also highlighted in today's Science Times), Thomsen et al. describe a method that may be used to extract DNA from insect specimens collected as far back as 188 years ago. Remarkably, this method also avoids destruction of the pinned insect (see photo - this beetle is post-extraction). Twenty of twenty museum specimens examined yielded good, if short (~200 bp), sequences from mitochondrial genes. There was also some limited success on insects that were even older from non-frozen conditions. One big caveat is that all of their tests were done on beetles, which obviously are some of the most durable of insects, but it was nonetheless exciting. Although they didn't test this, sounds like their DNA might also have been useful for other short fragments, i.e. microsatellites - opening up doors for tracking lots of interesting population biology of insects, including pests and maybe even vectors. Suddenly those cabinets and cabinets of pinned insects I'm surrounded by seem all the more interesting!

Monday, April 6, 2009

Complexity in Crustaceans: A Driven Trend in Organismal Design

One of my favorite papers from last year was the analysis by Adamowicz et al (PNAS) of an apparent trend towards increasing complexity within the Crustacea. Evolutionary biologists have long been fascinated by trends, and many have at least some familiarity with Cope’s Rule, the proposed trend towards increasing body size within lineages. Adamowicz and colleagues looked at complexity within multiple lineages of crustaceans, from the Cambrian to the present. They found that many indices of complexity, including number and disparity of limb types as well as disparity of limb form, generally showed parallel increases in complexity through time.

The very issue of whether complexity is a trend has been controversial (e.g., McShea 1996). Moreover, some might complain that this smacks too much of orthogenesis or progressionism for their tastes. And what do we mean by complexity, anyway? These issues have been and will continue to be debated in the literature. But one of the neatest things about the Adamowicz paper is that they provide a possible mechanism for a trend in complexity. They found that newly originated higher taxa had greater limb differentiation than their contemporaries, and that taxa going extinct had lower degree of limb differentiation. Moreover, limb complexity turns out to be one of the strongest predictors of species richness in extant crustacean clades. Together, these suggest the possibility that the trend in complexity might be driven in part by differential speciation and extinction of lineages based on complexity. What if lineages with higher complexity diversified at greater rates than lineages with reduced complexity? Over time, traits associated with complexity might increase simply because of this connection to diversification. This research thus raises some intriguing levels of selection issues, because there is – in principle – no reason why complexity could only be favored by selection at the individual level.

How might limb complexity fuel the diversification process? The authors speculate that increased limb complexity might increase ‘evolvability’ (the meaning of which is even more fun to discuss than ‘complexity’!) and possibly promoting niche specialization. They also note that new limb types might amplify the intensity of sexual selection, possibly serving indirectly to enable that supposed ‘engine of speciation.’ Anyway, don’t expect this paper to end with a case-closed feeling – after all, questions like these are on par with the biggest unresolved issues in biology. But there are lots of things to think about here!

New Insight on Size Free Morphometrics

An interesting discussion on how to remove size from phylogenetic comparative analyses of morphological data is playing out on the R-sig-phylo list-serv (a forum about the use and development of phylogenetic and comparative methods within the R platform). This topic has been contentious for a some time and the ongoing discussion should be of interest to anyone looking to analyze morphological data in a phylogenetic context. Several heavy hitters (Ted Garland & Joe Felsenstein) and Dechronization bloggers (Dan Rabosky & Liam Revell) have already weighed in with insightful remarks.

Coffee Shop Phylogenetics #1: Is the Guinea Pig a Rodent?

Confusion is the typical reaction when you tell somebody you're a phylogeneticist . Sometimes though, us phylogeneticists chance upon someone who's been waiting weeks, months, or even years to consult someone from our profession. For me this tends to happen when I strike up conversations with strangers while waiting in line at my local coffee shop. My most recent experience with coffee shop phylogenetics involved a guinea pig lover with a pressing question about where her beloved pets fall in the tree of life. She was particularly eager to get my thoughts the a rumor circulating among fellow afficionados that the guinea pig is not a rodent (apparently, some guinea pig fans would like to distance themselves from the less-ruputable mouse and rat lovers). I laughed and said "Of course, the guinea pig is a rodent. How else would you explain all of their stunning similarities, like the precence of constantly growing upper incisors?" When I tried to track down the source of her information, however, I was stunned to learn of the guinea pig battles that raged among phylogeneticists in the early to mid-1990s.

Things kicked off in 1991 with a parsimony analysis of amino acid sequence data published in Nature by Graur et al. suggesting that mouse-like rodents (myomorphs) were more closely related to primates than they were to guinea pigs (hystricomorphs). Hasegawa et al. responded immediately, showing that monophyly of rodents (myomorphs + hystricomorphs) was supported by maximum likelihood-based analyses and suggesting that the unusual myomorphs + primates inference was due to parsimony's inability to deal with unequal evolutionary rates. In a '92 response, Graur's group stood their ground, arguing that Hasegawa et al.'s results were an anomaly resulting from maximum likelihood analyses of highly divergent, "nonconservative" proteins. Graur continued to discuss the distinctness of guinea pigs and lobbied to have the Hystricomorpha recognized as a distinct order representing "one of the most ancient branches in eutherian evolutionary history." In a '93 PNAS paper, Martignetti and Brosius used the presence of a neural specific small cytoplasmic RNA (BC1 RNA) in guinea pigs and other rodents - but not in other mammals - to argue for inclusion of guinea pigs with rodents. Additional phylogenetic analyses of DNA sequence data by Hasegawa's group in '94 and Frye and Hedges in '95 further supported the guinea pigs as rodents hypothesis. By '96, even Graur had changed his tune and was considering the Hystricognathi a suborder of Rodentia (my knowledge of this history if obviously incomplete and he may have addressed this point more direclty elsewhere).

Just when the dust had settled, things blew up again with the publication of a Nature paper by D'Erchia titled simply "The Guinea Pig is Not a Rodent". Although this paper rejected rodent monophyly, it suggested a rather different tree than that of Graur et al. (1991). This time, the New York Times even got involved. Of course, Hasegawa's group rallied once again to dismiss the guinea pig is not a rodent argument, arguing that, at the very least, there simply wasn't enough support to overturn the traditional classification (a point that was reinforced by similar conclusions from Philippe). Where do things stand now? Suffice to say that nearly everything published over the last 10 years has strongly supported inclusion of gunea pigs in a monophyletic rodentia (e.g., Prasad et al.'s recent phylogenomic analysis fo mammals). In any case, the guinea pig wars represent an interesting historical anecdote and a powerful example of the symptoms that can result when systematists are engaged in intense debate over the value of different types of data (morphological versus molecular) and different types of phylogenetic methods (parsimony versus maximum likelihood).

Sorry guinea pig lovers: you're living with a rodent whether you like it or not.

Friday, April 3, 2009

When MrBayes Fails...

Last summer, Hackett et al. published a widely-read study of phylogenetic relationships among major bird lineages based on 19 independent loci sampled from 169 species (see also Tom Near's previous post). Their study confirmed some patterns suggested by previous phylogenetic studies (e.g., ratites + tinamous as sister to remaining bird species) while also recovering some novel patterns (e.g., passerines sister to parrots [albiet with low support]). One of the more interesting results from their analyses, however, was relegated to the on-line supplement. In this supplement, we learn that all eight of the 10 million generation partitioned analyses they ran in MrBayes apparently failed to reach stationarity (see figure; note that the first 2 million generations are inexplicably trimmed from each analysis as 'burnin-in'). Unpartitioned analyses fared even worse, resulting in immediate crashes "regardless of the memory capacity of the computers used."

Among the partitioned analyses, some continued to shift to new areas of the likelihood surface until relatively late in the analysis. Perhaps even more troubling though was the fact that analyses that did appear to reach a stable plateau sampled significantly different likelihood scores (e.g., -lnL -861,000 v. -lnL 859,500). Is this problem unavoidable in analyses of large datasets?

The most obvious solution would be to simply run the analyses for more than 10 million generations. I've certainly had analyses that required more than 10 million generations to reach stationarity. Perhaps this wasn't done because it took two months on a super computer to run the 10 million generation analyses (anybody know if Hackett et al. or others have implemented longer runs since their paper was published?). Another possibile solution to their problems is to modify the parameters of the MC3 analyses implemented by MrBayes (recall that the MrBayes default is to run two independent MC3 analyses with one cold chain and three heated chains). Hackett et al. explored this possibility by running six analyses with one heated chain and one cold chain (B1-B6) and two analyses with six heated chains and one cold chain (A1-A2). The analyses run with multiple heated chains performed significantly better than those with a single heated chain, perhaps due to the fact that multiple chains are incrementally heated by MrBayes (meaning that the fourth of six heated chains has a flatter likelihood surface than the first). Hackett et al. do not discuss the temperatures used for the heated chains in their analyses, but their results suggest that running multiple heated chains in a single analysis is superior to repeatedly running analyses with only one heated chain.

In any case, Hackett et al.'s ultimate solution was to discard all of their Bayesian analyses and rely instead on parsimony and the fast maximum likelihood methods implemented by GARLI and RAxML. Is this shift away from Bayesian inference in favor of fast maximum likelihood searches for computational reasons a sign of things to come (or has this shift already occurred)? Are the fast maximum likelihood methods ready for prime time, or do people remain uncomfortable with the shortcuts they use to acheive their apparent computational efficiency?