Wednesday, April 29, 2009
Hopefully this study and others like it (e.g. Wargo et al., 2007) will be carefully read by those making public health decisions - because thinking long-term - in an evolutionary sense, not just a medical sense - is absolutely critical. And if you're not a public health official, read these studies anyway - they make great examples for teaching the importance of evolution to everyday life.
Monday, April 27, 2009
Sunday, April 26, 2009
Saturday, April 25, 2009
I was in for a small surprise when, in preparation for my evolution class, which will be taught to a mostly pre-med audience, I read some fabulously interesting papers by Bernie Crespi and Chris Badcock [2,3; and popular accounts in NYT, Science]. Briefly, taking a cue from the early work on sexual conflict by W. D. Hamilton, and expanded by D. Haig, they contend that asymmetric expression of maternally and paternally imprinted genes may be responsible for a wide spectrum of seemingly unrelated mental illnesses.
Given that intragenomic conflict can drive the evolution of paternal and maternal imprinting, and imprinting can affect the development of the parts of the brain involved in social interactions in an opposite manner, Crespi and Badcock argue that balanced expression of those two components results in 'normal' cognitive and social development. Alternatively, a wide imbalance can have a strong negative outcome. If the mother's genetic self-interest wins, this can lead to hypermentalism (e.g. paranoid schizophrenia; pathologically conspiracy-prone with delusions of grandeur, ambivalence). Conversely, male imprinting can lead to hypomentalism (autism spectrum; poor inference of intention, inability to decieve, deficit in personal agency, single-mindedness). The key prediction of the theory is that autism and schizophrenia occupy ends of a 'social brain'  spectrum. This is significant because of the puzzling and hopelessly contradictory medical evidence. Autism and schizophrenia do not obey simple Mendelian inheritance, and this paralyzed the search for clinical treatments.
Although definitive evidence is still lacking, and some individuals can show signs of both spectrum disorders, the imprinted social brain theory is now supported by the frequent genomic co-localization of the two end-spectrum disorders, the distribution of copy number variants, predicted correlations with other mental conditions, as well as anatomical and epidemiological data (but see critiques in  and elsewhere). At the very least this contention is testable and, if it holds up, it may show outstanding clinical payoffs. It seems that the deadweight could have paydirt potential, afterall.
 Coyne, J.A. 2000. “The fairy tales of evolutionary psychology.” Review of A Natural History of Rape: Biological Bases of Sexual Coercion, by Randy Thornhill & Craig T. Palmer, MIT Press, 2000. The New Republic, March 4, 2000.
 Badcock C. and B. Crespi. 2008. Battle of the sexes may set the brain. Nature 454:1054-1055. (Photo credit: J. Robinson)
 Crespi, B. and C. Badcock. 2008. Psychosis and autism as diametrical disorders of the social brain. [with commentary] Behavioral and Brain Sciences 31:241-320.
 Dunbar, R.I.M. 1998. The Social Brain Hypothesis. Evolutionary Anthropology 6:178-190. (Dechro peeps: we should totally get and re-analyze this data).
Friday, April 24, 2009
Battle of the Sexes: Asexuality versus Sexuality by Jesse L. Grismer
The Herpetofauna of Guana Island: An Annotated Checklist and Travelogue by Gad Perry and Robert Powell
Arboreal Alligator Lizards in the Genus Abronia: Emeralds of the Cloud Forests of Guatemala by Daniel Ariano-Sánchez and Lester Melendez
Beyond 2008 "Year of the Frog": The Challenges Facing Amphibians and the Amphibian Ark by Ron Gagliardo
One Species that Will be Saved: The Grand Cayman Blue Iguana by Fred Burton
Madagascar Travelogue by Seth Rudman (Glor Lab undergraduate!)
As an example: one macroecological metric is the “root distance”: basically, the number of nodes separating a species from the root of a phylogenetic tree. Several studies have looked at mean root distances among species within regions, classify species as basal (few nodes between root and tip) and derived (lots of nodes between root and tip). Under this classification scheme, there are very interesting differences in species richness between basal and derived taxa.
I have a hard time getting over my initial visceral reaction to the use of ‘basal’ versus ‘derived’ in this context (see previous discussion on the “coffee shop phylogenetics” series). While I think these studies are on to something, my take on root-node distances is that they are a metric of diversification rate or total diversification. Regions with more “derived” species thus contain more species from clades that have undergone substantial diversification (and hence, have greater root-tip nodal distances). But I think a focus on basal and derived taxa is confusing and this literature could benefit from eliminating the use of these terms in association with extant taxa (see, for example, Crisp and Cook on this subject).
Thursday, April 23, 2009
Most of the questions below are from me (LH) but a couple come from Dan Rabosky (DR). Many thanks to Joe for participating.
JF: The availability of genome-scale information is certainly one. The arrival of a generation of young researchers who are comfortable with statistical and computational approaches is another. But the most important development is reflected in recent work on coalescent trees of gene copies within trees of species. What this does is tie together between-species molecular evolution and within-species population genetics. Those two lines of work have been developing almost independently since the 1960s. But now, with population samples of sequences at multiple loci in multiple related species, they are coming back together. This is not another Modern Synthesis, but it is a major event that needs a name. How about the "Family Reunion"? Long-estranged relatives who have not been in touch are getting together.
LH: Take us back to the beginnings, back when you were working on phylogenetic and comparative methods for your PhD thesis. Where did you derive your inspiration? Did you anticipate the impact that this work would have on the
JF: I did not anticipate it at all. My original thesis project with Dick Lewontin was a rather grandiose theoretical population genetics macroevolution model -- my idea, not his. It didn't work out and I didn't have any useful results. Meanwhile Lynn Throckmorton and Jack Hubby, whose labs were nearby, needed someone to write a clustering program for protein electrophoresis band data that they had in multiple Drosophila species. I volunteered and was
fascinated by the algorithms. I went on to write parsimony programs for the Camin-Sokal, Dollo, and polymorphism parsimony criteria, and then to work on how to infer trees by likelihood using Anthony Edwards and Luca Cavalli-Sforza's brownian motion approximation to gene frequency drift. Dick finally suggested that I write this up for my thesis, which I did in 1967 (the degree was officially 1968). Through the 1970s I maintained a sideline of work on trees while mostly working in theoretical population genetics. It was really not until about 1978 that I began to see that this was becoming more important, and that it fit in with my interest in evolution beyond the species boundary. So I shifted my work toward trees and dropped out of theoretical population genetics.
DR: A lot of what we do in comparative methods is based on Brownian motion, or models for which BM is a special case (eg OU). As you (Felsenstein) have written, "Brownian motion is a poor model, and so is Ornstein-Uhlenbeck, but just as democracy is the worst method of organizing a society 'except for all the others', so these two models are all we've really got that is tractable. Critics will be admitted to the event, but only if they carry with them another tractable model."
And for discrete traits, we use Markovian models that assume (generally) homogeneous rates through time and among lineages. Undoubtedly, the math for this could get out of hand, but at some point I think we'll have to do something to explore (among other things) more realistic constraint surfaces etc.
Given this, what do you view as "the frontier" for models of continuous and discrete character evolution? New mathematics? Approximate Bayesian approaches that rely on simulation to deal with analytically intractable scenarios?
JF: Hard to see what. I think one framework will be models in which a population "chases" an adaptive peak which is moving. But we need to have some model for how the peak moves, and aside from having a mechanistic and ecological model of the function of the character this is not forthcoming. Nor is it easy to see how adaptive peaks in sister species become different from each other. We're also going to find that the amount of information available to tell different schemes of selection pressure apart will be small. We are going to have to be able to characterize what we can and can't know given the data. Just adding new mathematical tools or lots of simulation will not resolve these dilemmas.
DR: What do you think about the unification of modern (neontological) comparative biology with paleontology? There seems to be a lot of room for progress in this area. Do you have any suggestions for future directions?
JF: Oh thank you thank you thank you for giving me an opportunity to mount the soapbox and hold forth on one of my favorite topics. I've been working on this. See my paper in 2002:
Felsenstein, J. 2002. Quantitative characters, phylogenies, and morphometrics. pp. 27-44 in Morphology, Shape, and Phylogenetics, edited by N. MacLeod. Systematics Association Special Volume Series 64. Taylor and Francis, London.
and watch my Julian Huxley Lecture to the Systematics Association in London in 2008 which is available as a video also with a PDF of my slides.
Basically we can infer the tree of present-day species from molecular data, and then use it for morphological characters (or other measurable continuous or discrete characters) with a Brownian or OU model, to infer phylogenetic covariances of changes of characters. Then we can use these together with the fossil morphology to help place the fossils. (One could also use all this together in a giant likelihood or Bayesian inference but the gain in doing so will be very small as the morphology will add little to the inference of the tree, I think). One can also use bootstrap samples of trees in this, or samples from Bayesian posteriors.
There is lots to be done here and I am rushing to do it, and working with Fred Bookstein on the morphometric angles to this too. I wonder whether statistical frameworks such as this, together with within species quantitative treatment, will not be important in untangling the paleoanthropological mess caused by nonquantitative approaches to hominoid fossils.
LH: What do you think about the current trend in phylogenetics (and, lately, comparative biology) towards Bayesian approaches?
JF: I am a curmudgeon on this, in that Bayesian approaches do not feel right to me. So I have been resisting them. Bayesians were unhappy with the treatment of Bayesian Inference in my book, in that I did not give them four chapters, the last of which ended by declaring victory. I think we're all Bayesians when we come to cross the street, balancing evidence of approaching cars against our priors. But that's where one of the criticisms of Bayesianism comes in -- do we all have the same priors? Is there necessarily a single prior that you can use that will be broadly acceptable to your readership? If not, then maybe the reader of the paper should instead be given the likelihood curve so they can apply their own prior to it. For phylogenies, priors giving equal probability to all topologies (or to all labeled histories) would be noncontroversial. But the part of the prior that puts distributions on branch lengths could be wildly controversial. There is also the issue of whether some things, such as whether the sun will rise tomorrow morning, really should have a prior.
People should be Bayesians if that fits with their philosophy of doing science. But not just because a Bayesian program happens to run faster than a non-Bayesian one. They should also realize that we will continue to have both Bayesians and non-Bayesians. Biologists sometimes think that this controversy emerged in their field and will be settled there -- that one more really good argument and everyone will become a Bayesian. They might not be aware that Bayesian arguments have been around since 1764. There is no new decisive argument that's going to arise in our field.
The issue to contemplate is the priors, not the details of MCMC techniques. We have not yet seen a case where an important conclusion depends strongly on what prior you assume. Perhaps we never will, but if a case like that arises, and causes trouble for Bayesian approaches, people should not be too surprised.
LH: Your work has inspired a generation of comparative biologists. Any
advice for those of us just starting out on our careers?
JF: I have too many opinions on that for this forum. I guess I would urge people to take a long view and to realize that it takes time for methods to be developed, published and used, and to prepare themselves for the new forms of data that are coming. When I submitted my 1985 comparative methods paper, the referees were dubious about it because it required phylogenies, whereas they felt that only classifications were going to be available! A year or two earlier and it might not have been accepted for publication. I would also urge people to become familiar not only with phylogeny methods and statistical techniques, but also with the theoretical side of evolutionary biology. We're entering a period when there is going to be a merger (or Reunion) of between-species phylogenetic inference and within-species population genetics. I'm worried that we are graduating too many people who know what Subtree Pruning and Regrafting is, but who have no idea what Wahlund's Law is, or how mutational load arguments work. Theoretical population genetics is in danger of becoming a lost art, just when it is most needed. Comparative biologists should learn it -- and teach it.
Wednesday, April 22, 2009
As a point of information, earlier statements posted by other writers on this blog may have led to mistaken impressions. The 2005 workshops run by the Mathematical Biosciences Institute of the Ohio State University were entirely organized and funded by the MBI (see http://www.mbi.osu.edu/). From the web site, we read the relevant events as follows:
September 7-9, 12-13, 2005
Tutorial on Tree Reconstruction and Coalescence Theory
September 26-30, 2005
Workshop 1: Phylogeography and Phylogenetics
November 14-18, 2005
Workshop 2: Aspects of Self-Organization in Evolution
December 1-2, 2005
Current Topics Workshop: The Problems of Phylogenetic Analysis of Large Datasets
The three workshops on phylogenetics were arranged under the direction of Dennis Pearl of our Department of Statistics. Dennis had help from people he chose for each workshop. In the September 26-30 workshop on phylogenetics, contributors were in order, (again from the web site) Elizabeth Allman, Mike Steel, Flavia F. Jesus, Ligia Mateiu, Michael Hickerson, Jeff Pan, Amy Russell, Liang Liu, Bryan C. Carstens, Yoko Satta, Craig Moritz, Antonis Rokas, Marc Suchard, Tandy Warnow, Laura Salter Kubatko, Susan Holmes, Scott Edwards, Noah Rosenberg, Mark Beaumont, Lacey Knowles, Stuart Baird, Peter Beerli, Chuck Cannon, and Robert Griffiths. In the December workshop that attracted so much attention on this blog, speakers were in order: Walter Fitch, Diego Pol, Dan Janies, Usman Roshan, Pablo Goloboff, James Farris, Bernard Moret, Andres Varon, Ward Wheeler, Gonzalo Giribet, Alexandros Stamatakis, and Bret Larget. The substantial funding and organization that OSU has put forward is aimed at producing the highest caliber program. We are proud of our accomplishments, we continue to lead by example with subsequent and current workshops, and we invite others to emulate our efforts.
Saturday, April 18, 2009
Friday, April 17, 2009
Although most Floridians seem more likely to be concerned about the safety of their house pets than the loss of an ecosystem, Bulger does a beautiful job driving home the profound significance of the latter when he suggests that some invasive species can "...change the way we see a place. A parrot in Miami is like a McDonald's in Kathmandu: a sign that you are everwhere and nowhere at once."
Thursday, April 16, 2009
Wednesday, April 15, 2009
As an aside, I also think the AMNH has done a fabulous job of grounding this paleodiversity in a phylogenetic framework. Trees are everywhere. My suspicion is that most museum visitors take away very little from this, but I found it to be wonderful.
Question: What are the most exciting recent developments in systematics?
I think there are three. First, there are second-order statistical analyses that can now be applied across a sample of trees from a Bayesian posterior distribution. These include biogeography, comparative methods, and macroevolutionary tests. We used to have to rely on a single tree for our analyses; now we can do the same analyses accounting for phylogenetic uncertainty by sampling from the posterior distribution of trees. Second, the explicit accommodation of incongruence in analyses of multilocus data through the use of the coalescent. I think it will be really cool when we can use these approaches to differentiate between incongruence caused by coalescent stochasticity from that caused by nonvertical transmission such as horizontal gene transfer or hybridization. Third, the development of phylogenomics. I remember a symposium debate at the Evolution meetings when I was a graduate student in the early 1990s. The debate was about total evidence approaches versus other methods. During the debate, someone raised the question of, “If we could sequence every single nucleotide in the genome, would we then get the best possible estimate of the phylogeny?” I think that emerging datasets demonstrate that the answer to this question might be, “not necessarily.”
Question: What is the role of Editor-in-Chief of prominent journals?
It really depends on how heavy-handed you want to be. In our journal, Systematic Biology, content is really meant to be driven by the Society for Systematic Biology (SSB). Because of this, I have tried to be less heavy-handed in the journal’s direction. The direction of the journal should be driven by members of the society as reflected by submissions. There are some topics that I wish we had less submissions (for example, phylocode and DNA barcoding). When papers are submitted and go through review with positive results, I am very reluctant to reject them based on the subject matter.
Question: So you view the editors role as more of a service to the society rather than an opportunity to shape the field?
Both. The editor can shape the field by insisting on maintaining the highly rigorous standards for data analysis that Systematic Biology is known for, especially for empirical papers. Particular things that I require as EIC might differ from my predecessors.
Question: What is the difference between a good and a bad review of a paper?
The primary characteristic of an excellent review is that the reviewer has assumed the role of silent partner - this comes from Dick Olmstead when he was the editor. Reviewers do this because it has been done for them at the journal. We have an incredibly valuable tradition of rigorous yet constructive feedback in reviews.
Question: Do you have any advice for the next generation of systematists?
As early as possible, find your niche that differentiates you from all of your peers that are doing great work. You cannot just do “comparative biology of (fill in the blank)” or “molecular phylogeography of (fill in the blank).” Probably the easiest way to think about this is to imagine yourself on an airplane next to an intelligent layperson. Convey to them what is important about what you do in a manner that is unique. This is critical for the job search - it is a rare situation when the audience [of a job talk] is just phylogeneticists or even evolutionary biologists.
Question: OK now I’m going to ask you about two controversial groups. What is your take on the cladists?
The view that statistics are anathema to systematics is dead. All the vitality in the discipline is in statistical approaches.
Question: And how do you think we should respond to the creationists?
Fighting court battles require very different tactics than changing public opinion. To affect public opinion, there are two things we can do:
1. Publicly deconstruct the false dichotomy between macroevolution and microevolution.
2. Engage in a strong public outreach campaign over the importance of evolution in day to day life.
Thursday, April 9, 2009
Existing implementations, such as MrBayes, approximate the joint posterior probability density of phylogeny and model parameters using some form of MCMC sampling (typically based on the Metropolis-Hastings algorithm). These methods quietly specify a means of updating the value of each parameter (the proposal mechanisms), the probability of invoking each proposal mechanism (the proposal probability), and the magnitude of the proposed change issued by each proposal mechanism (the tuning parameters). Proposal mechanism design is an art form (there are no hard rules that ensure valid and efficient MCMC sampling for all problems). For this reason, for many (non-phylogenetic) Bayesian inference methods, it is the responsibility of the investigator to explore a range of proposal probabilities and tuning parameterizations that deliver acceptable MCMC performance.
Accordingly, most researchers familiar with Bayesian inference would consider it extremely naïve to expect that any specific MCMC sampling design would perform well for all (or even most) empirical data sets, especially in the very difficult case of phylogeny estimation. Nevertheless, the default settings of existing Bayesian phylogeny estimation programs are so successful that we are actually “shocked, shocked to find that MrBayes does not solve all of our problems!!”. Without going into detail (as doing so would constitute an entirely separate post), the analyses detailed in the supporting material of the Hackett et al. study reads like a recipe for failure, and I would venture that the putative 'impossibility' of obtaining a reliable estimate with MrBayes in this case falls squarely under the third inference scenario defined above.
What does this mean for our phylogenetic community? First, I would argue that researchers interested in Bayesian estimation of phylogeny need to become much, much more sophisticated about diagnosing MCMC performance, carefully assessing convergence (ensuring that the chain has reached the stationary distribution, which is the joint posterior probability density of interest), mixing (assessing movement of the chain over the stationary distribution in proportion to the posterior probability of the parameter space), and sampling intensity (assessing adequacy of the number of independent samples used to approximate the posterior probability). Second, I believe that developers of Bayesian methods need to encourage and facilitate more vigorous and nuanced exploration of MCMC performance among users of these methods.
Tuesday, April 7, 2009
Monday, April 6, 2009
The very issue of whether complexity is a trend has been controversial (e.g., McShea 1996). Moreover, some might complain that this smacks too much of orthogenesis or progressionism for their tastes. And what do we mean by complexity, anyway? These issues have been and will continue to be debated in the literature. But one of the neatest things about the Adamowicz paper is that they provide a possible mechanism for a trend in complexity. They found that newly originated higher taxa had greater limb differentiation than their contemporaries, and that taxa going extinct had lower degree of limb differentiation. Moreover, limb complexity turns out to be one of the strongest predictors of species richness in extant crustacean clades. Together, these suggest the possibility that the trend in complexity might be driven in part by differential speciation and extinction of lineages based on complexity. What if lineages with higher complexity diversified at greater rates than lineages with reduced complexity? Over time, traits associated with complexity might increase simply because of this connection to diversification. This research thus raises some intriguing levels of selection issues, because there is – in principle – no reason why complexity could only be favored by selection at the individual level.
How might limb complexity fuel the diversification process? The authors speculate that increased limb complexity might increase ‘evolvability’ (the meaning of which is even more fun to discuss than ‘complexity’!) and possibly promoting niche specialization. They also note that new limb types might amplify the intensity of sexual selection, possibly serving indirectly to enable that supposed ‘engine of speciation.’ Anyway, don’t expect this paper to end with a case-closed feeling – after all, questions like these are on par with the biggest unresolved issues in biology. But there are lots of things to think about here!
Things kicked off in 1991 with a parsimony analysis of amino acid sequence data published in Nature by Graur et al. suggesting that mouse-like rodents (myomorphs) were more closely related to primates than they were to guinea pigs (hystricomorphs). Hasegawa et al. responded immediately, showing that monophyly of rodents (myomorphs + hystricomorphs) was supported by maximum likelihood-based analyses and suggesting that the unusual myomorphs + primates inference was due to parsimony's inability to deal with unequal evolutionary rates. In a '92 response, Graur's group stood their ground, arguing that Hasegawa et al.'s results were an anomaly resulting from maximum likelihood analyses of highly divergent, "nonconservative" proteins. Graur continued to discuss the distinctness of guinea pigs and lobbied to have the Hystricomorpha recognized as a distinct order representing "one of the most ancient branches in eutherian evolutionary history." In a '93 PNAS paper, Martignetti and Brosius used the presence of a neural specific small cytoplasmic RNA (BC1 RNA) in guinea pigs and other rodents - but not in other mammals - to argue for inclusion of guinea pigs with rodents. Additional phylogenetic analyses of DNA sequence data by Hasegawa's group in '94 and Frye and Hedges in '95 further supported the guinea pigs as rodents hypothesis. By '96, even Graur had changed his tune and was considering the Hystricognathi a suborder of Rodentia (my knowledge of this history if obviously incomplete and he may have addressed this point more direclty elsewhere).
Just when the dust had settled, things blew up again with the publication of a Nature paper by D'Erchia titled simply "The Guinea Pig is Not a Rodent". Although this paper rejected rodent monophyly, it suggested a rather different tree than that of Graur et al. (1991). This time, the New York Times even got involved. Of course, Hasegawa's group rallied once again to dismiss the guinea pig is not a rodent argument, arguing that, at the very least, there simply wasn't enough support to overturn the traditional classification (a point that was reinforced by similar conclusions from Philippe). Where do things stand now? Suffice to say that nearly everything published over the last 10 years has strongly supported inclusion of gunea pigs in a monophyletic rodentia (e.g., Prasad et al.'s recent phylogenomic analysis fo mammals). In any case, the guinea pig wars represent an interesting historical anecdote and a powerful example of the symptoms that can result when systematists are engaged in intense debate over the value of different types of data (morphological versus molecular) and different types of phylogenetic methods (parsimony versus maximum likelihood).
Sorry guinea pig lovers: you're living with a rodent whether you like it or not.
Friday, April 3, 2009
Among the partitioned analyses, some continued to shift to new areas of the likelihood surface until relatively late in the analysis. Perhaps even more troubling though was the fact that analyses that did appear to reach a stable plateau sampled significantly different likelihood scores (e.g., -lnL -861,000 v. -lnL 859,500). Is this problem unavoidable in analyses of large datasets?
The most obvious solution would be to simply run the analyses for more than 10 million generations. I've certainly had analyses that required more than 10 million generations to reach stationarity. Perhaps this wasn't done because it took two months on a super computer to run the 10 million generation analyses (anybody know if Hackett et al. or others have implemented longer runs since their paper was published?). Another possibile solution to their problems is to modify the parameters of the MC3 analyses implemented by MrBayes (recall that the MrBayes default is to run two independent MC3 analyses with one cold chain and three heated chains). Hackett et al. explored this possibility by running six analyses with one heated chain and one cold chain (B1-B6) and two analyses with six heated chains and one cold chain (A1-A2). The analyses run with multiple heated chains performed significantly better than those with a single heated chain, perhaps due to the fact that multiple chains are incrementally heated by MrBayes (meaning that the fourth of six heated chains has a flatter likelihood surface than the first). Hackett et al. do not discuss the temperatures used for the heated chains in their analyses, but their results suggest that running multiple heated chains in a single analysis is superior to repeatedly running analyses with only one heated chain.
In any case, Hackett et al.'s ultimate solution was to discard all of their Bayesian analyses and rely instead on parsimony and the fast maximum likelihood methods implemented by GARLI and RAxML. Is this shift away from Bayesian inference in favor of fast maximum likelihood searches for computational reasons a sign of things to come (or has this shift already occurred)? Are the fast maximum likelihood methods ready for prime time, or do people remain uncomfortable with the shortcuts they use to acheive their apparent computational efficiency?