Most of the questions below are from me (LH) but a couple come from Dan Rabosky (DR). Many thanks to Joe for participating.
LH: What are the most exciting recent developments in systematics / comparative methods?
JF: The availability of genome-scale information is certainly one. The arrival of a generation of young researchers who are comfortable with statistical and computational approaches is another. But the most important development is reflected in recent work on coalescent trees of gene copies within trees of species. What this does is tie together between-species molecular evolution and within-species population genetics. Those two lines of work have been developing almost independently since the 1960s. But now, with population samples of sequences at multiple loci in multiple related species, they are coming back together. This is not another Modern Synthesis, but it is a major event that needs a name. How about the "Family Reunion"? Long-estranged relatives who have not been in touch are getting together.
LH: Take us back to the beginnings, back when you were working on phylogenetic and comparative methods for your PhD thesis. Where did you derive your inspiration? Did you anticipate the impact that this work would have on the
JF: I did not anticipate it at all. My original thesis project with Dick Lewontin was a rather grandiose theoretical population genetics macroevolution model -- my idea, not his. It didn't work out and I didn't have any useful results. Meanwhile Lynn Throckmorton and Jack Hubby, whose labs were nearby, needed someone to write a clustering program for protein electrophoresis band data that they had in multiple Drosophila species. I volunteered and was
fascinated by the algorithms. I went on to write parsimony programs for the Camin-Sokal, Dollo, and polymorphism parsimony criteria, and then to work on how to infer trees by likelihood using Anthony Edwards and Luca Cavalli-Sforza's brownian motion approximation to gene frequency drift. Dick finally suggested that I write this up for my thesis, which I did in 1967 (the degree was officially 1968). Through the 1970s I maintained a sideline of work on trees while mostly working in theoretical population genetics. It was really not until about 1978 that I began to see that this was becoming more important, and that it fit in with my interest in evolution beyond the species boundary. So I shifted my work toward trees and dropped out of theoretical population genetics.
DR: A lot of what we do in comparative methods is based on Brownian motion, or models for which BM is a special case (eg OU). As you (Felsenstein) have written, "Brownian motion is a poor model, and so is Ornstein-Uhlenbeck, but just as democracy is the worst method of organizing a society 'except for all the others', so these two models are all we've really got that is tractable. Critics will be admitted to the event, but only if they carry with them another tractable model."
And for discrete traits, we use Markovian models that assume (generally) homogeneous rates through time and among lineages. Undoubtedly, the math for this could get out of hand, but at some point I think we'll have to do something to explore (among other things) more realistic constraint surfaces etc.
Given this, what do you view as "the frontier" for models of continuous and discrete character evolution? New mathematics? Approximate Bayesian approaches that rely on simulation to deal with analytically intractable scenarios?
JF: Hard to see what. I think one framework will be models in which a population "chases" an adaptive peak which is moving. But we need to have some model for how the peak moves, and aside from having a mechanistic and ecological model of the function of the character this is not forthcoming. Nor is it easy to see how adaptive peaks in sister species become different from each other. We're also going to find that the amount of information available to tell different schemes of selection pressure apart will be small. We are going to have to be able to characterize what we can and can't know given the data. Just adding new mathematical tools or lots of simulation will not resolve these dilemmas.
DR: What do you think about the unification of modern (neontological) comparative biology with paleontology? There seems to be a lot of room for progress in this area. Do you have any suggestions for future directions?
JF: Oh thank you thank you thank you for giving me an opportunity to mount the soapbox and hold forth on one of my favorite topics. I've been working on this. See my paper in 2002:
Felsenstein, J. 2002. Quantitative characters, phylogenies, and morphometrics. pp. 27-44 in Morphology, Shape, and Phylogenetics, edited by N. MacLeod. Systematics Association Special Volume Series 64. Taylor and Francis, London.
and watch my Julian Huxley Lecture to the Systematics Association in London in 2008 which is available as a video also with a PDF of my slides.
Basically we can infer the tree of present-day species from molecular data, and then use it for morphological characters (or other measurable continuous or discrete characters) with a Brownian or OU model, to infer phylogenetic covariances of changes of characters. Then we can use these together with the fossil morphology to help place the fossils. (One could also use all this together in a giant likelihood or Bayesian inference but the gain in doing so will be very small as the morphology will add little to the inference of the tree, I think). One can also use bootstrap samples of trees in this, or samples from Bayesian posteriors.
There is lots to be done here and I am rushing to do it, and working with Fred Bookstein on the morphometric angles to this too. I wonder whether statistical frameworks such as this, together with within species quantitative treatment, will not be important in untangling the paleoanthropological mess caused by nonquantitative approaches to hominoid fossils.
LH: What do you think about the current trend in phylogenetics (and, lately, comparative biology) towards Bayesian approaches?
JF: I am a curmudgeon on this, in that Bayesian approaches do not feel right to me. So I have been resisting them. Bayesians were unhappy with the treatment of Bayesian Inference in my book, in that I did not give them four chapters, the last of which ended by declaring victory. I think we're all Bayesians when we come to cross the street, balancing evidence of approaching cars against our priors. But that's where one of the criticisms of Bayesianism comes in -- do we all have the same priors? Is there necessarily a single prior that you can use that will be broadly acceptable to your readership? If not, then maybe the reader of the paper should instead be given the likelihood curve so they can apply their own prior to it. For phylogenies, priors giving equal probability to all topologies (or to all labeled histories) would be noncontroversial. But the part of the prior that puts distributions on branch lengths could be wildly controversial. There is also the issue of whether some things, such as whether the sun will rise tomorrow morning, really should have a prior.
People should be Bayesians if that fits with their philosophy of doing science. But not just because a Bayesian program happens to run faster than a non-Bayesian one. They should also realize that we will continue to have both Bayesians and non-Bayesians. Biologists sometimes think that this controversy emerged in their field and will be settled there -- that one more really good argument and everyone will become a Bayesian. They might not be aware that Bayesian arguments have been around since 1764. There is no new decisive argument that's going to arise in our field.
The issue to contemplate is the priors, not the details of MCMC techniques. We have not yet seen a case where an important conclusion depends strongly on what prior you assume. Perhaps we never will, but if a case like that arises, and causes trouble for Bayesian approaches, people should not be too surprised.
LH: Your work has inspired a generation of comparative biologists. Any
advice for those of us just starting out on our careers?
JF: I have too many opinions on that for this forum. I guess I would urge people to take a long view and to realize that it takes time for methods to be developed, published and used, and to prepare themselves for the new forms of data that are coming. When I submitted my 1985 comparative methods paper, the referees were dubious about it because it required phylogenies, whereas they felt that only classifications were going to be available! A year or two earlier and it might not have been accepted for publication. I would also urge people to become familiar not only with phylogeny methods and statistical techniques, but also with the theoretical side of evolutionary biology. We're entering a period when there is going to be a merger (or Reunion) of between-species phylogenetic inference and within-species population genetics. I'm worried that we are graduating too many people who know what Subtree Pruning and Regrafting is, but who have no idea what Wahlund's Law is, or how mutational load arguments work. Theoretical population genetics is in danger of becoming a lost art, just when it is most needed. Comparative biologists should learn it -- and teach it.