Friday, March 26, 2010

Testing for Nonlinear Selection

Nonlinear natural selection, particularly stabilizing selection, is often presumed to be widespread in nature. However, it is seldom found in practice. For instance, the now famous Kingsolver et al. (2001) review found that only 16% of estimated nonlinear selection coefficients on single traits (estimates of stabilizing or disruptive selection) were significant; and furthermore that correlational selection was estimated in fewer than 10% of studies. Nonlinear selection is very important in the study of evolution, however, because of its relevance to many very interesting questions, such as the evolution of genetic correlations between characters and the evolution of evolvability (e.g., Arnold et al. 2008).

An increasingly popular approach in recent years has been to first estimate the γ-matrix, which contains the coefficients of stabilizing and disruptive selection on its diagonal and the coefficients of correlational selection in off-diagonal positions, and then to diagonalize γ by solving MγM'=Λ for matrices containing the orthonormal eigenvectors (M) and eigenvalues (Λ) of γ. The widely perceived advantage of this approach is one of increased power: diagonalization identifies (in its first and/or last ranked eigenvectors) the dimensions of strongest nonlinear selection; and, furthermore, it allows for more modest multiple test correction, since the number of coefficients to be tested scales linearly with the number of traits in our analysis (rather than as the square). True to form, some studies (e.g., Blows et al. 2003) have found significant nonlinear selection on the canonical axes where none was found on the original traits.

However, a recent paper by Richard Reynolds and colleagues (2010) has revealed that some of this increased power may be illusory. In particular, the standard double-regression approach for hypothesis testing of the canonical nonlinear coefficients has type I error that goes to 1.0 (i.e., very bad type I error) under pretty realistic conditions. The lower panel of the figure above, copied from Reynolds et al. (2010), shows the type I error for hypothesis tests on the canonical axes for a nonlinear selection analysis of 10 traits. In this study no selection was simulated! The authors also prove analytically that the expected eigenvalues of the estimated γ-matrix for data without nonlinear selection only go to zero as the number of samples used to estimate γ goes to infinite (obviously sample sizes in empirical studies are usually finite. . . unless, of course, you take a really, really long field season).

The implications of this result are quite significant. In particular, it means that some recently published examples of significant nonlinear selection on canonical trait axes could be type I errors. However, the authors also provide a solution. They find that type I errors contract to their nominal levels when a permutation-based hypothesis testing approach is used. (In a self-serving addendum, I'd also like to note that I independently devised and applied the exact simulation test recommended by the authors in a recently published paper - detailed here in a supplement - even though I must admit I was not at all aware of this problem at the time!)

I think this paper also reflects the fact that methods are never static, and that when new ones are devised they must be tested thoroughly - and furthermore that these tests should be conducted with both empirical and simulated data. The rise of canonical rotation in the analysis of nonlinear selection had previously not been accompanied by this level of scrutiny. Reynolds et al. (2010) provides not only a definitive critique, but also a suitable way forward.

Friday, March 12, 2010

On the Improbability of One-tailed Hypothesis Tests

One-tailed hypothesis tests have fairly wide popularity in ecology and evolution. For instance, an article by Lombardi & Hurlbert (2009) reported that 13% and 24% of "articles with data susceptible to one-tailed tests" used such tests in two recent journal years. Another similar review by Ruxton & Neuhäuser (2010) found that 5% of all articles published in 2008 in the journal Ecology used at least one one-tailed test, although they didn't examine "susceptibility" (i.e., many articles not using a one-tailed test might not have had data appropriate to such a test).

One-tailed hypothesis tests are popular in large part because they provide increased power to reject the null hypothesis if it is false. The lower panel of the figure, right, shows the expected mean absolute value of t for a real (but small) mean difference between populations A and B, for various equal sample sizes of A and B. What it reveals is that the sample required to reject a two-tailed (rather than a one-tailed) null on average is about 50% larger, which could be expensive and time consuming if data are difficult to obtain.

However, there have been repeated articles questioning the general appropriateness of one-tailed tests. For instance, Lombardi & Hurlbert (2009) conclude that "all uses of one-tailed tests in the journals surveyed seemed invalid." Ruxton & Neuhäuser (2010) were a little more generous, but they concluded that in 17 papers using a one-tailed test, only one had appropriate justification to do so.

The problem arises from an apparently widespread belief among ecologists and evolutionary biologists that any a priori hypothesis regarding the direction of the outcome in our statistical test is sufficient grounds to justify a one-tailed null hypothesis. This is not true, but Lombardi & Hurlbert (2009) conclude that the reason for this misperception is fairly well founded, documenting bad or confusing advice regarding the application of one-tailed hypothesis tests in 40 of 52 popular statistical texts (Lombardi & Hurlbert 2009, Supplement).

In fact, a one-tailed hypothesis test is only appropriate if a large effect in the opposite direction of our a priori prediction is exactly as interesting and will result in the same action as a small, non-significant result in the predicted direction. Both articles point out some very restrictive circumstances in which this might be true. (For instance, in the example of an FDA test on a new headache drug - no positive effect and a large negative effect on the pain of test subjects will result in the same action: no approval for the drug.) However, in ecology and evolution it is quite hard to imagine circumstances in which a large, significant result in the opposite direction of that predicted by theory could easily be ignored.

Of course, there are many statistical tests (lots of them common among evolutionary biologists) to which the concept of "tailedness" doesn't really apply. For instance, we are not usually interested in whether our data fit our a priori model better than expected in a goodness-of-fit test (although perhaps we should be).

For statistical tests in which the concept of tailedness does apply, one-tailed tests generally ill-advised. Thus, their use should require substantial justification. Ruxton & Neuhäuser (2010) give two very simple grounds on which they feel a one-tailed need be justified. First, an author using a one-tailed test should clearly explain why the result in a particular direction is expected, and why it is fundamentally more interesting than a result in the opposite direction. Second, importantly the author should also explain why a large result in the unexpected direction should be treated no differently from a non-significant result in the expected direction (Ruxton & Neuhäuser [2010]). These conditions may be rare (or, in fact, nonexistent: Lombardi & Hurlbert [2009]) in our field.

Wednesday, March 10, 2010

Resolving the Vertebrate Tree

In a recent paper from BMC Biology, Bob Thomson and Brad Shaffer at the University of Californa - Davis quantify progress toward resovling the vertebrate tree of life. Using a phyloinformatic pipeline and GenBank data from a large sample of vertebrate diversity (100 clades, encompassing about 12,000 species), the authors ask the simple question: "How many nodes in the vertebrate tree do we have some information about?" The brief answer is about a quarter, though this information is highly skewed. Avian and mammalian clades are on average better resolved than the other major vertebrate lineages, and marine clades are on average very poorly resolved. In addition to estimating current 'resolution', Thomson and Shaffer analyze the accumulation of this resolution through time. The superexponential growth curve of sequences in GenBank is now well-known. However, there is little understanding of how this accumulation of data correlates with accumulation of phylogenetic information. These analyses indicate that information is accumulating polynomially and, if current rates continue, we might understand a large majority of the vertebrate tree within a few decades.

Bob has made their data available via a google motion chart, which allows for easy exploration of the studies' results (embedded below):

Slingjaw Wrasse!

Peter Wainwright is lecturing this morning at the Bodega Bay Applied Phylogenetics workshop on morphological diversification. He just showed his lab's famous video of a slingjaw wrasse (Epibulus insidiator). Best feeding video ever. Peter's lecture will be up on the Bodega Wiki in an hour or so. Samantha Price is going to follow Peter with an awesome new tutorial on investigating character evolution with the program Brownie.

Tuesday, March 9, 2010

The Price of Parenthood

Any parent will tell you that reproduction is costly. There are rising health care expenses, child care costs for working parents, expensive sports or extracurricular activities, and, eventually, college enrollment and tuition. From an evolutionary perspective, the only relevant costs of reproduction are those that depress survivorship and as a consequence decrease the future opportunity for subsequent reproductive output (and, in fact, such costs have been found in humans).

A recent study in the pages of 'Evolution' has demonstrated a very high toll of reproduction, indeed. By stymieing reproduction in female Brown Anoles (Anolis sagrei, pictured right) through surgical removal of the ovaries, Bob Cox and Ryan Calsbeek at Dartmouth University have found that female interannual survival increases nearly threefold (relative to females manipulated only with a control "sham" surgery; solid bars, right). In addition to the survival advantage of non-reproduction, ovariectomized females also exhibited higher growth than control females.

Although the result is consistent with abundant life-history theory predicting a trade-off between reproduction and survival, the proximate mechanism of increased growth and survival of non-reproductive adult female anoles remains unclear. In performance trials, females whose egg burden has been surgically relieved improved dramatically in both stamina and sprint speed, suggesting that ovariectomized females might be better equipped to avoid predatory attack. However, in results presented in this year's Society for Integrative and Compative Biology meeting (and discussed in a previous blog post), Bob found that experimental manipulation of predation regime had little effect on the survival probability of sham and ovariectomized females. Perhaps ovariectomized lizards are simply better able to allocate sparse resources to fat reserves, and thus exhibit improved survival during food scarcity. Furthermore, Cox and Calsbeek acknowledge that ovariectomy removes not only the physical burden of reproductive investment, but also the source of steroid hormones - which could also affect growth and survival in lizards.

No doubt these important questions regarding proximate causes for the relationship between reproduction and survival in female anoles will be the subject of future studies.

Monday, March 8, 2010

Bodega Phylogenetics 2010 is Underway

The annual Bodega Bay workshop in applied phylogenetics kicked off last weekend. Participants have already heard Mike Sanderson's take on the State of the (Phylogenetics) Union, learned about Bayesian phylogenetic inference from John Huelsenbeck and Jeremy Brown, and run tutorials on the use of programs BEST, RAxML, R, and BEAST. Don't worry if you couldn't be here in person - lecture material and tutorials are being posted at the Bodega Phylogenetics Wiki! The next few days will feature lectures on comparative methods, morphological evolution, phylogenomics, diversification rates, and community phylogenetics (see the complete schedule). Photo captions: John Huelsenbeck introducing students to programming, Peter Wainwright organizes group projects, students learn about maximum likelihood with 10-sided die.

Tuesday, March 2, 2010

Speak now or (forever?) hold your peace

Got an idea for a new program at NSF? Think you know of a way that data and other information can be better shared? Have you concocted a plan for better linking the public, the government and scientists? For the next 17 days, you're invited to post these opinions and any other feedback at OpenNSF. Even if you don't think you have any original ideas, there's a mechanism to vote on whether you like or dislike others' suggestions and leave comments.