One-tailed hypothesis tests have fairly wide popularity in ecology and evolution. For instance, an article by Lombardi & Hurlbert (2009) reported that 13% and 24% of "articles with data susceptible to one-tailed tests" used such tests in two recent journal years. Another similar review by Ruxton & Neuhäuser (2010) found that 5% of all articles published in 2008 in the journal Ecology used at least one one-tailed test, although they didn't examine "susceptibility" (i.e., many articles not using a one-tailed test might not have had data appropriate to such a test).
One-tailed hypothesis tests are popular in large part because they provide increased power to reject the null hypothesis if it is false. The lower panel of the figure, right, shows the expected mean absolute value of t for a real (but small) mean difference between populations A and B, for various equal sample sizes of A and B. What it reveals is that the sample required to reject a two-tailed (rather than a one-tailed) null on average is about 50% larger, which could be expensive and time consuming if data are difficult to obtain.
However, there have been repeated articles questioning the general appropriateness of one-tailed tests. For instance, Lombardi & Hurlbert (2009) conclude that "all uses of one-tailed tests in the journals surveyed seemed invalid." Ruxton & Neuhäuser (2010) were a little more generous, but they concluded that in 17 papers using a one-tailed test, only one had appropriate justification to do so.
The problem arises from an apparently widespread belief among ecologists and evolutionary biologists that any a priori hypothesis regarding the direction of the outcome in our statistical test is sufficient grounds to justify a one-tailed null hypothesis. This is not true, but Lombardi & Hurlbert (2009) conclude that the reason for this misperception is fairly well founded, documenting bad or confusing advice regarding the application of one-tailed hypothesis tests in 40 of 52 popular statistical texts (Lombardi & Hurlbert 2009, Supplement).
In fact, a one-tailed hypothesis test is only appropriate if a large effect in the opposite direction of our a priori prediction is exactly as interesting and will result in the same action as a small, non-significant result in the predicted direction. Both articles point out some very restrictive circumstances in which this might be true. (For instance, in the example of an FDA test on a new headache drug - no positive effect and a large negative effect on the pain of test subjects will result in the same action: no approval for the drug.) However, in ecology and evolution it is quite hard to imagine circumstances in which a large, significant result in the opposite direction of that predicted by theory could easily be ignored.
Of course, there are many statistical tests (lots of them common among evolutionary biologists) to which the concept of "tailedness" doesn't really apply. For instance, we are not usually interested in whether our data fit our a priori model better than expected in a goodness-of-fit test (although perhaps we should be).
For statistical tests in which the concept of tailedness does apply, one-tailed tests generally ill-advised. Thus, their use should require substantial justification. Ruxton & Neuhäuser (2010) give two very simple grounds on which they feel a one-tailed need be justified. First, an author using a one-tailed test should clearly explain why the result in a particular direction is expected, and why it is fundamentally more interesting than a result in the opposite direction. Second, importantly the author should also explain why a large result in the unexpected direction should be treated no differently from a non-significant result in the expected direction (Ruxton & Neuhäuser ). These conditions may be rare (or, in fact, nonexistent: Lombardi & Hurlbert ) in our field.
Anole Embryos Don’t Mind the Heat
1 day ago