Friday, November 20, 2009

New Applications of Population Genetic Methods to Studies of Language Evolution

ResearchBlogging.orgA recent study in PLoS biology nicely illustrates how modern studies of language evolution and geographic variation are taking advantage of modern population genetic and phylogenetic methods. Although the use of phylogenetic methods in quantititive comparative linguisitics isn't new, this new study is, to my knowledge, the first to use the Bayesian clustering algorithms that are all the rage in population genetics. This study uses these methods to investigate the extraordinarily diverse languages of Sahul (an ancient continent formed from present day Australia, New Guinea and surrounding islands). Previous attempts to infer the history of these languages have been complicated by two problems that will be familiar to any phylogeneticist. The first problem involves identification of homologies - or, in the jargon of a quantitative comparative linguist "phonological and semantic drift [which] make it impossible to identify lexical cognate characters." Another challenge stems from admixture, which may result from the fact that many Sahul languages have been in "long term and intensive contact," . By applying the program Structure to 160 "abstract structural features" quantified for 121 Sahul languages, Reesink et al. are able to recover 10 "ancestral language populations." Many of the clusters recovered by Structure correspond with previously diagnosed language groupings, and the overall patterns of hierarchical clustering suggest plausible historical scenarios. The authors also suggest that, in spite of ample opportunity for interchange, many languages show "negligible amounts of admixture."

Reesink, G., Singer, R., & Dunn, M. (2009). Explaining the Linguistic Diversity of Sahul Using Population Models PLoS Biology, 7 (11) DOI: 10.1371/journal.pbio.1000241

7 comments:

Roberto Keller said...

Although the use of phylogenetic methods in quantititive comparative linguisitics isn't new, this new study is, to my knowledge, the first to use the Bayesian clustering algorithms

cough... doi:10.1038/nature02029 ... cough... DOI: 10.1126/science.1166858 ... cough...

Glor said...

Sorry for the somewhat sloppy language Roberto, but when I said "Bayesian clustering algorithms that are all the rage in population genetics" I didn't intend to suggest tree building methods like those used in the papers you linked. I was trying to speak specifically of tree-free clustering algorithms implemented in programs like Structure. That said, I wouldn't be surprised if Structure has also been used previously to investigate language evolution, I just couldn't find any such studies in quick Web of Science and Google searches.

Susan Perkins said...

And indeed, the authors of this paper don't cite any others...

Poletarac said...

I've been dabbling with applying phylogenetic methods to language evolution problems, and at least half-heartedly following the literature. I'd be very surprised if this isn't the first study to use Structure-like population assignment and admixture analyses. (Seems like a fairly clever and innovative use, by the way.)

Unknown said...

Be sure to look at the supplemental material from Gray et al. for the description of their (non-tree-free) construction methods.

DOI: 10.1126/science.1166858

Glor said...

@frabotta
Did you mean to say non-tree-free or tree-free? The supplemental material of the study you cited only discusses tree-based methods, but perhaps you can further enlighten me if I'm missing something!

Unknown said...

@Glor

Rich, the Gray et al. paper DID use tree-based methods (non-tree-free) vs. what you attempted to highlight: "the tree-free clustering algorithms implemented in programs like Structure"

Sorry for the confusion with my double negative.