Inconsistency of phylogenetic estimates from concatenated data under coalescence.
نویسندگان
چکیده
Although multiple gene sequences are becoming increasingly available for molecular phylogenetic inference, the analysis of such data has largely relied on inference methods designed for single genes. One of the common approaches to analyzing data from multiple genes is concatenation of the individual gene data to form a single supergene to which traditional phylogenetic inference procedures - e.g., maximum parsimony (MP) or maximum likelihood (ML) - are applied. Recent empirical studies have demonstrated that concatenation of sequences from multiple genes prior to phylogenetic analysis often results in inference of a single, well-supported phylogeny. Theoretical work, however, has shown that the coalescent can produce substantial variation in single-gene histories. Using simulation, we combine these ideas to examine the performance of the concatenation approach under conditions in which the coalescent produces a high level of discord among individual gene trees and show that it leads to statistically inconsistent estimation in this setting. Furthermore, use of the bootstrap to measure support for the inferred phylogeny can result in moderate to strong support for an incorrect tree under these conditions. These results highlight the importance of incorporating variation in gene histories into multilocus phylogenetics.
منابع مشابه
Bears in a Forest of Gene Trees: Phylogenetic Inference Is Complicated by Incomplete Lineage Sorting and Gene Flow
Ursine bears are a mammalian subfamily that comprises six morphologically and ecologically distinct extant species. Previous phylogenetic analyses of concatenated nuclear genes could not resolve all relationships among bears, and appeared to conflict with the mitochondrial phylogeny. Evolutionary processes such as incomplete lineage sorting and introgression can cause gene tree discordance and ...
متن کاملUsing Genes as Characters and a Parsimony Analysis to Explore the Phylogenetic Position of Turtles
The phylogenetic position of turtles within the vertebrate tree of life remains controversial. Conflicting conclusions from different studies are likely a consequence of systematic error in the tree construction process, rather than random error from small amounts of data. Using genomic data, we evaluate the phylogenetic position of turtles with both conventional concatenated data analysis and ...
متن کاملConcordance analysis in mitogenomic phylogenetics.
Here I advocate the utility of Bayesian concordance analysis as a mechanism for exploring the magnitude and source of phylogenetic signal in concatenated mitogenomic phylogenetic studies. While typically applied to the study of independently evolving gene trees, Bayesian concordance analysis can also be applied to linked, but individually analyzed, gene regions using a prior probability that re...
متن کاملImpact of the Partitioning Scheme on Divergence Times Inferred from Mammalian Genomic Data Sets
Data partitioning has long been regarded as an important parameter for phylogenetic inference. The division of heterogeneous multigene data sets into partitions with similar substitution patterns is known to increase the performance of probabilistic phylogenetic methods. However, the effect of the partitioning scheme on divergence time estimates has generally been ignored. To investigate the im...
متن کاملSpecies tree discordance traces to phylogeographic clade boundaries in North American fence lizards (Sceloporus).
I investigated the impacts of phylogeographic sampling decisions on species tree estimation in the Sceloporus undulatus species group, a recent radiation of small, insectivorous lizards connected by parapatric and peripatric distribution across North America, using a variety of species tree inference methods (Bayesian estimation of species trees, Bayesian untangling of concordance knots, and mi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Systematic biology
دوره 56 1 شماره
صفحات -
تاریخ انتشار 2007