Maximum likelihood estimation of phylogenetic tree and substitution rates via generalized neighbor-joining and the EM algorithm
نویسندگان
چکیده
A central task in the study of molecular sequence data from present-day species is the reconstruction of the ancestral relationships. The most established approach to tree reconstruction is the maximum likelihood (ML) method. In this method, evolution is described in terms of a discrete-state continuous-time Markov process on a phylogenetic tree. The substitution rate matrix, that determines the Markov process, can be estimated using the expectation maximization (EM) algorithm. Unfortunately, an exhaustive search for the ML phylogenetic tree is computationally prohibitive for large data sets. In such situations, the neighbor-joining (NJ) method is frequently used because of its computational speed. The NJ method reconstructs trees by clustering neighboring sequences recursively, based on pairwise comparisons between the sequences. The NJ method can be generalized such that reconstruction is based on comparisons of subtrees rather than pairwise distances. In this paper, we present an algorithm for simultaneous substitution rate estimation and phylogenetic tree reconstruction. The algorithm iterates between the EM algorithm for estimating substitution rates and the generalized NJ method for tree reconstruction. Preliminary results of the approach are encouraging.
منابع مشابه
A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates.
Using simulated data, we compared five methods of phylogenetic tree estimation: parsimony, compatibility, maximum likelihood, Fitch-Margoliash, and neighbor joining. For each combination of substitution rates and sequence length, 100 data sets were generated for each of 50 trees, for a total of 5,000 replications per condition. Accuracy was measured by two measures of the distance between the t...
متن کاملPhylogenetic analyses of amino acid variation in the serpin proteins.
Phylogenetic analyses of 110 serpin protein sequences revealed clades consistent with independent phylogenetic analyses based on exon-intron structure and diagnostic amino acid sites. Trees were estimated by maximum likelihood, neighbor joining, and partial split decomposition using both the BLOSUM 62 and Jones-Taylor-Thornton substitution matrices. Neighbor-joining trees gave results closest t...
متن کاملNJML+P: A Hybrid Algorithm of the Maximum Likelihood and Neighbor-Joining Methods Using Parallel Computing
The NJML method [2, 3] is a hybrid algorithm of the two well-known methods to reconstruct molecular phylogenetic trees: the neighbor-joining (NJ) method [4] and the maximum likelihood (ML) method [1]. The NJML method is considerably efficient both in reliability and speed comparing with the other existing ML-based methods. By giving appropriate parameters, the NJML method gradually approaches t...
متن کامل(مقاله کوتاه) تجزیه فیلوژنی و تکامل مولکولی لپتین
In the current study, phylogenetic analysis and molecular evolution of the mammalian’s Leptin was investigated. Data was achieved and aligned by searching its genome database, while all examined mammals contained only a single copy of the Leptin. The nucleotide substitution rate of the sequences and molecular evolution of the Leptin were calculated by maximum likelihood and neighbor-joinin...
متن کاملThe robustness of two phylogenetic methods: four-taxon simulations reveal a slight superiority of maximum likelihood over neighbor joining.
The robustness (sensitivity to violation of assumptions) of the maximum-likelihood and neighbor-joining methods was examined using simulation. Maximum likelihood and neighbor joining were implemented with Jukes-Cantor, Kimura, and gamma models of DNA substitution. Simulations were performed in which the assumptions of the methods were violated to varying degrees on three model four-taxon trees....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005