Automatic selection of reference taxa for protein-protein interaction prediction with phylogenetic profiling
نویسندگان
چکیده
MOTIVATION Phylogenetic profiling methods can achieve good accuracy in predicting protein-protein interactions, especially in prokaryotes. Recent studies have shown that the choice of reference taxa (RT) is critical for accurate prediction, but with more than 2500 fully sequenced taxa publicly available, identifying the most-informative RT is becoming increasingly difficult. Previous studies on the selection of RT have provided guidelines for manual taxon selection, and for eliminating closely related taxa. However, no general strategy for automatic selection of RT is currently available. RESULTS We present three novel methods for automating the selection of RT, using machine learning based on known protein-protein interaction networks. One of these methods in particular, Tree-Based Search, yields greatly improved prediction accuracies. We further show that different methods for constituting phylogenetic profiles often require very different RT sets to support high prediction accuracy.
منابع مشابه
Protein profiling for phylogenetic relationship in snakehead species
Protein banding pattern of eight snakeheads – Channa species viz., Channa striatus, Channa marulius, Channa punctatus, Channa diplogramme, Channa bleheri, Channa gachua, Channa stewartii and Channa aurantimaculata collected from different regions of India were used to study the phylogenetic relationship among them. The banding pattern from muscle protein indicated a unique profile for each spec...
متن کاملProtein profiling for phylogenetic relationship in snakehead species
Protein banding pattern of eight snakeheads – Channa species viz., Channa striatus, Channa marulius, Channa punctatus, Channa diplogramme, Channa bleheri, Channa gachua, Channa stewartii and Channa aurantimaculata collected from different regions of India were used to study the phylogenetic relationship among them. The banding pattern from muscle protein indicated a unique profile for each spec...
متن کاملEffect of Reference Genome Selection on the Performance of Computational Methods for Genome-Wide Protein-Protein Interaction Prediction
BACKGROUND Recent progress in computational methods for predicting physical and functional protein-protein interactions has provided new insights into the complexity of biological processes. Most of these methods assume that functionally interacting proteins are likely to have a shared evolutionary history. This history can be traced out for the protein pairs of a query genome by correlating di...
متن کاملRefined phylogenetic profiles method for predicting protein-protein interactions
MOTIVATION The increasing availability of complete genome sequences provides excellent opportunity for the further development of tools for functional studies in proteomics. Several experimental approaches and in silico algorithms have been developed to cluster proteins into networks of biological significance that may provide new biological insights, especially into understanding the functions...
متن کاملPrediction of Protein Sub-Mitochondria Locations Using Protein Interaction Networks
Background: Prediction of the protein localization is among the most important issues in the bioinformatics that is used for the prediction of the proteins in the cells and organelles such as mitochondria. In this study, several machine learning algorithms are applied for the prediction of the intracellular protein locations. These algorithms use the features extracted from pro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 28 6 شماره
صفحات -
تاریخ انتشار 2012