Annotation inconsistencies beyond sequence similarity-based function prediction – phylogeny and genome structure

نویسندگان

  • Vasilis J. Promponas
  • Ioannis Iliopoulos
  • Christos A. Ouzounis
چکیده

The function annotation process in computational biology has increasingly shifted from the traditional characterization of individual biochemical roles of protein molecules to the system-wide detection of entire metabolic pathways and genomic structures. The so-called genome-aware methods broaden misannotation inconsistencies in genome sequences beyond protein function assignments, encompassing phylogenetic anomalies and artifactual genomic regions. We outline three categories of error propagation in databases by providing striking examples - at various levels of appreciation by the community from traditional to emerging, thus raising awareness for future solutions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

G-protein coupled receptor subfamily identification using phylogenetic comparison of gene and species trees

Most approaches to prediction of protein function from primary structure are based on similarity between the query sequence and sequences of known function. This approach, however, disregards the occurrence of gene duplication (paralogy) or convergent evolution of the genes. The analysis of correlated proteins that share a common domain, taking into consideration the evolutionary history of gen...

متن کامل

ESG: extended similarity group method for automated protein function prediction

MOTIVATION Importance of accurate automatic protein function prediction is ever increasing in the face of a large number of newly sequenced genomes and proteomics data that are awaiting biological interpretation. Conventional methods have focused on high sequence similarity-based annotation transfer which relies on the concept of homology. However, many cases have been reported that simple tran...

متن کامل

An improved hypergeometric probability method for identification of functionally linked proteins using phylogenetic profiles

Predicting functions of proteins and alternatively spliced isoforms encoded in a genome is one of the important applications of bioinformatics in the post-genome era. Due to the practical limitation of experimental characterization of all proteins encoded in a genome using biochemical studies, bioinformatics methods provide powerful tools for function annotation and prediction. These methods al...

متن کامل

Family Classification and Integrative Analysis for Protein Functional Annotation

The high-throughput genome projects have resulted in a rapid accumulation of predicted protein sequences, however, experimentally-verified information on protein function lags far behind. The common approach to inferring function of uncharacterized proteins based on sequence similarity to annotated proteins in sequence databases often results in over-identification, underidentification, or even...

متن کامل

Molecular phylogeny of three desert truffles from Iran based on ribosomal genome

The ITS region including the 5.8S gene of rDNA of three desert truffle species were amplified using ITS4 and ITS1 primers. The ITS sequences were compared to those of other related authentic sequences obtained from GenBank. Among 12 specimens studied, seven isolates corresponded to Terfezia claveryi reported by other authors. Iranian T. claveryi specimens had an average similarity of 99.4% (ran...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2015