Mutual Information Content of Homologous DNA Sequences

نویسندگان

  • Luciana Pessoa
  • Helena Cristina da Gama Leitão
  • Jorge Stolfi
چکیده

The necessary information to reproduce and keep an organism is codified in acid nucleic molecules. Deepening the knowledge about how the information is stored in these bio-sequences can lead to more efficient methods of comparing genomic sequences. In the present study, we analyzed the quantity of information contained in a DNA sequence that can be useful to identify sequences homologous to it. To reach it, we used signal processing techniques, specially spectral analysis and information theory.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering of a Number of Genes Affecting in Milk Production using Information Theory and Mutual Information

Information theory is a branch of mathematics. Information theory is used in genetic and bioinformatics analyses and can be used for many analyses related to the biological structures and sequences. Bio-computational grouping of genes facilitates genetic analysis, sequencing and structural-based analyses. In this study, after retrieving gene and exon DNA sequences affecting milk yield in dairy ...

متن کامل

Sequence Comparisons via Algorithmic Mutual Information

One of the main problems in DNA and protein sequence comparisons is to decide whether observed similarity of two sequences should be explained by their relatedness or by mere presence of some shared internal structure, e.g., shared internal tandem repeats. The standard methods that are based on statistics or classical information theory can be used to discover either internal structure or mutua...

متن کامل

Information decomposition of symbolic sequences

We developed a non-parametric method of Information Decomposition (ID) of a content of any symbolical sequence. The method is based on the calculation of Shannon mutual information between analyzed and artificial symbolical sequences, and allows the revealing of latent periodicity in any symbolical sequence. We show the stability of the ID method in the case of a large number of random letter c...

متن کامل

InterMap3D: predicting and visualizing co-evolving protein residues

SUMMARY InterMap3D predicts co-evolving protein residues and plots them on the 3D protein structure. Starting with a single protein sequence, InterMap3D automatically finds a set of homologous sequences, generates an alignment and fetches the most similar 3D structure from the Protein Data Bank (PDB). It can also accept a user-generated alignment. Based on the alignment, co-evolving residues ar...

متن کامل

Transcription Factor Binding Sites: Position-Specific and Position-Dependent Modeling

Transcription of a gene is cued by the binding of a protein to a binding site. Transcription factors bind to specific binding sites (TFBSs), of which there may be many for a single transcription factor (1). However, these TFBSs often exhibit a considerable amount of variability, as the sequences consist of similar nucleotides rather than complete replicas of a TFBS (1). The ability to model the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Genetics and molecular research : GMR

دوره 4 3  شماره 

صفحات  -

تاریخ انتشار 2004