Mapping the Space of Genomic Signatures
نویسندگان
چکیده
We propose a computational method to measure and visualize interrelationships among any number of DNA sequences allowing, for example, the examination of hundreds or thousands of complete mitochondrial genomes. An "image distance" is computed for each pair of graphical representations of DNA sequences, and the distances are visualized as a Molecular Distance Map: Each point on the map represents a DNA sequence, and the spatial proximity between any two points reflects the degree of structural similarity between the corresponding sequences. The graphical representation of DNA sequences utilized, Chaos Game Representation (CGR), is genome- and species-specific and can thus act as a genomic signature. Consequently, Molecular Distance Maps could inform species identification, taxonomic classifications and, to a certain extent, evolutionary history. The image distance employed, Structural Dissimilarity Index (DSSIM), implicitly compares the occurrences of oligomers of length up to k (herein k = 9) in DNA sequences. We computed DSSIM distances for more than 5 million pairs of complete mitochondrial genomes, and used Multi-Dimensional Scaling (MDS) to obtain Molecular Distance Maps that visually display the sequence relatedness in various subsets, at different taxonomic levels. This general-purpose method does not require DNA sequence alignment and can thus be used to compare similar or vastly different DNA sequences, genomic or computer-generated, of the same or different lengths. We illustrate potential uses of this approach by applying it to several taxonomic subsets: phylum Vertebrata, (super)kingdom Protista, classes Amphibia-Insecta-Mammalia, class Amphibia, and order Primates. This analysis of an extensive dataset confirms that the oligomer composition of full mtDNA sequences can be a source of taxonomic information. This method also correctly finds the mtDNA sequences most closely related to that of the anatomically modern human (the Neanderthal, the Denisovan, and the chimp), and that the sequence most different from it in this dataset belongs to a cucumber.
منابع مشابه
A genome-wide scan to detect signatures of recent selection in Australian Merino sheep
Domestication and selection are processes that conserve the pattern of genetic diversities between and within populations. Identification of genomic regions that are targets of selection for phenotypic traits is one of the main aims of research in animal genetics. An approach for identifying divergently selected regions of the genome is to compare FST values among loci to estimate the genetic v...
متن کاملVisualisation and exploration of high-dimensional data using a “force directed placement”method: application to the analysis of genomic signatures
Abstract. Visualization of high-dimensional data is generally achieved by projection in a low dimensional space (usually 2 to 3 dimensions). Visualization is designed to facilitate the understanding of data sets by preserving some “essential”information. We have designed a non-linear multi-dimensional-scaling (MDS) tool relying on the force directed placement (FDP) algorithm to help dynamically...
متن کاملIterative Process for an α- Nonexpansive Mapping and a Mapping Satisfying Condition(C) in a Convex Metric Space
We construct one-step iterative process for an α- nonexpansive mapping and a mapping satisfying condition (C) in the framework of a convex metric space. We study △-convergence and strong convergence of the iterative process to the common fixed point of the mappings. Our results are new and are valid in hyperbolic spaces, CAT(0) spaces, Banach spaces and Hilbert spaces, simultaneously.
متن کاملDetection of Genetic Differences between Holstein and Iranian North-West Indigenous Hybrid Cattles using Genomic Data
Extended Abstract Introduction and Objective: Selection to increase the frequency of new mutations useful only in some subpopulations leaves markers at the genome level. Most of these regions are related to genes and QTLs controlling significant economic traits. Material and Methods: In order to detection of genetic differences between Iranian northwestern crossbred and Holstein cattle breed,...
متن کاملDouble Sequence Iterations for Strongly Contractive Mapping in Modular Space
In this paper, we consider double sequence iteration processes for strongly $rho$-contractive mapping in modular space. It is proved, these sequences, convergence strongly to a fixed point of the strongly $rho$-contractive mapping.
متن کاملFixed point theorem for non-self mappings and its applications in the modular space
In this paper, based on [A. Razani, V. Rako$check{c}$evi$acute{c}$ and Z. Goodarzi, Nonself mappings in modular spaces and common fixed point theorems, Cent. Eur. J. Math. 2 (2010) 357-366.] a fixed point theorem for non-self contraction mapping $T$ in the modular space $X_rho$ is presented. Moreover, we study a new version of Krasnoseleskii's fixed point theorem for $S+T$, where $T$ is a cont...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 10 شماره
صفحات -
تاریخ انتشار 2015