Simplifying the mosaic description of DNA sequences.
نویسندگان
چکیده
By using the Jensen-Shannon divergence, genomic DNA can be divided into compositionally distinct domains through a standard recursive segmentation procedure. Each domain, while significantly different from its neighbors, may, however, share compositional similarity with one or more distant (non-neighboring) domains. We thus obtain a coarse-grained description of the given DNA string in terms of a smaller set of distinct domain labels. This yields a minimal domain description of a given DNA sequence, significantly reducing its organizational complexity. This procedure gives a new means of evaluating genomic complexity as one examines organisms ranging from bacteria to human. The mosaic organization of DNA sequences could have originated from the insertion of fragments of one genome (the parasite) inside another (the host), and we present numerical experiments that are suggestive of this scenario.
منابع مشابه
Association of Tomato Leaf Curl New Delhi Virus, Betasatellite, and Alphasatellite with Mosaic Disease of Spine Gourd (Momordica dioica Roxb. Willd) in India
Background: Spine gourd (Momordica dioica Roxb. Willd) is one of the important cucurbitaceous crops grown across the world for vegetable and medicinal purposes. Diseases caused by the DNA viruses are becoming the limiting factors for the production of spine gourd reducing its potential yield. For the commercial cultivation of the spine gourd, propagation material used by most o...
متن کاملComplete nucleotide sequence and host range of South African cassava mosaic virus: further evidence for recombination amongst begomoviruses.
Complete nucleotide sequences of the DNA-A (2800 nt) and DNA-B (2760 nt) components of a novel cassava-infecting begomovirus, South African cassava mosaic virus (SACMV), were determined and compared with various New World and Old World begomoviruses. SACMV is most closely related to East African cassava mosaic virus (EACMV) in both its DNA-A (85% with EACMV-MH and -MK) and -B (90% with EACMV-UG...
متن کاملMOSAIC: segmenting multiple aligned DNA sequences
UNLABELLED MOSAIC is a set of tools for the segmentation of multiple aligned DNA sequences into homogeneous zones. The segmentation is based on the distribution of mutational events along the alignment. As an example, the analysis of one repeated sequence belonging to the subtelomeric regions of the yeast genome is presented. AVAILABILITY Free access from ftp://ftp.biomath.jussieu.fr/pub/pape...
متن کاملCauliflower mosaic virus 35S promoter-controlled DNA copies of cowpea mosaic virus RNAs are infectious on plants.
Clones have been constructed that contain full-length cDNA copies of cowpea mosaic virus RNA1 and RNA2, downstream of the cauliflower mosaic virus 35S promoter. The clones, when linearized downstream of the viral sequences, give rise to cowpea mosaic virus-like symptoms when inoculated onto cowpea plants. Viral RNA and virions can be detected in the inoculated plants, demonstrating that the clo...
متن کاملQuantification of DNA patchiness using long-range correlation measures.
We introduce and develop new techniques to quantify DNA patchiness, and to quantify characteristics of its mosaic structure. These techniques, which involve calculating two functions, alpha(l) and beta(l), measure correlations at length scale l and detect distinct characteristic patch sizes embedded in scale-invariant patch size distributions. Using these new methods, we address a number of iss...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Physical review. E, Statistical, nonlinear, and soft matter physics
دوره 66 3 Pt 1 شماره
صفحات -
تاریخ انتشار 2002