De novo detection of copy number variation by co-assembly

نویسندگان

  • Jurgen F. Nijkamp
  • Marcel van den Broek
  • Jan-Maarten A. Geertman
  • Marcel J. T. Reinders
  • Jean-Marc Daran
  • Dick de Ridder
چکیده

MOTIVATION Comparing genomes of individual organisms using next-generation sequencing data is, until now, mostly performed using a reference genome. This is challenging when the reference is distant and introduces bias towards the exact sequence present in the reference. Recent improvements in both sequencing read length and efficiency of assembly algorithms have brought direct comparison of individual genomes by de novo assembly, rather than through a reference genome, within reach. RESULTS Here, we develop and test an algorithm, named Magnolya, that uses a Poisson mixture model for copy number estimation of contigs assembled from sequencing data. We combine this with co-assembly to allow de novo detection of copy number variation (CNV) between two individual genomes, without mapping reads to a reference genome. In co-assembly, multiple sequencing samples are combined, generating a single contig graph with different traversal counts for the nodes and edges between the samples. In the resulting 'coloured' graph, the contigs have integer copy numbers; this negates the need to segment genomic regions based on depth of coverage, as required for mapping-based detection methods. Magnolya is then used to assign integer copy numbers to contigs, after which CNV probabilities are easily inferred. The copy number estimator and CNV detector perform well on simulated data. Application of the algorithms to hybrid yeast genomes showed allotriploid content from different origin in the wine yeast Y12, and extensive CNV in aneuploid brewing yeast genomes. Integer CNV was also accurately detected in a short-term laboratory-evolved yeast strain.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering of Short Read Sequences for de novo Transcriptome Assembly

Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...

متن کامل

De novo rates and selection of large copy number variation Andy Itsara , 1 Hao Wu , 2

De novo rates and selection of large copy number variation Andy Itsara, Hao Wu, Joshua D. Smith, Deborah A. Nickerson, Isabelle Romieu, Stephanie J. London, and Evan E. Eichler Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA; National Institute of Environmental Health Sciences, National Institutes of Health, Department of Health and Hum...

متن کامل

A stochastic inference of de novo CNV detection and association test in multiplex schizophrenia families

The copy number variation (CNV) is a type of genetic variation in the genome. It is measured based on signal intensity measures and can be assessed repeatedly to reduce the uncertainty in PCR-based typing. Studies have shown that CNVs may lead to phenotypic variation and modification of disease expression. Various challenges exist, however, in the exploration of CNV-disease association. Here we...

متن کامل

Detection of de novo copy number alterations in case-parent trios using the R package MinimumDistance

For the analysis of case-parent trio genotyping arrays, copy number variants (CNV) appearing in the offspring that differ from the parental copy numbers are often of interest (de novo CNV). This package defines a statistic, referred to as the minimum distance, for identifying de novo copy number alterations in the offspring. We smooth the minimum distance using the circular binary segmentation ...

متن کامل

GENES AND SCHIZOPHRENIA De Novo Mutation in Schizophrenia

Several studies in the last 5 years have shown that newly arising (de novo) mutations contribute to the genetics of schizophrenia (SZ). This will replenish genetic variants removed by natural selection and could, in part, explain why SZ prevalence has remained stable in the general population despite low fecundity. The strongest evidence to date for the association between SZ and de novo mutati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 28 24  شماره 

صفحات  -

تاریخ انتشار 2012