Estimating individual admixture proportions from next generation sequencing data.

نویسندگان

  • Line Skotte
  • Thorfinn Sand Korneliussen
  • Anders Albrechtsen
چکیده

Inference of population structure and individual ancestry is important both for population genetics and for association studies. With next generation sequencing technologies it is possible to obtain genetic data for all accessible genetic variations in the genome. Existing methods for admixture analysis rely on known genotypes. However, individual genotypes cannot be inferred from low-depth sequencing data without introducing errors. This article presents a new method for inferring an individual's ancestry that takes the uncertainty introduced in next generation sequencing data into account. This is achieved by working directly with genotype likelihoods that contain all relevant information of the unobserved genotypes. Using simulations as well as publicly available sequencing data, we demonstrate that the presented method has great accuracy even for very low-depth data. At the same time, we demonstrate that applying existing methods to genotypes called from the same data can introduce severe biases. The presented method is implemented in the NGSadmix software available at http://www.popgen.dk/software.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

fastNGSadmix: admixture proportions and principal component analysis of a single NGS sample

Motivation Estimation of admixture proportions and principal component analysis (PCA) are fundamental tools in populations genetics. However, applying these methods to low- or mid-depth sequencing data without taking genotype uncertainty into account can introduce biases. Results Here we present fastNGSadmix, a tool to fast and reliably estimate admixture proportions and perform PCA from next...

متن کامل

A Hidden Markov Model Approach for Simultaneously Estimating Local Ancestry and Admixture Time Using Next Generation Sequence Data in Samples of Arbitrary Ploidy

Admixture-the mixing of genomes from divergent populations-is increasingly appreciated as a central process in evolution. To characterize and quantify patterns of admixture across the genome, a number of methods have been developed for local ancestry inference. However, existing approaches have a number of shortcomings. First, all local ancestry inference methods require some prior assumption a...

متن کامل

I-37: Establishing High Resolution Genomic Profiles of Single Cells Using Microarray and Next-Generation Sequencing Technologies

The nature and pace of genome mutation is largely unknown. Standard methods to investigate DNA-mutation rely on arraying or sequencing DNA from a population of cells, hence the genetic composition of individual cells is lost and de novo mutation in cell(s) is concealed within the bulk signal. We developed methods based on (SNP-) arraying and next-generation sequencing of single-cell whole-genom...

متن کامل

ContEst: estimating cross-contamination of human samples in next-generation sequencing data

SUMMARY Here, we present ContEst, a tool for estimating the level of cross-individual contamination in next-generation sequencing data. We demonstrate the accuracy of ContEst across a range of contamination levels, sources and read depths using sequencing data mixed in silico at known concentrations. We applied our tool to published cancer sequencing datasets and report their estimated contamin...

متن کامل

Strategies and Clinical Applications of Next Generation Sequencing

Abstract DNA sequencing is one of the great valuable techniques in molecular biology, which can be used to detect the sequence of nucleotides in a DNA fragment. The high-throughput se­quencing known as Next Generation Sequencing (NGS) revolutionized genomic research and molecular biology; therefore, the whole human genome can be sequenced with a low cost in several days. NGS technology is simi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Genetics

دوره 195 3  شماره 

صفحات  -

تاریخ انتشار 2013