Hierarchical Analysis of Multi-mapping RNA-Seq Reads Improves the Accuracy of Allele-specific Expression
نویسندگان
چکیده
Abstract. Allele-specific expression (ASE) refers to the differential abundance of the allelic copies of a transcript. Direct RNA sequencing (RNA-Seq) can provide quantitative estimates of ASE for genes with transcribed polymorphisms. However, estimating ASE is challenging due to ambiguities in read alignment. Current approaches do not account for the hierarchy of multiple read alignments to genes, isoforms, and alleles. We have developed EMASE (ExpectationMaximization for Allele Specific Expression), an integrated approach to estimate total gene expression, ASE, and isoform usage based on hierarchical allocation of multi-mapping reads. In simulations, EMASE outperforms standard ASE estimation methods. We apply EMASE to RNA-Seq data from F1 hybrid mice where we observe widespread ASE associated with cis-acting polymorphisms and a small number of parent-of-origin effects at known imprinted genes. The EMASE software is freely available under GNU license at https://github.com/churchill-lab/ emase and it can be adapted to other sequencing applications.
منابع مشابه
RNA-Seq Alignment to Individualized Genomes Improves Transcript Abundance Estimates in Multiparent Populations
Massively parallel RNA sequencing (RNA-seq) has yielded a wealth of new insights into transcriptional regulation. A first step in the analysis of RNA-seq data is the alignment of short sequence reads to a common reference genome or transcriptome. Genetic variants that distinguish individual genomes from the reference sequence can cause reads to be misaligned, resulting in biased estimates of tr...
متن کاملasSeq: A set of tools for the study of allele-specific RNA-seq data
RNA-seq has become one of the major solutions for genome-wide inquiry of transcriptome variation. Allelespecific expression (ASE), which can be measured by RNA-seq but not by traditional microarray, provides a new perspective of transcriptome variation. Allelic imbalance of gene expression may be due to cis-acting genetic variant or parent-of-origin regulation. Currently this R package asSeq on...
متن کاملEstimates of allele-specific expression in Drosophila with a single genome sequence and RNA-seq data
MOTIVATION Genetic variation in cis-regulatory elements is an important cause of variation in gene expression. Cis-regulatory variation can be detected by using high-throughput RNA sequencing (RNA-seq) to identify differences in the expression of the two alleles of a gene. This requires that reads from the two alleles are equally likely to map to a reference genome(s), and that single-nucleotid...
متن کاملGene Expression Profile Analysis during Mouse Tooth Development
Introduction: Complex molecular pathways involve in development of different tissues such as teeth. Differential gene expression patterns during teeth development generates different tooth types. Teeth development results from interactions between oral epithelium and underlying ectomesenchyme cells with neural crest origin. Teeth development are regulated by different signaling networks. In thi...
متن کاملAccurate Estimation of Expression Levels of Homologous Genes in RNA-seq Experiments
Abstract Next generation high-throughput sequencing (NGS) is poised to replace array-based technologies as the experiment of choice for measuring RNA expression levels. Several groups have demonstrated the power of this new approach (RNA-seq), making significant and novel contributions and simultaneously proposing methodologies for the analysis of RNA-seq data. In a typical experiment, millions...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017