Model-P: a basecalling method for resequencing microarrays of diploid samples

نویسندگان

  • Yiping Zhan
  • David Kulp
چکیده

MOTIVATION Basecalling is a critical step of the analysis of DNA resequencing microarray data for single nucleotide polymorphism discovery and genotyping. For microarrays hybridized with DNA derived from diploid organisms, basecalling with high accuracy at high call rates is a challenging task. Current methods sometimes do not produce satisfactory results. RESULTS We explored using physical models based on the sequences of the probe and the target to predict feature intensities in resequencing microarrays. Based on these intensity-predicting models, a new basecalling method (Model-P), which takes into consideration the expected feature intensities for different potential genotypes, was developed. Model-P is shown to have better performance at high call rates compared with ABACUS, the current state-of-the-art method, on a test dataset and on relatively AT-rich regions. AVAILABILITY Model-P is available upon request.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A model of base-call resolution on broad-spectrum pathogen detection resequencing DNA microarrays

Oligonucleotide microarrays offer the potential to efficiently test for multiple organisms, an excellent feature for surveillance applications. Among these, resequencing microarrays are of particular interest, as they possess additional unique capabilities to track pathogens' genetic variations and perform detailed discrimination of closely related organisms. However, this potential can only be...

متن کامل

ResqMi - a Versatile Algorithm and Software for Resequencing Microarrays

Resequencing microarrays are a common tool for fast monitoring of individual genetic variations. Applications include diagnosis of genetic and infectious diseases and SNP prediction. Base calling is the crucial step in the analysis of resequencing data. All current base calling algorithms produce ambiguous calls on parts of the sequence. Therefore, proper data handling, editing and visualizatio...

متن کامل

A two-stage stochastic rule-based model to determine pre-assembly buffer content

This study considers instant decision-making needs of the automobile manufactures for resequencing vehicles before final assembly (FA). We propose a rule-based two-stage stochastic model to determine the number of spare vehicles that should be kept in the pre-assembly buffer to restore the altered sequence due to paint defects and upstream department constraints. First stage of the model decide...

متن کامل

Identifying Influenza Viruses with Resequencing Microarrays

Identification of genetic variations of influenza viruses is essential for epidemic and pandemic outbreak surveillance and determination of vaccine strain selection. In this study, we combined a random amplification strategy with high-density resequencing microarray technology to demonstrate simultaneous detection and sequence-based typing of 25 geographically distributed human influenza virus ...

متن کامل

Automated identification of multiple micro-organisms from resequencing DNA microarrays

There is an increasing recognition that detailed nucleic acid sequence information will be useful and even required in the diagnosis, treatment and surveillance of many significant pathogens. Because generating detailed information about pathogens leads to significantly larger amounts of data, it is necessary to develop automated analysis methods to reduce analysis time and to standardize ident...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 21 Suppl 2  شماره 

صفحات  -

تاریخ انتشار 2005