Cleaning genotype data.

نویسنده

  • K W Broman
چکیده

The identification of genes contributing to variation in complex phenotypes requires genetic data of high fidelity. Thus, the identification of pedigree and genotyping errors is a crucial prerequisite to the analysis of data from a genome scan for disease genes. The problem has been given little attention in most gene hunting papers; the focus has often been on eliminating mendelian inconsistencies in order that the analysis may proceed, rather than on achieving the best possible data. Though a number of computer programs are available to assist in the identification of genotyping and pedigree errors, the process is still not completely automated. While the Collaborative Study on the Genetics of Alcoholism (COGA) data set for GAW11 is completely compatible with Mendel's rules, there are still some errors present. We inspected the COGA data for the presence of additional errors, and identified five possible pedigree errors.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rules for resolving Mendelian inconsistencies in nuclear pedigrees typed for two-allele markers

Gene-mapping studies, regularly, rely on examination for Mendelian transmission of marker alleles in a pedigree as a way of screening for genotyping errors and mutations. For analysis of family data sets, it is, usually, necessary to resolve or remove the genotyping errors prior to consideration. At the Center of Inherited Disease Research (CIDR), to deal with their large-scale data flow, they ...

متن کامل

reGenotyper: Detecting mislabeled samples in genetic data

In high-throughput molecular profiling studies, genotype labels can be wrongly assigned at various experimental steps; the resulting mislabeled samples seriously reduce the power to detect the genetic basis of phenotypic variation. We have developed an approach to detect potential mislabeling, recover the "ideal" genotype and identify "best-matched" labels for mislabeled samples. On average, we...

متن کامل

Phenotypes and genotypes of campylobacter strains isolated after cleaning and disinfection in poultry slaughterhouses.

Campylobacter is responsible for human bacterial enteritis and poultry meat is recognised as a primary source of infection. In slaughterhouses, cleaning and disinfection procedures are performed daily, and it has been suggested that disinfectant molecules might select for antibiotic resistant strains if shared targets or combined resistance mechanisms were involved. The aim of the study was to ...

متن کامل

Research and Realization of the Extensible Data Cleaning Framework

This paper proposes the idea of establishing an extensible data cleaning framework which is based on the key technology of data cleaning, and the framework includes open rules library and algorithms library. This paper gives the descriptions of model principle and working process of the extensible data cleaning framework, and the validity of the framework is verified by experiment. When the dat...

متن کامل

A Unified Framework and Sequential Data Cleaning Approach for a Data Warehouse

The data cleaning is the process of identifying and removing the errors in the data warehouse. Data cleaning is very important in data mining process. Most of the organizations are in the need of quality data. The quality of the data needs to be improved in the data warehouse before the mining process. The framework available for data cleaning offers the fundamental services for data cleaning s...

متن کامل

A Framework for Data Cleaning in Data Warehouses

It is a persistent challenge to achieve a high quality of data in data warehouses. Data cleaning is a crucial task for such a challenge. To deal with this challenge, a set of methods and tools has been developed. However, there are still at least two questions needed to be answered: How to improve the efficiency while performing data cleaning? How to improve the degree of automation when perfor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Genetic epidemiology

دوره 17 Suppl 1  شماره 

صفحات  -

تاریخ انتشار 1999