Optimal methods for using posterior probabilities in association testing.
نویسندگان
چکیده
OBJECTIVE The use of haplotypes to impute the genotypes of unmeasured single nucleotide variants continues to rise in popularity. Simulation results suggest that the use of the dosage as a one-dimensional summary statistic of imputation posterior probabilities may be optimal both in terms of statistical power and computational efficiency; however, little theoretical understanding is available to explain and unify these simulation results. In our analysis, we provide a theoretical foundation for the use of the dosage as a one-dimensional summary statistic of genotype posterior probabilities from any technology. METHODS We analytically evaluate the dosage, mode and the more general set of all one-dimensional summary statistics of two-dimensional (three posterior probabilities that must sum to 1) genotype posterior probability vectors. RESULTS We prove that the dosage is an optimal one-dimensional summary statistic under a typical linear disease model and is robust to violations of this model. Simulation results confirm our theoretical findings. CONCLUSIONS Our analysis provides a strong theoretical basis for the use of the dosage as a one-dimensional summary statistic of genotype posterior probability vectors in related tests of genetic association across a wide variety of genetic disease models.
منابع مشابه
Association Studies with Imputed Variants Using Expectation-Maximization Likelihood-Ratio Tests
Genotype imputation has become standard practice in modern genetic studies. As sequencing-based reference panels continue to grow, increasingly more markers are being well or better imputed but at the same time, even more markers with relatively low minor allele frequency are being imputed with low imputation quality. Here, we propose new methods that incorporate imputation uncertainty for down...
متن کاملQuantifying evidence for candidate gene polymorphisms: Bayesian analysis combining sequence-specific and quantitative trait loci colocation information.
We calculate posterior probabilities for candidate genes as a function of genomic location. Posterior probabilities for quantitative trait loci (QTL) presence in a small interval are calculated using a Bayesian model-selection approach based on the Bayesian information criterion (BIC) and used to combine QTL colocation information with sequence-specific evidence, e.g., from differential express...
متن کاملObjective Testing Procedures in Linear Models: Calibration of the p-values
An optimal Bayesian decision procedure for testing hypothesis in normal linear models based on intrinsic model posterior probabilities is considered. It is proven that these posterior probabilities are simple functions of the classical F-statistic, thus the evaluation of the procedure can be carried out analytically through the frequentist analysis of the posterior probability of the null. An a...
متن کاملMax-Min Posterior Pseudo-Probabilities Estimation of Posterior Class Probabilities to Maximize Class Separability
The estimation of the posterior class probabilities is desirable for optimal decision, decision confidence measure, and accurate performance evaluation of a classifier. In this paper, we address the problem of estimating posterior class probabilities by learning from samples for producing optimal classifiers. We introduce a posterior pseudo-probability function based on Bayes’ formula to transf...
متن کاملLearning Optimal Threshold on Resampling Data to Deal with Class Imbalance
Class imbalance is one of the challenging problems for machine learning algorithms. When learning from highly imbalanced data, most classifiers are overwhelmed by the majority class examples, thus, their performance usually degrades. Many papers have been introduced to tackle this problem including methods for pre-processing, internal classifier processing, and post-processing – which mainly re...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Human heredity
دوره 75 1 شماره
صفحات -
تاریخ انتشار 2013