imputation

Biases and Variances of Survey Estimators Based on Nearest Neighbor Imputation

2007

Jiahua Chen Jun Shao

NEAREST NEIGHBOR IMPUTATION Jiahua Chen1 University of Waterloo Jun Shao2 University of Wisconsin-Madison Abstract Nearest neighbor imputation is one of the hot deck methods used to compensate for nonresponse in sample surveys. Although it has a long history of application, theoretical properties of the nearest neighbor imputation method are unknown prior to the current paper. We show that unde...

متن کامل

A Robust Missing Value Imputation Method MifImpute For Incomplete Molecular Descriptor Data And Comparative Analysis With Other Missing Value Imputation Methods

Journal: :CoRR 2013

Doreswamy Chanabasayya M. Vastrad

Missing data imputation is an important research topic in data mining. Large-scale Molecular descriptor data may contains missing values (MVs). However, some methods for downstream analyses, including some prediction tools, require a complete descriptor data matrix. We propose and evaluate an iterative imputation method MiFoImpute based on a random forest. By averaging over many unpruned regres...

متن کامل

influence of pattern of missing data on performance of imputation methods: an example from national data on drug injection in prisons

Journal: :international journal of health policy and management 2013

saiedeh haji-maghsoudi ali-akbar haghdoost azam rastegari mohammad reza baneshi

background policy makers need models to be able to detect groups at high risk of hiv infection. incomplete records and dirty data are frequently seen in national data sets. presence of missing data challenges the practice of model development. several studies suggested that performance of imputation methods is acceptable when missing rate is moderate. one of the issues which was of less concern...

متن کامل

Multiple imputation of covariates by fully conditional specification: Accommodating the substantive model

2015

Jonathan W Bartlett Shaun R Seaman Ian R White James R Carpenter Michael G Kenward

Missing covariate data commonly occur in epidemiological and clinical research, and are often dealt with using multiple imputation. Imputation of partially observed covariates is complicated if the substantive model is non-linear (e.g. Cox proportional hazards model), or contains non-linear (e.g. squared) or interaction terms, and standard software implementations of multiple imputation may imp...

متن کامل

Survival analysis using auxiliary variables via non-parametric multiple imputation.

Journal: :Statistics in medicine 2006

Chiu-Hsieh Hsu Jeremy M G Taylor Susan Murray Daniel Commenges

We develop an approach, based on multiple imputation, that estimates the marginal survival distribution in survival analysis using auxiliary variables to recover information for censored observations. To conduct the imputation, we use two working survival models to define a nearest neighbour imputing risk set. One model is for the event times and the other for the censoring times. Based on the ...

متن کامل

Systematic assessment of imputation performance using the 1000 Genomes reference panels

Journal: :Briefings in bioinformatics 2015

Qian Liu Elizabeth T. Cirulli Yujun Han Song Yao Song Liu Qianqian Zhu

Genotype imputation has been widely adopted in the postgenome-wide association studies (GWAS) era. Owing to its ability to accurately predict the genotypes of untyped variants, imputation greatly boosts variant density, allowing fine-mapping studies of GWAS loci and large-scale meta-analysis across different genotyping arrays. By leveraging genotype data from 90 whole-genome deeply sequenced in...

متن کامل

Multiple Imputation of Missing Composite Outcomes in Longitudinal Data

2016

Aidan G. O’Keeffe Daniel M. Farewell Brian D. M. Tom Vernon T. Farewell

In longitudinal randomised trials and observational studies within a medical context, a composite outcome-which is a function of several individual patient-specific outcomes-may be felt to best represent the outcome of interest. As in other contexts, missing data on patient outcome, due to patient drop-out or for other reasons, may pose a problem. Multiple imputation is a widely used method for...

متن کامل

Dealing with missing values in large-scale studies: microarray data imputation and beyond

Journal: :Briefings in bioinformatics 2010

Tero Aittokallio

High-throughput biotechnologies, such as gene expression microarrays or mass-spectrometry-based proteomic assays, suffer from frequent missing values due to various experimental reasons. Since the missing data points can hinder downstream analyses, there exists a wide variety of ways in which to deal with missing values in large-scale data sets. Nowadays, it has become routine to estimate (or i...

متن کامل

Simple imputation methods were inadequate for missing not at random (MNAR) quality of life data

Journal: :Health and Quality of Life Outcomes 2008

Shona Fielding Peter M Fayers Alison McDonald Gladys McPherson Marion K Campbell

OBJECTIVE QoL data were routinely collected in a randomised controlled trial (RCT), which employed a reminder system, retrieving about 50% of data originally missing. The objective was to use this unique feature to evaluate possible missingness mechanisms and to assess the accuracy of simple imputation methods. METHODS Those patients responding after reminder were regarded as providing missin...

متن کامل

Comparison of methods of handling missing data in individual patient data meta-analyses: an empirical example on antibiotics in children with acute otitis media.

Journal: :American journal of epidemiology 2008

Laura Koopman Geert J M G van der Heijden Diederick E Grobbee Maroeska M Rovers

What is the influence of various methods of handling missing data (complete case analyses, single imputation within and over trials, and multiple imputations within and over trials) on the subgroup effects of individual patient data meta-analyses? An empirical data set was used to compare these five methods regarding the subgroup results. Logistic regression analyses were used to determine inte...

متن کامل