Multiple imputation and other resampling schemes for imputing missing observations
نویسندگان
چکیده
The problem of imputing missing observations under the linear regression model is considered. It is assumed that observations are missing at random and all the observations on the auxiliary or independent variables are available. Estimates of the regression parameters based on singly and multiply imputed values are given. Jackknife as well as bootstrap estimates of the variance of the singly imputed estimator of the regression parameters are given. These estimators are shown to be consistent estimators. The asymptotic distributions of the imputed estimators are also given to obtain interval estimates of the parameters of interest. These interval estimates are then compared with the interval estimates obtained from multiple imputation. It is shown that singly imputed estimators perform at least as good as multiply imputed estimators. A new nonparametric multiply imputed estimator is proposed and shown to perform as good as a multiply imputed estimator under normality. The singly imputed estimator, however, still remains at least as good as a multiply imputed estimator. © 2009 Elsevier Inc. All rights reserved.
منابع مشابه
Multiple imputation for IPD meta‐analysis: allowing for heterogeneity and studies with missing covariates
Recently, multiple imputation has been proposed as a tool for individual patient data meta-analysis with sporadically missing observations, and it has been suggested that within-study imputation is usually preferable. However, such within study imputation cannot handle variables that are completely missing within studies. Further, if some of the contributing studies are relatively small, it may...
متن کاملMultiple imputation for interval censored data with auxiliary variables.
We propose a non-parametric multiple imputation scheme, NPMLE imputation, for the analysis of interval censored survival data. Features of the method are that it converts interval-censored data problems to complete data or right censored data problems to which many standard approaches can be used, and that measures of uncertainty are easily obtained. In addition to the event time of primary int...
متن کاملSimple imputation methods versus direct likelihood analysis for missing item scores in multilevel educational data.
Missing data, such as item responses in multilevel data, are ubiquitous in educational research settings. Researchers in the item response theory (IRT) context have shown that ignoring such missing data can create problems in the estimation of the IRT model parameters. Consequently, several imputation methods for dealing with missing item data have been proposed and shown to be effective when a...
متن کاملSelection of Variables that Influence Drug Injection in Prison: Comparison of Methods with Multiple Imputed Data Sets
Background: Prisoners, compared to the general population, are at greater risk of infection. Drug injection is the main route of HIV transmission, in particular in Iran. What would be of interest is to determine variables that govern drug injection among prisoners. However, one of the issues that challenge model building is incomplete national data sets. In this paper, we addressed the process ...
متن کاملMissing data imputation in multivariable time series data
Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of diffe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Multivariate Analysis
دوره 100 شماره
صفحات -
تاریخ انتشار 2009