Masters candidate in Biostatistics, Ella Chrenka, will present:
“Performance of Imputation in Data with Planned Missingness: Comparison of Single and Multiple Imputation Methods”
Plan B Adviser: Ashley Petersen
Abstract: This study presents how well methods of simple and multiple imputation perform when used to impute 80% of the values for a predictor of interest. Motivated by ongoing research by Shannon Cigan, simulated data sets with observed predictors age, gender and race, urinary cadmium with 80% of the values missing, and our outcome of interest the development of lung cancer were produced. Simulations had various structures of association between the predictors and outcome and types of planned missingness. The accuracy of the estimate of the effect of urinary cadmium on the odds of developing lung cancer was assessed for data sets with imputed values from both simple and multiple imputation methods using measures of bias, coverage, and precision. Results showed that single imputation method led to biased results, especially when confounding was present and not adjusted for. Estimates resulting from data sets with data imputed using single imputation also underestimated the variance. Multiple imputation produced unbiased results when the outcome was included as a predictor in the imputation process. Measures of variance were efficient for estimates of the effects of age, gender, and race on the outcome, but were inflated for the estimates of the coefficient associated with urinary cadmium’s effect on the odds of developing lung cancer. Overall, multiple imputation outperformed single imputation for all variable structures. The inclusion of outcome was shown to be vital for the accuracy of multiple imputation.
Refreshments will be served prior to the presentation.