Hierarchical imputation of systematically and sporadically missing data: An approximate Bayesian approach using chained equations

Shahab Jolani*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

6 Citations (Web of Science)

Abstract

In health and medical sciences, multiple imputation (MI) is now becoming popular to obtain valid inferences in the presence of missing data. However, MI of clustered data such as multicenter studies and individual participant data meta-analysis requires advanced imputation routines that preserve the hierarchical structure of data. In clustered data, a specific challenge is the presence of systematically missing data, when a variable is completely missing in some clusters, and sporadically missing data, when it is partly missing in some clusters. Unfortunately, little is known about how to perform MI when both types of missing data occur simultaneously. We develop a new class of hierarchical imputation approach based on chained equations methodology that simultaneously imputes systematically and sporadically missing data while allowing for arbitrary patterns of missingness among them. Here, we use a random effect imputation model and adopt a simplification over fully Bayesian techniques such as Gibbs sampler to directly obtain draws of parameters within each step of the chained equations. We justify through theoretical arguments and extensive simulation studies that the proposed imputation methodology has good statistical properties in terms of bias and coverage rates of parameter estimates. An illustration is given in a case study with eight individual participant datasets.
Original languageEnglish
Pages (from-to)333-351
Number of pages19
JournalBiometrical Journal
Volume60
Issue number2
DOIs
Publication statusPublished - 1 Mar 2018

Keywords

  • conditional imputation
  • multilevel imputation
  • multiple imputation by chained equations (MICE)
  • sequential regression imputation
  • INDIVIDUAL PARTICIPANT DATA
  • CLUSTER RANDOMIZED-TRIALS
  • MULTIPLE IMPUTATION
  • DATA METAANALYSIS
  • STRATEGIES
  • MODELS

Cite this