Incomplete Multilevel Data: Problems and solutions

J. Hox*, S. van Buuren, Shahab Jolani

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapterAcademic


Incomplete data are common in empirical research. The default solutions in software packages are very simplistic; the default is generally listwise deletion where a case with any variable missing is completely removed from the analysis. In multilevel data, missing values at the group level can be a serious problem. For example, when a teacher has no data on a single variable, listwise deletion means that the teacher plus the corresponding class is completely removed. Listwise deletion is clearly very inefficient. More importantly, any deletion scheme assumes that the remaining cases are representative for the entire original sample, meaning that it assumes that the missingness is completely random. This is a very strong assumption, unlikely to be true in real-world data. Modern solutions to incomplete data are full information maximum likelihood (FIML) estimation, which includes the incomplete cases in the estimation, and multiple imputation (MI). The problem with FIML is that most available multilevel analysis software does not have it. The problem with MI is that one must use a multilevel procedure to generate the imputations. This presentation discusses missingness mechanisms, introduces the FIML and MI approaches, and shows how these can be used with currently available software.
Original languageEnglish
Title of host publicationAdvances in multilevel modeling for educational research: addressing practical issues found in real-world applications
EditorsJ.R. Harring, L.M. Staplecton, S.N. Beretvas
Place of PublicationCharlotte, NC
PublisherInformation Age Publishing Inc.
ISBN (Print)978-1681233284
Publication statusPublished - 2015

Publication series

SeriesCILVR Series on Latent Variable Methodology

Cite this