Handling missing values in the analysis of between-hospital differences in ordinal and dichotomous outcomes: a simulation study

Reinier C A van Linschoten; Marzyeh Amini; Nikki van Leeuwen; Frank Eijkenaar; Sanne J den Hartog; Paul J Nederkoorn; Jeannette Hofmeijer; Bart J Emmer; Alida A Postma; Wim van Zwam; Bob Roozenbeek; Diederik Dippel; Hester F Lingsma; MR CLEAN Registry Investigators

doi:10.1136/bmjqs-2023-016387

Handling missing values in the analysis of between-hospital differences in ordinal and dichotomous outcomes: a simulation study

Reinier C A van Linschoten^*, Marzyeh Amini, Nikki van Leeuwen, Frank Eijkenaar, Sanne J den Hartog, Paul J Nederkoorn, Jeannette Hofmeijer, Bart J Emmer, Alida A Postma, Wim van Zwam, Bob Roozenbeek, Diederik Dippel, Hester F Lingsma, MR CLEAN Registry Investigators

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

5 Downloads (Pure)

Abstract

Missing data are frequently encountered in registries that are used to compare performance across hospitals. The most appropriate method for handling missing data when analysing differences in outcomes between hospitals with a generalised linear mixed model is unclear. We aimed to compare methods for handling missing data when comparing hospitals on ordinal and dichotomous outcomes. We performed a simulation study using data from the Multicentre Randomised Controlled Trial of Endovascular Treatment for Acute Ischaemic Stroke in the Netherlands (MR CLEAN) Registry, a prospective cohort study in 17 hospitals performing endovascular therapy for ischaemic stroke in the Netherlands. The investigated methods for handling missing data, both case-mix adjustment variables and outcomes, were complete case analysis, single imputation, multiple imputation, single imputation with deletion of imputed outcomes and multiple imputation with deletion of imputed outcomes. Data were generated as missing completely at random (MCAR), missing at random and missing not at random (MNAR) in three scenarios: (1) 10% missing data in case-mix and outcome; (2) 40% missing data in case-mix and outcome; and (3) 40% missing data in case-mix and outcome with varying degree of missing data among hospitals. Bias and reliability of the methods were compared on the mean squared error (MSE, a summary measure combining bias and reliability) relative to the hospital effect estimates from the complete reference data set. For both the ordinal outcome (ie, the modified Rankin Scale) and a common dichotomised version thereof, all methods of handling missing data were biased, likely due to shrinkage of the random effects. The MSE of all methods was on average lowest under MCAR and with fewer missing data, and highest with more missing data and under MNAR. The 'multiple imputation, then deletion' method had the lowest MSE for both outcomes under all simulated patterns of missing data. Thus, when estimating hospital effects on ordinal and dichotomous outcomes in the presence of missing data, the least biased and most reliable method to handle these missing data is 'multiple imputation, then deletion'.

Original language	English
Pages (from-to)	742-749
Number of pages	8
Journal	BMJ Quality and Safety
Volume	32
Issue number	12
Early online date	21 Sept 2023
DOIs	https://doi.org/10.1136/bmjqs-2023-016387
Publication status	Published - 1 Dec 2023

Keywords

Healthcare quality improvement
Performance measures
Quality improvement
Quality improvement methodologies
Simulation

Access to Document

10.1136/bmjqs-2023-016387

Full TextFinal published version, 693 KBLicence: Taverne

Cite this

van Linschoten, R. C. A., Amini, M., van Leeuwen, N., Eijkenaar, F., den Hartog, S. J., Nederkoorn, P. J., Hofmeijer, J., Emmer, B. J., Postma, A. A., van Zwam, W., Roozenbeek, B., Dippel, D., Lingsma, H. F., & MR CLEAN Registry Investigators (2023). Handling missing values in the analysis of between-hospital differences in ordinal and dichotomous outcomes: a simulation study. BMJ Quality and Safety, 32(12), 742-749. https://doi.org/10.1136/bmjqs-2023-016387

@article{40fd59577c254479958e925eb60301b0,

title = "Handling missing values in the analysis of between-hospital differences in ordinal and dichotomous outcomes: a simulation study",

abstract = "Missing data are frequently encountered in registries that are used to compare performance across hospitals. The most appropriate method for handling missing data when analysing differences in outcomes between hospitals with a generalised linear mixed model is unclear. We aimed to compare methods for handling missing data when comparing hospitals on ordinal and dichotomous outcomes. We performed a simulation study using data from the Multicentre Randomised Controlled Trial of Endovascular Treatment for Acute Ischaemic Stroke in the Netherlands (MR CLEAN) Registry, a prospective cohort study in 17 hospitals performing endovascular therapy for ischaemic stroke in the Netherlands. The investigated methods for handling missing data, both case-mix adjustment variables and outcomes, were complete case analysis, single imputation, multiple imputation, single imputation with deletion of imputed outcomes and multiple imputation with deletion of imputed outcomes. Data were generated as missing completely at random (MCAR), missing at random and missing not at random (MNAR) in three scenarios: (1) 10% missing data in case-mix and outcome; (2) 40% missing data in case-mix and outcome; and (3) 40% missing data in case-mix and outcome with varying degree of missing data among hospitals. Bias and reliability of the methods were compared on the mean squared error (MSE, a summary measure combining bias and reliability) relative to the hospital effect estimates from the complete reference data set. For both the ordinal outcome (ie, the modified Rankin Scale) and a common dichotomised version thereof, all methods of handling missing data were biased, likely due to shrinkage of the random effects. The MSE of all methods was on average lowest under MCAR and with fewer missing data, and highest with more missing data and under MNAR. The 'multiple imputation, then deletion' method had the lowest MSE for both outcomes under all simulated patterns of missing data. Thus, when estimating hospital effects on ordinal and dichotomous outcomes in the presence of missing data, the least biased and most reliable method to handle these missing data is 'multiple imputation, then deletion'.",

keywords = "Healthcare quality improvement, Performance measures, Quality improvement, Quality improvement methodologies, Simulation",

author = "{van Linschoten}, {Reinier C A} and Marzyeh Amini and {van Leeuwen}, Nikki and Frank Eijkenaar and {den Hartog}, {Sanne J} and Nederkoorn, {Paul J} and Jeannette Hofmeijer and Emmer, {Bart J} and Postma, {Alida A} and {van Zwam}, Wim and Bob Roozenbeek and Diederik Dippel and Lingsma, {Hester F} and {MR CLEAN Registry Investigators}",

year = "2023",

month = dec,

day = "1",

doi = "10.1136/bmjqs-2023-016387",

language = "English",

volume = "32",

pages = "742--749",

journal = "BMJ Quality and Safety",

publisher = "BMJ Publishing Group",

number = "12",

}

van Linschoten, RCA, Amini, M, van Leeuwen, N, Eijkenaar, F, den Hartog, SJ, Nederkoorn, PJ, Hofmeijer, J, Emmer, BJ, Postma, AA , van Zwam, W, Roozenbeek, B, Dippel, D, Lingsma, HF & MR CLEAN Registry Investigators 2023, 'Handling missing values in the analysis of between-hospital differences in ordinal and dichotomous outcomes: a simulation study', BMJ Quality and Safety, vol. 32, no. 12, pp. 742-749. https://doi.org/10.1136/bmjqs-2023-016387

TY - JOUR

T1 - Handling missing values in the analysis of between-hospital differences in ordinal and dichotomous outcomes

T2 - a simulation study

AU - van Linschoten, Reinier C A

AU - Amini, Marzyeh

AU - van Leeuwen, Nikki

AU - Eijkenaar, Frank

AU - den Hartog, Sanne J

AU - Nederkoorn, Paul J

AU - Hofmeijer, Jeannette

AU - Emmer, Bart J

AU - Postma, Alida A

AU - van Zwam, Wim

AU - Roozenbeek, Bob

AU - Dippel, Diederik

AU - Lingsma, Hester F

AU - MR CLEAN Registry Investigators

PY - 2023/12/1

Y1 - 2023/12/1

N2 - Missing data are frequently encountered in registries that are used to compare performance across hospitals. The most appropriate method for handling missing data when analysing differences in outcomes between hospitals with a generalised linear mixed model is unclear. We aimed to compare methods for handling missing data when comparing hospitals on ordinal and dichotomous outcomes. We performed a simulation study using data from the Multicentre Randomised Controlled Trial of Endovascular Treatment for Acute Ischaemic Stroke in the Netherlands (MR CLEAN) Registry, a prospective cohort study in 17 hospitals performing endovascular therapy for ischaemic stroke in the Netherlands. The investigated methods for handling missing data, both case-mix adjustment variables and outcomes, were complete case analysis, single imputation, multiple imputation, single imputation with deletion of imputed outcomes and multiple imputation with deletion of imputed outcomes. Data were generated as missing completely at random (MCAR), missing at random and missing not at random (MNAR) in three scenarios: (1) 10% missing data in case-mix and outcome; (2) 40% missing data in case-mix and outcome; and (3) 40% missing data in case-mix and outcome with varying degree of missing data among hospitals. Bias and reliability of the methods were compared on the mean squared error (MSE, a summary measure combining bias and reliability) relative to the hospital effect estimates from the complete reference data set. For both the ordinal outcome (ie, the modified Rankin Scale) and a common dichotomised version thereof, all methods of handling missing data were biased, likely due to shrinkage of the random effects. The MSE of all methods was on average lowest under MCAR and with fewer missing data, and highest with more missing data and under MNAR. The 'multiple imputation, then deletion' method had the lowest MSE for both outcomes under all simulated patterns of missing data. Thus, when estimating hospital effects on ordinal and dichotomous outcomes in the presence of missing data, the least biased and most reliable method to handle these missing data is 'multiple imputation, then deletion'.

AB - Missing data are frequently encountered in registries that are used to compare performance across hospitals. The most appropriate method for handling missing data when analysing differences in outcomes between hospitals with a generalised linear mixed model is unclear. We aimed to compare methods for handling missing data when comparing hospitals on ordinal and dichotomous outcomes. We performed a simulation study using data from the Multicentre Randomised Controlled Trial of Endovascular Treatment for Acute Ischaemic Stroke in the Netherlands (MR CLEAN) Registry, a prospective cohort study in 17 hospitals performing endovascular therapy for ischaemic stroke in the Netherlands. The investigated methods for handling missing data, both case-mix adjustment variables and outcomes, were complete case analysis, single imputation, multiple imputation, single imputation with deletion of imputed outcomes and multiple imputation with deletion of imputed outcomes. Data were generated as missing completely at random (MCAR), missing at random and missing not at random (MNAR) in three scenarios: (1) 10% missing data in case-mix and outcome; (2) 40% missing data in case-mix and outcome; and (3) 40% missing data in case-mix and outcome with varying degree of missing data among hospitals. Bias and reliability of the methods were compared on the mean squared error (MSE, a summary measure combining bias and reliability) relative to the hospital effect estimates from the complete reference data set. For both the ordinal outcome (ie, the modified Rankin Scale) and a common dichotomised version thereof, all methods of handling missing data were biased, likely due to shrinkage of the random effects. The MSE of all methods was on average lowest under MCAR and with fewer missing data, and highest with more missing data and under MNAR. The 'multiple imputation, then deletion' method had the lowest MSE for both outcomes under all simulated patterns of missing data. Thus, when estimating hospital effects on ordinal and dichotomous outcomes in the presence of missing data, the least biased and most reliable method to handle these missing data is 'multiple imputation, then deletion'.

KW - Healthcare quality improvement

KW - Performance measures

KW - Quality improvement

KW - Quality improvement methodologies

KW - Simulation

U2 - 10.1136/bmjqs-2023-016387

DO - 10.1136/bmjqs-2023-016387

M3 - Article

VL - 32

SP - 742

EP - 749

JO - BMJ Quality and Safety

JF - BMJ Quality and Safety

IS - 12

ER -