Developing risk models for multicenter data using standard logistic regression produced suboptimal predictions: A simulation study

Nora Falconieri; Ben Van Calster; Dirk Timmerman; Laure Wynants

doi:10.1002/bimj.201900075

Developing risk models for multicenter data using standard logistic regression produced suboptimal predictions: A simulation study

Nora Falconieri, Ben Van Calster, Dirk Timmerman, Laure Wynants^*

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

Abstract

Although multicenter data are common, many prediction model studies ignore this during model development. The objective of this study is to evaluate the predictive performance of regression methods for developing clinical risk prediction models using multicenter data, and provide guidelines for practice. We compared the predictive performance of standard logistic regression, generalized estimating equations, random intercept logistic regression, and fixed effects logistic regression. First, we presented a case study on the diagnosis of ovarian cancer. Subsequently, a simulation study investigated the performance of the different models as a function of the amount of clustering, development sample size, distribution of center-specific intercepts, the presence of a center-predictor interaction, and the presence of a dependency between center effects and predictors. The results showed that when sample sizes were sufficiently large, conditional models yielded calibrated predictions, whereas marginal models yielded miscalibrated predictions. Small sample sizes led to overfitting and unreliable predictions. This miscalibration was worse with more heavily clustered data. Calibration of random intercept logistic regression was better than that of standard logistic regression even when center-specific intercepts were not normally distributed, a center-predictor interaction was present, center effects and predictors were dependent, or when the model was applied in a new center. Therefore, to make reliable predictions in a specific center, we recommend random intercept logistic regression.

Original language	English
Pages (from-to)	932-944
Number of pages	13
Journal	Biometrical Journal
Volume	62
Issue number	4
DOIs	https://doi.org/10.1002/bimj.201900075
Publication status	Published - Jul 2020

Keywords

calibration
discrimination
multicenter
random effects
risk prediction model
OVARIAN-CANCER
VALIDATION
TRIALS

Access to Document

10.1002/bimj.201900075Licence: CC BY-NC

Cite this

@article{48c051f2e9924f36a6f65f53fe9aac09,

title = "Developing risk models for multicenter data using standard logistic regression produced suboptimal predictions: A simulation study",

abstract = "Although multicenter data are common, many prediction model studies ignore this during model development. The objective of this study is to evaluate the predictive performance of regression methods for developing clinical risk prediction models using multicenter data, and provide guidelines for practice. We compared the predictive performance of standard logistic regression, generalized estimating equations, random intercept logistic regression, and fixed effects logistic regression. First, we presented a case study on the diagnosis of ovarian cancer. Subsequently, a simulation study investigated the performance of the different models as a function of the amount of clustering, development sample size, distribution of center-specific intercepts, the presence of a center-predictor interaction, and the presence of a dependency between center effects and predictors. The results showed that when sample sizes were sufficiently large, conditional models yielded calibrated predictions, whereas marginal models yielded miscalibrated predictions. Small sample sizes led to overfitting and unreliable predictions. This miscalibration was worse with more heavily clustered data. Calibration of random intercept logistic regression was better than that of standard logistic regression even when center-specific intercepts were not normally distributed, a center-predictor interaction was present, center effects and predictors were dependent, or when the model was applied in a new center. Therefore, to make reliable predictions in a specific center, we recommend random intercept logistic regression.",

keywords = "calibration, discrimination, multicenter, random effects, risk prediction model, OVARIAN-CANCER, VALIDATION, TRIALS",

author = "Nora Falconieri and {Van Calster}, Ben and Dirk Timmerman and Laure Wynants",

note = "Publisher Copyright: {\textcopyright} 2020 The Authors. Biometrical Journal published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.",

year = "2020",

month = jul,

doi = "10.1002/bimj.201900075",

language = "English",

volume = "62",

pages = "932--944",

journal = "Biometrical Journal",

issn = "0323-3847",

publisher = "Wiley",

number = "4",

}

TY - JOUR

T1 - Developing risk models for multicenter data using standard logistic regression produced suboptimal predictions

T2 - A simulation study

AU - Falconieri, Nora

AU - Van Calster, Ben

AU - Timmerman, Dirk

AU - Wynants, Laure

PY - 2020/7

Y1 - 2020/7

N2 - Although multicenter data are common, many prediction model studies ignore this during model development. The objective of this study is to evaluate the predictive performance of regression methods for developing clinical risk prediction models using multicenter data, and provide guidelines for practice. We compared the predictive performance of standard logistic regression, generalized estimating equations, random intercept logistic regression, and fixed effects logistic regression. First, we presented a case study on the diagnosis of ovarian cancer. Subsequently, a simulation study investigated the performance of the different models as a function of the amount of clustering, development sample size, distribution of center-specific intercepts, the presence of a center-predictor interaction, and the presence of a dependency between center effects and predictors. The results showed that when sample sizes were sufficiently large, conditional models yielded calibrated predictions, whereas marginal models yielded miscalibrated predictions. Small sample sizes led to overfitting and unreliable predictions. This miscalibration was worse with more heavily clustered data. Calibration of random intercept logistic regression was better than that of standard logistic regression even when center-specific intercepts were not normally distributed, a center-predictor interaction was present, center effects and predictors were dependent, or when the model was applied in a new center. Therefore, to make reliable predictions in a specific center, we recommend random intercept logistic regression.

AB - Although multicenter data are common, many prediction model studies ignore this during model development. The objective of this study is to evaluate the predictive performance of regression methods for developing clinical risk prediction models using multicenter data, and provide guidelines for practice. We compared the predictive performance of standard logistic regression, generalized estimating equations, random intercept logistic regression, and fixed effects logistic regression. First, we presented a case study on the diagnosis of ovarian cancer. Subsequently, a simulation study investigated the performance of the different models as a function of the amount of clustering, development sample size, distribution of center-specific intercepts, the presence of a center-predictor interaction, and the presence of a dependency between center effects and predictors. The results showed that when sample sizes were sufficiently large, conditional models yielded calibrated predictions, whereas marginal models yielded miscalibrated predictions. Small sample sizes led to overfitting and unreliable predictions. This miscalibration was worse with more heavily clustered data. Calibration of random intercept logistic regression was better than that of standard logistic regression even when center-specific intercepts were not normally distributed, a center-predictor interaction was present, center effects and predictors were dependent, or when the model was applied in a new center. Therefore, to make reliable predictions in a specific center, we recommend random intercept logistic regression.

KW - calibration

KW - discrimination

KW - multicenter

KW - random effects

KW - risk prediction model

KW - OVARIAN-CANCER

KW - VALIDATION

KW - TRIALS

U2 - 10.1002/bimj.201900075

DO - 10.1002/bimj.201900075

M3 - Article

C2 - 31957077

SN - 0323-3847

VL - 62

SP - 932

EP - 944

JO - Biometrical Journal

JF - Biometrical Journal

IS - 4

ER -