Predicting mortality of individual patients with COVID-19: a multicentre Dutch cohort

M.C. Ottenhoff*, L.A. Ramos, W. Potters, M.L.F. Janssen, D. Hubers, S. Hu, E.A. Fridgeirsson, D. Pina-Fuentes, R. Thomas, I.C.C. van der Horst, C. Herff, P. Kubben, P.W.G. Elbers, H.A. Marquering, M. Welling, S. Simsek, M.D. de Kruif, T. Dormans, L.M. Fleuren, M. SchinkelP.G. Noordzij, J.P. van den Bergh, C.E. Wyers, D.T.B. Buis, W.J. Wiersinga, E.H.C. van den Hout, A.C. Reidinga, D. Rusch, K.C.E. Sigaloff, R.A. Douma, L. de Haan, N.C.G. van den Oever, R.J.M.W. Rennenberg, G.A. van Wingen, M.J.H. Aries, M. Beudel, Dutch COVID-PREDICT research group

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review


Objective Develop and validate models that predict mortality of patients diagnosed with COVID-19 admitted to the hospital. Design Retrospective cohort study. Setting A multicentre cohort across 10 Dutch hospitals including patients from 27 February to 8 June 2020. Participants SARS-CoV-2 positive patients (age >= 18) admitted to the hospital. Main outcome measures 21-day all-cause mortality evaluated by the area under the receiver operator curve (AUC), sensitivity, specificity, positive predictive value and negative predictive value. The predictive value of age was explored by comparison with age-based rules used in practice and by excluding age from the analysis. Results 2273 patients were included, of whom 516 had died or discharged to palliative care within 21 days after admission. Five feature sets, including premorbid, clinical presentation and laboratory and radiology values, were derived from 80 features. Additionally, an Analysis of Variance (ANOVA)-based data-driven feature selection selected the 10 features with the highest F values: age, number of home medications, urea nitrogen, lactate dehydrogenase, albumin, oxygen saturation (%), oxygen saturation is measured on room air, oxygen saturation is measured on oxygen therapy, blood gas pH and history of chronic cardiac disease. A linear logistic regression and non-linear tree-based gradient boosting algorithm fitted the data with an AUC of 0.81 (95% CI 0.77 to 0.85) and 0.82 (0.79 to 0.85), respectively, using the 10 selected features. Both models outperformed age-based decision rules used in practice (AUC of 0.69, 0.65 to 0.74 for age >70). Furthermore, performance remained stable when excluding age as predictor (AUC of 0.78, 0.75 to 0.81). Conclusion Both models showed good performance and had better test characteristics than age-based decision rules, using 10 admission features readily available in Dutch hospitals. The models hold promise to aid decision-making during a hospital bed shortage.
Original languageEnglish
Article numbere047347
Number of pages13
JournalBMJ Open
Issue number7
Publication statusPublished - 2021


  • COVID-19
  • public health
  • risk management

Cite this