Developing and Validating a Survival Prediction Model for NSCLC Patients Through Distributed Learning Across 3 Countries

Arthur Jochems; Timo M. Deist; Issam El Naqa; Marc Kessler; Chuck Mayo; Jackson Reeves; Shruti Jolly; Martha Matuszak; Randall Ten Haken; Johan van Soest; Cary Oberije; Corinne Faivre-Finn; Gareth Price; Dirk de Ruysscher; Philippe Lambin; Andre Dekker

doi:10.1016/j.ijrobp.2017.04.021

Developing and Validating a Survival Prediction Model for NSCLC Patients Through Distributed Learning Across 3 Countries

Arthur Jochems^*, Timo M. Deist, Issam El Naqa, Marc Kessler, Chuck Mayo, Jackson Reeves, Shruti Jolly, Martha Matuszak, Randall Ten Haken, Johan van Soest, Cary Oberije, Corinne Faivre-Finn, Gareth Price, Dirk de Ruysscher, Philippe Lambin, Andre Dekker

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

Abstract

Purpose: Tools for survival prediction for non-small cell lung cancer (NSCLC) patients treated with chemoradiation or radiation therapy are of limited quality. In this work, we developed a predictive model of survival at 2 years. The model is based on a large volume of historical patient data and serves as a proof of concept to demonstrate the distributed learning approach.

Methods and Materials: Clinical data from 698 lung cancer patients, treated with curative intent with chemoradiation or radiation therapy alone, were collected and stored at 2 different cancer institutes (559 patients at Maastro clinic (Netherlands) and 139 at Michigan university [ United States]). The model was further validated on 196 patients originating from The Christie (United Kingdon). A Bayesian network model was adapted for distributed learning (the animation can be viewed at https://www.youtube.com/watch?v=ZDJFOxpwqEA). Two-year posttreatment survival was chosen as the endpoint. The Maastro clinic cohort data are publicly available at https://www.cancerdata.org/publication/developing-and-validating-survival-prediction-model-nsclc-patients-through-distributed, and the developed models can be found at www.predictcancer.org.

Results: Variables included in the final model were T and N category, age, performance status, and total tumor dose. The model has an area under the curve (AUC) of 0.66 on the external validation set and an AUC of 0.62 on a 5-fold cross validation. A model based on the T and N category performed with an AUC of 0.47 on the validation set, significantly worse than our model (P

Conclusions: Distributed learning from federated databases allows learning of predictive models on data originating from multiple institutions while avoiding many of the data-sharing barriers. We believe that distributed learning is the future of sharing data in health care. (C) 2017 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Original language	English
Pages (from-to)	344-352
Number of pages	9
Journal	International Journal of Radiation Oncology Biology Physics
Volume	99
Issue number	2
DOIs	https://doi.org/10.1016/j.ijrobp.2017.04.021
Publication status	Published - 1 Oct 2017

Keywords

CELL-LUNG-CANCER
RECURSIVE PARTITIONING ANALYSIS
ONCOLOGY GROUP RTOG
GROSS TUMOR VOLUME
RADIATION-THERAPY
EXTERNAL VALIDATION
PROGNOSTIC-FACTORS
2-YEAR SURVIVAL
DOSE-ESCALATION
HEALTH-CARE

Access to Document

10.1016/j.ijrobp.2017.04.021Licence: CC BY-NC-ND

Cite this

Jochems, A., Deist, T. M., El Naqa, I., Kessler, M., Mayo, C., Reeves, J., Jolly, S., Matuszak, M., Ten Haken, R., van Soest, J., Oberije, C., Faivre-Finn, C., Price, G., de Ruysscher, D., Lambin, P., & Dekker, A. (2017). Developing and Validating a Survival Prediction Model for NSCLC Patients Through Distributed Learning Across 3 Countries. International Journal of Radiation Oncology Biology Physics, 99(2), 344-352. https://doi.org/10.1016/j.ijrobp.2017.04.021

@article{0b10630d8f634c298fc00f8933150895,

title = "Developing and Validating a Survival Prediction Model for NSCLC Patients Through Distributed Learning Across 3 Countries",

abstract = "Purpose: Tools for survival prediction for non-small cell lung cancer (NSCLC) patients treated with chemoradiation or radiation therapy are of limited quality. In this work, we developed a predictive model of survival at 2 years. The model is based on a large volume of historical patient data and serves as a proof of concept to demonstrate the distributed learning approach.Methods and Materials: Clinical data from 698 lung cancer patients, treated with curative intent with chemoradiation or radiation therapy alone, were collected and stored at 2 different cancer institutes (559 patients at Maastro clinic (Netherlands) and 139 at Michigan university [ United States]). The model was further validated on 196 patients originating from The Christie (United Kingdon). A Bayesian network model was adapted for distributed learning (the animation can be viewed at https://www.youtube.com/watch?v=ZDJFOxpwqEA). Two-year posttreatment survival was chosen as the endpoint. The Maastro clinic cohort data are publicly available at https://www.cancerdata.org/publication/developing-and-validating-survival-prediction-model-nsclc-patients-through-distributed, and the developed models can be found at www.predictcancer.org.Results: Variables included in the final model were T and N category, age, performance status, and total tumor dose. The model has an area under the curve (AUC) of 0.66 on the external validation set and an AUC of 0.62 on a 5-fold cross validation. A model based on the T and N category performed with an AUC of 0.47 on the validation set, significantly worse than our model (PConclusions: Distributed learning from federated databases allows learning of predictive models on data originating from multiple institutions while avoiding many of the data-sharing barriers. We believe that distributed learning is the future of sharing data in health care. (C) 2017 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).",

keywords = "CELL-LUNG-CANCER, RECURSIVE PARTITIONING ANALYSIS, ONCOLOGY GROUP RTOG, GROSS TUMOR VOLUME, RADIATION-THERAPY, EXTERNAL VALIDATION, PROGNOSTIC-FACTORS, 2-YEAR SURVIVAL, DOSE-ESCALATION, HEALTH-CARE",

author = "Arthur Jochems and Deist, {Timo M.} and {El Naqa}, Issam and Marc Kessler and Chuck Mayo and Jackson Reeves and Shruti Jolly and Martha Matuszak and {Ten Haken}, Randall and {van Soest}, Johan and Cary Oberije and Corinne Faivre-Finn and Gareth Price and {de Ruysscher}, Dirk and Philippe Lambin and Andre Dekker",

year = "2017",

month = oct,

day = "1",

doi = "10.1016/j.ijrobp.2017.04.021",

language = "English",

volume = "99",

pages = "344--352",

journal = "International Journal of Radiation Oncology Biology Physics",

issn = "0360-3016",

publisher = "Elsevier Science",

number = "2",

}

Jochems, A, Deist, TM, El Naqa, I, Kessler, M, Mayo, C, Reeves, J, Jolly, S, Matuszak, M, Ten Haken, R, van Soest, J, Oberije, C, Faivre-Finn, C, Price, G, de Ruysscher, D , Lambin, P & Dekker, A 2017, 'Developing and Validating a Survival Prediction Model for NSCLC Patients Through Distributed Learning Across 3 Countries', International Journal of Radiation Oncology Biology Physics, vol. 99, no. 2, pp. 344-352. https://doi.org/10.1016/j.ijrobp.2017.04.021

TY - JOUR

T1 - Developing and Validating a Survival Prediction Model for NSCLC Patients Through Distributed Learning Across 3 Countries

AU - Jochems, Arthur

AU - Deist, Timo M.

AU - El Naqa, Issam

AU - Kessler, Marc

AU - Mayo, Chuck

AU - Reeves, Jackson

AU - Jolly, Shruti

AU - Matuszak, Martha

AU - Ten Haken, Randall

AU - van Soest, Johan

AU - Oberije, Cary

AU - Faivre-Finn, Corinne

AU - Price, Gareth

AU - de Ruysscher, Dirk

AU - Lambin, Philippe

AU - Dekker, Andre

PY - 2017/10/1

Y1 - 2017/10/1

N2 - Purpose: Tools for survival prediction for non-small cell lung cancer (NSCLC) patients treated with chemoradiation or radiation therapy are of limited quality. In this work, we developed a predictive model of survival at 2 years. The model is based on a large volume of historical patient data and serves as a proof of concept to demonstrate the distributed learning approach.Methods and Materials: Clinical data from 698 lung cancer patients, treated with curative intent with chemoradiation or radiation therapy alone, were collected and stored at 2 different cancer institutes (559 patients at Maastro clinic (Netherlands) and 139 at Michigan university [ United States]). The model was further validated on 196 patients originating from The Christie (United Kingdon). A Bayesian network model was adapted for distributed learning (the animation can be viewed at https://www.youtube.com/watch?v=ZDJFOxpwqEA). Two-year posttreatment survival was chosen as the endpoint. The Maastro clinic cohort data are publicly available at https://www.cancerdata.org/publication/developing-and-validating-survival-prediction-model-nsclc-patients-through-distributed, and the developed models can be found at www.predictcancer.org.Results: Variables included in the final model were T and N category, age, performance status, and total tumor dose. The model has an area under the curve (AUC) of 0.66 on the external validation set and an AUC of 0.62 on a 5-fold cross validation. A model based on the T and N category performed with an AUC of 0.47 on the validation set, significantly worse than our model (PConclusions: Distributed learning from federated databases allows learning of predictive models on data originating from multiple institutions while avoiding many of the data-sharing barriers. We believe that distributed learning is the future of sharing data in health care. (C) 2017 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

AB - Purpose: Tools for survival prediction for non-small cell lung cancer (NSCLC) patients treated with chemoradiation or radiation therapy are of limited quality. In this work, we developed a predictive model of survival at 2 years. The model is based on a large volume of historical patient data and serves as a proof of concept to demonstrate the distributed learning approach.Methods and Materials: Clinical data from 698 lung cancer patients, treated with curative intent with chemoradiation or radiation therapy alone, were collected and stored at 2 different cancer institutes (559 patients at Maastro clinic (Netherlands) and 139 at Michigan university [ United States]). The model was further validated on 196 patients originating from The Christie (United Kingdon). A Bayesian network model was adapted for distributed learning (the animation can be viewed at https://www.youtube.com/watch?v=ZDJFOxpwqEA). Two-year posttreatment survival was chosen as the endpoint. The Maastro clinic cohort data are publicly available at https://www.cancerdata.org/publication/developing-and-validating-survival-prediction-model-nsclc-patients-through-distributed, and the developed models can be found at www.predictcancer.org.Results: Variables included in the final model were T and N category, age, performance status, and total tumor dose. The model has an area under the curve (AUC) of 0.66 on the external validation set and an AUC of 0.62 on a 5-fold cross validation. A model based on the T and N category performed with an AUC of 0.47 on the validation set, significantly worse than our model (PConclusions: Distributed learning from federated databases allows learning of predictive models on data originating from multiple institutions while avoiding many of the data-sharing barriers. We believe that distributed learning is the future of sharing data in health care. (C) 2017 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

KW - CELL-LUNG-CANCER

KW - RECURSIVE PARTITIONING ANALYSIS

KW - ONCOLOGY GROUP RTOG

KW - GROSS TUMOR VOLUME

KW - RADIATION-THERAPY

KW - EXTERNAL VALIDATION

KW - PROGNOSTIC-FACTORS

KW - 2-YEAR SURVIVAL

KW - DOSE-ESCALATION

KW - HEALTH-CARE

U2 - 10.1016/j.ijrobp.2017.04.021

DO - 10.1016/j.ijrobp.2017.04.021

M3 - Article

C2 - 28871984

SN - 0360-3016

VL - 99

SP - 344

EP - 352

JO - International Journal of Radiation Oncology Biology Physics

JF - International Journal of Radiation Oncology Biology Physics

IS - 2

ER -