Predicting Lung Cancer Survival Using Probabilistic Reclassification of TNM Editions With a Bayesian Network

Melle S. Sieswerda; Inigo Bermejo; Gijs Geleijnse; Mieke J. Aarts; Valery E. P. P. Lemmens; Dirk De Ruysscher; Andre L. A. J. Dekker; Xander A. A. M. Verbeek

doi:10.1200/CCI.19.00136

Predicting Lung Cancer Survival Using Probabilistic Reclassification of TNM Editions With a Bayesian Network

Melle S. Sieswerda^*, Inigo Bermejo, Gijs Geleijnse, Mieke J. Aarts, Valery E. P. P. Lemmens, Dirk De Ruysscher, Andre L. A. J. Dekker, Xander A. A. M. Verbeek

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

Abstract

PURPOSE The TNM classification system is used for prognosis, treatment, and research. Regular updates potentially break backward compatibility. Reclassification is not always possible, is labor intensive, or requires additional data. We developed a Bayesian network (BN) for reclassifying the 5th, 6th, and 7th editions of the TNM and predicting survival for non-small-cell lung cancer (NSCLC) without training data with known classifications in multiple editions.

METHODS Data were obtained from the Netherlands Cancer Registry (n = 146,084). A BN was designed with nodes for TNM edition and survival, and a group of nodes was designed for all TNM editions, with a group for edition 7 only. Before learning conditional probabilities, priors for relations between the groups were manually specified after analysis of changes between editions. For performance evaluation only, part of the 7th edition test data were manually reclassified. Performance was evaluated using sensitivity, specificity, and accuracy. Twoyear survival was evaluated with the receiver operating characteristic area under the curve (AUC), and model calibration was visualized.

RESULTS Manual reclassification of 7th to 6th edition stage group as ground truth for testing was impossible in 5.6% of the patients. Predicting 6th edition stage grouping using 7th edition data and vice versa resulted in average accuracies, sensitivities, and specificities between 0.85 and 0.99. The AUC for 2-year survival was 0.81.

CONCLUSION We have successfully created a BN for reclassifying TNM stage grouping across TNM editions and predicting survival in NSCLC without knowing the true TNM classification in various editions in the training set. We suggest binary prediction of survival is less relevant than predicted probability and model calibration. For research, probabilities can be used for weighted reclassification. (C) 2020 by American Society of Clinical Oncology.

Original language	English
Pages (from-to)	436-443
Number of pages	8
Journal	JCO Clinical Cancer Informatics
Volume	4
DOIs	https://doi.org/10.1200/CCI.19.00136
Publication status	Published - 11 May 2020

Keywords

SYSTEM

Access to Document

10.1200/CCI.19.00136Licence: CC BY

Cite this

@article{b94679eed9dc4559aed08d6829777776,

title = "Predicting Lung Cancer Survival Using Probabilistic Reclassification of TNM Editions With a Bayesian Network",

abstract = "PURPOSE The TNM classification system is used for prognosis, treatment, and research. Regular updates potentially break backward compatibility. Reclassification is not always possible, is labor intensive, or requires additional data. We developed a Bayesian network (BN) for reclassifying the 5th, 6th, and 7th editions of the TNM and predicting survival for non-small-cell lung cancer (NSCLC) without training data with known classifications in multiple editions.METHODS Data were obtained from the Netherlands Cancer Registry (n = 146,084). A BN was designed with nodes for TNM edition and survival, and a group of nodes was designed for all TNM editions, with a group for edition 7 only. Before learning conditional probabilities, priors for relations between the groups were manually specified after analysis of changes between editions. For performance evaluation only, part of the 7th edition test data were manually reclassified. Performance was evaluated using sensitivity, specificity, and accuracy. Twoyear survival was evaluated with the receiver operating characteristic area under the curve (AUC), and model calibration was visualized.RESULTS Manual reclassification of 7th to 6th edition stage group as ground truth for testing was impossible in 5.6% of the patients. Predicting 6th edition stage grouping using 7th edition data and vice versa resulted in average accuracies, sensitivities, and specificities between 0.85 and 0.99. The AUC for 2-year survival was 0.81.CONCLUSION We have successfully created a BN for reclassifying TNM stage grouping across TNM editions and predicting survival in NSCLC without knowing the true TNM classification in various editions in the training set. We suggest binary prediction of survival is less relevant than predicted probability and model calibration. For research, probabilities can be used for weighted reclassification. (C) 2020 by American Society of Clinical Oncology.",

keywords = "SYSTEM",

author = "Sieswerda, {Melle S.} and Inigo Bermejo and Gijs Geleijnse and Aarts, {Mieke J.} and Lemmens, {Valery E. P. P.} and {De Ruysscher}, Dirk and Dekker, {Andre L. A. J.} and Verbeek, {Xander A. A. M.}",

note = "Publisher Copyright: {\textcopyright} 2020 by American Society of Clinical Oncology",

year = "2020",

month = may,

day = "11",

doi = "10.1200/CCI.19.00136",

language = "English",

volume = "4",

pages = "436--443",

journal = "JCO Clinical Cancer Informatics",

issn = "2473-4276",

publisher = "American Society of Clinical Oncology",

}

TY - JOUR

T1 - Predicting Lung Cancer Survival Using Probabilistic Reclassification of TNM Editions With a Bayesian Network

AU - Sieswerda, Melle S.

AU - Bermejo, Inigo

AU - Geleijnse, Gijs

AU - Aarts, Mieke J.

AU - Lemmens, Valery E. P. P.

AU - De Ruysscher, Dirk

AU - Dekker, Andre L. A. J.

AU - Verbeek, Xander A. A. M.

PY - 2020/5/11

Y1 - 2020/5/11

N2 - PURPOSE The TNM classification system is used for prognosis, treatment, and research. Regular updates potentially break backward compatibility. Reclassification is not always possible, is labor intensive, or requires additional data. We developed a Bayesian network (BN) for reclassifying the 5th, 6th, and 7th editions of the TNM and predicting survival for non-small-cell lung cancer (NSCLC) without training data with known classifications in multiple editions.METHODS Data were obtained from the Netherlands Cancer Registry (n = 146,084). A BN was designed with nodes for TNM edition and survival, and a group of nodes was designed for all TNM editions, with a group for edition 7 only. Before learning conditional probabilities, priors for relations between the groups were manually specified after analysis of changes between editions. For performance evaluation only, part of the 7th edition test data were manually reclassified. Performance was evaluated using sensitivity, specificity, and accuracy. Twoyear survival was evaluated with the receiver operating characteristic area under the curve (AUC), and model calibration was visualized.RESULTS Manual reclassification of 7th to 6th edition stage group as ground truth for testing was impossible in 5.6% of the patients. Predicting 6th edition stage grouping using 7th edition data and vice versa resulted in average accuracies, sensitivities, and specificities between 0.85 and 0.99. The AUC for 2-year survival was 0.81.CONCLUSION We have successfully created a BN for reclassifying TNM stage grouping across TNM editions and predicting survival in NSCLC without knowing the true TNM classification in various editions in the training set. We suggest binary prediction of survival is less relevant than predicted probability and model calibration. For research, probabilities can be used for weighted reclassification. (C) 2020 by American Society of Clinical Oncology.

AB - PURPOSE The TNM classification system is used for prognosis, treatment, and research. Regular updates potentially break backward compatibility. Reclassification is not always possible, is labor intensive, or requires additional data. We developed a Bayesian network (BN) for reclassifying the 5th, 6th, and 7th editions of the TNM and predicting survival for non-small-cell lung cancer (NSCLC) without training data with known classifications in multiple editions.METHODS Data were obtained from the Netherlands Cancer Registry (n = 146,084). A BN was designed with nodes for TNM edition and survival, and a group of nodes was designed for all TNM editions, with a group for edition 7 only. Before learning conditional probabilities, priors for relations between the groups were manually specified after analysis of changes between editions. For performance evaluation only, part of the 7th edition test data were manually reclassified. Performance was evaluated using sensitivity, specificity, and accuracy. Twoyear survival was evaluated with the receiver operating characteristic area under the curve (AUC), and model calibration was visualized.RESULTS Manual reclassification of 7th to 6th edition stage group as ground truth for testing was impossible in 5.6% of the patients. Predicting 6th edition stage grouping using 7th edition data and vice versa resulted in average accuracies, sensitivities, and specificities between 0.85 and 0.99. The AUC for 2-year survival was 0.81.CONCLUSION We have successfully created a BN for reclassifying TNM stage grouping across TNM editions and predicting survival in NSCLC without knowing the true TNM classification in various editions in the training set. We suggest binary prediction of survival is less relevant than predicted probability and model calibration. For research, probabilities can be used for weighted reclassification. (C) 2020 by American Society of Clinical Oncology.

KW - SYSTEM

U2 - 10.1200/CCI.19.00136

DO - 10.1200/CCI.19.00136

M3 - Article

C2 - 32392098

SN - 2473-4276

VL - 4

SP - 436

EP - 443

JO - JCO Clinical Cancer Informatics

JF - JCO Clinical Cancer Informatics

ER -