Deep-learning system to improve the quality and efficiency of volumetric heart segmentation for breast cancer

Roman Zeleznik; Jakob Weiss; Jana Taron; Christian Guthier; Danielle S. Bitterman; Cindy Hancox; Benjamin H. Kann; Daniel W. Kim; Rinaa S. Punglia; Jeremy Bredfeldt; Borek Foldyna; Parastou Eslami; Michael T. Lu; Udo Hoffmann; Raymond Mak; Hugo J. W. L. Aerts

doi:10.1038/s41746-021-00416-5

Deep-learning system to improve the quality and efficiency of volumetric heart segmentation for breast cancer

Roman Zeleznik, Jakob Weiss, Jana Taron, Christian Guthier, Danielle S. Bitterman, Cindy Hancox, Benjamin H. Kann, Daniel W. Kim, Rinaa S. Punglia, Jeremy Bredfeldt, Borek Foldyna, Parastou Eslami, Michael T. Lu, Udo Hoffmann, Raymond Mak, Hugo J. W. L. Aerts^*

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

Abstract

Although artificial intelligence algorithms are often developed and applied for narrow tasks, their implementation in other medical settings could help to improve patient care. Here we assess whether a deep-learning system for volumetric heart segmentation on computed tomography (CT) scans developed in cardiovascular radiology can optimize treatment planning in radiation oncology. The system was trained using multi-center data (n = 858) with manual heart segmentations provided by cardiovascular radiologists. Validation of the system was performed in an independent real-world dataset of 5677 breast cancer patients treated with radiation therapy at the Dana-Farber/Brigham and Women's Cancer Center between 2008-2018. In a subset of 20 patients, the performance of the system was compared to eight radiation oncology experts by assessing segmentation time, agreement between experts, and accuracy with and without deep-learning assistance. To compare the performance to segmentations used in the clinic, concordance and failures (defined as Dice < 0.85) of the system were evaluated in the entire dataset. The system was successfully applied without retraining. With deep-learning assistance, segmentation time significantly decreased (4.0 min [IQR 3.1-5.0] vs. 2.0 min [IQR 1.3-3.5]; p < 0.001), and agreement increased (Dice 0.95 [IQR = 0.02]; vs. 0.97 [IQR = 0.02], p < 0.001). Expert accuracy was similar with and without deep-learning assistance (Dice 0.92 [IQR = 0.02] vs. 0.92 [IQR = 0.02]; p = 0.48), and not significantly different from deep-learning-only segmentations (Dice 0.92 [IQR = 0.02]; p >= 0.1). In comparison to real-world data, the system showed high concordance (Dice 0.89 [IQR = 0.06]) across 5677 patients and a significantly lower failure rate (p < 0.001). These results suggest that deep-learning algorithms can successfully be applied across medical specialties and improve clinical care beyond the original field of interest.

Original language	English
Article number	43
Number of pages	7
Journal	npj Digital Medicine
Volume	4
Issue number	1
DOIs	https://doi.org/10.1038/s41746-021-00416-5
Publication status	Published - 5 Mar 2021

Keywords

ARTIFICIAL-INTELLIGENCE

Access to Document

10.1038/s41746-021-00416-5Licence: CC BY

Cite this

Zeleznik, R., Weiss, J., Taron, J., Guthier, C., Bitterman, D. S., Hancox, C., Kann, B. H., Kim, D. W., Punglia, R. S., Bredfeldt, J., Foldyna, B., Eslami, P., Lu, M. T., Hoffmann, U., Mak, R., & Aerts, H. J. W. L. (2021). Deep-learning system to improve the quality and efficiency of volumetric heart segmentation for breast cancer. npj Digital Medicine, 4(1), Article 43. https://doi.org/10.1038/s41746-021-00416-5

@article{91dcae3b44804766880a9280f8c3265e,

title = "Deep-learning system to improve the quality and efficiency of volumetric heart segmentation for breast cancer",

abstract = "Although artificial intelligence algorithms are often developed and applied for narrow tasks, their implementation in other medical settings could help to improve patient care. Here we assess whether a deep-learning system for volumetric heart segmentation on computed tomography (CT) scans developed in cardiovascular radiology can optimize treatment planning in radiation oncology. The system was trained using multi-center data (n = 858) with manual heart segmentations provided by cardiovascular radiologists. Validation of the system was performed in an independent real-world dataset of 5677 breast cancer patients treated with radiation therapy at the Dana-Farber/Brigham and Women's Cancer Center between 2008-2018. In a subset of 20 patients, the performance of the system was compared to eight radiation oncology experts by assessing segmentation time, agreement between experts, and accuracy with and without deep-learning assistance. To compare the performance to segmentations used in the clinic, concordance and failures (defined as Dice < 0.85) of the system were evaluated in the entire dataset. The system was successfully applied without retraining. With deep-learning assistance, segmentation time significantly decreased (4.0 min [IQR 3.1-5.0] vs. 2.0 min [IQR 1.3-3.5]; p < 0.001), and agreement increased (Dice 0.95 [IQR = 0.02]; vs. 0.97 [IQR = 0.02], p < 0.001). Expert accuracy was similar with and without deep-learning assistance (Dice 0.92 [IQR = 0.02] vs. 0.92 [IQR = 0.02]; p = 0.48), and not significantly different from deep-learning-only segmentations (Dice 0.92 [IQR = 0.02]; p >= 0.1). In comparison to real-world data, the system showed high concordance (Dice 0.89 [IQR = 0.06]) across 5677 patients and a significantly lower failure rate (p < 0.001). These results suggest that deep-learning algorithms can successfully be applied across medical specialties and improve clinical care beyond the original field of interest.",

keywords = "ARTIFICIAL-INTELLIGENCE",

author = "Roman Zeleznik and Jakob Weiss and Jana Taron and Christian Guthier and Bitterman, {Danielle S.} and Cindy Hancox and Kann, {Benjamin H.} and Kim, {Daniel W.} and Punglia, {Rinaa S.} and Jeremy Bredfeldt and Borek Foldyna and Parastou Eslami and Lu, {Michael T.} and Udo Hoffmann and Raymond Mak and Aerts, {Hugo J. W. L.}",

note = "Publisher Copyright: {\textcopyright} 2021, The Author(s).",

year = "2021",

month = mar,

day = "5",

doi = "10.1038/s41746-021-00416-5",

language = "English",

volume = "4",

journal = "npj Digital Medicine",

issn = "2398-6352",

publisher = "Springer Nature",

number = "1",

}

Zeleznik, R, Weiss, J, Taron, J, Guthier, C, Bitterman, DS, Hancox, C, Kann, BH, Kim, DW, Punglia, RS, Bredfeldt, J, Foldyna, B, Eslami, P, Lu, MT, Hoffmann, U, Mak, R & Aerts, HJWL 2021, 'Deep-learning system to improve the quality and efficiency of volumetric heart segmentation for breast cancer', npj Digital Medicine, vol. 4, no. 1, 43. https://doi.org/10.1038/s41746-021-00416-5

TY - JOUR

T1 - Deep-learning system to improve the quality and efficiency of volumetric heart segmentation for breast cancer

AU - Zeleznik, Roman

AU - Weiss, Jakob

AU - Taron, Jana

AU - Guthier, Christian

AU - Bitterman, Danielle S.

AU - Hancox, Cindy

AU - Kann, Benjamin H.

AU - Kim, Daniel W.

AU - Punglia, Rinaa S.

AU - Bredfeldt, Jeremy

AU - Foldyna, Borek

AU - Eslami, Parastou

AU - Lu, Michael T.

AU - Hoffmann, Udo

AU - Mak, Raymond

AU - Aerts, Hugo J. W. L.

PY - 2021/3/5

Y1 - 2021/3/5

N2 - Although artificial intelligence algorithms are often developed and applied for narrow tasks, their implementation in other medical settings could help to improve patient care. Here we assess whether a deep-learning system for volumetric heart segmentation on computed tomography (CT) scans developed in cardiovascular radiology can optimize treatment planning in radiation oncology. The system was trained using multi-center data (n = 858) with manual heart segmentations provided by cardiovascular radiologists. Validation of the system was performed in an independent real-world dataset of 5677 breast cancer patients treated with radiation therapy at the Dana-Farber/Brigham and Women's Cancer Center between 2008-2018. In a subset of 20 patients, the performance of the system was compared to eight radiation oncology experts by assessing segmentation time, agreement between experts, and accuracy with and without deep-learning assistance. To compare the performance to segmentations used in the clinic, concordance and failures (defined as Dice < 0.85) of the system were evaluated in the entire dataset. The system was successfully applied without retraining. With deep-learning assistance, segmentation time significantly decreased (4.0 min [IQR 3.1-5.0] vs. 2.0 min [IQR 1.3-3.5]; p < 0.001), and agreement increased (Dice 0.95 [IQR = 0.02]; vs. 0.97 [IQR = 0.02], p < 0.001). Expert accuracy was similar with and without deep-learning assistance (Dice 0.92 [IQR = 0.02] vs. 0.92 [IQR = 0.02]; p = 0.48), and not significantly different from deep-learning-only segmentations (Dice 0.92 [IQR = 0.02]; p >= 0.1). In comparison to real-world data, the system showed high concordance (Dice 0.89 [IQR = 0.06]) across 5677 patients and a significantly lower failure rate (p < 0.001). These results suggest that deep-learning algorithms can successfully be applied across medical specialties and improve clinical care beyond the original field of interest.

AB - Although artificial intelligence algorithms are often developed and applied for narrow tasks, their implementation in other medical settings could help to improve patient care. Here we assess whether a deep-learning system for volumetric heart segmentation on computed tomography (CT) scans developed in cardiovascular radiology can optimize treatment planning in radiation oncology. The system was trained using multi-center data (n = 858) with manual heart segmentations provided by cardiovascular radiologists. Validation of the system was performed in an independent real-world dataset of 5677 breast cancer patients treated with radiation therapy at the Dana-Farber/Brigham and Women's Cancer Center between 2008-2018. In a subset of 20 patients, the performance of the system was compared to eight radiation oncology experts by assessing segmentation time, agreement between experts, and accuracy with and without deep-learning assistance. To compare the performance to segmentations used in the clinic, concordance and failures (defined as Dice < 0.85) of the system were evaluated in the entire dataset. The system was successfully applied without retraining. With deep-learning assistance, segmentation time significantly decreased (4.0 min [IQR 3.1-5.0] vs. 2.0 min [IQR 1.3-3.5]; p < 0.001), and agreement increased (Dice 0.95 [IQR = 0.02]; vs. 0.97 [IQR = 0.02], p < 0.001). Expert accuracy was similar with and without deep-learning assistance (Dice 0.92 [IQR = 0.02] vs. 0.92 [IQR = 0.02]; p = 0.48), and not significantly different from deep-learning-only segmentations (Dice 0.92 [IQR = 0.02]; p >= 0.1). In comparison to real-world data, the system showed high concordance (Dice 0.89 [IQR = 0.06]) across 5677 patients and a significantly lower failure rate (p < 0.001). These results suggest that deep-learning algorithms can successfully be applied across medical specialties and improve clinical care beyond the original field of interest.

KW - ARTIFICIAL-INTELLIGENCE

U2 - 10.1038/s41746-021-00416-5

DO - 10.1038/s41746-021-00416-5

M3 - Article

C2 - 33674717

SN - 2398-6352

VL - 4

JO - npj Digital Medicine

JF - npj Digital Medicine

IS - 1

M1 - 43

ER -