Federated learning enables big data for rare cancer boundary detection

Sarthak Pati; Ujjwal Baid; Brandon Edwards; Micah Sheller; G. Anthony Reina; Spyridon Bakas; Shih-Han Wang; Patrick Foley; Alexey Gruzdev; Deepthi Karkada; Christos Davatzikos; Chiharu Sako; Satyam Ghodasara; Michel Bilello; Suyash Mohan; Philipp Vollmuth; Gianluca Brugnara; Chandrakanth J. Preetha; Felix Sahm; Klaus Maier-Hein; Maximilian Zenk; Martin Bendszus; Wolfgang Wick; Evan Calabrese; Jeffrey Rudie; Javier Villanueva-Meyer; Soonmee Cha; Madhura Ingalhalikar; Manali Jadhav; Umang Pandey; Jitender Saini; John Garrett; Matthew Larson; Robert Jeraj; Stuart Currie; Russell Frood; Kavi Fatania; Raymond Y. Huang; Ken Chang; Carmen Balana; Jaume Capellades; Josep Puig; Johannes Trenkler; Josef Pichler; Georg Necker; Andreas Haunschmidt; Stephan Meckel; Gaurav Shukla; Spencer Liem; Gregory S. Alexander; Regina G. H. Beets-Tan; Et al.

doi:10.1038/s41467-022-33407-5

Federated learning enables big data for rare cancer boundary detection

Sarthak Pati, Ujjwal Baid, Brandon Edwards, Micah Sheller, G. Anthony Reina, Spyridon Bakas^*, Shih-Han Wang, Patrick Foley, Alexey Gruzdev, Deepthi Karkada, Christos Davatzikos, Chiharu Sako, Satyam Ghodasara, Michel Bilello, Suyash Mohan, Philipp Vollmuth, Gianluca Brugnara, Chandrakanth J. Preetha, Felix Sahm, Klaus Maier-HeinMaximilian Zenk, Martin Bendszus, Wolfgang Wick, Evan Calabrese, Jeffrey Rudie, Javier Villanueva-Meyer, Soonmee Cha, Madhura Ingalhalikar, Manali Jadhav, Umang Pandey, Jitender Saini, John Garrett, Matthew Larson, Robert Jeraj, Stuart Currie, Russell Frood, Kavi Fatania, Raymond Y. Huang, Ken Chang, Carmen Balana, Jaume Capellades, Josep Puig, Johannes Trenkler, Josef Pichler, Georg Necker, Andreas Haunschmidt, Stephan Meckel, Gaurav Shukla, Spencer Liem, Gregory S. Alexander, Regina G. H. Beets-Tan, Et al.

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

Abstract

Although machine learning (ML) has shown promise across disciplines, out-of-sample generalizability is concerning. This is currently addressed by sharing multi-site data, but such centralization is challenging/infeasible to scale due to various limitations. Federated ML (FL) provides an alternative paradigm for accurate and generalizable ML, by only sharing numericalmodel updates. Here we present the largest FL study to-date, involving data from 71 sites across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, reporting the largest such dataset in the literature (n = 6, 314). We demonstrate a 33% delineation improvement for the surgically targetable tumor, and 23% for the complete tumor extent, over a publicly trained model. We anticipate our study to: 1) enable more healthcare studies informed by large diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further analyses for glioblastoma by releasing our consensus model, and 3) demonstrate the FL effectiveness at such scale and task-complexity as a paradigm shift for multisite collaborations, alleviating the need for data-sharing.

Original language	English
Article number	7346
Number of pages	17
Journal	Nature Communications
Volume	13
Issue number	1
DOIs	https://doi.org/10.1038/s41467-022-33407-5
Publication status	Published - 5 Dec 2022

Keywords

CENTRAL-NERVOUS-SYSTEM
BRAIN
PERFORMANCE
ATLAS
MRI
SEGMENTATION
SURVIVAL
CLASSIFICATION
BEVACIZUMAB
VALIDATION

Access to Document

10.1038/s41467-022-33407-5Licence: CC BY

1 Erratum / corrigendum / retractions

Author Correction: Federated learning enables big data for rare cancer boundary detection (Nature Communications,
Beets - Tan, R. & Et al., 26 Jan 2023, In: Nature Communications. 14, 1, 2 p., 436.
Research output: Contribution to journal › Erratum / corrigendum / retractions › Academic

Open Access

Cite this

Pati, S., Baid, U., Edwards, B., Sheller, M., Reina, G. A., Bakas, S., Wang, S.-H., Foley, P., Gruzdev, A., Karkada, D., Davatzikos, C., Sako, C., Ghodasara, S., Bilello, M., Mohan, S., Vollmuth, P., Brugnara, G., Preetha, C. J., Sahm, F., ... Et al. (2022). Federated learning enables big data for rare cancer boundary detection. Nature Communications, 13(1), Article 7346. https://doi.org/10.1038/s41467-022-33407-5

@article{bf12e5099d2b4e55980a0d3f0dcfa90a,

title = "Federated learning enables big data for rare cancer boundary detection",

abstract = "Although machine learning (ML) has shown promise across disciplines, out-of-sample generalizability is concerning. This is currently addressed by sharing multi-site data, but such centralization is challenging/infeasible to scale due to various limitations. Federated ML (FL) provides an alternative paradigm for accurate and generalizable ML, by only sharing numericalmodel updates. Here we present the largest FL study to-date, involving data from 71 sites across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, reporting the largest such dataset in the literature (n = 6, 314). We demonstrate a 33% delineation improvement for the surgically targetable tumor, and 23% for the complete tumor extent, over a publicly trained model. We anticipate our study to: 1) enable more healthcare studies informed by large diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further analyses for glioblastoma by releasing our consensus model, and 3) demonstrate the FL effectiveness at such scale and task-complexity as a paradigm shift for multisite collaborations, alleviating the need for data-sharing.",

keywords = "CENTRAL-NERVOUS-SYSTEM, BRAIN, PERFORMANCE, ATLAS, MRI, SEGMENTATION, SURVIVAL, CLASSIFICATION, BEVACIZUMAB, VALIDATION",

author = "Sarthak Pati and Ujjwal Baid and Brandon Edwards and Micah Sheller and Reina, {G. Anthony} and Spyridon Bakas and Shih-Han Wang and Patrick Foley and Alexey Gruzdev and Deepthi Karkada and Christos Davatzikos and Chiharu Sako and Satyam Ghodasara and Michel Bilello and Suyash Mohan and Philipp Vollmuth and Gianluca Brugnara and Preetha, {Chandrakanth J.} and Felix Sahm and Klaus Maier-Hein and Maximilian Zenk and Martin Bendszus and Wolfgang Wick and Evan Calabrese and Jeffrey Rudie and Javier Villanueva-Meyer and Soonmee Cha and Madhura Ingalhalikar and Manali Jadhav and Umang Pandey and Jitender Saini and John Garrett and Matthew Larson and Robert Jeraj and Stuart Currie and Russell Frood and Kavi Fatania and Huang, {Raymond Y.} and Ken Chang and Carmen Balana and Jaume Capellades and Josep Puig and Johannes Trenkler and Josef Pichler and Georg Necker and Andreas Haunschmidt and Stephan Meckel and Gaurav Shukla and Spencer Liem and Alexander, {Gregory S.} and Beets-Tan, {Regina G. H.} and {Et al.}",

year = "2022",

month = dec,

day = "5",

doi = "10.1038/s41467-022-33407-5",

language = "English",

volume = "13",

journal = "Nature Communications",

issn = "2041-1723",

publisher = "Nature Publishing Group",

number = "1",

}

Pati, S, Baid, U, Edwards, B, Sheller, M, Reina, GA, Bakas, S, Wang, S-H, Foley, P, Gruzdev, A, Karkada, D, Davatzikos, C, Sako, C, Ghodasara, S, Bilello, M, Mohan, S, Vollmuth, P, Brugnara, G, Preetha, CJ, Sahm, F, Maier-Hein, K, Zenk, M, Bendszus, M, Wick, W, Calabrese, E, Rudie, J, Villanueva-Meyer, J, Cha, S, Ingalhalikar, M, Jadhav, M, Pandey, U, Saini, J, Garrett, J, Larson, M, Jeraj, R, Currie, S, Frood, R, Fatania, K, Huang, RY, Chang, K, Balana, C, Capellades, J, Puig, J, Trenkler, J, Pichler, J, Necker, G, Haunschmidt, A, Meckel, S, Shukla, G, Liem, S, Alexander, GS, Beets-Tan, RGH & Et al. 2022, 'Federated learning enables big data for rare cancer boundary detection', Nature Communications, vol. 13, no. 1, 7346. https://doi.org/10.1038/s41467-022-33407-5

TY - JOUR

T1 - Federated learning enables big data for rare cancer boundary detection

AU - Pati, Sarthak

AU - Baid, Ujjwal

AU - Edwards, Brandon

AU - Sheller, Micah

AU - Reina, G. Anthony

AU - Bakas, Spyridon

AU - Wang, Shih-Han

AU - Foley, Patrick

AU - Gruzdev, Alexey

AU - Karkada, Deepthi

AU - Davatzikos, Christos

AU - Sako, Chiharu

AU - Ghodasara, Satyam

AU - Bilello, Michel

AU - Mohan, Suyash

AU - Vollmuth, Philipp

AU - Brugnara, Gianluca

AU - Preetha, Chandrakanth J.

AU - Sahm, Felix

AU - Maier-Hein, Klaus

AU - Zenk, Maximilian

AU - Bendszus, Martin

AU - Wick, Wolfgang

AU - Calabrese, Evan

AU - Rudie, Jeffrey

AU - Villanueva-Meyer, Javier

AU - Cha, Soonmee

AU - Ingalhalikar, Madhura

AU - Jadhav, Manali

AU - Pandey, Umang

AU - Saini, Jitender

AU - Garrett, John

AU - Larson, Matthew

AU - Jeraj, Robert

AU - Currie, Stuart

AU - Frood, Russell

AU - Fatania, Kavi

AU - Huang, Raymond Y.

AU - Chang, Ken

AU - Balana, Carmen

AU - Capellades, Jaume

AU - Puig, Josep

AU - Trenkler, Johannes

AU - Pichler, Josef

AU - Necker, Georg

AU - Haunschmidt, Andreas

AU - Meckel, Stephan

AU - Shukla, Gaurav

AU - Liem, Spencer

AU - Alexander, Gregory S.

AU - Beets-Tan, Regina G. H.

AU - Et al.

PY - 2022/12/5

Y1 - 2022/12/5

N2 - Although machine learning (ML) has shown promise across disciplines, out-of-sample generalizability is concerning. This is currently addressed by sharing multi-site data, but such centralization is challenging/infeasible to scale due to various limitations. Federated ML (FL) provides an alternative paradigm for accurate and generalizable ML, by only sharing numericalmodel updates. Here we present the largest FL study to-date, involving data from 71 sites across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, reporting the largest such dataset in the literature (n = 6, 314). We demonstrate a 33% delineation improvement for the surgically targetable tumor, and 23% for the complete tumor extent, over a publicly trained model. We anticipate our study to: 1) enable more healthcare studies informed by large diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further analyses for glioblastoma by releasing our consensus model, and 3) demonstrate the FL effectiveness at such scale and task-complexity as a paradigm shift for multisite collaborations, alleviating the need for data-sharing.

AB - Although machine learning (ML) has shown promise across disciplines, out-of-sample generalizability is concerning. This is currently addressed by sharing multi-site data, but such centralization is challenging/infeasible to scale due to various limitations. Federated ML (FL) provides an alternative paradigm for accurate and generalizable ML, by only sharing numericalmodel updates. Here we present the largest FL study to-date, involving data from 71 sites across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, reporting the largest such dataset in the literature (n = 6, 314). We demonstrate a 33% delineation improvement for the surgically targetable tumor, and 23% for the complete tumor extent, over a publicly trained model. We anticipate our study to: 1) enable more healthcare studies informed by large diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further analyses for glioblastoma by releasing our consensus model, and 3) demonstrate the FL effectiveness at such scale and task-complexity as a paradigm shift for multisite collaborations, alleviating the need for data-sharing.

KW - CENTRAL-NERVOUS-SYSTEM

KW - BRAIN

KW - PERFORMANCE

KW - ATLAS

KW - MRI

KW - SEGMENTATION

KW - SURVIVAL

KW - CLASSIFICATION

KW - BEVACIZUMAB

KW - VALIDATION

U2 - 10.1038/s41467-022-33407-5

DO - 10.1038/s41467-022-33407-5

M3 - Article

C2 - 36470898

SN - 2041-1723

VL - 13

JO - Nature Communications

JF - Nature Communications

IS - 1

M1 - 7346

ER -

Federated learning enables big data for rare cancer boundary detection

Abstract

Keywords

Access to Document

Research output

Author Correction: Federated learning enables big data for rare cancer boundary detection (Nature Communications,

Cite this