TY - JOUR
T1 - The FeatureCloud Platform for Federated Learning in Biomedicine
T2 - Unified Approach
AU - Matschinske, Julian
AU - Spaeth, Julian
AU - Bakhtiari, Mohammad
AU - Probul, Niklas
AU - Majdabadi, Mohammad Mahdi Kazemi
AU - Nasirigerdeh, Reza
AU - Torkzadehmahani, Reihaneh
AU - Hartebrodt, Anne
AU - Orban, Balazs-Attila
AU - Fejer, Sandor-Jozsef
AU - Zolotareva, Olga
AU - Das, Supratim
AU - Baumbach, Linda
AU - Pauling, Josch K.
AU - Tomasevic, Olivera
AU - Bihari, Bela
AU - Bloice, Marcus
AU - Donner, Nina C.
AU - Fdhila, Walid
AU - Frisch, Tobias
AU - Hauschild, Anne-Christin
AU - Heider, Dominik
AU - Holzinger, Andreas
AU - Hoetzendorfer, Walter
AU - Hospes, Jan
AU - Kacprowski, Tim
AU - Kastelitz, Markus
AU - List, Markus
AU - Mayer, Rudolf
AU - Moga, Monika
AU - Mueller, Heimo
AU - Pustozerova, Anastasia
AU - Roettger, Richard
AU - Saak, Christina C.
AU - Saranti, Anna
AU - Schmidt, Herald H. H. W.
AU - Tschohl, Christof
AU - Wenke, Nina K.
AU - Baumbach, Jan
PY - 2023/7/12
Y1 - 2023/7/12
N2 - Background: Machine learning and artificial intelligence have shown promising results in many areas and are driven by the increasing amount of available data. However, these data are often distributed across different institutions and cannot be easily shared owing to strict privacy regulations. Federated learning (FL) allows the training of distributed machine learning models without sharing sensitive data. In addition, the implementation is time-consuming and requires advanced programming skills and complex technical infrastructures.Objective: Various tools and frameworks have been developed to simplify the development of FL algorithms and provide the necessary technical infrastructure. Although there are many high-quality frameworks, most focus only on a single application case or method. To our knowledge, there are no generic frameworks, meaning that the existing solutions are restricted to a particular type of algorithm or application field. Furthermore, most of these frameworks provide an application programming interface that needs programming knowledge. There is no collection of ready-to-use FL algorithms that are extendable and allow users (eg, researchers) without programming knowledge to apply FL. A central FL platform for both FL algorithm developers and users does not exist. This study aimed to address this gap and make FL available to everyone by developing FeatureCloud, an all-in-one platform for FL in biomedicine and beyond.Methods: The FeatureCloud platform consists of 3 main components: a global frontend, a global backend, and a local controller. Our platform uses a Docker to separate the local acting components of the platform from the sensitive data systems. We evaluated our platform using 4 different algorithms on 5 data sets for both accuracy and runtime.Results: FeatureCloud removes the complexity of distributed systems for developers and end users by providing a comprehensive platform for executing multi-institutional FL analyses and implementing FL algorithms. Through its integrated artificial intelligence store, federated algorithms can easily be published and reused by the community. To secure sensitive raw data, FeatureCloud supports privacy-enhancing technologies to secure the shared local models and assures high standards in data privacy to comply with the strict General Data Protection Regulation. Our evaluation shows that applications developed in FeatureCloud can produce highly similar results compared with centralized approaches and scale well for an increasing number of participating sites.Conclusions: FeatureCloud provides a ready-to-use platform that integrates the development and execution of FL algorithms while reducing the complexity to a minimum and removing the hurdles of federated infrastructure. Thus, we believe that it has the potential to greatly increase the accessibility of privacy-preserving and distributed data analyses in biomedicine and beyond.
AB - Background: Machine learning and artificial intelligence have shown promising results in many areas and are driven by the increasing amount of available data. However, these data are often distributed across different institutions and cannot be easily shared owing to strict privacy regulations. Federated learning (FL) allows the training of distributed machine learning models without sharing sensitive data. In addition, the implementation is time-consuming and requires advanced programming skills and complex technical infrastructures.Objective: Various tools and frameworks have been developed to simplify the development of FL algorithms and provide the necessary technical infrastructure. Although there are many high-quality frameworks, most focus only on a single application case or method. To our knowledge, there are no generic frameworks, meaning that the existing solutions are restricted to a particular type of algorithm or application field. Furthermore, most of these frameworks provide an application programming interface that needs programming knowledge. There is no collection of ready-to-use FL algorithms that are extendable and allow users (eg, researchers) without programming knowledge to apply FL. A central FL platform for both FL algorithm developers and users does not exist. This study aimed to address this gap and make FL available to everyone by developing FeatureCloud, an all-in-one platform for FL in biomedicine and beyond.Methods: The FeatureCloud platform consists of 3 main components: a global frontend, a global backend, and a local controller. Our platform uses a Docker to separate the local acting components of the platform from the sensitive data systems. We evaluated our platform using 4 different algorithms on 5 data sets for both accuracy and runtime.Results: FeatureCloud removes the complexity of distributed systems for developers and end users by providing a comprehensive platform for executing multi-institutional FL analyses and implementing FL algorithms. Through its integrated artificial intelligence store, federated algorithms can easily be published and reused by the community. To secure sensitive raw data, FeatureCloud supports privacy-enhancing technologies to secure the shared local models and assures high standards in data privacy to comply with the strict General Data Protection Regulation. Our evaluation shows that applications developed in FeatureCloud can produce highly similar results compared with centralized approaches and scale well for an increasing number of participating sites.Conclusions: FeatureCloud provides a ready-to-use platform that integrates the development and execution of FL algorithms while reducing the complexity to a minimum and removing the hurdles of federated infrastructure. Thus, we believe that it has the potential to greatly increase the accessibility of privacy-preserving and distributed data analyses in biomedicine and beyond.
KW - privacy-preserving machine learning
KW - federated learning
KW - interactive platform
KW - artificial intelligence
KW - AI store
KW - privacy-enhancing technologies
KW - additive secret sharing
KW - ARTIFICIAL-INTELLIGENCE
U2 - 10.2196/42621
DO - 10.2196/42621
M3 - Article
C2 - 37436815
SN - 1439-4456
VL - 25
JO - Journal of Medical Internet Research
JF - Journal of Medical Internet Research
IS - 1
M1 - e42621
ER -