Distributed learning to protect privacy in multi-centric clinical studies

Andrea Damiani, Mauro Vallati, Roberto Gatta*, Nicola Dinapoli, Arthur Jochems, Timo Deist, Johan van Soest, Andre Dekker, Vincenzo Valentini

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapterAcademic

Abstract

Research in medicine has to deal with the growing amount of data about patients which are made available by modern technologies. All these data might be used to support statistical studies, and for identifying causal relations. To use these data, which are spread across hospitals, efficient merging techniques as well as policies to deal with this sensitive information are strongly needed. In this paper we introduce and empirically test a distributed learning approach, to train Support Vector Machines (SVM), that allows to overcome problems related to privacy and data being spread around. The introduced technique allows to train algorithms without sharing any patients-related information, ensuring privacy and avoids the development of merging tools. We tested this approach on a large dataset and we described results, in terms of convergence and performance; we also provide considerations about the features of an IT architecture designed to support distributed learning computations.
Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages65-75
Number of pages11
ISBN (Print)9783319195506
DOIs
Publication statusPublished - 2015

Publication series

SeriesLecture Notes in Computer Science
Volume9105

Keywords

  • Distributed learning
  • Multi-centric clinical studies
  • Patient privacy preserving

Cite this