Federated learning: an introduction

Anastasia Shteyn, Konrad Kollnig, Calum Inverarity

Research output: Book/ReportReportProfessional

19 Downloads (Pure)

Abstract

After decades of increased data collection, sharing and use that has driven the emergence and development of new industries, public sentiment has been trending towards the demand for greater data privacy. At the same time, concerns around data privacy, commercial sensitivity and security have contributed towards hesitation and reluctance to share data that might otherwise deliver significant social, economic and environmental benefits. Privacy enhancing technologies (PETs) present potential means to facilitate greater sharing of sensitive data and to protect individuals’ dignity, autonomy and fundamental rights, including data protection and privacy. Federated learning is one technology that is approaching a stage of relative maturity, in terms of awareness and practical application. It can be used to train machine learning (ML) models in a distributed manner, whilst keeping raw sensitive data safe in its original locations. This report is the culmination of research undertaken by the Open Data Institute (ODI) between April 2022 and January 2023, supported by the Rockefeller Foundation. In this report, we provide a comprehensive account of federated learning. We cover its primary distinguishing characteristics and the promise that it holds both for commercial use cases (within a single company or for collaboration across multiple companies) and organisations interested in using federated learning for public, charitable and educational purposes. Though federated learning has shown considerable promise in areas such as health research, finance, Industrial Internet of Things (IIoT) and consumer applications, there remain relatively few examples of end-to-end implementations of federated learning to date. Many pilot projects and initiatives are ongoing. Through our research, we found that privacy and confidentiality are not the most compelling benefits of this technology when deployed in isolation – that is, without additional privacy measures. Instead, the primary drivers for federated learning adoption are often scalability, improved resource utilisation and model performance improvements. Later in the report, we consider four key dimensions that may help determine the complexity and rigour of federated learning governance, including the number of organisations involved, the level of trust between them, the design of the federated architecture, and data sensitivity. The final section is dedicated to practical guidance for organisations in the form of a summary of proposed steps for approaching, experimenting and deploying federated learning.
Original languageEnglish
Place of PublicationLondon
PublisherOpen Data Institute
Number of pages31
Publication statusPublished - 25 Jan 2023

Cite this