Background and purpose: Predicting outcomes is challenging in rare cancers. Single-institutional datasets are often small and multi-institutional data sharing is complex. Distributed learning allows machine learning models to use data from multiple institutions without exchanging individual patient-level data. We demonstrate this technique in a proof-of-concept study of anal cancer patients treated with chemoradiotherapy across multiple European countries.
Materials and methods: atomCAT is a three-centre collaboration between Leeds Cancer Centre (UK), MAASTRO Clinic (The Netherlands) and Oslo University Hospital (Norway). We trained and validated a Cox proportional hazards regression model in a distributed fashion using data from 281 patients treated with radical, conformal chemoradiotherapy for anal cancer in three institutions. Our primary endpoint was overall survival. We selected disease stage, sex, age, primary tumour size, and planned radiotherapy dose (in EQD2) a priori as predictor variables.
Results: The Cox regression model trained across all three centres found worse overall survival for high risk disease stage (HR = 2.02), male sex (HR = 3.06), older age (HR = 1.33 per 10 years), larger primary tumour volume (HR = 1.05 per 10 cm(3)) and lower radiotherapy dose (HR = 1.20 per 5 Gy). A mean concordance index of 0.72 was achieved during validation, with limited variation between centres (Leeds = 0.72, MAASTRO = 0.74, Oslo = 0.70). The global model performed well for risk stratification for two out of three centres.
Conclusions: Using distributed learning, we accessed and analysed one of the largest available multi-institutional cohorts of anal cancer patients treated with modern radiotherapy techniques. This demonstrates the value of distributed learning in outcome modelling for rare cancers. (C) 2021 Elsevier B.V. All rights reserved.
- Anal cancer
- Squamous cell carcinoma
- Distributed learning
- Outcome modelling
- Overall survival