Development and validation of deep learning classifiers to detect Epstein-Barr virus and microsatellite instability status in gastric cancer: a retrospective multicentre cohort study

H.S. Muti, L.R. Heij, G. Keller, M. Kohlruss, R. Langer, B. Dislich, J.H. Cheong, Y.W. Kim, H. Kim, M.C. Kook, D. Cunningham, W.H. Allum, R.E. Langley, M.G. Nankivell, P. Quirke, J.D. Hayden, N.P. West, A.J. Irvine, T. Yoshikawa, T. OshimaR. Huss, B. Grosser, F. Roviello, A. d'Ignazio, A. Quaas, H. Alakus, X.X. Tan, A.T. Pearson, T. Luedde, M.P. Ebert, D. Jager, C. Trautwein, N.T. Gaisa, H.I. Grabsch, J.N. Kather*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

22 Citations (Web of Science)


Background Response to immunotherapy in gastric cancer is associated with microsatellite instability (or mismatch repair deficiency) and Epstein-Barr virus (EBV) positivity. We therefore aimed to develop and validate deep learning based classifiers to detect microsatellite instability and EBV status from routine histology slides. Methods In this retrospective, multicentre study, we collected tissue samples from ten cohorts of patients with gastric cancer from seven countries (South Korea, Switzerland, Japan, Italy, Germany, the UK and the USA). We trained a deep learning-based classifier to detect microsatellite instability and EBV positivity from digitised, haematoxylin and eosin stained resection slides without annotating tumour containing regions. The performance of the classifier was assessed by within-cohort cross-validation in all ten cohorts and by external validation, for which we split the cohorts into a five-cohort training dataset and a five-cohort test dataset. We measured the area under the receiver operating curve (AUROC) for detection of microsatellite instability and EBV status. Microsatellite instability and EBV status were determined to be detectable if the lower bound of the 95% CI for the AUROC was above 0middot5. Findings Across the ten cohorts, our analysis included 2823 patients with known microsatellite instability status and 2685 patients with known EBV status. In the within-cohort cross-validation, the deep learning-based classifier could detect microsatellite instability status in nine of ten cohorts, with AUROCs ranging from 0middot597 (95% CI 0middot522-0middot737) to 0middot836 (0middot795-0middot880) and EBV status in five of eight cohorts, with AUROCs ranging from 0middot819 (0middot752-0middot841) to 0middot897 (0middot513-0middot966). Training a classifier on the pooled training dataset and testing it on the five remaining cohorts resulted in high classification performance with AUROCs ranging from 0middot723 (95% CI 0middot676-0middot794) to 0middot863 (0middot747-0middot969) for detection of microsatellite instability and from 0middot672 (0middot403-0middot989) to 0middot859 (0middot823-0middot919) for detection of EBV status. Interpretation Classifiers became increasingly robust when trained on pooled cohorts. After prospective validation, this deep learning-based tissue classification system could be used as an inexpensive predictive biomarker for immunotherapy in gastric cancer. Funding German Cancer Aid and German Federal Ministry of Health. Copyright (c) 2021 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license.
Original languageEnglish
Pages (from-to)E654-E664
Number of pages11
JournalThe Lancet Digital Health
Issue number10
Publication statusPublished - 1 Oct 2021



Cite this