Transformer-based biomarker prediction from colorectal cancer histology: A large-scale multicentric study

Sophia J Wagner, Daniel Reisenbüchler, Nicholas P West, Jan Moritz Niehues, Jiefu Zhu, Sebastian Foersch, Gregory Patrick Veldhuizen, Philip Quirke, Heike I Grabsch, Piet A van den Brandt, Gordon G A Hutchins, Susan D Richman, Tanwei Yuan, Rupert Langer, Josien C A Jenniskens, Kelly Offermans, Wolfram Mueller, Richard Gray, Stephen B Gruber, Joel K GreensonGad Rennert, Joseph D Bonner, Daniel Schmolze, Jitendra Jonnagaddala, Nicholas J Hawkins, Robyn L Ward, Dion Morton, Matthew Seymour, Laura Magill, Marta Nowak, Jennifer Hay, Viktor H Koelzer, David N Church, Christian Matek, Carol Geppert, Chaolong Peng, Cheng Zhi, Xiaoming Ouyang, Jacqueline A James, Maurice B Loughrey, Manuel Salto-Tellez, Hermann Brenner, Michael Hoffmeister, Daniel Truhn, Julia A Schnabel, Melanie Boxberg, Tingying Peng*, Jakob Nikolas Kather*, TransSCOT consortium

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Deep learning (DL) can accelerate the prediction of prognostic biomarkers from routine pathology slides in colorectal cancer (CRC). However, current approaches rely on convolutional neural networks (CNNs) and have mostly been validated on small patient cohorts. Here, we develop a new transformer-based pipeline for end-to-end biomarker prediction from pathology slides by combining a pre-trained transformer encoder with a transformer network for patch aggregation. Our transformer-based approach substantially improves the performance, generalizability, data efficiency, and interpretability as compared with current state-of-the-art algorithms. After training and evaluating on a large multicenter cohort of over 13,000 patients from 16 colorectal cancer cohorts, we achieve a sensitivity of 0.99 with a negative predictive value of over 0.99 for prediction of microsatellite instability (MSI) on surgical resection specimens. We demonstrate that resection specimen-only training reaches clinical-grade performance on endoscopic biopsy tissue, solving a long-standing diagnostic problem.
Original languageEnglish
Pages (from-to)1650-1661.e4
Number of pages17
JournalCancer Cell
Volume41
Issue number9
DOIs
Publication statusPublished - 11 Sept 2023

Keywords

  • artificial intelligence
  • biomarker
  • colorectal cancer
  • deep learning
  • microsatellite instability
  • multiple instance learning
  • transformer

Cite this