An international multi-institutional validation study of the algorithm for prostate cancer detection and Gleason grading

Yuri Tolkach*, Vlado Ovtcharov, Alexey Pryalukhin, Marie-Lisa Eich, Nadine Therese Gaisa, Martin Braun, Abdukhamid Radzhabov, Alexander Quaas, Peter Hammerer, Ansgar Dellmann, Wolfgang Hulla, Michael C Haffner, Henning Reis, Ibrahim Fahoum, Iryna Samarska, Artem Borbat, Hoa Pham, Axel Heidenreich, Sebastian Klein, George NettoPeter Caie, Reinhard Buettner

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Pathologic examination of prostate biopsies is time consuming due to the large number of slides per case. In this retrospective study, we validate a deep learning-based classifier for prostate cancer (PCA) detection and Gleason grading (AI tool) in biopsy samples. Five external cohorts of patients with multifocal prostate biopsy were analyzed from high-volume pathology institutes. A total of 5922 H&E sections representing 7473 biopsy cores from 423 patient cases (digitized using three scanners) were assessed concerning tumor detection. Two tumor-bearing datasets (core n?=?227 and 159) were graded by an international group of pathologists including expert urologic pathologists (n?=?11) to validate the Gleason grading classifier. The sensitivity, specificity, and NPV for the detection of tumor-bearing biopsies was in a range of 0.971-1.000, 0.875-0.976, and 0.988-1.000, respectively, across the different test cohorts. In several biopsy slides tumor tissue was correctly detected by the AI tool that was initially missed by pathologists. Most false positive misclassifications represented lesions suspicious for carcinoma or cancer mimickers. The quadratically weighted kappa levels for Gleason grading agreement for single pathologists was 0.62-0.80 (0.77 for AI tool) and 0.64-0.76 (0.72 for AI tool) for the two grading datasets, respectively. In cases where consensus for grading was reached among pathologists, kappa levels for AI tool were 0.903 and 0.855. The PCA detection classifier showed high accuracy for PCA detection in biopsy cases during external validation, independent of the institute and scanner used. High levels of agreement for Gleason grading were indistinguishable between experienced genitourinary pathologists and the AI tool.
Original languageEnglish
Article number77
Number of pages9
Journalnpj Precision Oncology
Volume7
Issue number1
DOIs
Publication statusPublished - 15 Aug 2023

Cite this