Enrichment of lung cancer computed tomography collections with AI-derived annotations

Deepa Krishnaswamy*, Dennis Bontempi, Vamsi Krishna Thiriveedhi, Davide Punzo, David Clunie, Christopher P. Bridge, Hugo J.W.L. Aerts, Ron Kikinis, Andrey Fedorov

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Public imaging datasets are critical for the development and evaluation of automated tools in cancer imaging. Unfortunately, many do not include annotations or image-derived features, complicating downstream analysis. Artificial intelligence-based annotation tools have been shown to achieve acceptable performance and can be used to automatically annotate large datasets. As part of the effort to enrich public data available within NCI Imaging Data Commons (IDC), here we introduce AI-generated annotations for two collections containing computed tomography images of the chest, NSCLC-Radiomics, and a subset of the National Lung Screening Trial. Using publicly available AI algorithms, we derived volumetric annotations of thoracic organs-at-risk, their corresponding radiomics features, and slice-level annotations of anatomical landmarks and regions. The resulting annotations are publicly available within IDC, where the DICOM format is used to harmonize the data and achieve FAIR (Findable, Accessible, Interoperable, Reusable) data principles. The annotations are accompanied by cloud-enabled notebooks demonstrating their use. This study reinforces the need for large, publicly accessible curated datasets and demonstrates how AI can aid in cancer imaging.
Original languageEnglish
Article number25
Number of pages15
JournalScientific data
Volume11
Issue number1
DOIs
Publication statusPublished - 2024

Cite this