External validation and transfer learning of convolutional neural networks for computed tomography dental artifact classification

Mattea L. Welch*, Chris McIntosh, Alberto Traverso, Leonard Wee, Tom G. Purdie, Andre Dekker, Benjamin Haibe-Kains, David A. Jaffray

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

4 Citations (Web of Science)
72 Downloads (Pure)

Abstract

Quality assurance of data prior to use in automated pipelines and image analysis would assist in safeguarding against biases and incorrect interpretation of results. Automation of quality assurance steps would further improve robustness and efficiency of these methods, motivating widespread adoption of techniques. Previous work by our group demonstrated the ability of convolutional neural networks (CNN) to efficiently classify head and neck (H& N) computed-tomography (CT) images for the presence of dental artifacts (DA) that obscure visualization of structures and the accuracy of Hounsfield units. In this work we demonstrate the generalizability of our previous methodology by validating CNNs on six external datasets, and the potential benefits of transfer learning with fine-tuning on CNN performance. 2112 H& N CT images from seven institutions were scored as DA positive or negative. 153(8) images from a single institution were used to train three CNNs with resampling grid sizes of 64(3), 128(3) and 256(3). The remaining six external datasets were used in five-fold cross-validation with a data split of 20% training/fine-tuning and 80% validation. The three pre-trained models were each validated using the five-folds of the six external datasets. The pre-trained models also underwent transfer learning with fine-tuning using the 20% training/finetuning data, and validated using the corresponding validation datasets. The highest micro-averaged AUC for our pre-trained models across all external datasets occurred with a resampling grid of 256(3) (AUC = 0.91 +/- 0.01). Transfer learning with fine-tuning improved generalizability when utilizing a resampling grid of 256(3) to a micro-averaged AUC of 0.92 +/- 0.01. Despite these promising results, transfer learning did not improve AUC when utilizing small resampling grids or small datasets. Our work demonstrates the potential of our previously developed automated quality assurance methods to generalize to external datasets. Additionally, we showed that transfer learning with fine-tuning using small portions of external datasets can be used to fine-tune models for improved performance when large variations in images are present.

Original languageEnglish
Article number035017
Number of pages11
JournalPhysics in Medicine and Biology
Volume65
Issue number3
DOIs
Publication statusPublished - Feb 2020

Keywords

  • CANCER
  • HEAD
  • RADIATION-THERAPY
  • RADIOTHERAPY
  • REDUCTION
  • SEGMENTATION
  • computed tomography
  • deep learning
  • dental artifacts
  • external validation
  • quality assurance
  • RADIOMICS

Cite this