External validation and transfer learning of convolutional neural networks for computed tomography dental artifact classification

Mattea L. Welch; Chris McIntosh; Alberto Traverso; Leonard Wee; Tom G. Purdie; Andre Dekker; Benjamin Haibe-Kains; David A. Jaffray

doi:10.1088/1361-6560/ab63ba

External validation and transfer learning of convolutional neural networks for computed tomography dental artifact classification

Mattea L. Welch^*, Chris McIntosh, Alberto Traverso, Leonard Wee, Tom G. Purdie, Andre Dekker, Benjamin Haibe-Kains, David A. Jaffray

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

151 Downloads (Pure)

Abstract

Quality assurance of data prior to use in automated pipelines and image analysis would assist in safeguarding against biases and incorrect interpretation of results. Automation of quality assurance steps would further improve robustness and efficiency of these methods, motivating widespread adoption of techniques. Previous work by our group demonstrated the ability of convolutional neural networks (CNN) to efficiently classify head and neck (H& N) computed-tomography (CT) images for the presence of dental artifacts (DA) that obscure visualization of structures and the accuracy of Hounsfield units. In this work we demonstrate the generalizability of our previous methodology by validating CNNs on six external datasets, and the potential benefits of transfer learning with fine-tuning on CNN performance. 2112 H& N CT images from seven institutions were scored as DA positive or negative. 153(8) images from a single institution were used to train three CNNs with resampling grid sizes of 64(3), 128(3) and 256(3). The remaining six external datasets were used in five-fold cross-validation with a data split of 20% training/fine-tuning and 80% validation. The three pre-trained models were each validated using the five-folds of the six external datasets. The pre-trained models also underwent transfer learning with fine-tuning using the 20% training/finetuning data, and validated using the corresponding validation datasets. The highest micro-averaged AUC for our pre-trained models across all external datasets occurred with a resampling grid of 256(3) (AUC = 0.91 +/- 0.01). Transfer learning with fine-tuning improved generalizability when utilizing a resampling grid of 256(3) to a micro-averaged AUC of 0.92 +/- 0.01. Despite these promising results, transfer learning did not improve AUC when utilizing small resampling grids or small datasets. Our work demonstrates the potential of our previously developed automated quality assurance methods to generalize to external datasets. Additionally, we showed that transfer learning with fine-tuning using small portions of external datasets can be used to fine-tune models for improved performance when large variations in images are present.

Original language	English
Article number	035017
Number of pages	11
Journal	Physics in Medicine and Biology
Volume	65
Issue number	3
DOIs	https://doi.org/10.1088/1361-6560/ab63ba
Publication status	Published - Feb 2020

Keywords

CANCER
HEAD
RADIATION-THERAPY
RADIOTHERAPY
REDUCTION
SEGMENTATION
computed tomography
deep learning
dental artifacts
external validation
quality assurance
RADIOMICS

Access to Document

10.1088/1361-6560/ab63ba

Full TextFinal published version, 1.52 MBLicence: Taverne

Cite this

@article{38c4b37d73904aa185aa64ff6fea22eb,

title = "External validation and transfer learning of convolutional neural networks for computed tomography dental artifact classification",

abstract = "Quality assurance of data prior to use in automated pipelines and image analysis would assist in safeguarding against biases and incorrect interpretation of results. Automation of quality assurance steps would further improve robustness and efficiency of these methods, motivating widespread adoption of techniques. Previous work by our group demonstrated the ability of convolutional neural networks (CNN) to efficiently classify head and neck (H& N) computed-tomography (CT) images for the presence of dental artifacts (DA) that obscure visualization of structures and the accuracy of Hounsfield units. In this work we demonstrate the generalizability of our previous methodology by validating CNNs on six external datasets, and the potential benefits of transfer learning with fine-tuning on CNN performance. 2112 H& N CT images from seven institutions were scored as DA positive or negative. 153(8) images from a single institution were used to train three CNNs with resampling grid sizes of 64(3), 128(3) and 256(3). The remaining six external datasets were used in five-fold cross-validation with a data split of 20% training/fine-tuning and 80% validation. The three pre-trained models were each validated using the five-folds of the six external datasets. The pre-trained models also underwent transfer learning with fine-tuning using the 20% training/finetuning data, and validated using the corresponding validation datasets. The highest micro-averaged AUC for our pre-trained models across all external datasets occurred with a resampling grid of 256(3) (AUC = 0.91 +/- 0.01). Transfer learning with fine-tuning improved generalizability when utilizing a resampling grid of 256(3) to a micro-averaged AUC of 0.92 +/- 0.01. Despite these promising results, transfer learning did not improve AUC when utilizing small resampling grids or small datasets. Our work demonstrates the potential of our previously developed automated quality assurance methods to generalize to external datasets. Additionally, we showed that transfer learning with fine-tuning using small portions of external datasets can be used to fine-tune models for improved performance when large variations in images are present.",

keywords = "CANCER, HEAD, RADIATION-THERAPY, RADIOTHERAPY, REDUCTION, SEGMENTATION, computed tomography, deep learning, dental artifacts, external validation, quality assurance, RADIOMICS",

author = "Welch, {Mattea L.} and Chris McIntosh and Alberto Traverso and Leonard Wee and Purdie, {Tom G.} and Andre Dekker and Benjamin Haibe-Kains and Jaffray, {David A.}",

year = "2020",

month = feb,

doi = "10.1088/1361-6560/ab63ba",

language = "English",

volume = "65",

journal = "Physics in Medicine and Biology",

issn = "0031-9155",

publisher = "IOP Publishing Ltd.",

number = "3",

}

TY - JOUR

T1 - External validation and transfer learning of convolutional neural networks for computed tomography dental artifact classification

AU - Welch, Mattea L.

AU - McIntosh, Chris

AU - Traverso, Alberto

AU - Wee, Leonard

AU - Purdie, Tom G.

AU - Dekker, Andre

AU - Haibe-Kains, Benjamin

AU - Jaffray, David A.

PY - 2020/2

Y1 - 2020/2

N2 - Quality assurance of data prior to use in automated pipelines and image analysis would assist in safeguarding against biases and incorrect interpretation of results. Automation of quality assurance steps would further improve robustness and efficiency of these methods, motivating widespread adoption of techniques. Previous work by our group demonstrated the ability of convolutional neural networks (CNN) to efficiently classify head and neck (H& N) computed-tomography (CT) images for the presence of dental artifacts (DA) that obscure visualization of structures and the accuracy of Hounsfield units. In this work we demonstrate the generalizability of our previous methodology by validating CNNs on six external datasets, and the potential benefits of transfer learning with fine-tuning on CNN performance. 2112 H& N CT images from seven institutions were scored as DA positive or negative. 153(8) images from a single institution were used to train three CNNs with resampling grid sizes of 64(3), 128(3) and 256(3). The remaining six external datasets were used in five-fold cross-validation with a data split of 20% training/fine-tuning and 80% validation. The three pre-trained models were each validated using the five-folds of the six external datasets. The pre-trained models also underwent transfer learning with fine-tuning using the 20% training/finetuning data, and validated using the corresponding validation datasets. The highest micro-averaged AUC for our pre-trained models across all external datasets occurred with a resampling grid of 256(3) (AUC = 0.91 +/- 0.01). Transfer learning with fine-tuning improved generalizability when utilizing a resampling grid of 256(3) to a micro-averaged AUC of 0.92 +/- 0.01. Despite these promising results, transfer learning did not improve AUC when utilizing small resampling grids or small datasets. Our work demonstrates the potential of our previously developed automated quality assurance methods to generalize to external datasets. Additionally, we showed that transfer learning with fine-tuning using small portions of external datasets can be used to fine-tune models for improved performance when large variations in images are present.

AB - Quality assurance of data prior to use in automated pipelines and image analysis would assist in safeguarding against biases and incorrect interpretation of results. Automation of quality assurance steps would further improve robustness and efficiency of these methods, motivating widespread adoption of techniques. Previous work by our group demonstrated the ability of convolutional neural networks (CNN) to efficiently classify head and neck (H& N) computed-tomography (CT) images for the presence of dental artifacts (DA) that obscure visualization of structures and the accuracy of Hounsfield units. In this work we demonstrate the generalizability of our previous methodology by validating CNNs on six external datasets, and the potential benefits of transfer learning with fine-tuning on CNN performance. 2112 H& N CT images from seven institutions were scored as DA positive or negative. 153(8) images from a single institution were used to train three CNNs with resampling grid sizes of 64(3), 128(3) and 256(3). The remaining six external datasets were used in five-fold cross-validation with a data split of 20% training/fine-tuning and 80% validation. The three pre-trained models were each validated using the five-folds of the six external datasets. The pre-trained models also underwent transfer learning with fine-tuning using the 20% training/finetuning data, and validated using the corresponding validation datasets. The highest micro-averaged AUC for our pre-trained models across all external datasets occurred with a resampling grid of 256(3) (AUC = 0.91 +/- 0.01). Transfer learning with fine-tuning improved generalizability when utilizing a resampling grid of 256(3) to a micro-averaged AUC of 0.92 +/- 0.01. Despite these promising results, transfer learning did not improve AUC when utilizing small resampling grids or small datasets. Our work demonstrates the potential of our previously developed automated quality assurance methods to generalize to external datasets. Additionally, we showed that transfer learning with fine-tuning using small portions of external datasets can be used to fine-tune models for improved performance when large variations in images are present.

KW - CANCER

KW - HEAD

KW - RADIATION-THERAPY

KW - RADIOTHERAPY

KW - REDUCTION

KW - SEGMENTATION

KW - computed tomography

KW - deep learning

KW - dental artifacts

KW - external validation

KW - quality assurance

KW - RADIOMICS

U2 - 10.1088/1361-6560/ab63ba

DO - 10.1088/1361-6560/ab63ba

M3 - Article

C2 - 31851961

SN - 0031-9155

VL - 65

JO - Physics in Medicine and Biology

JF - Physics in Medicine and Biology

IS - 3

M1 - 035017

ER -