A Quantitative Comparison of Epistemic Uncertainty Maps Applied to Multi-Class Segmentation

Robin Camarasa; Daniel Bos; Jeroen Hendrikse; Paul J. Nederkoorn; Eline Kooi; Aad van der Lugt; M. de Bruijne

doi:10.59275/j.melba.2021-ec49

A Quantitative Comparison of Epistemic Uncertainty Maps Applied to Multi-Class Segmentation

Robin Camarasa , Daniel Bos, Jeroen Hendrikse, Paul J. Nederkoorn, Eline Kooi, Aad van der Lugt, M. de Bruijne

Research output: Contribution to journal › Article › Academic

Abstract

Uncertainty assessment has gained rapid interest in medical image analysis. A popular technique to compute epistemic uncertainty is the Monte-Carlo (MC) dropout technique. From a network with MC dropout and a single input, multiple outputs can be sampled. Various methods can be used to obtain epistemic uncertainty maps from those multiple outputs. In the case of multi-class segmentation, the number of methods is even larger as epistemic uncertainty can be computed voxelwise per class or voxelwise per image. This paper highlights a systematic approach to define and quantitatively compare those methods in two different contexts: class-specific epistemic uncertainty maps (one value per
image, voxel and class) and combined epistemic uncertainty maps (one value per image and voxel). We applied this quantitative analysis to a multi-class segmentation of the carotid artery lumen and vessel wall, on a multi-center, multi-scanner, multi-sequence dataset of Magnetic Resonance (MR) images. We validated our analysis over 144 sets of hyperparameters of a model. Our main analysis considers the relationship between the order of the voxels sorted
according to their epistemic uncertainty values and the misclassification of the prediction. Under this consideration, the comparison of combined uncertainty maps reveals that the multi-class entropy and the multi-class mutual information statistically out-perform the other combined uncertainty maps under study (the averaged entropy, the averaged variance, the similarity Bhattacharya coefficient and the similarity Kullback-Leibler divergence). In a class-specific scenario, the one-versus-all entropy statistically out-performs the class-wise entropy, the class-wise variance and the one versus all mutual information. The classwise entropy statistically out-performs the other class-specific uncertainty maps in term of calibration. We made a python package available to reproduce our analysis on different data and tasks.

Original language	English
Pages (from-to)	1-39
Number of pages	39
Journal	The Journal of Machine Learning for Biomedical Imaging
Volume	013
DOIs	https://doi.org/10.59275/j.melba.2021-ec49
Publication status	Published - Sept 2021

Access to Document

10.59275/j.melba.2021-ec49

https://www.melba-journal.org/pdf/2021:013.pdfLicence: CC BY

Cite this

@article{5a94d70a44e5439cae6a050328ac2d6c,

title = "A Quantitative Comparison of Epistemic Uncertainty Maps Applied to Multi-Class Segmentation",

abstract = "Uncertainty assessment has gained rapid interest in medical image analysis. A popular technique to compute epistemic uncertainty is the Monte-Carlo (MC) dropout technique. From a network with MC dropout and a single input, multiple outputs can be sampled. Various methods can be used to obtain epistemic uncertainty maps from those multiple outputs. In the case of multi-class segmentation, the number of methods is even larger as epistemic uncertainty can be computed voxelwise per class or voxelwise per image. This paper highlights a systematic approach to define and quantitatively compare those methods in two different contexts: class-specific epistemic uncertainty maps (one value perimage, voxel and class) and combined epistemic uncertainty maps (one value per image and voxel). We applied this quantitative analysis to a multi-class segmentation of the carotid artery lumen and vessel wall, on a multi-center, multi-scanner, multi-sequence dataset of Magnetic Resonance (MR) images. We validated our analysis over 144 sets of hyperparameters of a model. Our main analysis considers the relationship between the order of the voxels sortedaccording to their epistemic uncertainty values and the misclassification of the prediction. Under this consideration, the comparison of combined uncertainty maps reveals that the multi-class entropy and the multi-class mutual information statistically out-perform the other combined uncertainty maps under study (the averaged entropy, the averaged variance, the similarity Bhattacharya coefficient and the similarity Kullback-Leibler divergence). In a class-specific scenario, the one-versus-all entropy statistically out-performs the class-wise entropy, the class-wise variance and the one versus all mutual information. The classwise entropy statistically out-performs the other class-specific uncertainty maps in term of calibration. We made a python package available to reproduce our analysis on different data and tasks.",

author = "Robin Camarasa and Daniel Bos and Jeroen Hendrikse and Nederkoorn, {Paul J.} and Eline Kooi and {van der Lugt}, Aad and {de Bruijne}, M.",

year = "2021",

month = sep,

doi = "10.59275/j.melba.2021-ec49",

language = "English",

volume = "013",

pages = "1--39 ",

journal = "The Journal of Machine Learning for Biomedical Imaging",

issn = "2766-905X",

publisher = "Melba editors",

}

TY - JOUR

T1 - A Quantitative Comparison of Epistemic Uncertainty Maps Applied to Multi-Class Segmentation

AU - Camarasa , Robin

AU - Bos, Daniel

AU - Hendrikse, Jeroen

AU - Nederkoorn, Paul J.

AU - Kooi, Eline

AU - van der Lugt, Aad

AU - de Bruijne, M.

PY - 2021/9

Y1 - 2021/9

N2 - Uncertainty assessment has gained rapid interest in medical image analysis. A popular technique to compute epistemic uncertainty is the Monte-Carlo (MC) dropout technique. From a network with MC dropout and a single input, multiple outputs can be sampled. Various methods can be used to obtain epistemic uncertainty maps from those multiple outputs. In the case of multi-class segmentation, the number of methods is even larger as epistemic uncertainty can be computed voxelwise per class or voxelwise per image. This paper highlights a systematic approach to define and quantitatively compare those methods in two different contexts: class-specific epistemic uncertainty maps (one value perimage, voxel and class) and combined epistemic uncertainty maps (one value per image and voxel). We applied this quantitative analysis to a multi-class segmentation of the carotid artery lumen and vessel wall, on a multi-center, multi-scanner, multi-sequence dataset of Magnetic Resonance (MR) images. We validated our analysis over 144 sets of hyperparameters of a model. Our main analysis considers the relationship between the order of the voxels sortedaccording to their epistemic uncertainty values and the misclassification of the prediction. Under this consideration, the comparison of combined uncertainty maps reveals that the multi-class entropy and the multi-class mutual information statistically out-perform the other combined uncertainty maps under study (the averaged entropy, the averaged variance, the similarity Bhattacharya coefficient and the similarity Kullback-Leibler divergence). In a class-specific scenario, the one-versus-all entropy statistically out-performs the class-wise entropy, the class-wise variance and the one versus all mutual information. The classwise entropy statistically out-performs the other class-specific uncertainty maps in term of calibration. We made a python package available to reproduce our analysis on different data and tasks.

AB - Uncertainty assessment has gained rapid interest in medical image analysis. A popular technique to compute epistemic uncertainty is the Monte-Carlo (MC) dropout technique. From a network with MC dropout and a single input, multiple outputs can be sampled. Various methods can be used to obtain epistemic uncertainty maps from those multiple outputs. In the case of multi-class segmentation, the number of methods is even larger as epistemic uncertainty can be computed voxelwise per class or voxelwise per image. This paper highlights a systematic approach to define and quantitatively compare those methods in two different contexts: class-specific epistemic uncertainty maps (one value perimage, voxel and class) and combined epistemic uncertainty maps (one value per image and voxel). We applied this quantitative analysis to a multi-class segmentation of the carotid artery lumen and vessel wall, on a multi-center, multi-scanner, multi-sequence dataset of Magnetic Resonance (MR) images. We validated our analysis over 144 sets of hyperparameters of a model. Our main analysis considers the relationship between the order of the voxels sortedaccording to their epistemic uncertainty values and the misclassification of the prediction. Under this consideration, the comparison of combined uncertainty maps reveals that the multi-class entropy and the multi-class mutual information statistically out-perform the other combined uncertainty maps under study (the averaged entropy, the averaged variance, the similarity Bhattacharya coefficient and the similarity Kullback-Leibler divergence). In a class-specific scenario, the one-versus-all entropy statistically out-performs the class-wise entropy, the class-wise variance and the one versus all mutual information. The classwise entropy statistically out-performs the other class-specific uncertainty maps in term of calibration. We made a python package available to reproduce our analysis on different data and tasks.

U2 - 10.59275/j.melba.2021-ec49

DO - 10.59275/j.melba.2021-ec49

M3 - Article

SN - 2766-905X

VL - 013

SP - 1

EP - 39

JO - The Journal of Machine Learning for Biomedical Imaging

JF - The Journal of Machine Learning for Biomedical Imaging

ER -