Reinvestigating the performance of artificial intelligence classification algorithms on COVID-19 X-Ray and CT images

Rui Cao, Yanan Liu, Xin Wen, Caiqing Liao, Xin Wang, Yuan Gao, Tao Tan*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

There are concerns that artificial intelligence (AI) algorithms may create underdiagnosis bias by mislabeling patient individuals with certain attributes (e.g., female and young) as healthy. Addressing this bias is crucial given the urgent need for AI diagnostics facing rapidly spreading infectious diseases like COVID-19. We find the prevalent AI diagnostic models show an underdiagnosis rate among specific patient populations, and the underdiagnosis rate is higher in some intersectional specific patient populations (for example, females aged 20–40 years). Additionally, we find training AI models on heterogeneous datasets (positive and negative samples from different datasets) may lead to poor model generalization. The model's classification performance varies significantly across test sets, with the accuracy of the better performance being over 40% higher than that of the poor performance. In conclusion, we developed an AI bias analysis pipeline to help researchers recognize and address biases that impact medical equality and ethics.
Original languageEnglish
Article number109712
JournaliScience
Volume27
Issue number5
DOIs
Publication statusPublished - 17 May 2024

Keywords

  • Artificial intelligence applications
  • Health informatics
  • Microbiology

Cite this