TY - JOUR
T1 - Applications of different machine learning approaches in prediction of breast cancer diagnosis delay
AU - Dehdar, S.
AU - Salimifard, K.
AU - Mohammadi, R.
AU - Marzban, M.
AU - Saadatmand, S.
AU - Fararouei, M.
AU - Dianati-Nasab, M.
N1 - Funding Information:
We would like to thank all the participants and wish them all happiness.
Publisher Copyright:
Copyright © 2023 Dehdar, Salimifard, Mohammadi, Marzban, Saadatmand, Fararouei and Dianati-Nasab.
PY - 2023/2/16
Y1 - 2023/2/16
N2 - BackgroundThe increasing rate of breast cancer (BC) incidence and mortality in Iran has turned this disease into a challenge. A delay in diagnosis leads to more advanced stages of BC and a lower chance of survival, which makes this cancer even more fatal. ObjectivesThe present study was aimed at identifying the predicting factors for delayed BC diagnosis in women in Iran. MethodsIn this study, four machine learning methods, including extreme gradient boosting (XGBoost), random forest (RF), neural networks (NNs), and logistic regression (LR), were applied to analyze the data of 630 women with confirmed BC. Also, different statistical methods, including chi-square, p-value, sensitivity, specificity, accuracy, and area under the receiver operating characteristic curve (AUC), were utilized in different steps of the survey. ResultsThirty percent of patients had a delayed BC diagnosis. Of all the patients with delayed diagnoses, 88.5% were married, 72.1% had an urban residency, and 84.8% had health insurance. The top three important factors in the RF model were urban residency (12.04), breast disease history (11.58), and other comorbidities (10.72). In the XGBoost, urban residency (17.54), having other comorbidities (17.14), and age at first childbirth (>30) (13.13) were the top factors; in the LR model, having other comorbidities (49.41), older age at first childbirth (82.57), and being nulliparous (44.19) were the top factors. Finally, in the NN, it was found that being married (50.05), having a marriage age above 30 (18.03), and having other breast disease history (15.83) were the main predicting factors for a delayed BC diagnosis. ConclusionMachine learning techniques suggest that women with an urban residency who got married or had their first child at an age older than 30 and those without children are at a higher risk of diagnosis delay. It is necessary to educate them about BC risk factors, symptoms, and self-breast examination to shorten the delay in diagnosis.
AB - BackgroundThe increasing rate of breast cancer (BC) incidence and mortality in Iran has turned this disease into a challenge. A delay in diagnosis leads to more advanced stages of BC and a lower chance of survival, which makes this cancer even more fatal. ObjectivesThe present study was aimed at identifying the predicting factors for delayed BC diagnosis in women in Iran. MethodsIn this study, four machine learning methods, including extreme gradient boosting (XGBoost), random forest (RF), neural networks (NNs), and logistic regression (LR), were applied to analyze the data of 630 women with confirmed BC. Also, different statistical methods, including chi-square, p-value, sensitivity, specificity, accuracy, and area under the receiver operating characteristic curve (AUC), were utilized in different steps of the survey. ResultsThirty percent of patients had a delayed BC diagnosis. Of all the patients with delayed diagnoses, 88.5% were married, 72.1% had an urban residency, and 84.8% had health insurance. The top three important factors in the RF model were urban residency (12.04), breast disease history (11.58), and other comorbidities (10.72). In the XGBoost, urban residency (17.54), having other comorbidities (17.14), and age at first childbirth (>30) (13.13) were the top factors; in the LR model, having other comorbidities (49.41), older age at first childbirth (82.57), and being nulliparous (44.19) were the top factors. Finally, in the NN, it was found that being married (50.05), having a marriage age above 30 (18.03), and having other breast disease history (15.83) were the main predicting factors for a delayed BC diagnosis. ConclusionMachine learning techniques suggest that women with an urban residency who got married or had their first child at an age older than 30 and those without children are at a higher risk of diagnosis delay. It is necessary to educate them about BC risk factors, symptoms, and self-breast examination to shorten the delay in diagnosis.
KW - PATIENT DELAY
KW - REASONS
KW - SELECTION
KW - SOCIODEMOGRAPHIC FACTORS
KW - STAGE
KW - STATISTICS
KW - WOMEN
KW - breast cancer (BC)
KW - delay
KW - extreme gradient boosting
KW - logistic regression
KW - machine learning
KW - neural networks (NN)
KW - random forest (RF)
KW - Extreme gradient boosting
KW - Delay
KW - Logistic regression
KW - Machine learning
U2 - 10.3389/fonc.2023.1103369
DO - 10.3389/fonc.2023.1103369
M3 - Article
C2 - 36874113
SN - 2234-943X
VL - 13
JO - Frontiers in Oncology
JF - Frontiers in Oncology
M1 - 1103369
ER -