OBJECTIVE: Asthma is the most frequent chronic airway illness in preschool children and is difficult to diagnose due to the disease's heterogeneity. This study aimed to investigate different machine learning models and suggested the most effective one to classify two forms of asthma in preschool children (predominantly allergic asthma and non-allergic asthma) using a minimum number of features.
METHODS: After pre-processing, 127 patients (70 with non-allergic asthma and 57 with predominantly allergic asthma) were chosen for final analysis from the Frankfurt dataset, which had asthma-related information on 205 patients. The Random Forest algorithm and Chi-square were used to select the key features from a total of 63 features. Six machine learning models: random forest, extreme gradient boosting, support vector machines, adaptive boosting, extra tree classifier, and logistic regression were then trained and tested using 10-fold stratified cross-validation.
RESULTS: Among all features, age, weight, C-reactive protein, eosinophilic granulocytes, oxygen saturation, pre-medication inhaled corticosteroid + long-acting beta2-agonist (PM-ICS + LABA), PM-other (other pre-medication), H-Pulmicort/celestamine (Pulmicort/celestamine during hospitalization), and H-azithromycin (azithromycin during hospitalization) were found to be highly important. The support vector machine approach with a linear kernel was able to diffrentiate between predominantly allergic asthma and non-allergic asthma with higher accuracy (77.8%), precision (0.81), with a true positive rate of 0.73 and a true negative rate of 0.81, a F1 score of 0.81, and a ROC-AUC score of 0.79. Logistic regression was found to be the second-best classifier with an overall accuracy of 76.2%.
CONCLUSION: Predominantly allergic and non-allergic asthma can be classified using machine learning approaches based on nine features.
|Number of pages||9|
|Journal||Journal of Asthma|
|Early online date||7 Apr 2022|
|Publication status||Published - 4 Mar 2023|
- Frankfurt dataset
- Machine learning
- Non-allergic asthma
- Predominantly allergic asthma