TY - JOUR
T1 - The smarty4covid dataset and knowledge base as a framework for interpretable physiological audio data analysis
AU - Zarkogianni, Konstantia
AU - Dervakos, Edmund
AU - Filandrianos, George
AU - Ganitidis, Theofanis
AU - Gkatzou, Vasiliki
AU - Sakagianni, Aikaterini
AU - Raghavendra, Raghu
AU - Max Nikias, C. L.
AU - Stamou, Giorgos
AU - Nikita, Konstantina S.
N1 - Funding Information:
This research was funded by the Hellenic Foundation for Research and Innovation-H.F.R.I within the framework of the H.R.F.I Science & Society “Interventions to address the economic and social consequences of the COVID-19 pandemic” call. Grant number: 05020.
Publisher Copyright:
© 2023, The Author(s).
PY - 2023/12/1
Y1 - 2023/12/1
N2 - Harnessing the power of Artificial Intelligence (AI) and m-health towards detecting new bio-markers indicative of the onset and progress of respiratory abnormalities/conditions has greatly attracted the scientific and research interest especially during COVID-19 pandemic. The smarty4covid dataset contains audio signals of cough (4,676), regular breathing (4,665), deep breathing (4,695) and voice (4,291) as recorded by means of mobile devices following a crowd-sourcing approach. Other self reported information is also included (e.g. COVID-19 virus tests), thus providing a comprehensive dataset for the development of COVID-19 risk detection models. The smarty4covid dataset is released in the form of a web-ontology language (OWL) knowledge base enabling data consolidation from other relevant datasets, complex queries and reasoning. It has been utilized towards the development of models able to: (i) extract clinically informative respiratory indicators from regular breathing records, and (ii) identify cough, breath and voice segments in crowd-sourced audio recordings. A new framework utilizing the smarty4covid OWL knowledge base towards generating counterfactual explanations in opaque AI-based COVID-19 risk detection models is proposed and validated.
AB - Harnessing the power of Artificial Intelligence (AI) and m-health towards detecting new bio-markers indicative of the onset and progress of respiratory abnormalities/conditions has greatly attracted the scientific and research interest especially during COVID-19 pandemic. The smarty4covid dataset contains audio signals of cough (4,676), regular breathing (4,665), deep breathing (4,695) and voice (4,291) as recorded by means of mobile devices following a crowd-sourcing approach. Other self reported information is also included (e.g. COVID-19 virus tests), thus providing a comprehensive dataset for the development of COVID-19 risk detection models. The smarty4covid dataset is released in the form of a web-ontology language (OWL) knowledge base enabling data consolidation from other relevant datasets, complex queries and reasoning. It has been utilized towards the development of models able to: (i) extract clinically informative respiratory indicators from regular breathing records, and (ii) identify cough, breath and voice segments in crowd-sourced audio recordings. A new framework utilizing the smarty4covid OWL knowledge base towards generating counterfactual explanations in opaque AI-based COVID-19 risk detection models is proposed and validated.
U2 - 10.1038/s41597-023-02646-6
DO - 10.1038/s41597-023-02646-6
M3 - Article
SN - 2052-4463
VL - 10
JO - Scientific data
JF - Scientific data
IS - 1
M1 - 770
ER -