Abstract
There is an abundance of biomedical data present on the Web. However, this data is not re-usable because it is insu ciently described using rich metadata. The recently published FAIR principles specify desirable criteria that metadata and their corresponding datasets need to be Findable, Accessible, Interoperable, and Reusable. However, currently the biomedical metadata quality is poor which makes data reuse extremely di cult. To tackle this problem, we propose the use of topic modeling, specifically non-negative matrix factorization (NMF), as a first step towards dimensionality reduction when dealing with large amounts of data. In this position paper, as a use case, we apply NMF to the BioSamples metadata and present preliminary results.
Original language | English |
---|---|
Title of host publication | Quality Assessment of Biomedical Metadata Using Topic Modeling. |
Publication status | Published - 2018 |
Event | Semantic Web solutions for large-scale biomedical data analytics - Crete, Greece Duration: 3 Jun 2018 → … |
Workshop
Workshop | Semantic Web solutions for large-scale biomedical data analytics |
---|---|
Abbreviated title | SeWeBMeDA 2018 |
Country/Territory | Greece |
City | Crete |
Period | 3/06/18 → … |