Chemical substance resources on the Web are often made accessible to researchers through public APIs (Application Programming Interfaces). A significant problem of missing provenance information arises when extracting and integrating data in such APIs. Even when provenance is stated, it is usually not done with any prescribed templates or terminology. This creates a burden on data producers and makes it challenging for API developers to automatically extract and analyse this information. Downstream, these consequences hinder efforts to automatically determine the veracity and quality of extracted data, critical for proving the integrity of associated research findings. In this paper, we propose a model for capturing provenance of assertions about chemical substances by systematically analyzing three sources: (i) Nanopublications, (ii) Wikidata and (iii) selected Minimal Information Standards (MISTS) for reporting biomedical studies. We analyse provenance terms used in these sources along with their frequency of use and synthesize our findings into a preliminary model for capturing provenance.
|Title of host publication||Semantic Web Applications and Tools for Health Care and Life Sciences|
|Editors||Christopher J. O. Baker, Andra Waagmeester, Andrea Splendiani, Oya Deniz Beyan, M. Scott Marshall|
|Place of Publication||Antwerp, Belgium|
|Publication status||Published - 2018|
|Event||Semantic Web Applications and Tools for Health Care and Life Sciences - Antwerp, Antwerp, Belgium|
Duration: 3 Dec 2018 → 6 Dec 2018
Conference number: 11
|Conference||Semantic Web Applications and Tools for Health Care and Life Sciences|
|Period||3/12/18 → 6/12/18|
Moodley, K., Zaveri, A., Wu, C., & Dumontier, M. (2018). A model for capturing provenance of assertions about chemical substances. In C. J. O. Baker, A. Waagmeester, A. Splendiani, O. D. Beyan, & M. S. Marshall (Eds.), Semantic Web Applications and Tools for Health Care and Life Sciences (Vol. 2275). CEUR-WS.org.