Abstract
Semantic web technologies offer a promising mechanism for the representation and integration of thousands of biomedical databases. Many of these databases provide cross-references to other data sources, but they are generally incomplete and error-prone. In this paper, we conduct an empirical link analysis of the life science linked data, obtained from the bio2rdf project. Three different link graphs for datasets, entities and terms are characterized using degree distribution, connectivity, and clustering metrics, and their correlation is measured as well. Furthermore, we analyze the symmetry and transitivity of entity links to build a benchmark and preliminarily evaluate several entity matching methods. Our findings indicate that the life science data network can help identify hidden links, can be used to validate links, and may offer the mechanism to integrate a wider set of resources for biomedical knowledge discovery.
Original language | English |
---|---|
Pages (from-to) | 446-462 |
Number of pages | 17 |
Journal | Lecture Notes in Computer Science |
Volume | 9367 |
DOIs | |
Publication status | Published - 2015 |
Externally published | Yes |