Quality assessment for Linked Data: A Survey

Amrapali Zaveri*, Anisa Rula, Andrea Maurino, Ricardo Pietrobon, Jens Lehmann, Soeren Auer

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

The development and standardization of Semantic Web technologies has resulted in an unprecedented volume of data being published on the Web as Linked Data (LD). However, we observe widely varying data quality ranging from extensively curated datasets to crowdsourced and extracted data of relatively low quality. In this article, we present the results of a systematic review of approaches for assessing the quality of LD. We gather existing approaches and analyze them qualitatively. In particular, we unify and formalize commonly used terminologies across papers related to data quality and provide a comprehensive list of 18 quality dimensions and 69 metrics. Additionally, we qualitatively analyze the 30 core approaches and 12 tools using a set of attributes. The aim of this article is to provide researchers and data curators a comprehensive understanding of existing work, thereby encouraging further experimentation and development of new approaches focused towards data quality, specifically for LD.
Original languageEnglish
Pages (from-to)63-93
JournalSemantic web
Volume7
Issue number1
DOIs
Publication statusPublished - 2016
Externally publishedYes

Keywords

  • Data quality
  • Linked Data
  • assessment
  • survey

Cite this