Linked Open Data (LOD) comprises of an unprecedented volume of structured datasets on the Web. However, these datasets are of varying quality ranging from extensively curated datasets to crowdsourced and even extracted data of relatively low quality. We present a methodology for assessing the quality of linked data resources, which comprises of a manual and a semi-automatic process. In this paper we focus on the manual process where the first phase includes the detection of common quality problems and their representation in a quality problem taxonomy. The second phase comprises of the evaluation of a large number of individual resources, according to the quality problem taxonomy via crowdsourcing. This process is implemented by the tool TripleCheckMate wherein a user assesses an individual resource and evaluates each fact for correctness. This paper focuses on describing the methodology, quality taxonomy and the tools’ system architecture, user perspective and extensibility.
|Title of host publication||Proceedings of the 4th Conference on Knowledge Engineering and Semantic Web|
|Publication status||Published - 2013|
- group_aksw MOLE sys:relevantFor:infai sys:relevantFor:bis sys:relevantFor:geoknow topic_QualityAnalysis auer topic_QualityAnalysis lehmann kontokostas zaveri 2013 dataquality