DBpediaDQ: User-driven quality evaluation of DBpedia

  • screenshot

With the myriad of data sets and use cases available in the LOD cloud, data quality is one of the important concepts to be considered. The DBpedia Data Quality Curation project is aimed at evaluating the quality of the resources present in DBpedia.

Homepage

As we all know, DBpedia is an important dataset in Linked Data as it is not only connected to and from numerous other datasets, but it also is relied upon for useful information. However, quality problems are inherent in DBpedia be it in terms of incorrectly extracted values or datatype problems since it contains information extracted from crowd-sourced content.

However, not all the data quality problems are automatically detectable. Thus, we aim at crowd-sourcing the quality assessment of the dataset. In order to perform this assessment, we developed a tool whereby a user can evaluate a random resource by analyzing each triple individually and store the results. Here is the link to the tool: http://nl.dbpedia.org:8080/TripleCheckMate/.

If you have any questions or comments, please do not hesitate to contact us at dbpedia-data-quality@googlegroups.com.

Results

  • Results : http://goo.gl/lIKK7
  • Total no. of users : 58
  • Total no. of distinct resources evaluated : 521
  • Total no. of resources evaluated : 792
  • Total no. of distinct resources without problems : 86
  • Total no. of distinct resources with problems : 435
  • Total no. of distinct incorrect triples : 2928
  • Total no. of distinct incorrect triples in the dbprop namespace : 1745
  • Total no. of inter-evaluations : 268
  • No. of resources with evaluators having different opinions : 89
  • Resource-based inter-rater agreement (Cohen’s Kappa) : 0.34
  • Triple-based inter-rater agreement (Cohen’s Kappa) : 0.38
  • No. of triples evaluated for correctness : 700
  • No. of triples evaluated to be correct : 567
  • No. of triples evaluated incorrectly : 133
  • % of triples correctly evaluated : 81
  • Average no. of problems per resource : 5.69
  • Average no. of problems per resource in the dbprop namespace : 3.45
  • Average no. of triples per resource : 47.19
  • % of triples affected : 11.93
  • % of triples affected in the dbprop namespace : 7.11

Manuscript

Publications

by (Editors: ) [BibTex of ]

News

AKSW successful at #ISWC2014 ( 2014-10-28T17:03:42+01:00 by Ricardo Usbeck)

2014-10-28T17:03:42+01:00 by Ricardo Usbeck

Dear followers, 9 members of AKSW have been participating at the 13th International Semantic Web Conference (ISWC) at Riva del Garda, Italy. Read more about "AKSW successful at #ISWC2014"

New DBpedia Overview Article ( 2013-06-24T18:05:58+02:00 by Dr. Jens Lehmann)

2013-06-24T18:05:58+02:00 by Dr. Jens Lehmann

We are pleased to announce that a new overview article for DBpedia is available. The article covers several aspects of the DBpedia community project: The DBpedia extraction framework. The mappings wiki as the central structure for maintaining the community-curated DBpedia ontology. Statistics on the multilingual support in DBpedia. DBpedia live synchronisation with Wikipedia. Read more about "New DBpedia Overview Article"

Crowd-sourcing the evaluation of Linked Open Data with TripleCheckMate ( 2013-01-28T15:11:27+01:00 by Dimitris Kontokostas)

2013-01-28T15:11:27+01:00 by Dimitris Kontokostas

On November 16th, 2012 the DBpedia Data Quality group started a campaign for assessing the quality of DBpedia. To get the best of this effort we developed TripleCheckMate, a tool designed for crowd-sourcing the evaluation of Linked Data. Read more about "Crowd-sourcing the evaluation of Linked Open Data with TripleCheckMate"

DBpedia Data Quality Evaluation Campaign ( 2012-11-16T12:54:21+01:00 by Amrapali Zaveri)

2012-11-16T12:54:21+01:00 by Amrapali Zaveri

As we all know, DBpedia is an important dataset in Linked Data as it is not only connected to and from numerous other datasets, but it also is relied upon for useful information. Read more about "DBpedia Data Quality Evaluation Campaign"

Open Innovation Conference, 28th-30th November 2012, Leipzig ( 2012-11-01T15:10:11+01:00 by Sandra Prätor)

2012-11-01T15:10:11+01:00 by Sandra Prätor

The first Open Innovation Conference organized by the city of Leipzig brings together experts from the fields of the triple helix “university, industry, government” focusing on the exchange of experience in the field of open innovation in creative industries. Read more about "Open Innovation Conference, 28th-30th November 2012, Leipzig"