Tapioca: Tapioca is a search engine for topically similar RDF datasets.

Tapioca is a search engine for finding topically similar linked data datasets.

Demo Issues Source Code

The Web of data is growing continuously with respect to both the size and number of the datasets published. Porting these datasets to five-star Linked Data however requires data publishers to link their novel dataset with the already available Linked Data sets. Given the size and growth of the Linked Data Cloud, the current mostly manual approach used for detecting relevant datasets for linking is thus obsolete.

We present Tapioca, a linked dataset search engine so as to provide data publishers with similar existing datasets automatically. Our search engine uses a novel approach for determining the topical similarity of datasets. This approach relies on probabilistic topic modelling to determine related datasets by relying solely on the metadata of datasets.

The source code can be found at Github. The software is provided under a dual license. For non-commercial purposes, the terms of the LGPL 3.0 license hold. For commercial purposes, please contact us.

For our publication Detecting Similar Linked Datasets Using Topic Modelling we have the following additional material:

  • For the first experiment, you can find the gold standard as well as the detailed F1 scores of Tapioca and a second version of Tapioca that uses the Jensen-Shannon divergence, in this folder.
  • For the second experiment, you can find the detailed values of the P(w|T) and the A measure in this folder.
  • For the third experiment, you can find the detailed values of the P(w|T) and the A measure as well as the F1 scores of our approach in this folder.

Project Team

Publications

by (Editors: ) [BibTex of ]

News

DBpedia @ SEMANTiCS 2017 ( 2017-09-04T15:25:14+02:00 by Sandra Bartsch)

2017-09-04T15:25:14+02:00 by Sandra Bartsch

We are happy to invite you to the 10th DBpedia Community Meeting which will be held in Amsterdam. During the SEMANTiCS 2017, Sep 11-14, the DBpedia Community will get together on the 14th of September for the DBpdia Day. Read more about "DBpedia @ SEMANTiCS 2017"

PRESS RELEASE: Amsterdam​ ​-​ ​this​ ​year’s​ ​hotspot​ ​​on Linked​ ​Data​ ​Strategies​ ​&​ ​Practices ( 2017-09-04T11:58:06+02:00 by Sandra Bartsch)

2017-09-04T11:58:06+02:00 by Sandra Bartsch

September 11-14, 2017 international experts from science and industry demonstrate the business value of smart data services at SEMANTiCS 2017 Experts from science and industry meet at Europe’s biggest Linked Data and Semantic Web event to present and discuss latest … Continue reading → Read more about "PRESS RELEASE: Amsterdam​ ​-​ ​this​ ​year’s​ ​hotspot​ ​​on Linked​ ​Data​ ​Strategies​ ​&​ ​Practices"

AKSW Colloquium, 01.09.2017, IDOL: Comprehensive & Complete LOD Insights ( 2017-08-28T17:24:03+02:00 Gustavo Publio)

2017-08-28T17:24:03+02:00 Gustavo Publio

At the AKSW Colloquium on Friday 1st of September, at 10:40 AM there will be a paper presentation by Gustavo Publio. Read more about "AKSW Colloquium, 01.09.2017, IDOL: Comprehensive & Complete LOD Insights"

AKSW at ISWC2017 ( 2017-07-30T05:57:57+02:00 Muhammad Saleem)

2017-07-30T05:57:57+02:00 Muhammad Saleem

We are very pleased to announce that AKSW will be presenting 2 papers at ISWC 2017, which will be held on 21-24 October in Vienna, Austria. The demo and workshops papers have to be announced. Read more about "AKSW at ISWC2017"

AKSW Colloquium, 07.07.2017, Two paper presentations concerning Link Discovery and Knowledge Base Reasoning ( 2017-07-06T21:24:36+02:00 by Daniel Obraczka)

2017-07-06T21:24:36+02:00 by Daniel Obraczka

At the AKSW Colloquium on Friday 7th of July, at 10:40 AM there will be two paper presentations concerning genetic algorithms to learn linkage rules, and differentiable learning of logical rules for knowledge base reasoning. Read more about "AKSW Colloquium, 07.07.2017, Two paper presentations concerning Link Discovery and Knowledge Base Reasoning"