RDFSlice: Large-scale RDF Dataset Slicing

  • screenshot

In the last years an increasing number of structured data was published on the Web as Linked Open Data (LOD).Despite recent advances, consuming and using Linked Open Data within an organization is still a substantial challenge. Many of the LOD datasets are quite large and despite progress in RDF data management their loading and querying within a triple store is extremely time-consuming and resource-demanding. To overcome this consumption obstacle, we propose a process inspired by the classical Extract-Transform-Load (ETL) paradigm, RDF dataset slicing.

Download Homepage Source Code

RDFSlicing focuses on the selection and extraction. It devises a fragment of SPARQL dubbed SliceSPARQL, which enables the selection of well-defined slices of datasets fulfilling typical information needs. SliceSPARQL supports graph patterns for which each connected subgraph pattern involves a maximum of one variable or IRI in its join conditions. This restriction guarantees the efficient processing of the query against a sequential dataset dump stream. As a result dataset slices can be generated an order of magnitude faster than by using the conventional approach of loading the whole dataset into a triple store and retrieving the slice by executing the query against the triple store's SPARQL endpoint.

Project Team

Publications

by (Editors: ) [BibTex of ]

News

LDK conference @ University of Leipzig ( 2019-03-22T09:21:41+01:00 by Julia Holze)

2019-03-22T09:21:41+01:00 by Julia Holze

With the advent of digital technologies, an ever-increasing amount of language data is now available across various application areas and industry sectors, thus making language data more and more valuable. Read more about "LDK conference @ University of Leipzig"

13th DBpedia community meeting in Leipzig ( 2019-02-22T12:22:07+01:00 by Julia Holze)

2019-02-22T12:22:07+01:00 by Julia Holze

We are happy to invite you to join the 13th edition of the DBpedia Community Meeting, which will be held in Leipzig. Read more about "13th DBpedia community meeting in Leipzig"

SANSA 0.5 (Semantic Analytics Stack) Released ( 2018-12-13T09:25:34+01:00 by Prof. Dr. Jens Lehmann)

2018-12-13T09:25:34+01:00 by Prof. Dr. Jens Lehmann

We are happy to announce SANSA 0.5 – the fifth release of the Scalable Semantic Analytics Stack. Read more about "SANSA 0.5 (Semantic Analytics Stack) Released"

AKSW at web.br in São Paulo ( 2018-10-22T09:37:49+02:00 by Natanael Arndt)

2018-10-22T09:37:49+02:00 by Natanael Arndt

From October 1st until 6th a delegation from AKSW Group, Leipzig University of Applied Sciences (HTWK), eccenca GmbH, and Max Planck Institute for Human Cognitive and Brain Sciences went to São Paulo, Brazil to meet people from the Web Technologies … Continue reading → Read more about "AKSW at web.br in São Paulo"

AskNow 0.1 Released ( 2018-09-13T15:35:04+02:00 by Prof. Dr. Jens Lehmann)

2018-09-13T15:35:04+02:00 by Prof. Dr. Jens Lehmann

Dear all, we are very happy to announce AskNow 0.1 – the initial release of Question Answering Components and Tools over RDF Knowledge Graphs. Website: http://asknow.sda.tech/ Demo: http://asknowdemo.sda.tech GitHub: https://github. Read more about "AskNow 0.1 Released"