RDFSlice: Large-scale RDF Dataset Slicing

  • screenshot

In the last years an increasing number of structured data was published on the Web as Linked Open Data (LOD).Despite recent advances, consuming and using Linked Open Data within an organization is still a substantial challenge. Many of the LOD datasets are quite large and despite progress in RDF data management their loading and querying within a triple store is extremely time-consuming and resource-demanding. To overcome this consumption obstacle, we propose a process inspired by the classical Extract-Transform-Load (ETL) paradigm, RDF dataset slicing.

Download Homepage Source Code

RDFSlicing focuses on the selection and extraction. It devises a fragment of SPARQL dubbed SliceSPARQL, which enables the selection of well-defined slices of datasets fulfilling typical information needs. SliceSPARQL supports graph patterns for which each connected subgraph pattern involves a maximum of one variable or IRI in its join conditions. This restriction guarantees the efficient processing of the query against a sequential dataset dump stream. As a result dataset slices can be generated an order of magnitude faster than by using the conventional approach of loading the whole dataset into a triple store and retrieving the slice by executing the query against the triple store's SPARQL endpoint.

Project Team

News

AKSW Colloquium, 07.07.2017, Two paper presentations concerning Link Discovery and Knowledge Base Reasoning ( 2017-07-06T21:24:36+02:00 by Daniel Obraczka)

2017-07-06T21:24:36+02:00 by Daniel Obraczka

At the AKSW Colloquium on Friday 7th of July, at 10:40 AM there will be two paper presentations concerning genetic algorithms to learn linkage rules, and differentiable learning of logical rules for knowledge base reasoning. Read more about "AKSW Colloquium, 07.07.2017, Two paper presentations concerning Link Discovery and Knowledge Base Reasoning"

SANSA 0.2 (Semantic Analytics Stack) Released ( 2017-06-13T18:18:28+02:00 by Prof. Dr. Jens Lehmann)

2017-06-13T18:18:28+02:00 by Prof. Dr. Jens Lehmann

The AKSW and Smart Data Analytics groups are happy to announce SANSA 0.2 – the second release of the Scalable Semantic Analytics Stack. Read more about "SANSA 0.2 (Semantic Analytics Stack) Released"

AKSW at ESWC 2017 ( 2017-06-12T10:53:35+02:00 Christopher Schulz)

2017-06-12T10:53:35+02:00 Christopher Schulz

Hello Community! The ESWC 2017 just ended and we give a short report of the course at the conference, especially regarding the AKSW-Group. Our members Dr. Muhammad Saleem, Dr. Mohamed Ahmed Sherif, Claus Stadler, Michael Röder, Prof. Dr. Read more about "AKSW at ESWC 2017"

Four papers accepted at WI 2017 ( 2017-06-10T15:01:31+02:00 Christopher Schulz)

2017-06-10T15:01:31+02:00 Christopher Schulz

Hello Community! We proudly announce that The International Conference on Web Intelligence (WI) accepted four papers by our group. The WI takes place in Leipzig between the 23th – 26th of August. Read more about "Four papers accepted at WI 2017"

AKSW Colloquium, 29.05.2017, Addressing open Machine Translation problems with Linked Data. ( 2017-05-26T13:51:11+02:00 by Diego Moussallem)

2017-05-26T13:51:11+02:00 by Diego Moussallem

At the AKSW Colloquium, on Monday 29th of May 2017, 3 PM, Diego Moussallem will present two papers related to his topic. First paper titled “Using BabelNet to Improve OOV Coverage in SMT” of Du et al. Read more about "AKSW Colloquium, 29.05.2017, Addressing open Machine Translation problems with Linked Data."