DEER: RDF Data Extraction and Enrichment Framework

Over the last years, the Linked Data principles have been used across academia and industry to publish and consume structured data. Thanks to the fourth Linked Data principle, many of the RDF datasets used within these applications contain implicit and explicit references to more data. For example, music datasets such as Jamendo include references to locations of record labels, places where artists were born or have been, etc. Datasets such as Drugbank contain references to drugs from DBpedia, were verbal description of the drugs and their usage is explicitly available. The goal of mapping component, dubbed DEER, is to retrieve this information, make it explicit and integrate it into data sources according to the specifications of the user. To this end, DEER relies on a simple yet powerful pipeline system that consists of two main components: enrichment functions and operators.

Download Issues

Enrichment functions and operators.

Enrichment functions implement functionality for processing the content of a dataset (e.g., applying named entity recognition to a particular property). Thus, they take a dataset as input and return a dataset as output. Enrichment operators work at a higher level of granularity and combine datasets. Thus, they take sets of datasets as input and return sets of datasets.

RDF specification paradigm

In the current version of DEER we introduce our new RDF based specification paradigm. The main idea behind this new paradigm is to enable the processing execution of specifications in an efficient way. To this end, we first decided to use RDF as language for the specification. This has the main advantage of allowing for creating specification repositories which can be queried easily with the aim of retrieving accurate specifications for the use cases at hand. Moreover, extensions of the specification language do not require a change of the specification language due to the intrinsic extensibility of ontologies. The third reason for choosing RDF as language for specifications is that we can easily check the specification for correctness by using a reasoner, as the specification ontology allows for specifying the restrictions that specifications must abide by.

Project Team

Publications

by (Editors: ) [BibTex of ]

News

ESWC 2017 accepted two Demo Papers by AKSW members ( 2017-04-19T10:19:43+02:00 Christopher Schulz)

2017-04-19T10:19:43+02:00 Christopher Schulz

Hello Community! The 14th ESWC, which takes place from May 28th to June 1st 2017 in Portoroz, Slovenia, accepted two demos to be presented at the conference. Read more about them in the following:                                                                         1. Read more about "ESWC 2017 accepted two Demo Papers by AKSW members"

AKSW Colloquium, 10.04.2017, GeoSPARQL on geospatial databases ( 2017-04-07T10:43:55+02:00 by Dr. Matthias Wauer)

2017-04-07T10:43:55+02:00 by Dr. Matthias Wauer

At the AKSW Colloquium, on Monday 10th of April 2017, 3 PM, Matthias Wauer will discuss a paper titled “Ontop of Geospatial Databases“. Read more about "AKSW Colloquium, 10.04.2017, GeoSPARQL on geospatial databases"

AKSW Colloquium, 03.04.2017, RDF Rule Mining ( 2017-03-31T13:39:28+02:00 TommasoSoru)

2017-03-31T13:39:28+02:00 TommasoSoru

At the AKSW Colloquium, on Monday 3rd of April 2017, 3 PM, Tommaso Soru will present the state of his ongoing research titled “Efficient Rule Mining on RDF Data”, where he will introduce Horn Concerto, a novel scalable SPARQL-based approach … Continue reading → Read more about "AKSW Colloquium, 03.04.2017, RDF Rule Mining"

AKSW Colloquium, 27.03.2017, PPO & PPM 2.0: Extending the privacy preference framework to provide finer-grained access control for the Web of Data ( 2017-03-27T10:13:08+02:00 by Marvin Frommhold)

2017-03-27T10:13:08+02:00 by Marvin Frommhold

In the upcoming Colloquium, March the 27th at 3 PM Marvin Frommhold will discuss the paper “PPO & PPM 2.0: Extending the Privacy Preference Framework to provide finer-grained access control for the Web of Data” by Owen Sacco and John G. Read more about "AKSW Colloquium, 27.03.2017, PPO & PPM 2.0: Extending the privacy preference framework to provide finer-grained access control for the Web of Data"

DBpedia @ Google Summer of Code – GSoC 2017 ( 2017-03-13T11:12:50+01:00 Christopher Schulz)

2017-03-13T11:12:50+01:00 Christopher Schulz

DBpedia, one of InfAI’s community projects, will be part of the 5th Google Summer of Code program. The GsoC has the goal to bring students from all over the globe into open source software development. Read more about "DBpedia @ Google Summer of Code – GSoC 2017"