CSVImport: Representing multi-dimensional statistical data as RDF using the RDF Data Cube Vocabulary

  • screenshot

This project is about the representation of multi-dimensional statistical data as RDF using the RDF Data Cube vocabulary by importing spreadsheets into the OntoWiki plugin.

Homepage Issues Wiki

Statistical data on the web is often published as Excel sheets. Although they have the advantage of being easily readable by humans, they cannot be queried efficiently. Also it is difficult to integrate with other datasets, which may be in different formats. Our approach is to convert the data into a single data model – RDF. But in these datasets, a single statistical value is described in several dimensions. Thus a simple row-based transformation is not possible. Therefore, we use The RDF Data Cube vocabulary for the conversion as it is designed particularly to represent multidimensional statistical data using RDF. Transforming CSV to RDF in a fully automated way is not feasible as there may be dimensions encoded in the heading or label of a sheet. Therefore, we introduce a semi-automated approach as a plugin in OntoWiki.

Project Team

Former Members

News

SANSA 0.7.1 (Semantic Analytics Stack) Released ( 2020-01-17T09:52:41+01:00 by Prof. Dr. Jens Lehmann)

2020-01-17T09:52:41+01:00 by Prof. Dr. Jens Lehmann

We are happy to announce SANSA 0.7.1 – the seventh release of the Scalable Semantic Analytics Stack. SANSA employs distributed computing via Apache Spark and Flink in order to allow scalable machine learning, inference and querying capabilities for large knowledge graphs. Read more about "SANSA 0.7.1 (Semantic Analytics Stack) Released"

More Complete Resultset Retrieval from Large Heterogeneous RDF Sources ( 2019-12-05T15:46:09+01:00 Andre Valdestilhas)

2019-12-05T15:46:09+01:00 Andre Valdestilhas

Over recent years, the Web of Data has grown significantly. Various interfaces such as LOD Stats, LOD Laundromat and SPARQL endpoints provide access to hundreds of thousands of RDF datasets, representing billions of facts. Read more about "More Complete Resultset Retrieval from Large Heterogeneous RDF Sources"

DL-Learner 1.4 (Supervised Structured Machine Learning Framework) Released ( 2019-09-24T22:41:46+02:00 by Simon Bin)

2019-09-24T22:41:46+02:00 by Simon Bin

Dear all, The Smart Data Analytics group [1] and the E.T.-db-MOLE sub-group located at the InfAI Leipzig [2] is happy to announce DL-Learner 1.4. DL-Learner is a framework containing algorithms for supervised machine learning in RDF and OWL. Read more about "DL-Learner 1.4 (Supervised Structured Machine Learning Framework) Released"

DBpedia Day @ SEMANTiCS 2019 ( 2019-08-01T10:35:05+02:00 Sandra Bartsch)

2019-08-01T10:35:05+02:00 Sandra Bartsch

 We are happy to announce that SEMANTiCS 2019 will host the 14th DBpedia Community Meeting at the last day of the conference on September 12, 2019. Read more about "DBpedia Day @ SEMANTiCS 2019"

LDK conference @ University of Leipzig ( 2019-03-22T09:21:41+01:00 by Julia Holze)

2019-03-22T09:21:41+01:00 by Julia Holze

With the advent of digital technologies, an ever-increasing amount of language data is now available across various application areas and industry sectors, thus making language data more and more valuable. Read more about "LDK conference @ University of Leipzig"