HAWK: Hybrid Question Answering over Linked Data

HAWK is going to drive forth the OKBQA vision of hybrid question answering using Linked Data and full-text information. Performance benchmarks are done on the QALD-4 task 3 hybrid.

Source Code Demo Issues

Introduction

Recent advances in question answering (QA) over Linked Data provide end users with more and more sophisticated tools for querying linked data by expressing their information need in natural language. This allows access to the wealth of structured data available on the Semantic Web also to non-experts. However, a lot of information is still available only in textual form, both on the Document Web and in the form of labels and abstracts in Linked Data sources. Therefore, a considerable number of questions can only be answered by using hybrid question answering approaches, which can find and combine information stored in both structured and textual data sources.

Architecture

The HAWK Architecture

We present HAWK, the (to best of our knowledge) first full-fledged hybrid QA framework for entity search over Linked Data and textual data.

Given an input query, HAWK implements an 8-step pipeline, which comprises 1) part-of-speech tagging, 2) detecting entities in the query, 3) dependency parsing and 4) applying linguistic pruning heuristics for an in-depth analysis of the natural language input. The results of these first four steps is a predicate-argument graph annotated with resources from the Linked Data Web. HAWK then 5) assign semantic meaning to nodes and 6) generates basic triple patterns for each component of the input query with respect to a multitude of features. This deductive linking of triples results in a set of SPARQL queries containing text operators as well as triple patterns. In order to reduce operational costs, 7) HAWK discards queries using several rules, e.g., by discarding not connected query graphs. Finally, 8) queries are ranked using extensible feature vectors and cosine similarity.

Supplementary material concerning the evaluation and implementation of HAWK can be found here

Project Team

Former Members

Publications

by (Editors: ) [BibTex of ]

News

DBpedia Tutorial @ Knowledge Graph Conference 2021 ( 2021-04-09T13:20:50+02:00 by Julia Holze)

2021-04-09T13:20:50+02:00 by Julia Holze

On May 4, 2021 we will organize a tutorial at the Knowledge Graph Conference (KGC) 2021. Read more about "DBpedia Tutorial @ Knowledge Graph Conference 2021"

DBpedia @ Google Summer of Code program 2021 ( 2021-03-15T09:41:22+01:00 by Julia Holze)

2021-03-15T09:41:22+01:00 by Julia Holze

DBpedia, one of InfAI’s community projects, will participate in the Google Summer of Code (GSoC) program for the 10th time. The GsoC program has the goal to bring students from all over the globe into open source software development. Read more about "DBpedia @ Google Summer of Code program 2021"

DBpedia’s New Website ( 2021-01-28T12:42:40+01:00 by Julia Holze)

2021-01-28T12:42:40+01:00 by Julia Holze

We are proud to announce the completion of the new DBpedia website. Read more about "DBpedia’s New Website"

SANSA 0.7.1 (Semantic Analytics Stack) Released ( 2020-01-17T09:52:41+01:00 by Prof. Dr. Jens Lehmann)

2020-01-17T09:52:41+01:00 by Prof. Dr. Jens Lehmann

We are happy to announce SANSA 0.7.1 – the seventh release of the Scalable Semantic Analytics Stack. SANSA employs distributed computing via Apache Spark and Flink in order to allow scalable machine learning, inference and querying capabilities for large knowledge graphs. Read more about "SANSA 0.7.1 (Semantic Analytics Stack) Released"

More Complete Resultset Retrieval from Large Heterogeneous RDF Sources ( 2019-12-05T15:46:09+01:00 Andre Valdestilhas)

2019-12-05T15:46:09+01:00 Andre Valdestilhas

Over recent years, the Web of Data has grown significantly. Various interfaces such as LOD Stats, LOD Laundromat and SPARQL endpoints provide access to hundreds of thousands of RDF datasets, representing billions of facts. Read more about "More Complete Resultset Retrieval from Large Heterogeneous RDF Sources"