GERBIL: General Entity Annotation Benchmark Framework
GERBIL is a general entity annotation system.
GERBIL is a general Linked Data benchmarking system (formerly used for entity annotation systems based on the BAT-Framework). GERBIL offers an easy-to-use web-based platform for the agile comparison of annotators using multiple datasets and uniform measuring approaches. To add a tool to GERBIL, all the end user has to do is to provide a URL to a REST interface to its tool which abides by a given specification. The integration and benchmarking of the tool against user-specified datasets is then carried out automatically by the GERBIL platform.
- If you want to know more, please have a look at our novel paper about GERBIL at the Semantic Web Journal
- We also used GERBIL for benchmarking system for question answering
BAT-Framework | GERBIL 1.0.0 | GERBIL1.2.5 | Experiment | |
---|---|---|---|---|
Wikipedia Miner | ✔ | ✔ | (✔) | A2KB |
Illinois Wikifier | ✔ | (✔) | ✔ | A2KB |
Spotlight | ✔ | ✔ | ✔ | A2KB |
AIDA | ✔ | ✔ | ✔ | A2KB |
TagMe 2 | ✔ | ✔ | ✔ | A2KB |
NERD-ML | ✔ | ✔ | A2KB | |
KEA | ✔ | ✔ | A2KB | |
WAT | ✔ | ✔ | A2KB | |
Dexter | ✔ | ✔ | A2KB | |
AGDISTIS | (✔) | ✔ | ✔ | D2KB |
Babelfy | ✔ | ✔ | A2KB | |
FOX | ✔ | OKE Task 1 | ||
FRED | ✔ | OKE Task 1 | ||
FREME | ✔ | OKE Task 1 | ||
entityclassifier.eu | ✔ | A2KB | ||
CETUS | ✔ | OKE Task 2 | ||
xLisa | ✔ | A2KB | ||
DoSer | ✔ | D2KB | ||
PBOH | ✔ | D2KB | ||
NERFGUN | ✔ | D2KB | ||
NIF-based Annotator | ✔ | ✔ | any |
A2KB, C2KB, Entity Recognition |
D2KB | Entity Typing |
OKE Task 1 | OKE Task 2 | |
---|---|---|---|---|---|
AIDA | ✔ | (✔) | |||
AGDISTIS | ✔ | ||||
Babelfy | ✔ | ✔ | |||
CETUS | ✔ | ||||
CETUS (FOX) | ✔ | ||||
Dexter | ✔ | (✔) | |||
entityclassifier.eu | ✔ | (✔) | |||
FRED | ✔ | (✔) | (✔) | ✔ | |
FREME e-Entity | ✔ | ✔ | ✔ | ✔ | |
FOX | ✔ | (✔) | (✔) | ✔ | |
KEA | ✔ | ✔ | |||
NERD-ML | ✔ | (✔) | |||
Spotlight | ✔ | ✔ | ✔ | ✔ | |
TagMe 2 | ✔ | (✔) | |||
WAT | ✔ | ✔ | |||
xLisa | ✔ | (✔) | |||
PBoH | ✔ | ||||
NERFGUN | ✔ | ||||
DoSER | ✔ |
The following table lists the datasets that are currently available and the experiment types they support.
A2KB, C2KB, D2KB, Entity Recognition |
Entity Typing | OKE Task 1 | OKE Task 2 | |
---|---|---|---|---|
AIDA/CoNLL-Complete | ✔ | |||
AIDA/CoNLL-Test A | ✔ | |||
AIDA/CoNLL-Test B | ✔ | |||
AIDA/CoNLL-Training | ✔ | |||
AQUAINT | ✔ | |||
DBpediaSpotlight | ✔ | |||
Dercyznski | ✔ | |||
IITB | ✔ | |||
KORE50 | ✔ | |||
MSNBC | ✔ | |||
Microposts 2013-Test | ✔ | |||
Microposts 2013-Train | ✔ | |||
Microposts 2014-Test | ✔ | |||
Microposts 2014-Train | ✔ | |||
Microposts 2015-Test | ✔ | |||
Microposts 2015-Train | ✔ | |||
Microposts 2015-Dev | ✔ | |||
Microposts 2016-Test | ✔ | |||
Microposts 2016-Train | ✔ | |||
Microposts 2016-Dev | ✔ | |||
N3-RSS-500 | ✔ | |||
N3-Reuters-128 | ✔ | |||
OKE 2015 Task 1 evaluation dataset | ✔ | ✔ | ✔ | |
OKE 2015 Task 1 example set | ✔ | ✔ | ✔ | |
OKE 2015 Task 1 gold standard sample | ✔ | ✔ | ✔ | |
OKE 2015 Task 2 evaluation dataset | ✔ | |||
OKE 2015 Task 2 example set | ✔ | |||
OKE 2015 Task 2 gold standard sample | ✔ | |||
Senseval 2 | ✔ | |||
Senseval 3 | ✔ | |||
UMBC | ✔ | |||
WSDM 2012 | ✔ |
Long term stability
The idea of GERBIL emerged in September 2014 when a couple of articles released at the same time claimed to be state-of-the-art. Especially, those approaches were not easily comparable due to their heterogeneous set-up, dataset use and evaluation metrics. Thus, we decided to build GERBIL and extend the BAT-Framework to break the barriers for people not able to write source code.
GERBIL is now more than 3 years old and has hosted more than 50.000 experiments. It is currently hosted at the research and development unit of the University Leipzig Computation Center and the Paderborn University which keep daily backups to ensure long-term quotability.
The survey data from our paper can be found at GERBIL's GitHub repository.
Contributors
- Ciro Baron (University Leipzig, Germany)
- Andreas Both (R&D, Unister GmbH, Germany)
- Martin Brümmer (University Leipzig, Germany)
- Diego Ceccarelli (Unversity Pisa, Italy)
- Marco Cornolti (University of Pisa, Italy)
- Didier Cherix (R&D, Unister GmbH, Germany)
- Bernd Eickmann (R&D, Unister GmbH, Germany)
- Paolo Ferragina (University of Pisa, Italy)
- Christiane Lemke (R&D, Unister GmbH, Germany)
- Andrea Moro (Sapienza University of Rome, Italy)
- Roberto Navigli (Sapienza University of Rome, Italy)
- Francesco Piccinno (University of Pisa, Italy)
- Giuseppe Rizzo (EURECOM, France)
- Harald Sack (HPI Potsdam, Germany)
- René Speck (Institute for Applied Informatics, Germany)
- Raphaël Troncy (EURECOM, France)
- Jörg Waitelonis (HPI Potsdam, Germany)
- Lars Wesemann (R&D, Unister GmbH, Germany)
Project Team
- Lixi Conrads
- Kunal Jha
- Prof. Dr. Axel-C. Ngonga Ngomo
- Michael Röder (Principle Contact / Maintainer)
- Dr. Ricardo Usbeck