Palmetto: Palmetto is a quality measuring tool for topics

Palmetto is a quality measuring tool for topics based on coherence calculations.

Issues Demo Source Code

Logo Palmetto Palmetto is a quality measuring tool for topics

With Topic Modeling it is possible to extract topics from a collection of documents automatically and unsupervised. A disadvantage of Topic Modeling is that in most cases the created topics have to be evaluated manually by humans. Palmetto is a tool which tries to help researchers by offering different coherence calculations for a topic's top words. These coherences are based on word co-occurrences in the english wikipedia and have been proven to correlate with human ratings.

The source code is dual licensed and can be found at github. For larger experiments the program can be downloaded or the webservice can be used. More on how Palmetto could be used can be found on this wikipage.

A Dutch index for Palmetto has been created by van der Zwaan, Marx and Kamps. Thus, Palmetto can be used for Dutch as well. The index can be downloaded here.

For researchers who want to try out different coherences by themself, it might be interesting that Palmetto can be used as Java library and already contains more than 200.000 coherences that have been evaluated for the publication Exploring the Space of Topic Coherences.

The topics and human ratings used in this publication as well as the Movie and RTL-Wiki corpora can be found here. Since we did not create all datasets by ourself, please cite the creators/providers of the datasets where appropriate. You can find the reference of their publications in our paper in the section that describes the datasets.

Project Team

Publications

by (Editors: ) [BibTex of ]

News

DBpedia Day @ SEMANTiCS 2022 ( 2022-08-08T11:24:02+02:00 by Julia Holze)

2022-08-08T11:24:02+02:00 by Julia Holze

We are happy to announce that we are partnering again with the SEMANTiCS Conference which will host this year’s DBpedia Day on September 13, 2022. Read more about "DBpedia Day @ SEMANTiCS 2022"

DBpedia Knowledge Engineering PhD Symposium ( 2022-05-02T16:59:37+02:00 by Julia Holze)

2022-05-02T16:59:37+02:00 by Julia Holze

Dear all,  We are excited to invite you to the 1st DBpedia Knowledge Engineering PhD Symposium, organized on July 6th, 2022 in Leipzig, Germany. Read more about "DBpedia Knowledge Engineering PhD Symposium"

Tutorial @ Knowledge Graph Conference 2022 ( 2022-04-25T12:24:06+02:00 by Julia Holze)

2022-04-25T12:24:06+02:00 by Julia Holze

On May 2, 2022 we will organize a tutorial 2.0 at the Knowledge Graph Conference (KGC) 2022. Read more about "Tutorial @ Knowledge Graph Conference 2022"

International Workshop on Data-driven Resilience Research 2022 ( 2022-04-21T14:43:27+02:00 by Julia Holze)

2022-04-21T14:43:27+02:00 by Julia Holze

In the face of continuously changing contextual conditions and ubiquitous disruptive crisis events, the concept of resilience refers to some of the most urgent, challenging, and interesting issues of nowadays society. Read more about "International Workshop on Data-driven Resilience Research 2022"

DBpedia @ Google Summer of Code Program 2022 ( 2022-03-23T14:26:48+01:00 by Julia Holze)

2022-03-23T14:26:48+01:00 by Julia Holze

DBpedia, one of InfAI’s community projects, will be part of the 11th Google Summer of Code (GSoC) program. The GSoC program has the goal to bring students from all over the globe into open source software development. Read more about "DBpedia @ Google Summer of Code Program 2022"