GHO:Publishing and Interlinking the Global Health Observatory Dataset
The improvement of public health is one of the main indicators for societal progress. Statistical data for monitoring public health is highly relevant for a number of sectors, such as research (e.g. in the life sciences or economy), policy making, health care, pharmaceutical industry, insurances etc. Such data is meanwhile available even on a global scale, e.g. in the Global Health Observatory (GHO) of the United Nations’s World Health Organization (WHO). GHO comprises more than 50 different datasets, it covers all 198 WHO member countries and is updated as more recent or revised data becomes available or when there are changes to the methodology being used. However, this data is only accessible via complex spreadsheets and, therefore, queries over the 50 different datasets as well as combinations with other datasets are very tedious and require a significant amount of manual work. By making the data available as RDF, we lower the barrier for data re-use and integration.
The converted GHO data is now available at http://gho.aksw.org/. After converting the entire GHO data, an RDF dataset containing almost 8 million triples was obtained. Following is the example of a single statistical item, the death value of 1098, from the GHO dataset represented using the Data Cube vocabulary:
gho:o1 a qb:Observation;
gho:Disease All Causes;
gho:Country a qb:DimensionProperty;
gho:Disease a qb:DimensionProperty;
This is a short presentation describing the process of conversion of the CSV files to RDF using SCOVO (Statistical Core Vocabulary) in OntoWiki. SCOVO is an earlier version of the Data Cube Vocabulary and the conversion process is similar for both.