LIMES implements novel time-efficient approaches for link discovery in metric spaces. Our approaches facilitate different approximation techniques to compute estimates of the similarity between instances. These estimates are then used to filter out a large amount of those instance pairs that do not suffice the mapping conditions. By these means, LIMES can reduce the number of comparisons needed during the mapping process by several orders of magnitude. The approaches implemented in LIMES include the original LIMES original LIMES algorithm for edit distances, HR3, HYPPO, and ORCHID. Additionally, LIMES supports the first planning technique for link discovery HELIOS , that minimizes the overall execution of a link specification, without any loss of completeness. Moreover, LIMES implements supervised and unsupervised machine-learning algorithms for finding accurate link specifications. The algorithms implemented here include the supervised, active and unsupervised versions of EAGLE and WOMBAT.
The LIMES framework consists of eight main modules of which each can be extended to accommodate new or improved functionality. The central modules of LIMES is the controller module, which coordinates the matching process. The matching process is carried out as follows: First, the controller calls the configuration module, which reads the configuration file and extracts all the information necessary to carry out the comparison of instances, including the URL of the SPARQL-endpoints of the knowledge bases S (source) and T(target), the restrictions on the instances to map (e.g., their type), the expression of the metric to be used and the threshold to be used.
Given that the configuration file is valid w.r.t. the LIMES Specification Language (LSL), the query module is called. This module uses the configuration for the target and source knowledge bases to retrieve instances and properties from the SPARQL-endpoints of the source and target knowledge bases that adhere to the restrictions specified in the configuration file. The query module writes its output into a file by invoking the cache module. Once all instances have been stored in the cache, the controller chooses between performing Link Discovery or Machine Learning. For Link Discovery, LIMES will re-write, plan and execute the Link Specification (LS) included in the configuration file, by calling the rewriter, planner and engine modules resp. The main goal of LD is to identify the set of links (mapping) that satisfy the conditions opposed by the input LS. For Machine Learning, LIMES calls the machine learning algorithm included in the configuration file, to identify an appropriate LS to link S and T. Then it proceeds in executing the LS. For both taks, the mapping will be stored in the output file choosen by the user in the configuration file. The results are finally stored into a RDF or a XML file.
The algorithms implemented in LIMES were published in several papers. Below are links to evaluation results.
- An Evaluation of Point Set Distance Measures for Link Discovery
- Unsupervised Link Discovery Through Knowledge Base Repair
- RADON - Rapid Discovery of Topological Relation
- AEGLE - An Efficient Approach for the Generation of Allen Relations
- Download the LIMES package (includes a user manual) and run it locally on your server
- You can either execute LIMES using the graphical interface or run LIMES via the command line as a Java executable package.