Property Alignment

From Knoesis wiki
Revision as of 23:03, 6 June 2013 by Kalpa (Talk | contribs) (Approach)

Jump to: navigation, search

Property Alignment on Linked Datasets

Property alignment in Linked Open Data (LOD) or linked datasets is a non-trivial task because of the complex data representations. Concept (class) and instance level alignment possibilities have been investigated in the recent past but property alignment has not received much attention yet. Therefore, we propose an approach that can handle complex data representations and also achieve higher correct matching ratio. Our approach is based on utilizing fundamental building block of the interlinked datasets (e.g., LOD) which is known as Entity Co-Reference (ECR) links. We try to match property extensions to come up with a measurement to approximate owl:equivalent property. We use ECR links to findout equivalent instances for a particular property extension and then accumulate the matching number of extensions to decide on a matching property pair between two datasets.


Approach

In this initial experiment, we explored property extension matching using owl:sameAs and skos:exactMatch interlinking relationships (as ECR links). We will explore other less restrictive links as skos:closeMatch and some links like rdf:seeAlso links used in certain datasets for their requirements later and check the performance.


Fig. 1. Matching mechanism of the extension based approach

Figure 1 shows how the matching process work in our extension based algorithm. Each property pair is matched separately of others by extensions by analyzing each instance associated with that property (in the extensions slots). The algorithm needs to process subject instances from starting dataset and it extracts triples from each subject instance and finds out the relevant subject instance in the second dataset by traversing through an ECR link. Then object values for the property pair is matched using ECR links again. The final result of this matching process can be illustrated by an example presented in Figure 2. We keep track of statistical measures for deciding the final matching pairs as described in the paper (to appear in isemantics 2013) as MatchCount and Co-appearanceCount as described in Figure 2. These measures help to reduce incorrect mappings such as "birth_place" and "place_of_birth".

Fig. 2. Matching example for the extension based approach

Experiment and Datasets

Analysis