Want to keep learning?

This content is taken from the University of Southampton's online course, Introduction to Linked Data and the Semantic Web. Join the course to learn more.

Reasoning over linked data

Reasoning enhances the information contained in a dataset by including results obtained by inference from the triples already present.

As a simple example, suppose that the dataset includes the following triples, shown here in Turtle (assume the usual prefixes):

dbpedia:The_Beatles a mo:MusicGroup .
mo:MusicGroup rdfs:subClassOf mo:MusicArtist .

Recall that the predicate a in the first triple is Turtle shorthand for rdf:type, denoting class membership. Now, suppose that we submit the following SELECT query designed to retrieve all triples in which the Beatles occur as subject:

PREFIX dbpedia: <http://dbpedia.org/resource/>

WHERE { dbpedia:The_Beatles ?predicate ?object }

Assuming there are no other relevant triples, the response to this query under a regime with no entailment will be the following single-row table:

?predicate ?object
a mo:MusicGroup

An obvious inference has been missed here, since if every music group is a music artist (as asserted by the second triple), the Beatles will also be a music artist. (It might sound odd to call a group an artist, but this is how the information is encoded in the Music Ontology.) If we execute the query on an engine that implements the RDFS entailment regime (a set of rules governing inference based on RDFS resources like rdfs:subClassOf), the output table will be enriched by a second variable binding inferred with the aid of the second triple.

?predicate ?object
a mo:MusicGroup
a mo:MusicArtist

This is a very simple case, but in general the formulation of workable entailment regimes is a complex task, still under investigation and discussion. One hard problem is what to do if the dataset is inconsistent, since a well-known result of logic states that from a contradiction, anything can be inferred. (This follows from the definition of material implication, for which the truth-conditions state that if the antecedent is false, the implication holds whatever the consequent. Hence a contradiction, being false by definition, implies any statement whatever.)

Another problem is that some queries might return an infinite set of solutions. For instance, there is a convention in RDF that allows for a family of predicates rdf:_1, rdf:_2, etc. ending in any positive integer; such predicates are used when constructing lists. This means that if all possible inferences are performed, a query of the form ?x rdf:type rdf:Property (intuitively, return all properties) would yield an infinite set.

For practical purposes we can ignore such cases, but they illustrate that that query engines may not always return all the entailments that one might expect.

In the next step we look first at typical entailments that arise from RDFS.

This work is a derivative of ‘Using Linked Data Effectively’ by The Open University (2014) and licensed under CC by 4.0 International Licence adapted and used by the University of Southampton. http://www.euclid-project.eu/

Share this article:

This article is from the free online course:

Introduction to Linked Data and the Semantic Web

University of Southampton

Get a taste of this course

Find out what this course is like by previewing some of the course steps before you join: