Want to keep learning?

This content is taken from the University of Southampton's online course, Introduction to Linked Data and the Semantic Web. Join the course to learn more.

Missing information

In the examples we have seen so far, variable bindings must be retrieved for all patterns listed after WHERE.

This means that if we retrieve several facts about an album (for example), the album will only be included in the output if all these facts are presented in the dataset: if just one is missing, the others will be ignored.

SPARQL deals with this problem by allowing any graph pattern in the list to be preceded by the keyword OPTIONAL. This means that when computing variable bindings, the query engine should accept incomplete bindings provided that the unspecified variables occur only in optional patterns.

In the following query, optional patterns are used ingeniously to select only variable bindings for which a particular variable is not bound.

PREFIX dbont: <http://dbpedia.org/ontology/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>

CONSTRUCT { ?album dc:creator ?artist . }
WHERE { ?artist foaf:made ?album .
    OPTIONAL { ?artist dbont:deathPlace ?place_of_death }
    FILTER (!BOUND(?place_of_death))

The variable in question records an artist’s place of death, and it is assumed that if this information is missing from the dataset, the artist will still be alive. If variables in the CONSTRUCT clause are not bound in the OPTIONAL clause, the triple patterns with these variables are not generated.

As a result, “creator” relationships are constructed only for artists who are alive (or more precisely, artists for whom there is no death place recorded in the dataset).

Note that in the filter expression ‘!’ denotes negation, so that the whole expression means that the variable is not bound.

You should take care using this kind of query, since it depends on a risky inference sometimes called the closed-world assumption – namely, that any relevant statement not found in the dataset must be false.

Thus if the dataset contains information about places of death, but no statement giving the place of death of Paul McCartney, we infer by this assumption that Paul McCartney must still be alive, since otherwise his place of death would have been recorded.

This work is a derivative of ‘Using Linked Data Effectively’ by The Open University (2014) and licensed under CC by 4.0 International Licence adapted and used by the University of Southampton. http://www.euclid-project.eu/

Share this article:

This article is from the free online course:

Introduction to Linked Data and the Semantic Web

University of Southampton

Get a taste of this course

Find out what this course is like by previewing some of the course steps before you join: