Web standards for the Semantic Web
Further standards are used in the technologies underlying the Semantic Web: RDF, RDFS and OWL. These allow complex statements to be made about data.
The Resource Description Framework (RDF) was introduced originally as a data model for metadata, which are attributes of a document, or image, or program, etc. such as its author, date, location, and coding standards.
First published as a W3C recommendation in 19991, the framework has since been updated, and generalised in its purpose to cover not only metadata (strictly interpreted) but knowledge of all kinds.
The basic idea of RDF is a very simple one: namely, that statements are represented as triples of the form subject–predicate–object, each triple expressing a relation (represented by the predicate resource) between the subject and object resources.
Formally, the subject is expressed by a URI or a blank node, the predicate by a URI, and the object by a URI or a literal such as a number or string.
The original W3C recommendation for exposing RDF data was that it should be encoded in XML syntax, sometimes called RDF/XML.
It is for this reason that the semantic web ‘stack’ of languages has RDF implemented on top of XML.
However, notations have also been proposed which are easier for people to read and write, such as Turtle, in which statements are formed simply by listing the elements of the triple on a line, in the order subject-predicate-object, followed by a full stop, with URIs possibly shortened through the use of namespace abbreviations defined by ‘prefix’ and ‘base’ statements, as in the following example:
@base <http://musicbrainz.org/>. @prefix mo:<http://purl.org/ontology/mo/>. <artist/b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d#_> a mo:MusicGroup.
Here the subject is abbreviated using the ‘base’ statement, and the object is abbreviated using the ‘prefix’ statement.
The very simple predicate ‘a’ relies on a further Turtle shorthand for very commonly used predicates, and refers to the ‘type’ relation between a resource and its class.
This can be seen from the following equivalent Turtle statement, in which all URIs are shown in their cumbersome unabbreviated form. Note that this statement should occupy a single line, although it is shown here with wrapping so that it fits on the page.
The format in which every URI in a Turtle statement is fully expanded is also known as NTriples.
<http://musicbrainz.org/artist/b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d#_> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/ontology/mo/MusicGroup>.
Where multiple statements apply to the same subject, they can be abbreviated by placing a semi-colon after the first object, and then giving further predicate-object pairs separated by semi-colons, with a full stop after the final pair. For statements having the same subject and predicate, objects can be listed in a similar way separated by commas. These conventions are illustrated by the following statements:
@base <http://musicbrainz.org/>. @prefix mo:<http://purl.org/ontology/mo/>. @prefix rdfs:<http://www.w3.org/2000/01/rdf-schema#>. @prefix owl:<http://www.w3.org/2002/07/owl#>. @prefix dbpedia:<http://dbpedia.org/resouce/>. @prefix bbc:<http://www.bbc.co.uk/music/artists/>. <artist/b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d#_> rdfs:label "The Beatles"; owl:sameAs dbpedia:The_Beatles, bbc:b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d#artist.
You will have an opportunity to try and express your own RDF statements in step 1.10, using a similar example.
RDF Schema (RDFS) is an extension of RDF which allows resources to be classified explicitly as classes or properties; it also supports some further statements that depend on this classification, such as class-subclass or property-subproperty relationships, and domain and range of a property.
Some important resources in RDFS are as follows (for brevity we use the ‘rdfs’ prefix defined above):
A resource representing the class of all classes.
Used as a predicate to mean that the subject is a subclass of the object.
Used as a predicate to mean that the subject is a sub-property of the object.
Used as a predicate when the subject is a property and the object is the class that is domain of this property.
Used as a predicate when the subject is a property and the object is the class that is range of this property.
The following statements in Turtle serve to illustrate these RDFS resources. Note that they use abbreviated URLs for which the prefixes are given above.
mo:member rdf:type rdfs:Property. mo:member rdfs:domain mo:MusicGroup. mo:member rdfs:range foaf:Agent. mo:MusicGroup rdfs:subClassOf foaf:Group.
In these statements, the resource ‘mo:member’ denotes the property that relates a music group to each of its members – for instance, the Beatles to John, Paul, George and Ringo, as in the following triple:
dbpedia:The_Beatles mo:member dbpedia:Ringo_Starr.
The second and third statements above give the domain and range of the property ‘mo:member’. Intuitively, their meaning is that if ‘mo:member’ is employed as predicate in a triple, its subject will belong to the class ‘mo:MusicGroup’, and its object to the class ‘foaf:Agent’. The fourth statement means that any resource belonging to the class ‘mo:MusicGroup’ will also belong to the (more general) class ‘foaf:Group’.
An important gain in adding such statements is that they allow new facts to be inferred from existing ones.
Consider for instance how they may be combined with the statement (just given) that Ringo is a member of the Beatles. Using the domain and range statements for the property ‘mo:member’, it follows directly that the Beatles are a music group, and that Ringo is an agent; using the subClassOf statement, it follows further that the Beatles are a group.
Encoded in Turtle, these inferred facts are as follows:
dbpedia:The_Beatles rdf:type mo:MusicGroup. dbpedia:Ringo_Starr rdf:type foaf:Agent. dbpedia:The_Beatles rdf:type foaf:Group.
RDFS also contains some predicates for linking a resource to information useful in presentation and navigation, but not for inference. These include the following:
Associates a resource with a human-readable description of it.
Associates a resource with a human-readable label for it.
Associates a resource with another resource that might provide additional information about it.
A sub-property of ‘rdfs:seeAlso’, indicating a resource that contains a definition of the subject resource.
The Web Ontology Language (OWL) extends RDFS to provide an implementation of a description logic, capable of expressing more complex general statements about individuals, classes and properties.
OWL was developed in the early 2000s and became a W3C standard (along with RDFS) in 2004. The acronym OWL was preferred to the more logical WOL because it is easier to pronounce, provides a handy logo, and is suggestive of wisdom. Of course the name also reminds us of the character in ‘Winnie the Pooh’ who misspells his name ‘Wol’.
The reason for choosing description logic, rather than a more expressive kind of mathematical logic, has already been mentioned: the aim was to achieve fast scalable reasoning services, and hence to use a logic for which efficient reasoning algorithms were already available.
In fact description logics are more a family of languages than a single language. They can be thought of as a palette of operators for constructing classes, properties and statements, from which the user can make different selections, so obtaining fragments with different profiles of expressivity and tractability.
The OWL standard is under constant development, and the current version OWL 2.0 provides for the fragments shown in Figure 1.4; their meanings are as follows:
OWL 2 Full
Used informally to refer to RDF graphs considered as OWL 2 ontologies and interpreted using the RDF-Based Semantics.
Figure 1.4 OWL Languages
OWL 2 DL
Used informally to refer to OWL 2 ontologies interpreted using the formal semantics of Description Logic (‘Direct Semantics’).
OWL 2 EL
A simple fragment limited to basic classification, allowing reasoning in polynomial time.
OWL 2 QL
A fragment designed to be translatable to querying in relational databases.
OWL 2 RL
A fragment designed to be efficiently implementable using rule-based reasoners.
As already explained, a detailed understanding of OWL is not necessary for working with Linked Data. When reasoning over huge amounts of data, only the simplest reasoning processes are computationally efficient, and these can for the most part be implemented using only the resources of RDFS. Very briefly, the additional resources in OWL are terms providing mainly for the following:
Class construction: forming new classes from existing classes, properties and individuals (e.g., ObjectIntersectionOf);
Property construction: distinguishing object properties (resources as values) from data properties (literals as values);
Class axioms: statements about classes, describing sub-class, equivalence and disjointness relationships;
Property axioms: statements about properties, including relationships such as equivalence and sub-property, and also attributes such as whether a property is functional, transitive, and so forth;
Individual axioms: statements about individuals, including class membership, and whether two resources represent the same individual or different individuals.
In the next step, we introduce the final key standard underlying the Semantic Web: SPARQL. Later in the course we cover SPARQL in a lot more detail, and you’ll learn how to write a range of queries to retrieve information from sets of triples.
This work is a derivative of ‘Using Linked Data Effectively’ by The Open University (2014) and licensed under CC by 4.0 International Licence adapted and used by the University of Southampton. http://www.euclid-project.eu/
O. Lassila and R. Swick (1999) “Resource Description Framework (RDF) Model and Syntax Specification”. Published on-line at http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/. ↩