What is linked data?
In 2006 Berners-Lee wrote an influential note suggesting principles for the publication of data on the semantic web. The original text of Berners-Lee’s suggestions can be found at this web address:
Since then the volume of data has grown from around 2 billion triples in 2007 to over 30 billion in 2011, interconnected by over 500 million RDF links, the main purpose of which is to establish chains of URIs that refer to the same individuals.
Figure 1.5: Linked Open Data Cloud diagram
Through such links, published datasets are combined into a vast body of data known as a ‘cloud’.
Figure 1.5 shows the Linked Open Data cloud diagram for 2007, in which nodes represent published datasets, and links represent sets of RDF triples through which the URIs in one dataset are paired with their counterparts in another dataset.
Thus the link from DBpedia to MusicBrainz means that DBpedia includes not only RDF triples that give information about the world, but also triples that link some DBpedia names to their synonyms in MusicBrainz.
We have seen examples of such statements in the last section, including the following triple which links the two names for the Beatles.
<http://musicbrainz.org/artist/b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d> <http://www.w3.org/2002/07/owl#sameAs> <http://dbpedia.org/resource/The_Beatles>.
Note that since the ‘sameAs’ relation is transitive and commutative, two statements of the form ‘X sameAs Y’ and ‘Y sameAs Z’ (or equivalently ‘Z sameAs Y’) can be combined to infer ‘X sameAs Z’; in this way, lists of synonymous names can be derived from the cloud.
In the next steps, we will look at some of the principles behind linked data, and later we will discuss some application domains that make use of linked data.
© This work is a derivative of ‘Using Linked Data Effectively’ by The Open University (2014) and licensed under CC by 4.0 International Licence adapted and used by the University of Southampton. http://www.euclid-project.eu/