Principles of linked data

In his 2006 note, Berners-Lee set out four simple principles for publishing data on the web. These are best seen as rules of best practice rather than rules that must be obeyed: the idea is that the more people follow these principles, the more their data will be usable by others.

In brief, the principles are as follows:

  1. Use URIs to identify things.
  2. Use HTTP URIs so that people can look up those names.
  3. When someone looks up a URI, provide useful information, using the standards (RDF, RDFS, SPARQL).
  4. Include links to other URIs, so that they can discover more things.

The rationale for these principles is probably obvious. By using URIs to identify individuals, classes, and properties, we obtain names that perform a double duty: as well as referring to the relevant thing, they give us a location on the web where we may look for information about that thing.

Other naming schemes accomplish only the first of these duties. However, to obtain benefit from a name that also serves as a web address, the URI should not be a broken link. It should point to relevant information, encoded in one of the expected formats. This benefit will be enhanced further if the information includes URIs that point to other locations on the web from which additional relevant information might be recovered.

What do you think would be an effective way of measuring how well a dataset conforms to these principles? How do you think datasets could be rated? You might want to think about how other things are rated in real life, such as restaurants, hotels etc.


Share this article:

This article is from the free online course:

Introduction to Linked Data and the Semantic Web

University of Southampton

Get a taste of this course

Find out what this course is like by previewing some of the course steps before you join: