Want to keep learning?

This content is taken from the University of Southampton's online course, Introduction to Linked Data and the Semantic Web. Join the course to learn more.

Skip to 0 minutes and 3 seconds BARRY NORTON: Once we have the four principles in mind for how to use those technology in the best possible way, we also have a star rating scheme to look at the data that people expose on the web, and say how well does it conform to those principles and best practice? 5 stars is where following the four principles aims to get you, but there are star ratings for people who are part of the way there, who are not necessarily using all of the linked data technologies or maybe you’re not using them all properly. So we get one star for just putting data on the web.

Skip to 0 minutes and 43 seconds So if we have somehow a web page about the Beatles, that will give us at least one star because we’ve put something online about the Beatles, if it is resolvable from HTTP, which means if it’s on the web. We’ll get two stars for, at the technical level, the data being machine readable in some way. Now what you’ll find when people have legacy data is the crudest thing they can do if the legacy data’s on paper is to scan it all and put images online from scans or put those images into PDFs and publish their company handbook, their company product catalogue just as images within a PDF or as images on the web online. That would only get you one star.

Skip to 1 minute and 41 seconds To get two stars, it has to be machine readable.

Skip to 1 minute and 47 seconds In that case you need a reliable optical character recognition over so the data is machine accessible. Or better, the normal case, you put a database online. So your database, perhaps your relational database, has been transformed into HTML pages, and the data’s there, machine readable within the HTML. For that, you get yourself two stars. You get yourself three stars by using some non-proprietary formats for that data. So no PDFs, no Excel documents, but instead, something more open. So plain text is open as it can be. A CSV file instead of an Excel spreadsheet will get you three stars.

Skip to 2 minutes and 40 seconds If you want to make it up to four stars, you need to identify the things that you’re talking about, using an open standard. Ideally, linked data best principles would tell you use HTTP URIs, but other identifiers are available, even within the URI scheme. So if you use stock book IDs, you consistently used IDs when you’re referring to things within your data, you’ve consistently used IDs, so you get yourself four stars. To get five stars, you have to link those IDs, link the things within your data to other people’s data as well. So the best way to get five stars is to follow the four linked data principles.

Skip to 3 minutes and 34 seconds So four star says consistently use IDs, linked data best principles says consistently use HTTP URIs as those IDs. Five star says link to other people. Following on that argument, the linked data best principles say easiest way to do that you use HTTP URIs, other people use HTTP URIs, and you reuse other people’s URIs in your data. How do you do that? Easiest way, by using RDF. I relate two things that have URIs together under a particular relationship by asserting an RDF triple to relate the two things.

5-star linked open data

In 2010 Berners-Lee extended the note referenced earlier to propose a system for rating datasets, based on the five-star rating system used for hotels.

Closely related to the principles from the previous step, the 5 Star Linked Open Data system is as follows:

  • One-star (*): The data is available on the web with an open license.

  • Two-star (**): The data is structured and machine-readable.

  • Three-star (***): The data does not use a proprietary format.

  • Four-star (****): The data uses only open standards from W3C (RDF, SPARQL).

  • Five-star (*****): The data is linked to that of other data providers.

Note that every level here includes the previous levels: thus for instance three-star data must also be available on the web in machine-readable form.

Watch Dr Barry Norton illustrate the different types of data rating.

This work is a derivative of ‘Using Linked Data Effectively’ by The Open University (2014) and licensed under CC by 4.0 International Licence adapted and used by the University of Southampton. http://www.euclid-project.eu/

Share this video:

This video is from the free online course:

Introduction to Linked Data and the Semantic Web

University of Southampton

Get a taste of this course

Find out what this course is like by previewing some of the course steps before you join: