Skip main navigation
We use cookies to give you a better experience, if that’s ok you can close this message and carry on browsing. For more info read our cookies policy.
We use cookies to give you a better experience. Carry on browsing if you're happy with this, or read our cookies policy for more information.

What’s coming this week?

Throughout this week, we will describe a set of technologies that allow datasets to be published over the web, and queried effectively by applications.

Compared with search engines such as Google and Yahoo, which are based on text-string matching, these technologies are ‘semantic’. This means that information is represented not in a natural language like English or Spanish, but in a graph-based data model that facilitates extension, integration, inference and uniform querying.

As a realistic application of semantic technologies, we will be using a portal through which learners can retrieve resources and information in the world of music. Consider for example the following tasks:

  • Retrieve a performance of the Beethoven violin concerto by a Chinese orchestra
  • Retrieve a photograph of the conductor of this performance
  • List male British rock musicians married to Scandinavians

Attempts to answer such queries through text-based search are unreliable: we might equally retrieve a performance in which the soloist was Chinese, or a rock musician that plays Scandinavian music.

Using semantic technologies, resources such as the audio file of the performance, or the photograph of the conductor, can be annotated using the Resource Description Framework (RDF).

In this framework, formal names can be assigned to what are called resources, which would include Beethoven, his violin concerto, the orchestra, and the conductor.

Names can also be assigned to types (or classes) of resource (composers, concertos, etc.), and to relationships (or properties) that link resources (e.g., the ‘composed-by’ relationship between composition and composer).

By reasoning over facts encoded in this way, a query system can confirm that a performance was given by the Beijing Symphony Orchestra, that this orchestra is based in Beijing, that Beijing is located in China, and so forth – thus combining geographical and musical knowledge in order to retrieve an answer.

In designing these semantic technologies, a key design decision was to leave open the naming of resources and properties, provided that names conform to the format for web resource names – that is, provided they are Uniform Resource Identifiers or URIs.

All four of the URIs below could be names for Beethoven, illustrating that the URI need not be human-readable (e.g., it might be an arbitrary string of letters and numbers), although identifiers should be resolvable to RDF representations that include human-readable labels, as explained later.

http://rdf.freebase.com/ns/en.ludwig_van_beethoven
http://dbpedia.org/resource/Ludwig_van_Beethoven
http://musicbrainz.org/artist/1f9df192-a621-4f54-8850-2c5373b7eac9#_
http://data.nytimes.com/N30866506154608358173

Note: we are aware the data.nytimes.com URI above does not currently work. We have left it there to serve the example and show an additional name for Beethoven.

If data from different sources are to be combined, it is therefore important to establish links, for instance through statements indicating that the above four URIs are synonymous. These statements, which can also be expressed in RDF, provide a means by which data published by many people or organisations can be combined into linked data.

We will be using the case scenario of a music portal to illustrate topics about describing resources in RDF, and querying these using SPARQL.

For examples of existing music portals, you can look at the BBC music reviews site and the Internet Archive’s Live Music Archive (also sometimes known as ‘tree’ site.

These applications make use of a music ontology and a large dataset of musical information called MusicBrainz, which we use in this course.

We will be using the MusicBrainz large dataset of musical information later on in the course.

Linked data results from a coming together of earlier ideas and technologies. These include hypertext, databases, ontologies, markup languages, the Internet, and the World Wide Web. We will begin this week by looking at the technologies that underpin linked data.


Share this article:

This article is from the free online course:

Introduction to Linked Data and the Semantic Web

University of Southampton

Get a taste of this course

Find out what this course is like by previewing some of the course steps before you join:

  • Welcome to the course
    Welcome to the course
    video

    Watch Dr Elena Simperl & Dr Barry Norton explain how this short course on linked data & the semantic web can help you use this technology in your work

  • Developing real world applications
    Developing real world applications
    video

    Watch Dr Barry Norton describing some real world applications that have Linked Data as their underlying technology.

  • Welcome to Week 2
    Welcome to Week 2
    video

    Watch Dr Barry Norton explain what you will learn about SPARQL (the query language) on this course and what you will be practicing.

Contact FutureLearn for Support