During the 1990s, Berners-Lee and collaborators developed proposals for a further stage of web development known as the Semantic Web.
Web 3.0 Semantic
This far-reaching concept, first publicised in a 2001 article in the Scientific American1, is partly implemented in the current stage of web development sometimes called Web 3.0.
At present we cannot see clearly what lies beyond Web 3.0, but in Figure 1.1 we allow for future stages in Semantic Web development by including a loosely defined further stage ‘Web 4.0’.
Figure 1.1 The Web Evolution
In their 2001 article, Berners-Lee and co-authors pointed out that existing web content was usable by people but not by computer applications.
There were many computer applications available for tasks like planning, or scheduling, or analysis, but they worked only on data files in some standard logical format, not on information presented in natural language text.
A person could plan an itinerary by looking at web pages giving flight schedules, hotel locations, and so forth, but it was not yet possible (then as now) for programs to extract such information reliably from text-based web pages.
The initial aim of the Semantic Web is to provide standards through which people can publish documents that consist of data, or perhaps a mixture of data and text, so allowing programs to combine data from many datasets, just as a person can combine information from many text documents in order to solve a problem or perform a task.
Figure 1.2 From Web of Documents to Web of Data
In information sciences an ontology is a specification of a conceptualisation.
This means they are a formal description of objects, concepts and entities that exist in a particular domain, along with the relationships among them 2. Ontologies are one of the essential pillars of the Semantic Web.
Datasets usually encode facts about individual objects and events, such as the following two facts about the Beatles (shown here in English rather than a database format):
The Beatles are a music group
The Beatles are a group
There is something odd about this pair of facts: having said that the Beatles are a music group, why must we add the more generic fact that they are a group?
Must we list these two facts for all music groups – not to mention all groups of acrobats or actors etc.? Must we also add all other consequences of being a music group, such as performing music and playing musical instruments?
Ontologies allow more efficient storage and use of data by encoding generic facts about classes (or types of object), such as the following:
Every music group is a group
Every theatre group is a group
It is now sufficient to state that the Beatles (and the Rolling Stones, etc.) are music groups, and the more general fact that they are groups can be derived through inference.
Ontologies thus enhance the value of data by allowing a computer application to infer, automatically, many essential facts that may be obvious to a person but not to a program.
To allow automatic inference, ontologies may be encoded in some version of mathematical logic. There are many formal logics, which vary in expressivity (the meanings that can be expressed) and tractability (the speed with which inferences can be drawn).
To be useful in practical applications it is necessary to trade expressivity for tractability, and description logic, which is implemented in the Web Ontology Language OWL, does precisely this.
However, despite these restrictions on expressivity, OWL cannot yet be used efficiently for inference over very large datasets, as required by Linked Data applications.
For this reason, most reasoning for Linked Data relies on the far simpler logical resources of RDF-Schema, with OWL used sparingly if at all. We cover these standards later in this week, but first we will go over some of the protocols that allow the implementation of the Web itself.
T. Berners-Lee, J. Hendler and O. Lassila (2001) ‘The Semantic Web’. Scientific American vol. 284 number 5, pp 34-43. Available on-line at http://www.scientificamerican.com/article.cfm?id=the-semantic-web. There is currently a charge of $7.99 USD to download a copy. ↩
Gruber, T.R., 1993. A translation approach to portable ontology specifications. Knowledge acquisition, 5(2), pp.199-220. ↩
© This work is a derivative of ‘Using Linked Data Effectively’ by The Open University (2014) and licensed under CC by 4.0 International Licence adapted and used by the University of Southampton. http://www.euclid-project.eu/