What is metadata?

Metadata is often called the ‘data that describes other data’. What that means is metadata turns a series of numbers into something more meaningful.

Metadata includes a wide range of information which could include: who or why the data was recorded, what units the data are in, or even any copyright that might apply to using the data. Scientists often want more data to explore and compare with other datasets, so when using data that someone else has produced, metadata is vital.

Metadata: Data with no nasty surprises
Identification
You find a tin but the label is missing. How long has it been there? What’s in it exactly?
Identification is easier if the tin is labelled, so it is obvious where to put it in a kitchen, or where to store it in a warehouse.
Use
You could go ahead and open the unlabelled tin, but how confident are you that you will recognise and be able to use the contents – do you want a tin of custard for dinner? How would you cook it without
instructions? The label has essential information including the sell by date and the exact weight or volume.
Standardisation
Some parts of the label are for humans to read, and some for computers. The humans want to know the calories or allergens. Barcodes are for identifying the item in a database, for stock control or sales records.
Summary
The mystery unlabelled tin says a lot about encountering undocumented data that someone else has produced. The label, or metadata, is vital for identification and correct use of the contents. The more information provided on the label, especially standardised information, the better See larger version of infographic.

In the next Step, you’ll hear Professor David Wallom explain why metadata is so important. Don’t forget to mark this Step as complete before you move on.

Share this article:

This article is from the free online course:

Big Data and the Environment

University of Reading