Skip main navigation

Metadata and the FAIR principles

Systematically documented data is key to making data understandable, findable, citable, accessible, and reusable (FAIR).
Two people writing on papers

Systematically documented data is key to making the data understandable, findable, citable, accessible, and reusable. Metadata is structured information that describes, explains, and locates data, making it easier to retrieve, use or manage it. It is a critical component in making data FAIR.

What is Metadata?

Metadata is “data about data” and are descriptions that facilitate cataloguing data and data discovery. It is a standard document reporting:

  • WHO created the data?
  • WHAT is the content of the data?
  • WHEN were the data created?
  • WHERE is it geographically?
  • HOW were the data developed?
  • WHY were the data developed?

When you read nutrition facts on food or look up a book in the library you are reading metadata!

Why is it important?

Metadata capture information and can support not only data management but also data distribution. It helps avoid data duplication, share reliable information, and promote the work of a scientist and their contributions to a field of study. Metadata reuse saves time and resources in the long run. It can be said that metadata completes a dataset.

Metadata gives a user the ability to:

  • Search, retrieve, and evaluate dataset information from both inside and outside an organization
  • Find data: Determine what data exists for a topic and/or geographic location
  • Determine applicability: Decide if a dataset meets a particular need
  • Discover how to acquire the dataset identified; process and use the dataset
  • Understand the dataset, including definitions of column names, or expected numerical ranges found in the data

Metadata can be both at the project and data level. At the project level, it would explain things like the aims of the study, technology used, and who is linked to the data while at the data level it would contain more information like file type, format, and even notes about missing values.

How to Get Started with Metadata

Metadata standards provide structure for which to describe the data. Like data management plans, your institution might have a specific metadata standard that you should use. Some data types also have standards but movement analysis does not yet have one. If you are choosing your own, there are a good collection of standards and online tools. RDA Alliance Metadata Directory and DCC list of Metadata Standards are good collections of standards while Fair Sharing has recommendations and collections of standards. Starting with something general and simple like Dublin Core is a good idea.

FAIR Principles

The FAIR principles are intended to provide guidelines to improve the Findability, Accessibility, Interoperability, and Reuse of data. They are intended to help facilitate knowledge discovery by humans and machines. Metadata is of critical importance to data developers, data users, and organizations when making data FAIR.

  • Findable: The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Data should have rich metadata and a persistent identifier.
  • Accessible: Once the user finds the required data, they need to know how can it be accessed. This includes understanding authorization/authentication. The metadata should remain available even if the actual data is not.
  • Interoperable: The data need to interoperate with applications or workflows for analysis, storage, and processing as well as integrated with other data. Proprietary file types should be avoided whenever possible. Metadata should be shared, accessible in broadly applicable language for knowledge representation.
  • Reusable: The ultimate goal of FAIR is to optimize the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.

Fair principles.

The general idea is that Open Data (mentioned earlier this week) needs to be FAIR. After all, if you cannot find and access the data, they are not open. However, FAIR data does not necessarily need to be open. In movement analysis, we frequently handle personal data and copyright material so this data would not be publicly available. However, when the data follows the FAIR principles, they are findable, and they can be accessed if one applies to use them. So the data’s FAIRness secures that the data may be reusable even if they are not published openly.

Examples of data growing increasingly fair.

References

This article is from the free online

Motion Capture: The Art of Studying Human Activity

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education