Skip main navigation

Get 30% off one whole year of Unlimited learning. Subscribe for just £249.99 £174.99. T&Cs apply

Metadata

Metadata
group of people surrounded by words 'data'

Depending on what is being uploaded to the public databases, you might need to supply metadata. Metadata can describe data attached to the biosample or data attached to the reads.

Biosample metadata

There are many possible types of biosample metadata. NLM-NCBI has a selection of templates that can be downloaded as spreadsheets for metadata that is required or optional for the type of data you are uploading. These templates are called packages and contain a prescribed set of mandatory data and many optional fields. There are many packages available to choose from, depending on the biological source of the sequencing data.

In general, the pathogen package is mostly appropriate for sequences generated from a primary sample, such as an environmental or host sample. If the sequence is generated from cultured or passaged material then the MIxS may be more appropriate. Detailed information about the attributes of the pathogen package can be found here.

In general, fields that are mandatory cannot be left empty, but if the data is unavailable you may be able to use ‘not collected’, ‘not applicable’ or ‘missing’. Do not leave these fields empty, but make sure that no row has identical information in all columns. Optional fields can be left empty. You can add your own custom fields to describe the sample better if you choose. An example of what the record for a biosample looks like when using the pathogen package can be found here.

Read metadata (SRA)

Read metadata describes the reads, and can include the sequencing platform and the date sequenced. A template can be found here, this file contains the controlled vocabulary. Also note that both the biosample metadata and SRA metadata can be filled out in the online wizard at the time of submission.

Exercise and discussion: download and examine a template that is relevant for your context. Check the required fields, data formats, and guidelines needed for submission – do you have all the relevant information? Make comments in the section below.
© Wellcome Connecting Science
This article is from the free online

Antimicrobial Databases and Genotype Prediction: Data Sharing and Analysis

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now