Skip main navigation

The International Nucleotide Sequence Database Collaboration (INDCS)

INDCS
INSDC network illustration with three members DDBJ ENA GenBank

The most widely shared biological data type is sequence data, either in its raw or processed (e.g. assembled) form. Therefore much effort has been put into ensure this data type is easily accessed by all.

If you want to share your data or access data shared by others in your work you will inevitably come across The International Nucleotide Sequence Database Collaboration. INSDC has the aim to store and preserve sequence data.

There are three primary members DDBJ (DNA Databank Japan), NLM-NCBI (National Library Medicine – National Center for Biotechnology Information) and ENA (European Nucleotide Archive), which are mirrored, thus sequences shared with one are shared with all.

Why it is important to share data has been discussed earlier in this course in week 1 and step 3.7. To reiterate, sharing data is important for many reasons and can help researchers and the scientific community:

Backup data

  • Data is stored in 3 redundant systems
  • If your system is corrupted data is retrievable

Public Health is a global problem

  • Contribute to global knowledge
  • Identify if your sequences are related to anything in another part of the world
  • Provide access to quality data for other places in the world
  • Your data contributes to database curation
  • Your data can be integrated into databases and bioinformatic tools which makes them easier for you to use
  • Your data can be used in the Pathogen Detection Portal and other global resources
  • Data can be used to develop novel strategies and identify novel markers of resistance

Each of the repositories, DDBJ, ENA and NLM-NCBI all hold the same data, however each one has a different interface and mechanisms for uploading and downloading. The focus of this section will be on NLM-NCBI as it is feature rich and allows for automated uploads. You can find further information for ENA here.

INSDC diagram - the structureClick to expand

Figure 1 INSDC structure and some tools these repositories directly connect to https://doi.org/10.1186/s42522-020-00026-3
© Wellcome Connecting Science
This article is from the free online

Antimicrobial Databases and Genotype Prediction: Data Sharing and Analysis

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now