Skip main navigation

Public databases benefit epidemiology

Overview of public data bases and their importance in epidemiology
Isometric concept of database. Vector of a businessman holding a folder with documents from the archive managing online digital database
© COG-Train

For researchers, during the COVID-19 pandemic, the most significant mission has been to produce valuable data regarding infection and transmission of SARS-CoV-2. Equally important has been to share such data quickly and unreservedly to be looked at by others in real-time.

Such data sharing has enabled connectivity between researchers and policymakers and between regions across the world, helping to build the foundations of a global surveillance programme. Data sharing has become a valuable way to evaluate (and sometimes prevent) the effect of incoming variants into different latitudes. A relevant example of efficient data production and data sharing, includes the initial identification of the highly contagious Omicron variant in South Africa by Tulio de Oliveira and his team, in November 2021. As a consequence, many countries were rapidly aware of this variant and initiated studies on its possible effect. Tailored PCR methods were developed or adapted to detect this variant (without nucleotide sequencing), reducing the costs of its detection. This is an extraordinary example of how high-quality research can inform global public health.

At the beginning of the pandemic, relevant information was shared on social media or in specialised blogs. Of outstanding importance was the discussion forum Virological where the first whole SARS-CoV-2 genome was shared by Chinese-Australian scientists, and later was the niche to discuss relevant epidemiological and genomic information. For instance, many of the variants of interest or concern were firstly described on this blog (like alfa, gamma, lambda, mu, etc.). The blogging community has also discussed: the possible SARS-CoV-2 origin, whole-genome sequencing protocols, variants classification, and nomenclature.

Early in 2020, not many specialised databases were available to share large quantities of genomic data. For that reason, the National Center for Biotechnology Information (NCBI) nucleotide database in the USA and Global Initiative on Sharing Avian Influenza Data, known as GISAID, gathered data, curated it, and shared it openly. GISAID became the preferred database to share SARS-CoV-2 genomes worldwide, holding sequencing data for a staggering amount of 9.5 million SARS-CoV-2 genomes by March 2022.

Epidemiological databases created to share and visualise COVID-19 information Databases like Our World in Data are reliable information sources used even by the World Health Organization and regional health organisations. This database collects information from at least 57 different sources and provides trustworthy indicators such as infections, hospitalisations, officially reported deaths, and excess deaths. This is especially helpful in developing countries that do not possess open databases or cannot openly share most of their information.

Research findings databases became remarkably popular during the pandemic, for example, preprints servers like bioRxiv and medRxiv where non-peer-reviewed studies are shared with preliminary or complete results. Minimum requisites must be fulfilled to publish on these databases. Whilst lots of information produced was derived from non-verified or even fraudulent data which open scrutiny could not entirely confirm; nevertheless, it changed scientific information access forever. Canonical peer-review journals are still the most reliable approach to obtaining reliable research data. Additionally, most publishers have their published papers directly available or through open-source libraries such as PMC or Pubmed. Finally, wet lab, statistics, and bioinformatics protocols have been also shared in public databases such as GitHub. Such databases provide quick updates on protocols, comments, and adapted pipelines that developers directly share.

© COG-Train
This article is from the free online

From Swab to Server: Testing, Sequencing, and Sharing During a Pandemic

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education