Skip main navigation

£199.99 £139.99 for one year of Unlimited learning. Offer ends on 28 February 2023 at 23:59 (UTC). T&Cs apply

Find out more

SARS-CoV-2 genomic landscape

Overview of the genomic landscape of the SARS-CoV-2 virus.
© COG-Train

All of us have been impacted in some way by SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2). You are likely to be familiar with a city in China that you never heard of before the pandemic, Wuhan – where this virus was isolated in 2019.

SARS-CoV-2 is a single-stranded RNA virus of ~30 Kb (positive sense) (Figure 1). Despite being incredibly small, compared to the human genome, which is approximately 6 Gbp, SARS-CoV-2 is the largest known RNA virus. SARS-CoV-2 has a crown-like shape which contains four main structural proteins, namely the: spike (S), envelope (E) glycoprotein, nucleocapsid (N), membrane (M) protein, along with 16 nonstructural proteins, and 5-8 accessory proteins.

We are interested in the structure of a virus/organism because it gives us information about how the virus is able to shed, spread, infect and subsequently cause disease (pathogenesis).

Esquematic illustration of SARS-CoV-2: a purple sphere covered with green spikes. Spike, Nucleocapsid, Membrane, Envelope and ssRNA are indicated inside the sphere

Click here to enlarge the image

Figure 1 – The SARS-CoV-2 virus with its proteins. Source Frontiers

The accessory proteins participate in the viral replication, assembly and in virus-host interactions.

The non-structural proteins are like maintenance workers which act as enzymes, coenzymes, and binding proteins to facilitate the replication, transcription, and translation of the virus.

Finally, the structural proteins are essential for the host cells’ binding and invasion. The 3D shape of a spike protein is shown in Figure 2 below. This protein allows the virus to enter human cells by binding to human ACE receptors in the respiratory epithelium.

This article provides a detailed explanation of this process.

Illustrative image of the SARS-CoV-2 genome and its proteins: a bar with rectangles representing SARS-CoV-2 genes and protein structures of protease, endoribonuclease and spike

Click here to enlarge the image

Figure 2 – Various enzymes and the spike protein of SARS-CoV-2. Source: StatPearls Publishing LLC

The spike proteins are also frequently the site of mutations, resulting in diverse new variants. Mutations are natural random events which occur in viruses during replication. A variant has one or more mutations that may allow it to be distinct in its transmissibility, virulence, pathogenicity or response to vaccines. Currently, variants have been grouped into four different categories which are: a Variant of Interest (VOI), a Variant Being Monitored (VBM), a Variant of High Consequence (VOHC) and a Variant of Concern (VOC).

This article provides a detailed explanation of this process and you can read more about these proteins in here

A VOC usually contributes to outbreaks due to its increased transmission fitness and/or immune evasion ability. The most notable VOCs are currently called Alpha, Beta, Gamma and Delta; firstly described in the United Kingdom, South Africa, Brazil and India, respectively. The latest and most transmissible variant has been labelled Omicron, which was also first reported in South Africa. The location of where variants are first described is highly linked to the amount and quality of genomic surveillance occurring in a particular country. We can only find variants if we are looking for them at a genomic level.

The CDC’s website has up-to-date information to keep you abreast of the range of variants.

Mutations may also lead to the formation of new lineages. Lineages are described as being a genetically closely related group of viral variants which are derived from a common ancestor. Tracking lineages informs us of outbreaks and of the spread of a virus in a community or within populations.

Zoonotic coronaviruses have caused several outbreaks over the last two decades and many of the coronaviruses present in other mammals have the potential to infect humans. Therefore our continuous efforts to understand the origins and evolution of SARS-CoV-2 will remain critical.


Features, Evaluation, and Treatment of Coronavirus (COVID-19)

CDC Coronavirus Disease 2019 (COVID-19)

Defining a New Strain of a Virus

Omicron: What Makes the Latest SARS-CoV-2 Variant of Concern So Concerning?

The Genomic Landscape of Severe Acute Respiratory Syndrome Coronavirus 2

SARS-CoV-2 Proteins: Are They Useful as Targets for COVID-19 Drugs and Vaccines?

Public Health Responses to COVID-19 Outbreaks on Cruise Ships — Worldwide, February–March 2020

COVID-19, a worldwide public health emergency

Antivirals Against Coronaviruses: Candidate Drugs for SARS-CoV-2 Treatment?

On the origin and evolution of SARS-CoV-2

Aerosol and Surface Stability of SARS-CoV-2 as Compared with SARS-CoV-1

Relationship between the ABO Blood Group and the COVID-19 Susceptibility

© COG-Train
This article is from the free online

Making sense of genomic data: COVID-19 web-based bioinformatics

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education