Skip main navigation

The importance of sequencing in vaccine trials

In this article Stephen Bridgett describes the importance of sequencing in vaccine clinical trials


Viruses replicate and evolve constantly. These changes may alter how effective a vaccine is against the currently circulating viruses. The best way of detecting these alterations is by genomic sequencing.

When a virus or bacterium enters a host, such as humans, the host has an innate (nonspecific) immune defence, and an adaptive (specific) immune response. The innate response includes cell membranes, mucus, enzymes, cytokines, Toll-like receptors, and macrophage cells. The adaptive immune response includes antibodies, B- and T- lymphocyte cells (part of the ‘white blood cells’).

A vaccine aims to stimulate the adaptive immune response, so that it ‘remembers’ and will more quickly recognise and protect against future infections by the targeted pathogen.

Figure 1 shows how antibodies (produced by the adaptive immune system’s B-cells) can attach to the surface of the virus or bacterium. These attached antibodies can prevent the virus attaching to and entering the cell by blocking the binding-site at the tip of the virus spikes. They can also activate a complement cascade, and flag to immune cells, such as macrophages to engulf (phagocytose) the virus.

Antibodies and spike. A virus about to infect a host cell. The human cell has viral receptors on its surface. Antibodies are binding to the viral spike protein, preventing the attachment virus-cell

Figure 1: Antibodies attaching to the Spike protein on surface of the SARS-CoV-2 virus, to prevent the virus attaching to ACE2 Receptors on human cells. Source: NIH

A vaccine can also stimulate development of a T-cell response, for example: CD4 T-cells can increase the antibody response, and CD8 T-cells identify and remove host cells that have been infected by the virus.

Vaccine development

Before COVID-19, developing and testing a new vaccine could take 10 to 15 years due to the time to identify and test targets, run clinical trials, the time the regulatory agencies take to evaluate the data presented and other factors related to vaccine development. However, vaccine technology has evolved over the years:

  • Attenuated (weakened, eg. heat-treated) live vaccines were the first vaccines used (eg. Anthrax, BCG);
  • Recombinant DNA inserted into yeast was later used to synthesise a surface protein from a virus or bacteria (eg. Hepatitis B surface antigen).
  • More recently mRNA enclosed in lipid droplets or inside another viral vector, can be used so the host body’s cells synthesize a part of the viral protein (eg. the SARS-CoV-2 Spike protein).

The unprecedented speedy development of COVID-19 vaccines was a result of available state-of-the-art vaccine technologies united to the parallelisation of clinical trial phases. Additionally, in response to this public health emergency, regulatory agencies processed COVID-19 vaccines as a top priority.

In vaccine development, ‘Original Antigenic Sin (OAS)’ is an important phenomenon where “the development of immunity against pathogens/antigens is shaped by the first exposure to a related pathogen/antigen.”. For example, if someone is first vaccinated using the Spike protein, then the body remembers the immune response it produced (eg. antibodies from B-cells, and T-cell response). If they are later exposed to the actual complete virus or even a different viral variant, the immune system will tend to respond in a similar way as it had to the Spike protein vaccination.

Different antibodies are produced in response to the COVID-19 vaccine (and to the virus itself), which bind to several positions on Spike. Knowledge of the RNA sequence and resulting protein structure and shape can help understand how antibodies to the existing vaccine spike protein would interact with the evolving virus. Figure 2 illustrates how different classes of antibodies bind to the receptor-binding domain (RBD) on the Spike protein. Tong et al (2021) explain that “seven major epitopic regions of SARS-CoV-2 spike are consistently targeted by human antibodies”.

Tri-dimentional structure of SARS-CoV-2 Receptor-Binding Domain (RBD). Decorative object

Figure 2: Characterizing SARS-CoV-2 antibodies, showing each class of antibody bound to the Receptor-Binding Domain (RBD) (grey) of the Spike protein. Each of the SARS-CoV-2 virus spikes is composed of three identical copies of the Spike protein, each having it’s own RDB. This is based on cryo-electron microscopy imaging. Source: C&EN

Vaccines Trials

In developing a new vaccine, it is very important to determine, as early as possible, how safe and effective the new vaccine is, and how long it’s effectiveness will last. Clinical trials of a new vaccine have four phases:

  • Phase 1: the first time the vaccine is tested in humans, using a small group of healthy adults, to assess safety (recording side-effects) and immune response.
  • Phase 2a/2b: determines the most effective dose, and generate more safety experience. It is tested in more people.
  • Phase 3: determines how effective the vaccine. Volunteers are given either the vaccine or a placebo, observed for side-effects, and followed up to see who subsequently develops an infection, and its severity. This is tested in hundreds or thousands of volunteers, as using a large number of people gives greater statistical power to the study. The results are submitted for approval to one of the international regulators (eg. MHRA/EMA/FDA).
  • Phase 4: monitors if the approved/licensed vaccine stays safe and effective when it is rolled out to the public.

In phase 3, ideally, a fixed number of weeks after receiving the vaccine or placebo, each volunteer would be given a similar dose of identical strains of the pathogen, to compare effectiveness of vaccine versus placebo. However, the severe, potentially fatal, effect of the SARS-CoV-2 in some volunteers, means that intentionally infecting participants could pose ethical challenges.

Sequencing of Virus variants

Some types of viruses and bacteria have fairly stable genomes over time, whereas others change (drift or evolve) to become more infectious and/or evade or suppress immunity in the host population, or generate different symptoms that can be more or less severe. For example, the initial Wuhan SARS-CoV-2 lineage, replaced by Alpha, then later Delta, led to the emergence of a Delta+ (AY.4.2) variant, and recently replaced by Omicron varied in terms of rates of infection and severity of symptoms. Not all variants or lineages will spread Worldwide and these events are random and scattered geographically.

The current COVID-19 vaccines are mostly based on the original Wuhan variant, so these vaccines would be expected to be most effective against virus variants with a Spike protein most similar to Wuhan, and potentially less effective against variants with Spike proteins that differ significantly.

Sequencing of the virus variants can benefit vaccine research, including during the phase 3 and 4 trials.

In Phase 3, sequencing enables comparison of the viral variants that subsequently infect people in the placebo with those in vaccinated groups, and between different age/risk groups. Thus, for example: an increased proportion of a particular viral variant in the vaccinated group (relative to the placebo group and relative to the current strains circulating in the population) could indicate that the vaccine is less effective against that variant. If some viral variants are associated with more severe symptoms than others, identifying the variant could help to determine whether there is an association between a particular variant and the severity of symptoms.

The developing COVID-19 pandemic necessitated fast development and trials of vaccines, with subsequent application for ‘emergency-use’ authorisation (EUA) from regulatory authorities to enable faster roll-out of vaccination programmes, to reduce the number of deaths. However, concerns have been raised that this speed resulted in data integrity issues in Pfizer phase 3 COVID-19 vaccine trials. The shorter than typical phase 3 trials (median 2 month follow-up for EUA rather than the more usual FDA median of 1-4 years) mean that phase 4 has increased importance.

Virus sequencing can also be of benefit in Phase 4, especially in identifying the viral variants in ‘break-through’ infections – ie. in people who have been partially or fully vaccinated. Sequencing can also identify how the emerging variants in the general population are changing over time, and any correlation between symptom severity and new variants. Genomic data can also inform decision making on booster vaccinations, and whether the vaccine should be modified to use mRNA sequenced in more recent viral variants.

COVID-19 has become endemic: it can re-infect people who have been vaccinated or have had COVID-19. As such, similar to influenza vaccinations, annual or 6-monthly booster vaccinations may be needed for those at risk for many years, at least until a very effective treatment is developed to treat those at most risk. Thus, long-term sequencing of the virus will be needed to guide future development and trials of improved vaccines.


Vaccines are an effective way to limit infection and spread of infectious diseases, and reduce severity of symptoms. The on-going COVID-19 pandemic has hastened the development and trials of several vaccines. Sequencing of the virus variants assists in vaccine development, and in phase 3 and 4 vaccine trials, and guides booster vaccine improvements for possible new viral variants for which existing vaccines may be less effective.

For more information on the immune system and the difference between innate and adaptive immunity, there is an interesting overview from Technolgy Networks

© COG-Train
This article is from the free online

The Power of Genomics to Understand the COVID-19 Pandemic

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now