Skip main navigation

New offer! Get 30% off one whole year of Unlimited learning. Subscribe for just £249.99 £174.99. New subscribers only. T&Cs apply

Find out more

Genomic variant interpretation workflow

An article explaining the workflow for genomic variant interpretation

When a change occurs in the genome, it could have an impact on the function of the cell and in some cases can even lead to disease.

Genomic variant interpretation is the practice of using various sources of evidence to decide if a genomic change could potentially be disease-causing (i.e., a pathogenic variant). It is not always straightforward to predict pathogenicity – sometimes we don’t have enough evidence to make a clear call, and sometimes the disease is so complex that it makes it difficult to interpret the evidence at hand.

Next Generation Sequencing (NGS) is an advanced genomic sequencing technique that has made it possible to survey many (and in some cases all) nucleotides in the genome all at once. NGS has transformed the way we interpret variants, and has massively impacted how people with genetic conditions are diagnosed. Although many of the steps in an NGS workflow are similar to standard clinical diagnostic testing (e.g. obtaining a test request to delivering a report), the process of variant interpretation (which includes variant annotation, prioritisation and classification) is unique. The full process is summarised below in Figure 1. These components illustrated in this figure are common to all high-throughput sequencing tests, but modifications to the details of each component can result in differences in data quality and accuracy.

A workflow from request to report of a genomic sequencing: 1) Test request 2) NGS DNA sequencing 3) Sequence mapping & alignment 4) Variant annotation: Gene nomenclature; Consequence 5) Variant prioritisation: Gene-disease association; Variant type; Sequence quality; Population frequency; Inheritance 6) Variant classification & case interpretation: Supportive of pathogenicity?; Decide if variant classification matches clinical features. A line indicates a secondary process from 6 to 4: interactive process over time. 7) Variant confirmation & segregation analysis 8) Reporting Click to expand

Figure 1. An overview of the genomic sequencing workflow. Adapted from Marshall et al 2020. Source: npj Genomic Medicine

Variant interpretation relies on gathering information about the variant under review from several data resources to ultimately decide if the variant is likely to be involved in the disease under investigation. The steps in the interpretation process are as follows:

  1. Variant annotation: The process of adding information about a variant or a gene to allow for the interpretation process. The more information available, the better equipped we are to interpret the potential impact of a variant.
  2. Variant prioritisation: The annotated information is reviewed to decide on the likelihood that a variant has a disease-causing impact, and to prioritize variants on this basis. This review should consider the frequency of a variant (e.g., common variants are unlikely to cause a rare disease); its inheritance pattern (e.g., two pathogenic alleles should be present in a recessive disorder, whereas only one pathogenic allele will lead to a disease state in a dominant disorder); and its potential impact (e.g., does a genetic variant lead to a change in the protein, or to another alteration that might disrupt function?)
  3. Variant classification: Decide whether a variant is expected to be pathogenic on the basis of the standardised guideline for the interpretation of variants issued by The American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology.

Variant interpretation is an involved, iterative process that requires a lot of manual intervention. However, new developments in artificial intelligence tools have helped automate some parts of the workflow. A multidisciplinary team of experts, including clinicians, genetic counsellors and laboratory scientists, are often involved in the process.

Publicly available datasets need to be as comprehensive and diverse as possible given the importance of the data needed to make a decision on whether a variant is likely to be pathogenic. Representation of information from all global population groups and from as many clinical resources as possible is essential in this process. For instance, it is well known that genetic variants differ in frequency in a population-specific manner. If allele frequency information is used to decide whether a variant is likely to be involved in disease pathogenesis or not, it is essential to include information from groups that match the ethnicity of the patient. Unfortunately, we know that as few as 22% of participants in genomics research are of non-European ancestry and that the majority of available genetic data come from just three countries – the United Kingdom (40%), the United States of America (19%) and Iceland (12%). This lack of diversity in genetic research presents an ethical barrier to the realisation of precision medicine for all.

© Wellcome Connecting Science
This article is from the free online

Interpreting Genomic Variation: Overcoming Challenges in Diverse Populations

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now