Skip main navigation

New offer! Get 30% off one whole year of Unlimited learning. Subscribe for just £249.99 £174.99. New subscribers only. T&Cs apply

Find out more

Glossary

glossary of common terms and acronyms used in this course

Glossary of less known terms used in this course

If you are new to this topic, you will be introduced to many new terms. Download the glossary and use this as your reference throughout the course. Even if you have worked in genomics for some time, this glossary will be useful to refresh your memory. This glossary allows us to use a common language throughout this course.

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

If a term you are looking for is not on the list please ask for it in the comments section.

A

Allele

An allele is one of two or more versions of a DNA sequence (a single base or a segment of bases) at a given genomic location. An individual inherits two alleles, one from each parent, for any given genomic location where such variation exists. If the two alleles are the same, the individual is homozygous for that allele.

Alternative splicing

Alternative splicing is when a single gene produces a number of distinct mRNAs.

American College of Medical Genetics and Genomics (ACMG)

An organisation composed of multidisciplinary professionals committed to the practice of medical genetics.

Amorph (null allele)

An allele that results in a complete loss of gene function.

Aneuploidy

Refers to an abnormality in the number of chromosomes (loss or gain of an entire chromosome)

Antimorph (dominant negative allele)

An allele that interferes with the function of the wild-type allele, often producing a dominant negative effect (and complete loss of function of both alleles).

Artefact (sequencing)

Variations introduced by non-biological processes. Sequencing chemical reactions can introduce nucleotide changes that can seem like mutations. The artefacts are implied in sequencing methods ‘error-rates’ and are accounted for during analysis processes.

Association for Clinical Genomic Science (ACGS)

A constituent group of the British Society for Genetic Medicine which brings together scientists working within genetics into one professional association.

Association for Molecular Pathology (AMP)

A scientific society that advances the clinical practice, science, and excellence of molecular and genomic laboratory medicine through education, innovation, and advocacy to enable the highest quality health care.

Autozygosity

Occurs when two chromosomal segments that are identical from a common ancestor are inherited from each parent. This occurs at high rates in the offspring of mates who are closely related (inbreeding) but also occurs at lower levels among the offspring of distantly related mates.

Back to top

B

British Society for Genetic Medicine (BSGM)

An independent professional organisation whose purpose is to support the promotion, encouragement and advancement of genetic and genomic science in clinical and research practice for the public benefit.

Back to top

C

CATH database

A free, publicly available, hierarchical classification of protein domain structures, which clusters proteins at four major levels, Class(C), Architecture(A), Topology(T) and Homologous superfamily (H).

Cis state

In genetics, the cis state refers to the arrangement where two or more genetic loci (genes or genetic markers) are located on the same chromosome and are often inherited together.

Clinico-molecular diagnosis

A diagnosis which combines both variant classification and clinical fit.

Clinical Genome Resource (ClinGen)

A free, online database that defines the clinical relevance of genes and variants for use in precision medicine and research.

Clinical Genomic Variants Resource (ClinVar)

A free, online resource that catalogues reports of human variations classified for diseases and drug responses, with supporting evidence.

Coding Sequence (CDS)

A region of DNA or RNA whose sequence determines the sequence of amino acids in a protein.

Copy number deletion

Loss of a portion of a DNA segment, resulting in fewer copies of a specific genetic sequence.

Copy Number Variation (CNV)

Genetic alterations that involve the duplication or deletion of segments of DNA (typically 50 bases or more), resulting in changes in copy numbers of specific regions of the genome.

Back to top

D

Deciphering Developmental Disorders in Africa (DDD-Africa)

A project that aims to find genetic causes of developmental disorders in African populations.

Back to top

E

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI)

The European Bioinformatics Institute (EBI) is the European Molecular Biology Laboratory’s (EMBL) UK site. It maintains the world’s most comprehensive range of freely available and up-to-date molecular data resources, in addition to research and training programmes.

Back to top

F

Founder variant

A genetic alteration observed with high frequency in a group that is or was geographically or culturally isolated, in which one or more of the ancestors was a carrier of the altered gene. This phenomenon is often called a founder effect.

Frameshift deletion

Deletion of nucleotides that alter the reading frame, often causing a nonfunctional protein.

Frameshift insertion

Addition of nucleotides that disrupt the reading frame, typically leading to a nonfunctional protein.

Back to top

G

Gene Curation Coalition (GenCC)

The GenCC brings together groups engaged in the evaluation of gene-disease validity with a willingness to share data publicly, to develop consistent terminology for gene curation activities and to facilitate the consistent assessment of genes that have been reported in association with disease.

Genome assembly

The process of taking a large number of short DNA sequences and putting them back together to create a representation of the original chromosomes from which the DNA originated.

Genome Reference Consortium Human Build 37 (GRCh37)

The official name for the human reference genome assembly released on 27 February 2009. It is not the current version but is still used in some large-scale projects. It is usually abbreviated as GRCh37 and also referred to as hg19.

Genome Reference Consortium Human Build 38 (GRCh38)

The official name for the current human reference genome assembly. It is usually abbreviated as GRCh38 and also referred to as hg38.

Genome Aggregation Database (gnomAD)

A coalition to aggregate and harmonize exome and genome sequencing data and to summarise data available for the wider scientific community.

Gonadal mosaicism

Occurs when some of the sperm cells in the testes or some of the egg cells in the ovaries carry a pathogenic variant that is not found in other cells of the body.

Back to top

H

Haploinsufficiency

The situation occurs when one copy of a gene is inactivated or deleted and the remaining functional copy of the gene is not adequate to produce the needed gene product to preserve normal function.

Haplotype

A group of genes within an organism that are inherited together from a single parent.

Heterozygous

Refers to a gene or an individual having two different alleles at a genetic locus.

Homozygous

Refers to a gene or an individual having two copies of the same allele at a locus.

Homologous chromosomes

Pairs of chromosomes that have the same structure and size and contain the same genes in the same order. In humans, one chromosome of the pair is inherited from the individual’s mother, and the other from the father.

Human Genome Project (HGP)

This project is led by an international group of researchers, who generated the first sequence of the human genome.

Human Heredity and Health in Africa (H3Africa)

An Africa-wide consortium working towards improving capacity for genomics research on the African continent

Hypermorph

An allele that results in increased gene activity or expression.

Hypomorph

An allele that leads to reduced gene function.

Back to top

I

In-frame deletion

Deletion of nucleotides that maintains the reading frame of a gene (usually three or multiples of three nucleotides are changed).

In-frame insertion

Addition of nucleotides that maintain the reading frame of a gene (usually three or multiples of three nucleotides are changed).

Inversions

A genetic rearrangement is where a segment of DNA is reversed in orientation within the chromosome with two breakpoints.

Back to top

L

Large structural variants

Structural variants are genetic alterations that cause changes in the structure, organisation, or arrangement of larger DNA segments.

Linkage disequilibrium

Refers to the non-random association of alleles at two or more genetic loci. It occurs when alleles at different loci are inherited together more frequently than would be expected by chance.

Locus (genetic locus)

The specific physical location of a gene or other DNA sequence on a chromosome, like a genetic street address.

Long non-coding RNAs

Genes that are transcribed but not translated.

Back to top

M

MANE (Matched Annotation from NCBI and EMBL-EBI)

A project aiming to produce a genome-wide transcript set that can be useful as a default set.

Missense/nonsynonymous variants

Variation that changes one amino acid to another in a protein.

Back to top

N

National Center for Biotechnology Information (NCBI)

Part of the United States National Library of Medicine and advances science and health by providing access to biomedical and genomic information.

National Institutes of Health (NIH)

The primary agency of the United States government responsible for biomedical and public health research

Natural selection

Organisms that have traits that are advantageous to survival are more likely to reproduce and pass on their genes/variants to the next generation. This process is known as natural selection and causes species to change and diverge over time.

Neomorph

An allele that results in a novel or gain-of-function activity not found in the wild-type gene.

Next-Generation Sequencing (NGS)

Next-Generation Sequencing is a high-throughput sequencing methodology.

Non-homologous chromosomes

Chromosomes that do not pair during meiosis because they contain different sets of genes. These chromosomes belong to different chromosome pairs within the organism’s genome.

Nonsense-Mediated Decay (NMD)

Nonsense-mediated decay is a process that eliminates aberrant/truncated mRNAs.

Nonsense/stop gain variants

A variation that converts a regular codon into a stop codon, leading to premature protein termination.

Back to top

O

Online Mendelian Inheritance in Man (OMIM/MIM)

A comprehensive, online resource of human genes and genetic phenotypes that is freely available.

Back to top

P

Pangenome

A collection of the common and unique genomes of a given species. It combines the genetic information of all the genomes sampled.

Pathogenic variant

A change in the DNA sequence that causes a person to have or be at risk of developing a certain genetic disorder or disease.

Penetrance

The proportion of people with a particular genetic variant who exhibit symptoms of the associated genetic disorder.

Pfam

A database of protein families that includes their annotations and multiple sequence alignments generated using statistical models.

Probability of Loss-of-function Intolerance (pLI)

Estimates the probability that loss of function in one gene allele causes a haploinsufficient phenotype and estimates the likelihood that a gene falls into the class of LoF-haploinsufficient genes.

Pseudogene

A nonfunctional copy of a protein-coding gene.

Back to top

R

Reference Sequence (RefSeq)

A sequence collection, based in the United States which provides a comprehensive, integrated, non-redundant, well-annotated set of sequences, including genomic DNA, transcripts, and proteins.

Repeat expansions

An increase in the number of repeating DNA sequences within a gene (short tandem repeats) can lead to certain genetic disorders.

Back to top

S

Single nucleotide deletion

Removal of a single nucleotide from a DNA or RNA sequence.

Single nucleotide insertion

Addition of a single nucleotide into a DNA or RNA sequence.

Single Nucleotide Variations (SNV)

Single nucleotide variations involve changes at a single nucleotide within the DNA sequence, encompassing substitutions, insertions, or deletions.

Skewed X-inactivation

X inactivation is considered skewed if the ratio of the active to inactive X chromosome is less than 10% or greater than 90%. This can result in symptoms of an X-linked disease if the chromosome carrying the mutation is preferentially active.

Small indels

Short for “insertion-deletions,” they involve the addition and/or removal (or both) of a few nucleotides (typically more than one and less than 50 bases) within a DNA or RNA sequence.

Small RNAs

Small RNAs are RNAs less than 200-nucleotide, examples include transfer RNA (tRNA) and microRNAs (miRNAs).

Splicing

A molecular biology process during the mRNA transcription in which introns (non-coding regions of RNA) are removed from the newly synthesized pre-mRNA transcript and spliced back together exons (coding regions).

Start loss variants

Variation that prevents the initiation of translation at the usual start codon.

Stop loss variants

Variation that prevents the termination of translation, resulting in a longer protein.

Substitution

Replacement of one nucleotide with another in a DNA or RNA sequence. These are the most common variations observed.

Synonymous/silent variants

Variation that doesn’t alter the encoded amino acid, often occurring in the third position of a codon (as these nucleotides are redundant).

Back to top

T

Telomere

Telomeres are structures made from DNA sequences and proteins found at the ends of chromosomes

Transcriptome

The transcriptome is the full range of messenger RNA molecules expressed by an organism.

Translocations

Rearrangement of genetic material between non-homologous chromosomes or within the same chromosome.

Trans state

In genetics, the trans state refers to the arrangement where two or more genetic loci (genes or genetic markers) are located on different chromosomes or on opposite homologous chromosomes.

Back to top

V

Variants of Uncertain Significance (VUS)

A genetic variant that has been identified through genetic testing but whose significance to the function or health of an organism is not known.

Variant calling tools

Bioinformatics tools used to infer genetic variation information from DNA sequencing usually obtained from high-throughput methodologies.

Back to top

© Wellcome Connecting Science
This article is from the free online

Interpreting Genomic Variation: Overcoming Challenges in Diverse Populations

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now