Skip main navigation

Glossary of Genomics Terms

This article outlines definitions of some of the key terms related to the study of genomics.
© St George’s, University of London
The glossary and acronyms tables below provide a useful guide to the key terms related to genomics.
There is also a PDF version of these glossary and acronyms lists in the downloads section at the bottom of this step.
Many thanks to Miranda Shanks for her help in developing these lists.


AlignmentThe process of matching reads back to their original position in the reference genome.
AlleleAn allele is one of a number of alternative forms of the same gene or genetic locus. We inherit one copy of our genetic code from our mother and one copy of our genetic code from our father. Each copy is known as an allele.
Array CGHMicroarray based genomic comparative hybridisation. This is a technique used to detect chromosome imbalances by comparing patient and control DNA and comparing differences between the two sets. It is a useful technique for detecting small chromosome deletions and duplications which would not have been detected with more traditional karyotyping techniques.
BaseA unit of DNA. There are four bases which form the cross links (or rungs) of the DNA double helix: adenine (A), thymine (T), guanine (G) and cytosine (C).
Capturesee Target enrichment.
Cell differentiationThe process by which a cell becomes specialized in order to perform a specific function.
CentromereThe point at which the sister chromatids are joined.
ChromosomeA structure located in the nucleus all living cells, comprised of DNA bound around proteins called histones. The normal number of chromosomes in each human cell nucleus is 46 and is composed of 22 pairs of autosomes and a pair of sex chromosomes which determine gender: males have an X and a Y chromosome whilst females have two X chromosomes.
ChromatidTwo identical copies from the replication of a single chromosome. Therefore a sister chromatid refers to either of the two identical copies.
CodonA sequence of three adjacent nucleotides constituting the genetic code that determines the insertion of a specific amino acid in a polypeptide chain during protein synthesis.
Clinical exome sequencing Differs per laboratory, but usually involves sequencing of all genes known to be associated with human disease.
Coverage The number of reads giving information about the base present at a set position in the reference sequence.
CrystallographyThe experimental science of determining the arrangement of atoms in crystalline solids.
DNADeoxyribonucleic acid. DNA is a molecule consisting of two long chains of nucleotides twisted together to form a double helix. Genes are made from DNA.
DNA sequencing The process of identifying the order of a variable number of adjacent nucleotides in a strand of DNA.
Epigenome Chemical marks on the DNA, regulating whether the gene is turned “on” or “off”.
ExomeThe coding portion of the genes. The exome constitutes 1-2 % of the genome.
GeneA portion of DNA that serves as the basic unit of heredity.
Genetic codeThe DNA or RNA sequence that determines the amino acid sequence used in the synthesis of an organism’s proteins.
Genome The entirety of an individual’s genetic material including ≅20 000 genes and the genetic material between genes.
Gene panel A collection of genes to be sequenced together, which are usually linked by common biological pathways, or known disease associations.
Germline mutationsWhere a genetic error occurs in the egg or sperm pre fertilization. Therefore, the genetic mutation is passed onto the offspring.
HaploidA cell having a single set of unpaired chromosomes.
HistonesHistones are basic proteins which function as spools for thread-like DNA to wrap itself around, allowing DNA to be packaged efficiently.
HomologousCorresponding in structure and in origin, but not necessarily in function.
Homologous chromosomesA pair of chromosomes which have the same genes at the same loci.
Indel An insertion or deletion.
KaryotypeThe number and appearance of the chromosomes when viewed down a microscope.
Mapping See Alignment.
MeiosisOccurs in the production of sperm and eggs. Four daughter cells are produced from the original parent cell. There are two stages to meiosis, meiosis 1 and meiosis 2. At the end of meiosis, each daughter cell has only 23 chromosomes.
Missense mutationA missense mutation describes where a base substitution or change results in a codon which causes the insertion of a different amino acid into a protein. [see non-synonymous change]
MitosisOccurs at the end of normal cell division. Two daughter cells are produced from one parent cell, both with the same number of chromosomes (46) as the parent cell.
MonosomyWhere only one chromosome from a pair is present in a cell.
MosaicismWhere a genetic error occurs after fertilisation, resulting in two distinct genetic cell lines.
Next generation sequencingHigh-throughput DNA sequencing where millions of DNA bases are sequenced in parallel.
Non-pathogenicNot disease-causing.
Non-synonymous changeA non-synonymous change describes where a base substitution or change results in a codon which causes the insertion of a different amino acid into a protein. [see missense mutation]
NucleotideA nucleotide is composed of a DNA base, a phosphate and a pentose sugar.
PharmacogenomicsThe branch of genetics concerned with determining the likely response of an individual to therapeutic drugs.
PhenotypeThe set of observable characteristics of an individual as a result of their genotype/environment.
Pull down see Target enrichment.
Read A computer generated sequence of bases representing the sequenced code from an original DNA fragment.
Reference sequence/genome An assembled version of a genome that can be used to make comparisons to the genomes from other individuals.
RNARibonucleic acid, a nucleic acid present in all living cells. Its principal role is to act as a messenger carrying instructions from DNA for controlling the synthesis of proteins.
Single nucleotide polymorphismA single base substitution occurring at high frequency (more than 1%) in the general population.
Somatic mosaicismThe existance of genetically distinct cells lines in the body of an individual, but not the sex cells (germline)
SpindleFibres which draw the chromosomes to either end of the poles during mitosis and meiosis.
Target enrichment A method for selecting a specific portion of the genome to undergo sequencing.
TranscriptionTranscription is the first step of gene expression, in which a particular segment of DNA is copied into RNA by the enzyme RNA polymerase.
Translation Translation is the step after transcription, in which cellular ribosomes use the RNA to produce a specific protein.
UracilA pyrimidine base that is a component of RNA. It forms a base pair with adenine during the generation of messenger RNA. Uracil is therefore structurally analogous to thymine in molecules of DNA.
Variant of uncertain significanceAn alteration to the DNA sequence where it is unclear (on the basis of the available evidence) whether it is disease-causing or not.
Whole (human) exome sequencing Sequencing the portion of the genome which codes for proteins.
Whole (human) genome sequencing Sequencing of the entire length of the human genome.


The list below provides a list of key abbreviations used throughout the course.
ABLAbelson murine leukemia viral oncogene homolog 1
Array CGHArray comparative genomic hybridization
BCRBreakpoint cluster region
BRCA1 geneBreast cancer 1, early onset gene
BRCA2 geneBreast cancer 2, early onset gene
CMLChronic myelogenous leukaemia
CNVCopy number variation
DDDDeciphering Developmental Disorders
DNADeoxyribonucleic acid
DSDown syndrome
GWASGenome-wide associated studies
HDHuntington’s Disease
MEDMultiple epiphyseal dysplasia
mRNAMessenger RNA
NGSNext generation sequencing
PGDPre-implantation genetic diagnosis
RNARibonucleic acid
SNPsSingle nucleotide polymorphisms
TPMTThiopurine S-methyltransferase
tRNATransfer RNA
VUSVariant of uncertain significance
WESWhole exome sequencing
WGSWhole genome sequencing
© St George’s, University of London
This article is from the free online

The Genomics Era: the Future of Genetics in Medicine

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education