Getting you ready for the course
To ensure you are completely comfortable going through the course we have collated some helpful documents on the platform, some resources you may find useful and an A to Z Glossary on some key terms and acronyms.
We have included some key terms you will come across over the next five weeks that may need some further explanation or you would just like to know more about.
The glossary and acronyms below will appear as links in the course pages so you can look back whenever you come across anything you’re not sure about. Here is a version of the glossary in pdf format so you can download it and keep it for future reference.
Autosomal Dominant (AD) describes inheritance pattern of a genetic condition where one copy of the variant gene is inherited from one of the patients.
Autosomal Recessive (AR) describes inheritance pattern of a genetic condition where two copies of the variant gene are inherited (one from each parent).
Cloud computing is the practice of using a network of remote servers hosted on the Internet to store, manage, and process data.
De novo mutations are when genetic variant occur in an offspring but are not present in either parent.
Evolutionary Sequence Conservation (ESC) is where sequence similarity is used as evidence of structural and functional conservation, and evolutionary relationships between sequences.
Exome Aggregation Consortium (ExAc) is a coalition of investigators seeking to aggregate and harmonize exome sequencing data from a variety of large-scale sequencing projects.
This is the entire protein coding sequence of the genome.
Sets of 10-100s of genes used to identify variants in the human genome linking to specific phenotypes or conditions.
Genotype, at its broadest sense, is the genetic characteristics of an individual. When referring to a particular trait it describes the variant forms of a gene that are carried by an organism.
Incidental Findings - unexpected genetic changes found during sequencing of the genome.
Perform using computer modelling or simulation.
Command-line based computer operating system.
Locus specific Database (LSBD): A database describing variants found at particular gene loci.
The branch of science that deals with micro-organisms.
A Mis-sense is a single base pair change that will cause the formation of an alternate amino acid at that position in the sequence.
Multiple Sequence Alignment is generally the alignment of two three or more biological sequences (protein or nucleic acid) of similar length. From the output of the alignments, homology can be inferred and the evolutionary relationships between the sequences studied.
Next Generation Sequencing (NGS) - the process by which millions of fragments of DNA can be sequenced in parallel from the same sample.
Nonsense variant is a single base change in the nucleotide sequence that causes the formation of a stop codon either forming a truncated protein or non-sense mediated decay of the transcript.
Nonsynonymous variant is a single base change in nucleotide sequence that changes the codon leading to the formation of an alternate amino acid.
The branch of medicine concerned with the study and treatment of disorders and diseases of the eye.
The set of observable characteristics or traits of an individual.
Pfam is a database of protein domain families.
Pseudonymous Data is a type of data that allows the potential, under certain circumstances, for the manager of the database to re-identify each individual at a future time, usually via a ‘key’ that decodes the pseudonym back into the NHS number. In this sense, pseudonymous data are neither identifiable nor anonymous because all personal identifiers have been removed but identification is still possible through the pseudonym.
Reference Genome Sequence (RGS) is a digital sequence assembled from sequencing the DNA from a number of donors.
A Read is a fragment of data from the genome.
Sense variant is a single base change in nucleotide sequence that encodes the same amino acid, as several codons encode for the same amino acid.
Single Nucleotide Polymorphism (SNP) is a position in the genome where single base change occurs. One of the most common variations involves SNPs.
Single Nucleotide Variant (SNV) is a position in the genome where an alternate base is found in the test genome relative to the reference genome.
The Sourceforge website offers a repository for the source code, and tools that allow for it to be modified and updated in a way in which the whole community can contribute, but which can still be controlled by the original developers to ensure code quality.
Synonymous variant is a single base change in nucleotide sequence that encodes the same amino acid, as several codons encode for the same amino acid.
Splice-site is the position of two base pairs at the intron/exon boundary by which the process of splicing occurs to produce the mature mRNA transcript.
Variant of Unknown Significance (VOUS) is a variation in a genetic sequence whose association with disease risk is unknown.
Whole Exome Sequencing (WES) is sequencing of exons only within a genome by NGS.
Whole Genome Sequencing (WGS) is sequencing of the entire genome by NGS.
X-linked describes the inheritance pattern of a genetic condition that is inherited on the X chromosome, hence males will definitely inherit the disorder as they only have one X chromosome whereas females may show milder symptoms of the condition depending on which genetic disorder it is.