Skip main navigation

Introduction to annotation

In this introductory article to this Week 2 Activity, we set the scenario for the exploration of genome annotation.
Back view of two seated learners (on the right a man, on the left a woman) engaging in discussion; with one pointing at the other's screen as they talk with each other about the analyses displayed on their monitors.  The context a computer room
© Wellcome Genome Campus Advanced Courses and Scientific Conferences

In this article we set the scenario for the exploration of genome annotation.

DNA contains the code or instructions to make proteins and other components of cells. These instructions are put into genes, which can be considered as the smallest unit of information. Typically, a gene contains a stretch of protein coding sequence (or coding sequence for short or simply CDS) but may also contain other functional regions that do not encode for protein sequences. For example, certain regions of DNA have the role of providing a “meeting point” for the components of the transcription machinery that will then progress over the CDS section of the DNA to make a transcript. These regions are called promoters and can be considered parts of genes.

Another example: in the case of bacterial genomes, which are circular, initiating the replication of the bacterial chromosome is the role of a region of the genome. This region is commonly denominated as ori (origin). These are examples of important regions of the genome that are not protein coding. So how can we record where these sequences are in a given genome? We will try to answer this and other related questions in the coming activities.

One important problem presented by draft genomes is that their annotation may not be complete. Remember that draft genomes have typically a number of gaps (areas of unknown sequence). If a gene sequence were to fall in this gap, we would not be able to find the nucleotides that make that gene. In consequence, the gene might be missing from the genome assembly or truncated.

Annotation from finished genomes are much more reliable than those from draft genomes.

© Wellcome Genome Campus Advanced Courses and Scientific Conferences
This article is from the free online

Bacterial Genomes II: Accessing and Analysing Microbial Genome Data Using Artemis

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now