Skip main navigation

Introduction to annotation

In this introductory article to this Week 2 Activity, we set the scenario for the exploration of genome annotation.
Back view of two seated learners (on the right a man, on the left a woman) engaging in discussion; with one pointing at the other's screen as they talk with each other about the analyses displayed on their monitors.  The context a computer room
© Wellcome Genome Campus Advanced Courses and Scientific Conferences

In this article we set the scenario for the exploration of genome annotation.

DNA contains the code or instructions to make proteins and other components of cells. These instructions are put into genes, which can be considered as the smallest unit of information. Typically, a gene contains a stretch of protein coding sequence (or coding sequence for short or simply CDS) but may also contain other functional regions that do not encode for protein sequences. For example, certain regions of DNA have the role of providing a “meeting point” for the components of the transcription machinery that will then progress over the CDS section of the DNA to make a transcript. These regions are called promoters and can be considered parts of genes.

Another example: in the case of bacterial genomes, which are circular, initiating the replication of the bacterial chromosome is the role of a region of the genome. This region is commonly denominated as ori (origin). These are examples of important regions of the genome that are not protein coding. So how can we record where these sequences are in a given genome? We will try to answer this and other related questions in the coming activities.

One important problem presented by draft genomes is that their annotation may not be complete. Remember that draft genomes have typically a number of gaps (areas of unknown sequence). If a gene sequence were to fall in this gap, we would not be able to find the nucleotides that make that gene. In consequence, the gene might be missing from the genome assembly or truncated.

Annotation from finished genomes are much more reliable than those from draft genomes.

© Wellcome Genome Campus Advanced Courses and Scientific Conferences
This article is from the free online

Bacterial Genomes II: Accessing and Analysing Microbial Genome Data Using Artemis

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education