1.6

Wellcome Genome Campus Advanced Courses and Scientific Conferences

DNA double helix, a book of DNA bases (letters), and using a magnifying glass to search for changes

Welcome to Week 1 - DNA and protein sequences

In this week’s activities you will learn some basic bioinformatics concepts about representation and storage of DNA and protein sequences, data files and data formats.

You will learn how to represent DNA and protein sequences so that they can be stored in data files and handled by computers

Unlike humans, computers are not so flexible in reading different formats and layouts. For example, humans can easily understand this message:

“W E L C O M E to the FiRSt week of this course“

Computer programs can have a tough time trying to decipher the different formatting of the words presented here. It’s much easier to write an algorithm that can read:

“welcome to the first week of this course”.

Therefore, it is important that we humans follow a somewhat strict code (or common vocabulary) when writing DNA and protein information, so that computers can be programmed to read them and perform tasks for us.

Next, you will get familiar with different data files

You will learn about other sequence file formats that enable more comprehensive records of accompanying data. These files are used to store additional information about the nucleotide or protein sequence of interest.

Finally, we will finish this week’s activities with a short quiz where you can put your new knowledge to the test. Do not forget to engage in the ‘discussion’ with fellow learners.