Skip to 0 minutes and 10 secondsWelcome to week 2 of this course on Clinical Bioinformatics, Unlocking Genomics in Healthcare. This week, we'll be looking at the role of the clinical bioinformatician in more detail and seeing exactly what happens during the clinical bioinformatics workflow. The genome represents 3 billion bases of information. If we made each base just a millimetre long, the human genome would pretty much go around the earth. If we represented just the exome, it would stretch from Manchester to London. This is an awful lot of data and far too much to be managed at a human level. We need to find a way of taming this data and turning it into clinically actionable information.
Skip to 0 minutes and 49 secondsA clinical bioinformatician's core role is to create and implement the IT infrastructure and analysis pipelines that will manage this data and philtre it down to clinically useful information. So what are the stages, how do we go about taming the data? Well, the first stage is to organise the information, to pull together all the data we have from next generation sequencing machines to assemble a genome or exome. We need to complete one very big jigsaw. The second task is to identify the variants. We compare the patient's genome with a reference genome to see where the differences are. This could be 20,000 differences.
Skip to 1 minute and 30 secondsIt's as if we were trying to find one special pin among this whole set of pins scattered in the image. The third step is then to triage these variants, to try and find the one that's most likely to be associated with disease. Most of these 20,000 variants will be associated with the variability we'd expect to see within the population-- common variants. But some of these variants will be rarer and potentially damaging. As bioinformaticians, we won't be able to identify the actual disease variant.
Skip to 2 minutes and 0 secondsBut we should be able to get it down to a small number of possibilities, filtering the data down from 20,000 variants to a few tens or twenties, a data set that is small enough to pass to a human expert for further analysis. What we're hoping to do this week, therefore, is to show you what is the core role of a clinical bioinfomatician, how exactly we take the vast amount of raw data we get from the next-generation sequencing machines, how we clean it, assemble it, and then filter it down from a few hundreds of gigabytes to a small and well-annotated set of data that can be looked at by someone who is expert in the specific disease area.
Skip to 2 minutes and 40 secondsTaking data from a computational to a human scale, we are pinpointing the right data to support clinical decisions for patient care.
Welcome to Week 2
In this short video Andy Brass introduces the assembly and filtering pipelines constructed in the bioinformatics workflow – and begins to introduce some of the things we need to do to ensure that the analyses are of the quality that we have to meet for clinical use.
A key theme this week, which you will have seen in the videos with the three bioinformaticians in week 1, is the scale of the data being dealt with here. To give you an idea of the sheer size please see the image of the printed encyclopedia version of the human genome below. It is 130 volumes in size with 43,000 characters per page:
This week you will learn:
- Describe the role of the clinical bioinformatician in analysing genomic data
- List the five steps in bioinformatician’s workflow
- Explain how data is taken from the computational level to the human level
- Describe the importance of accuracy and precision in genomic data analysis
© University of Manchester