Skip main navigation

£199.99 £139.99 for one year of Unlimited learning. Offer ends on 28 February 2023 at 23:59 (UTC). T&Cs apply

Find out more

What is data science? Part one

What is data science?
In this discussion, we’re going to ask the question, what is data science? We’re going to hear from our resident experts in the field about what data science is in the health and care sector. We’re going to explore recent developments. And we’re going to look at some great examples to showcase data science in action. So our first question is what is data science? John? So data science is the science of looking at data. Data is the primary object of study. And so it brings together a lot of people who are interested in data. Those who’ve been traditionally working with data for many years– such as statisticians.
Those for whom data is an interesting object to study, because we can get nuggets out of it through algorithms– computer scientists, such as myself. And people who are interested in the kind of insights that we can get into the data. The experts in the field. So in the field of health and social care, for example, the practitioners– those who actually care for the data, and collects the data, and are interested in what the data supports– are going to come together with people who implement algorithms, such as myself. And are going to come together with statisticians, who understand what inferences mean in the field. And we’re going to work together as a team.
That’s the most important thing about data science, is it’s a team effort. And the same question to yourself, Marilyn. In your eyes, what is data science? So to me data science is about bringing people together. And to look at data, as John said, is very important in a health and social care site to actually start from the beginning. So for me, data science it’s actually quite fundamental. It’s seeing what data do you actually have your hands on? So in health and social care for example, historically there’s lots of data collected routinely– so health care records, large databases filled with disease information. And the power hasn’t really been unlocked yet, as to what you can do with that data.
So we collect it for auditing purposes. But data science is about the methods that would unlock looking at the data in more insightful ways. So using analytical methods to draw out insights from that data. And if you do that well– using data science– then that can help you to make predictions, for example, about the outcomes or health of an individual or the population. Excellent. And that’s what we want to see. Positive outcomes. Thank you very much. Kerem, I was wondering as well and from your opinion, what do you think data science is? From my perspective. So we have been since the start of human civilization collecting a lot of data.
Trade data, what people do, how people move from one place to another, et cetera. But when we look into say the last 10 years, 20 years the amount of data being collected has just grown immensely. And then what we need now is not use the traditional tools we have been using for hundreds of years, but actually come up with tools that can deal with this massive amount of data, and actually help us to just come up with something sensible. What does this data tell us? What’s the story behind it? I’m interested to build on your last point there about the recent developments that have brought data science more to the fore.
I was wondering, what do you believe has changed in the last five years, that we’re seeing such a prevalence of data science being undertaken? And the outcomes that are being generated from it. So for me, as a computer scientist, there’s two things. One is the collected data. There’s a huge amount of collected data out there now, but it’s collected by people and also collected by devices. Every time you move around, your mobile phone collects information on where you are and what you’re doing. The other thing that’s happened in recent times is that the algorithms that we’re using– which traditionally have been called artificial intelligence, or machine learning, or data mining– those algorithms have really come to the fore.
So the power of the computer can now be brought to bear upon the data to find those nuggets of information, those insights that the data actually supports, which you wouldn’t find using traditional means. Have you got anything to add to that, Marilyn, in terms of recent developments in the last five years or so that have made a positive difference? So one of the things that’s changed over the last four or five years in terms of machine learning, and the power of data analytics as the computing power– so that’s increasing all the time. So traditionally we would maybe analyse our data on a standalone computer with a few bits of software or a couple of programmes.
But now we can do that across multiple computers, across multiple servers, all across the world. And we can actually start to do that with real time data as well. Which means that we can be looking at different data sets, and linking those different data sets. That’s really key. Because that allows you to look at complex patterns that you might not have otherwise found within the data, just looking at it individually with standalone methods. OK. Thank you very much. And the same question to yourself, Kerem? I think what I’ll add to this is obviously there’s a lot of algorithmic development in computer science, as well as statistical tools. But also I think the accessibility has increased a lot.
Now anyone in the world just can create easily some infographics to understand what the data tells us. So I think that’s one of the also big games we had over the last five, ten years. Great. Really interesting to hear about the developments over the last few years. I was wondering if there’s a specific example that you have in your head that illustrates this kind of development and the opportunity that data science brings?
So a good example from the world of artificial intelligence is image recognition. Where the computer is given a picture of something, and it has to recognise that picture. The way that it works is that we use a training set where the computer is given both the picture and what it’s a picture of. So if it’s an elephant, it’ll say elephant. If it’s a llama, it’ll say llama. And we train the computer using all these data– so we might use thousands, tens of thousands of pictures, all labelled with what they are. And we ask it to learn a model– as we call it, a predictive model.
And what the computer can then do, once it’s learned that predictive model, is that you can give it a picture it’s never seen before and it will tell you what it is. Recent developments in machine learning, such as deep learning neural networks, can do this fantastically well. And so the power of machine learning can now be brought to bear on problems such as this, and lots of others. Marilyn, I was also wondering if you had an example where you see data science being played out very well? So because I work in the area of digital health and social care, I will use an example from health care.
So there are several companies just now getting together to work in large problems such as– let’s take the example of image analysis. So currently– if you have a brain scan, for example– you can use lots of data to look at that, and have lots of experts look at it. If you had the part of machine learning, the machine learning algorithms can actually detect patterns or anomalies that could help consultants to diagnose whether you have a brain tumour for example. The power that that unlocks is that that could avoid unnecessary invasive procedures later on.
So if the algorithm can reliably tell you for sure within a certain degree of confidence that you do or do not need the procedure, that’s much more useful for the patient as welfare, and also for the GP or the consultant. And finally, Kerem, to yourself. You have an example where data science has provided some great outcomes? Absolutely. When we think about it– for example, when we want to go somewhere, the first thing we do is go online and then check the maps, and then obviously find the shortest path to get there, right?
But now what we are seeing in the last years– and it’s getting better and better– is that in GPS devices used in cars, you have real time data coming from traffic that actually optimises your path– which way you should go. Maybe you should make it right turn, because the head of you is just stuck with traffic jam. So that type of the optimization algorithms have actually increased the efficiency of these problems, solving them via time over the last five, 10 years.

In this discussion, Steve is joined once again by Dr John Levine, Dr Marilyn Lennon and Dr Kerem Akartunali to discuss the data science process.

The data science process can be simplified into 6 steps and is an iterative process. The 6 steps are:

  1. Raw data is collected
  2. Data is processed
  3. Data is cleaned
  4. Perform exploratory data analysis
  5. Apply algorithms and models
  6. Communicate/Visualise/Report
This article is from the free online

The Power of Data in Health and Social Care

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education