Skip main navigation

New offer! Get 30% off your first 2 months of Unlimited Monthly. Start your subscription for just £35.99 £24.99. New subscribers only T&Cs apply

Find out more

Exploring and modelling data

Exploring and modeling data
In this presentation, we’re going to look at what data analytics is. And in particular, we’re going to look at two important techniques, those of machine learning algorithms, to look at the kind of insights that we can generate automatically from the data, and also look at data visualisation techniques. Those are techniques for visualising both the raw data and the insights that are generated by the algorithms. Data analytics is the area of data science which is concerned with automatically discovering insights from the data. And what we do is we use machine learning algorithms in order to find patterns in the data which aren’t obvious.
A good example of this would be where you’ve got a large data set concerning subjects and you’ve got a risk factor associated with them. And you want to somehow find out how that risk factor is correlated with the rest of the data. Now, it might not be an obvious correlation. It might not be that you can pick out one particular element of the data to find that risk factor. It might be a subtle combination of elements. And that’s what machine learning algorithms can do for you. What we do in general is that we present the machine learning algorithm with a collection of data and ask it to find patterns in that data.
There are three particular types of pattern that we’re interested in. The first one is called regression. And this is where one variable, a numeric value, relies on lots of other variables in the data. So what we’re trying to learn is what is going to be this numeric value for a data point that I haven’t seen before? And there are many different ways in which we can do that regression, and it may be the relationship between the variable that we’re concerned with, and all the other variables is quite a complex one. Or it could be a very simple one, such as a simple straight line. The second type of machine learning algorithm is concerned with classification.
This is where there is a distinct value in the data– what we call a label. It could be something as simple as this person is at risk, this person is not at risk. So it’s one of a number of values. And what we’re trying to do is to predict that value from the other data. So what we do here is we start with a collection of labelled data and we learn a model, which allows us to infer that label for unseen data. And the last method is called clustering. This is where we have a large collection of data, but we don’t have any labels on the data.
What we’d like to do is try to find some labels by looking at where the data are tightly clustered. So it could be that we have a particular cluster in our data and we want to find that cluster and maybe give it a label.

In this presentation, Dr John Levine will provide his thoughts on exploring and modelling data in the health and care sector.

This article is from the free online

The Power of Data in Health and Social Care

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now