Skip main navigation

Data challenges

Working with large datasets often comes with many challenges including storage and processing issues. Watch Dr Jon Blower explain more.
One of the most common challenges we find when working with data is simply identifying and getting hold of the right data to answer the particular question at hand. And we can be quite creative in finding other data sources that contain information that help us to solve whatever problem it might be. And to pick one example, we recently worked with Highways England on the problem of identifying fog patches on motorway networks, which is a very difficult problem for various reasons. Unfortunately, we don’t have direct measurements of fog from sensors that we can trust and that are sufficiently good quality.
But we were able to look at traffic patterns and movements of traffic, speeds of traffic, in the different lanes of the motorway, and use that as a kind of proxy measurement that might be indicative of the presence of fog. Understanding the quality of the data, by which I really mean the fitness for purpose of the data to address a particular solution, is extremely important. For example, shadows in satellite images might affect the quality of the image. We might be looking at gaps in sensor records caused by weather conditions or by interruptions in internet connectivity or something like that. And we really need to understand all those particular nuances of the data in order to be able to use them effectively.
In many of our projects, we need to combine data from lots of different sources. One of the key challenges is simply that data collected by different organisations may be registered differently. For example, they may use different names for the same place. They may use different ways of identifying a position on the Earth’s surface. Or there may be even more difficult challenges than that. So we need to really understand the nature of the data in order to be able to combine them successfully. When getting to grips with the complexities of data from different sources, one of the most important techniques that we use is visualisation.
To create a picture of the data that we can understand as humans, that give us a lot of insight into the nature of the data, what it’s good for, what it’s not so good for, and also for communicating the results of our work to our customers and to the wider world. We have a lot of technical solutions and computing solutions for working with big data in all its various forms, but what’s really important to remember is that we really need experts, human experts, domain experts, who understand the nature of the data and how it can be applied in any given situation.
As you’ve worked through this course and watched industry experts discuss their big data projects, you’ve heard about many of the challenges when working with big data. Watch Dr Jon Blower summarise these and highlight some of the possible solutions.
This article is from the free online

Big Data and the Environment

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education