Want to keep learning?

This content is taken from the University of Reading & Institute for Environmental Analytics 's online course, Big Data and the Environment. Join the course to learn more.

Skip to 0 minutes and 9 seconds Hello, there. My name’s Richard Lamb. In this exercise, we’re giving you the opportunity to handle some big data. We’re going to be looking at power usage in households across London using an open dataset that was compiled by the Greater London Authority and the UK Power Networks Association. So what we’d like you to do is download the data, look at it, explore it, visualise it, and think about how you might use the data and process it, to help contribute to the research programme for which it was originally created. By the end of this session, I hope you will have explored topics such as data quality, visualisation, and data handling.

Power usage in London

In this section you’ll look at a dataset and consider how it could be used to solve a real world challenge. Watch Richard Lamb of Innovate UK, explain the exercise in more detail. You will be able to download the dataset in the next Step.

London is a leading global city, and has a history stretching back over 2000 years. Whilst many people think of London as a centre of tourist attractions such as the London Eye, Buckingham Palace and the Tower Bridge, the area of Greater London is home to over 8.5 million people and covers an area of over 600 square miles. Like every major city, London residents consume large amounts of electrical power each day for heating, lighting, charging mobile phones, watching TV and more recently, charging a growing number of electric bikes and cars.

Generating this power can use large amounts of fossil fuels, but the UK has set ambitious targets to reduce its greenhouse gas emissions by 80%, and so as a nation we have started to transition to more sustainable types of power generation such as wind and solar. Indeed one of the world’s largest offshore wind farms, the London Array, which is located in the Thames estuary generates enough power for nearly half a million homes. There’s still a long way to go, but to make this transition work, we need to change the way in which power is generated, distributed and consumed.

In a project run by the Greater London Authority and UK Power Networks Association, smart meters were provided to over five and a half thousand households across Greater London to monitor electricity consumption. Readings were taken at half hourly intervals for over two years and compiled into a dataset which is available for you to download and explore. The dataset contains energy consumption in kilo watt hours (per half hour), a unique household identifier, the date and time, and an ACORN group which helps to describe the affluence of the households in in the survey.

This is a large data set with over 167 million rows and is over 11GB in size. Whilst you may find yourself working with much larger datasets, these files are typical of the type of information that a data analyst might use.

When working with large datasets, there are many considerations. The quality of the data is likely to be one of your first, quickly followed by what tools are available to bring clarity and generate the necessary insight

In the next Step you will download the data, don’t forget to mark this one as complete before you move on.

Share this video:

This video is from the free online course:

Big Data and the Environment

University of Reading

Get a taste of this course

Find out what this course is like by previewing some of the course steps before you join: