Vector Flat Illustration of Data Specialist juggling various types of data

Working with Data

In this step, you will undertake a simple task to de-personalise (anonymise), analyse and model some data.

Download/save to your desktop the dataset called “MOOC anonymisation” from the USMART Playground and undertake the following tasks (15 mins):

  • De-personalise/anonymise the dataset by masking the variable ‘name’ and k-anonymising any fields involving the variable ‘age’.
  • Do you think anything needs to be done with outliers (data that does not seem to fit the general pattern of the dataset)?
  • Analyse the data to look for a correlation (relationship) between the variables ‘salary’ and ‘age’

Additional Tasks

You may also now wish to go to the data.gov.uk website and search for datasets with ‘time’ and ‘date’ fields.

Look at how the ‘time’ and ‘dates’ are recorded in the dataset.

Discuss your findings in the comments section below.

  • How was each recorded?
  • Would this make analysis easier or more difficult?

Share this article:

This article is from the free online course:

The Power of Data in Health and Social Care

University of Strathclyde