Want to keep learning?

This content is taken from the University of Leeds & Institute of Coding's online course, Evidence and Data Collection for Problem Solving. Join the course to learn more.

How can data collection go bad?

In the last step we heard our expert Alexandra explain how Aisha’s data is missing important information about marketing consent from her customers. This means she can’t legally use their data in her marketing campaign.

image of mobile device with data protection graphic

Restrictions like this are designed to protect people’s personal data and stop companies using it for purposes other than those originally intended. Data is a very powerful tool which means it must be collected and used in ways which the law allows. For example, data protection law means you must store personal data securely, so putting data in a file which isn’t encrypted is likely a breach of the law.

Likewise, if you transfer data to another company - even within the same group - you must have explicit permission from users, and you must ensure that they can revoke that consent at any time. You shouldn’t pass a data file to another company and just hope they handle it correctly.

The law around data protection makes it harder for companies to sell your data to others or use it for other purposes that you haven’t consented to. However, if someone has given broad consent, for example to receive marketing messages, your data can be used in ways which might surprise you.

Targeting customers

Most companies want to be able to collect data from individuals and combine it with other data to learn more about them. Companies use that data to promote services or products: they make money from additional sales, and customers get special offers that are relevant to them. For example, American retailer Target used information about people’s gender and their buying habits to make predictions about whether or not they were likely to be pregnant, and – based solely on what they were buying – at what stage of pregnancy they were at. This allowed the company to send coupons to customers which were targeted to exactly their stage of pregnancy.

Why is this so powerful? Because the customers had not explicitly told Target they were pregnant. Worse still, Target could inadvertently share that private and sensitive information with anyone.

This isn’t illegal. As long as a company has your consent to use your data for marketing purposes, it can make inferences like this based on all it knows about you. But most people would, if told, find this kind of use a little disturbing.

Data bias

More disturbing still is the potential for even legal data collection to deliver misleading results. With the power of machine learning, combining datasets can be a valuable tool to predict behaviour. For example, in the UK, the West Midlands Police make use of a system called National Data Analytics Solution (NDAS) which combines data from police records, social services, the NHS and schools to work out the most effective places to deploy police.

However, this demonstrates another potential poor use of data. NDAS has been criticised because one of the data sets used is from the numbers of “stop and searches” in an area, a tactic deployed where police can stop and search an individual suspected of a crime. This tactic has been criticised as potentially racist, given the high number of stop and searches on specific minority ethnic communities. If the data you collect has a bias like this built in, then combining it with more data can make those biases worse.

Data security

Aisha is unlikely to need to use big data in her business, or to see the financial benefits of selling her customers’ information. But even her data has issues.

Remember, the data in Aisha’s case is purely for illustration. The customers are not real people, and they are not real emails, passwords or credit card details.

However, the passwords in her customer data are stored in ‘plaintext’ – they are not encrypted, and are readable by a human. As people often reuse passwords, anyone with access to that file could combine those passwords and the email addresses to access that person’s accounts elsewhere.

It might also indicate a security flaw in her online ordering system that could be exploited by hackers. Her system is also storing card payment information and security codes: anyone with access to that spreadsheet could potentially commit financial fraud.

Have your say:

  • Do you think it’s wrong to combine data from different sources in order to advertise to people, or sell more products?
  • Are there other situations you know of where data has been used in a way that might be considered unethical?

Share your thoughts in the Comments.

Share this article:

This article is from the free online course:

Evidence and Data Collection for Problem Solving

University of Leeds