Skip main navigation

The ethics of data science

What principles should you follow in order to be an ethical data scientist? We discuss ethical considerations in this article.

Ethics plays an important role in data science

Often, we deploy algorithms to make predictions about the way things are and even retrieve a full address of a Twitter user from two innocent-looking numbers.

Great work, but perhaps it is time we consider whether this commodity of data as the new oil is really a good thing when we consider how easy it is to link it to individuals.

This is where ethics starts to play an important role in data science, shifting from algorithms and data to a more social responsibility aspect. Time we consider some of the ethical issues surrounding all of this wealth in data oil.

The digital trace we leave

First and foremost when gathering data, we must remember that there is an organisation, places, and people behind the data. Going back to the very beginning of the course, we discussed how data is perceived as the new oil. Every move we make, every action we take leaves a digital trace.

These traces are often considered a commodity on which algorithms can make inferences about who we are, what we do, and what we prefer.

You may have experienced an extreme of this recently. Have you ever received a notification promoting an item you might have mentioned in verbal communication with a friend or partner? Did you opt into having an algorithm eavesdrop on your private conversation?

The convenience of the algorithm

More likely than not, you accepted the terms and conditions when they popped up so that you could benefit from the convenience of a new algorithm that would recommend a small subset of millions of possible services, choices, or products that you might not have found the time to research yourself.

The people behind the algorithms

We are living in an age of information overload, we need these algorithms to simplify things for us while we live our lives.

Behind these algorithms are data scientists and programmers who make choices about the best algorithm to fit the application and the data used to provide the model or for training it.

If you are considering gathering data and applying some of the algorithms presented in this course, you might consider whether what you are doing is likely to impinge on the rights of an individual to remain anonymous.

The General Data Protection Regulation (GDPR)

One recent change to data protection laws is the General Data Protection Regulation (GDPR), which requires clear disclosure of any data collection, as well as the intended purpose for processing the data, and how long that data is being stored.

This requires websites and mobile application developers to make it very clear how they will use our internet traffic, in the form of cookies, to make inferences about what we prefer.

We neglect terms and conditions

Often we neglect to read the terms and conditions before agreeing, which means a large proportion of the population are unaware that their data provides so much insight into their purchasing behaviour, political opinion, sexual orientation, and medical status.

In addition, not everyone knows that there is an API for that, which will provide anyone with the skills with access to that data for their own purposes.

© Coventry University. CC BY-NC 4.0
This article is from the free online

Applied Data Science

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now