Skip main navigation

£199.99 £139.99 for one year of Unlimited learning. Offer ends on 28 February 2023 at 23:59 (UTC). T&Cs apply

Find out more


Ethics in data science
© Coventry University. CC BY-NC 4.0

As data scientists, we design algorithms or experiments and use them to derive results. But how do these results impact society and our surrounding environment?

Experiments are a particular class of algorithms that involve processing data.

Designing algorithms enables us to build a system that can be re-used in different contexts and completely outside our control.

To some extent, this is what we want to happen, as it enables high degrees of scalability; income and business success are not reliant on one person to conduct the analysis, it can be picked up by anyone. On the other hand, this also means a lack of control over how our algorithms are used and how their outputs could be interpreted.

We, therefore, need to be aware of how our algorithms could be used and how their outputs could be interpreted. We have previously seen examples of interpreting results in Step 2.13: Sense checking sensational statistics.

In designing our algorithms and experiments, we must take into account:

  • Any applicable laws and regulations. For example, the General Data Protection Regulation (GDPR) or California Consumer Privacy Act (CCPA).

  • Privacy and anonymity of individuals. We need to be aware of how data science can be used to link different datasets, which can put anonymity under threat. Privacy is also a concern when consumers of a dataset change.

  • Ethical use of data. Not all data that is available can or should be used. We should follow guidelines published by professional bodies, such as the ACM Code of Ethics and Professional Conduct and the BCS Code of Conduct.

  • Validity of data and absence of bias. We need to make sure the data we hold accurately reflects the facts and is representative. Equally, the further processing of data must maintain representativeness and not introduce any bias.

  • Interpretation of results. Statistical models may be used in a predictive manner and different conclusions may be drawn by different audiences. Such statistical models provide no guarantee of the absence of other events. The Black Swan theory explains the potential impact unexpected events.

Whether designing or applying algorithms, we need to make sure we do so in an ethical fashion, taking into account all of the above. Even when doing so, we may encounter unexpected use of our technology at a future point in time.


ACM Ethics. (2018). ACM code of ethics and professional conduct.

Chappelo, J. (2020, March 11). Black swan. Investopedia.

European Commission. (2020). EU data protection rules.

State of California Department of Justice. (2020). California Consumer Privacy Act (CCPA).

The Chartered Institute for IT. (2020). BCS code of conduct.

© Coventry University. CC BY-NC 4.0
This article is from the free online

Get ready for a Masters in Data Science and AI

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education