Skip main navigation

5 ethical questions in data science

This article covers 5 ethical questions in data science, amid the growing concern of its ethical use by organisations.

Ethical Questions in Data Science

With the rapid growth in data science, there has been a growing concern around its ethical use by organisations. For example, concerns have arisen as:

  • Data science algorithms are used to accept and deny bank loans and the insurance premiums payable for insurance. However, the question arises: What is the social cost of a wrong decision for a bank loan or insurance?
  • Companies use data science to scan resumes and recommend the best candidate for a role. However, the question arises: What is the chance for a bias towards gender or age in the hiring algorithm if that algorithm is based on past data?
  • Companies use cookies to monitor the online behaviour of individuals and advertise based on their browsing behaviour. However, the question arises: What if an individual views companies reading their behaviour as an intrusion of their privacy?
  • Airlines use data science to decide on differential pricing for individuals based on their needs and rideshare companies (e.g., Uber) engage in surge pricing based on demands. However, the question arises: Is there a risk of these companies exploiting individuals beyond their means when they are in desperate need of their services?

Finger points to pie chart on tablet screen

As data science algorithms assist and replace human decision making, there are questions that every organisation should keep in mind. Some of the leading ethical concerns of harms by misuse of data include:

1. Unfair discrimination

The incorrect and unchecked use of data science can lead to unfair discrimination against individuals based on their gender, demographics and socio-economic conditions.

As Jeff Welser, a VPt and lab director at IBM Corp.’s Research Almaden Research Center in San Jose, told SiliconANGLE in December, If you have really large data sets, you might not even realize that the data are slightly biased towards gender or whatever you’re analysing …. It might be that you’ve overtrained on those characteristics.’

2. Reinforcing human biases

Gartner (‘Gartner Says Nearly Half of CIOs Are Planning to Deploy Artificial Intelligence’, 2020) predicts that by 2022, 85 percent of data science projects will deliver erroneous outcomes due to bias in data, algorithms or the teams responsible for managing them.

Data science algorithms use past data to predict future outcomes. Data are generated based on human decisions made in the past. Training the algorithm purely based on past data could lead to some of these biases being included in the algorithms.

Algorithms are also influenced by analysts’ biases, as they may choose data and hypotheses that seem important to them.

3. Lack of transparency

Data science algorithms can sometimes be a black box where the model predicts an outcome but does not explain the rationale behind the result.

Numerous recent machine learning algorithms fall into this category. With black box solutions, it is not easy for a business to understand and explain the reason for a business decision.

As Andrews notes, ‘Whether an AI system produces the right answer is not the only concern… Executives need to understand why it is effective and offer insights into its reasoning when it’s not.’

4. Privacy

Data privacy has become a major focus in the past few years. Sensitive data are stored by various organisations and are subject to hacking and misuse.

During the 2016 United States presidential election, Cambridge Analytica, a data analytics firm that worked on Donald Trump’s election campaign, used Facebook data to influence customers’ behaviours in the US election.

Revealed: 50 million Facebook profiles harvested for Cambridge Analytica in a major data breach. This incident highlighted ethical concerns related to the misuse of data.

There has been an increase in data breaches across the world. Rules and regulations, such as the General Data Protection Regulation (GDPR), have been introduced to monitor the way companies store and use sensitive data.

Organisations are not transparent as to what data they collect, and use it to make decisions. Most web browsers and websites capture enormous amounts of user data even without their knowledge and consent.

For example, Google (Chrome and Gmail) and Facebook store individual browsing data and monetises it by selling insights from users’ data for advertising.

The human side of analytics is the biggest challenge to implementing big data
Paul Gibbons
© Torrens University
This article is from the free online

Introduction to Digital Transformation: Understand and Manage Digital Transformation in the Workplace

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now