Want to keep learning?

This content is taken from the Griffith University's online course, Big Data Analytics: Opportunities, Challenges and the Future. Join the course to learn more.

Skip to 0 minutes and 5 seconds People use social media to connect with friends, family, communities, and businesses. They share personal information, photos, and give a glimpse into their lives, their interests, where they’ve been, and what they’ve been up to. They choose what they share and who they share it with. They control their own data privacy through the profile privacy settings. But how much do those controls truly protect our data? Back in 2014, a personality quiz called This Is Your Digital Life by Global Science Research circulated on the social media platform Facebook. Around 270,000 people completed the quiz. This quiz requested access to some information from your Facebook profile, which is a standard request encountered with any number of third party apps.

Skip to 0 minutes and 45 seconds Click OK, and off you go. But what happens with the data? In this case, by taking the quiz, you allowed it access to your personal data as well as the personal data of your friends. In total, a whopping 87 million Facebook profiles were accessed and potentially compromised. Global Science Research got access to not just the public page, but also date of birth, current city, and the page likes for each and every one of those profiles. This data was personally identifiable information. It could be linked back to individuals Facebook has insisted that this did not constitute a data breach. The users technically consented via the privacy settings despite never personally authorising the app’s access.

Skip to 1 minute and 25 seconds The app was created by an academic researcher, data scientist and psychologist named Aleksandr Kogan. He used the Facebook data to explore how people use emojis to convey emotion. Kogan shared the data set with the London based data mining company Cambridge Analytica for use in political consulting. This was a breach of Facebook’s terms and conditions. And subsequently, the survey was removed from Facebook apps. So why does political consulting now come into the picture? Well, Cambridge Analytica used the data to build profiles of voters to create highly targeted political ad campaigns. It came to light that this company potentially influenced the 2016 US presidential election, the Brexit referendum, and other political campaigns. This prompted investigations all around the globe.

Skip to 2 minutes and 10 seconds Both Facebook CEO Mark Zuckerberg and Cambridge Analytica CEO Alexander Nix were summoned by the respective governments to answer questions. 600 questions from 42 US senators in 10 hours over two days at Capitol Hill. Zuckerberg fielded a wide range of questions about the Cambridge Analytica data leak, fake news, and Russian election meddling. Our democratic institutions are undergoing a stress test. And I believe that American companies owe something to America. I think the damage done to our democracy relative to Facebook and its platform being weaponised are incalculable. Enabling the cynical manipulation of American citizens for the purpose of influencing an election is deeply offensive. And it’s very dangerous.

Skip to 2 minutes and 54 seconds Putting our private information on offer without concern for possible misuses I think is simply irresponsible. Who’s going to conduct an audit when we’re talking about are there other Cambridge Analytica’s out there? Do you have procedures in place to inform key government players when a foreign entity is attempting to buy a political ad or when it might be taking other steps to interfere in an election? Several legislative initiatives are currently in the works in the US, including multiple bills to tighten and strengthen regulations on online political ads, data collection, and data privacy.

Skip to 3 minutes and 24 seconds The Consent Act and the Social Media Privacy Protection and Consumer Rights Act look at consent for data collection and ownership of data with some parallels to the General Data Protection Regulation in the EU. The proposed legislations have not been well received by corporations. A trade association, whose members include Google, Facebook, and Twitter, instead promoted a model of self-regulation to be carried out by the platforms as an alternative to the proposed federal law. The proposed law would require more disclosure in regards to political advertising, and there are concerns about the platform’s commitment to consumer protection. This isn’t the first time Facebook has been scrutinised over privacy.

Skip to 4 minutes and 4 seconds In 2011, the Federal Trade Commission enacted a consent decree regarding how user data was tracked and shared mandating that Facebook must notify users and obtain their permission before data concerning them is shared beyond the privacy settings they have established. The consent decree specified a $40,000 US fine for each violation. The Cambridge Analytica situation has the potential to result in millions of dollars in fines to the tech giant if they are found in breach. Concerns around data privacy and social media are being examined around the world. This incident is receiving a lot of attention in the UK where Cambridge Analytica is based. A parliamentary committee in the UK requested Zuckerberg appear to give evidence on the matter.

Skip to 4 minutes and 44 seconds However, he refused to appear. Alexander Nix from Cambridge Analytica received a formal summons to give evidence in the committee in June 2018 in relation to disinformation and fake news. Since the incident surfaced, Cambridge Analytica has entered insolvency proceedings and is in administration. The UK information commissioner Elizabeth Denham has been spearheading the investigation into the Facebook and Cambridge Analytica scandal. The use of data illegally or even negligently has to change. And I think this investigation gives confidence and scope to other regulators around the world to look at misuse of data, new data techniques, and ensure that there’s compliance with the law. So this could happen in other jurisdictions.

Skip to 5 minutes and 26 seconds And I think parliamentarians and legislators and regulators are taking a close look at this case. I think the time for self-regulation is over. And I think the public expects these large tech companies to comply with laws and ethics and community standards. There has been significant speculation around what impact these proceedings will have on big data analytics. It is clear the changing legal landscape will impact the way data can be recorded, accessed, and used. But how will public perception of data analytics and the information individuals choose to share online be affected? Will consumer behaviours change? Or will it be business as usual as people enjoy the convenience of these platforms? And how should businesses respond to this?

Skip to 6 minutes and 8 seconds How will their brand trust be impacted by their observance of data security and their transparency around past and present utilisation of big data? It will be interesting to see how this case continues to unfold and how it will impact the big data landscape.

Data privacy and consent

In March 2018, a big scandal surrounding Facebook data circulated in the news. It involved data from millions of Facebook users that was used without their consent for political advertising campaigns, including the Trump presidential campaign and the Brexit vote.

It started with a personality quiz

Let’s examine the data and analytical side behind the story a bit more. It started with around 300,000 Facebook users who shared their own and, often unknowingly, some of their friends’ data by taking a personality quiz.1 The data included public profiles, page likes, birthdays, current cities, and even private messages2,3, in addition to the answers of the actual quiz. Based on the quiz answers, users were profiled along five psychographic dimensions: openness to experience, conscientiousness, extraversion, agreeableness, and neuroticism.4

It led to targeted political advertising

All this data formed the training set to be used later by the algorithms to build a prediction model. The users’ Facebook data and their answers to the quiz formed the features in the training set. The users’ values along the psychographic dimensions formed the labels in the training set.

A suite of algorithms then ran over the training set. The result was a model that can predict the psychographic label (psychographic dimension) for a user, based solely on their features (Facebook data). The model was applied to a total of 2.1 million Facebook users.4

The psychographic profiles of these 2.1 million people were used to create groups of similar users. The grouping then allowed the creation of targeted political advertising, tailored to each group’s attributes, to influence their political preferences.

How was this different to traditional voter analytics?

Creating targeted political advertising to influence voters is not a new practice. However, while traditional voter analytics were predicting rough political beliefs and voting behaviour, the detailed data from Facebook could be used to discover fine psychographic nuances in voters.5 The resulting groups are called ‘Flag and Family Republicans’ or ‘Education-Focused Democrats’6 and enable campaigns to deliver highly-personalised advertising.

Your task

How did you feel about this incident? Did this news prompt you to change your social media habits in any way?

Share your thoughts in the comments.


Video references and acknowledgements


The UK Information Commissioner’s Office have kindly provided an image of Elizabeth Denham for use in the step video.

Excerpt from interview reproduced with permission from Radio New Zealand:


Accompanying text references

  1. Zuckerberg, M. 2018 Mar 21 [Facebook post online]. Available from: https://www.facebook.com/zuck/posts/10104712037900071 

  2. Coulter M. How to find out if your Facebook data was shared with Cambridge Analytica using new tool launched by the social network; Evening Standard; 2018 Apr 10 [online]. Available from: https://www.standard.co.uk/tech/how-to-find-out-if-your-facebook-data-was-shared-with-cambridge-analytica-using-new-tool-launched-by-a3810551.html 

  3. Lapowsky I. Cambridge Analytica Could Have Also Accessed Private Facebook Messages; Wired; 2018 Apr 10 [online]. Available from: https://www.wired.com/story/cambridge-analytica-private-facebook-messages/ 

  4. Hern A. Cambridge Analytica: how did it turn clicks into votes?; The Guardian; 2018 Mar 6 [online]. Available from: https://www.theguardian.com/news/2018/may/06/cambridge-analytica-how-turn-clicks-into-votes-christopher-wylie  2

  5. Rosenberg M, Confessore N, Cadwalladr C. How Trump Consultants Exploited the Facebook Data of Millions; The New York Times; 2018 Mar 17 [online]. Available from: https://www.nytimes.com/2018/03/17/us/politics/cambridge-analytica-trump-campaign.html 

  6. Balz, D. Democrats Aim to Regain Edge In Getting Voters to the Polls; Washington Post; 2006 Oct 8 [online]. Available from: http://www.washingtonpost.com/wp-dyn/content/article/2006/10/07/AR2006100700388.html 

Share this video:

This video is from the free online course:

Big Data Analytics: Opportunities, Challenges and the Future

Griffith University