Skip main navigation

Working with data APIs

This article explains how to set up your environment for loading and analysing your Twitter data.
© Coventry University. CC BY-NC 4.0

This week we will learn how to analyse and visualise social media data. The activities will be more practical than theoretical, but we will take you through the steps and tools necessary for leveraging Twitter data to gain various insights based on very little data.

Our goal this week will be to explore a social media dataset obtained through the Twitter API containing geographical information stored with the tweets.

First we need data. The data set you will use in this course was extracted through the Twitter API, using a popular Python library called Tweepy, which handles all the requests to the Twitter API for us. Twitter requires users of their API to sign up for an account, and have this account approved, to reduce the barrier with respect to getting started with the activities for this week, we have extracted a sample of this data for you.

After tackling this weeks tasks, you may feel confident in accessing Twitter data for your own data analysis. To obtain a developer account and the required credentials, you’ll need to create a Twitter account, and request access through their official request form.

We will be working on a single data set and exploring some of the tools available to help us explore and summarise data related to locations attached to tweets to look at their geographical distribution.

The Twitter API

The Twitter API is a service for retrieving tweets for app building, commercial use, and commonly for research on social dynamics, such as, political opinions, responses to national and international events, and social network analysis. This makes it a great source of information on individuals (politicians, celebrities), social movements (demonstrations, calls for action, political opinions), world events (news, and blogs), and even the state of our planet (earthquake and climate sensor reading feeds).

Let’s first review the data we will use in this week’s tasks.

About the data

The data you will be using in this lesson was gathered from Twitter over a period of several days. The Twitter API provides functionality for defining a geographical area, and picks up only those tweets from that area. We defined a large rectangular area covering London and neighbouring suburbs. We explored the data and found that the most tweets sent out on any given day within our predefined geographical area came from a Twitter account connected with a mobile application called OLIO.

OLIO connects people and local businesses together to tackle the issue of surplus food, or food waster. This could be food nearing its sell-by date in local stores, surplus of home-grown vegetables, and even the left-over groceries in your fridge when you go away. The platform is also used for the reuse of household items. Users make an item available through the app, add a photo, description, and instructions on where the item is available for pick-up. Other users, browse items near them, reserve what they need, and arrange a pick-up via private messaging.

OLIO posts items listed through their app on Twitter via the screen name @WhatsOnOLIO, for the purpose of this lesson we will focus on the tweets broadcast in the United Kingdom.

Our choice of this dataset was somewhat random, and is not necessarily an endorsement of the mobile app OLIO and the attached Twitter, nor the Twitter account @WhatsOnOLIO account, although though we recognise its positive social impact.

© Coventry University. CC BY-NC 4.0
This article is from the free online

Applied Data Science

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now