Skip main navigation

Visualising geolocation data

In this article you will explore the most common visualisations for geolocation data.
© Coventry University. CC BY-NC 4.0

Now that we have information about a location, we will want to visualise them on a map to give us some idea of the distribution of the tweets. Towns, cities, and counties are clearly defined areas, either geographically bounded by the terrain (eg divided by a sea, river, or mountain range), or defined through administrative processes, such as a local authority.

These boundaries or borders can be used for the purpose of visualising our summary counts of tweets sent from each location. There are many sources of information we can use to find the boundaries around our locations, a common source is a standard format known as GeoJSON.

GeoJSON is a data standard designed for representing geographical features, based on the JSON format. Each location is stored as features including points (representing addresses and locations, lines representing streets, and boundaries, and polygons for defining regions such as countries, and provinces. We have prepared a GeoJSON file with all the boundary lines for the locations in our dataset. Before we begin to explore how we might visualise our data, let’s introduce some of the most common visualisations for geolocation data:

Point map

Point maps are the most simplistic visualisation for mapping data. Each item of data is represented by a single point or marker according to its position on the map. These are generally suitable as a starting point when exploring your data, as they provide a means to display data that might have a wide (or even global) distribution of information, for example, instances of earthquakes or restaurant locations. Point maps can also be used as a basis for drawing lines between points to represent calculate distances or display routes.

Heat map

The heat map is more suited to aggregating location data than a point map. When we include a time element, a heat map can be animated to show how the quantities over specific locations change over time. You may have seen examples of this before, such as on weather maps of heat or cloud cover, and earthquake sensor data showing the different shock intensities.

In more technical terms, the heat map represents points as a surface of relative density. We use a heat map when there are many points that are close together or overlapping, and not easily distinguished from one another. The relative density of points is represented by a dynamic colour scheme to indicate density values. In general, the colour scheme ranges from cool (less density of data points) and warm (high density of data points).

Choropleth map

Choropleth maps display geographical areas or regions, which are coloured, or shaded according to a data variable. The data variable uses colour progression to represent itself in each region of the map. Typically, this can be a blending from one colour to another, a single hue progression, transparent to opaque, light to dark or an entire colour spectrum.

When summarising over locations that are bounded or demarcated in some way, either geographically or politically, a choropleth map is a more natural choice of visualisation, and one you will have seen particular when discussing populations, electoral results, poverty, income etc. Given the appropriate geolocation information, the map can display the data variable according to country, province, or town and city.

Your task

Visualising geolocation data (10 mins)
In this task you will learn how to apply several visualisation techniques to geolocation data. You will aggregate the data to obtain frequency counts of total tweets per region and explore several types of visualisation including point maps, heatmaps and choropleth maps.
At the end of the task, you will have the practical skills to apply these visualisations to your own datasets, for example, political party votes, household income, and crime by region. Visit the Jupyter Notebook.

In this lesson we looked at how to extract, interrogate, and visualise location data stored with tweets. We learned how to process and extract information from the JSON format, and how to summarise by location according to geolocation data.

Further information

Many of the techniques that you have been introduced in this lesson can be straightforwardly applied to other datasets with location data. Why not explore the Earthquake Hazards Program

For a general list of free datasets visit Awesome public datasets

In the last part of this week, we will ask ourselves if we should be doing this.

References

USGS. (n.d.). GeoJSON summary format. Web link

GitHub. (n.d.). Awesome public datasets. Web link

© Coventry University. CC BY-NC 4.0
This article is from the free online

Applied Data Science

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education