Skip main navigation

Data Visualisation with ggplot2: Setting Facets and Scales

Data visualization with ggplot2: Setting Facets and Scales
© Wellcome Genome Campus Advanced Courses and Scientific Conferences

Now that we learned how to choose our data, and apply layers of aesthetics and geometries, let’s explore other possible layers of information such as facets and scales.

Good practice

A good practice is to load the packages you need before starting your analysis. It is also recommended to write the packages you need in the script you prepare for a project. This is a list of convenient packages to use with ggplot2:

> library(ggplot2)
> library(RColorBrewer)
> library(viridis)
> install.packages("ggsci")
> library(ggsci)

Note that we need an extra package compared to the previous step.

Setting your working directory in RStudio

Step 1. We recommend that you work in the Project folder Project_Test that we created previously, either by clicking directly on the Project_Test or using the following command

> setwd("/Users/imac/Desktop/exerciseR/Project_Test")
> getwd()
[1] "/Users/imac/Desktop/exerciseR/Project_Test"

Step 2. As a reminder, you can create a specific script file to write your commands and related comments.

Setting Faceting

Step 1. Here is an example of how to facet the output by Species, using “facet_wrap()”, that requires the facets argument to be specified, i.e. here the Species

> key <- ggplot(data = iris,aes( x= Sepal.Length,y = 
Petal.Length, color = Species))
> key + geom_point() + geom_smooth(se=FALSE) +

faceted output of species

Step 2. Other arguments with “facet_wrap()” give the possibility of fitting the y-axis scales to the values in order to optimize the output

> key + geom_point() + geom_smooth(se=FALSE) + 
facet_wrap(~Species, scale='free_y')

faceted output of species optimised

Setting Scales

Scales for positions and axis

Arguments for continuous x and y aesthetics are by default “scale_x_continuous()” and “scale_y_continuous()”. Variants include reversing order or transforming to a log scale. Please see usage in

It is also possible to plot discrete variables using “scale_x_discrete()” or “scale_y_discrete()

Step 1. To set x-axis limits using “scale_x_continuous()” with limits option

> key + geom_point() + geom_smooth(se=FALSE) + 
facet_wrap(~Species, scale='free_y') +
scale_x_continuous(limits = c(1, 10))

faceted output with scaled x and y axis

Step 2. To reverse the x-axis using “scale_x_reverse()

> key + geom_point() + geom_smooth(se=FALSE) + 
facet_wrap(~Species, scale='free_y') +

faceted output with reversed x axis

Scales for colours, sizes and shapes

Continuous colour scales can be specified using many options. Some are already pre-installed in RStudio, such as the RColorBrewer, but other specific colour palettes can be easily installed, loaded and used. Usage for “scale_colour_gradient()” or “scale_fill_gradient()” as examples can be found here

Step 1. Scales can be manually set by choosing specific colours, sizes and shapes

> key + geom_point() + geom_smooth(se=FALSE) + 
facet_wrap(~Species, scale='free_y') +
scale_shape_manual(values=c(3, 16, 17)) +
scale_size_manual(values=c(2,3,4)) +
scale_color_manual(values=c('#669999','#a3c2c2', '#b30059'))

scales set by choosing colours graph

Step 2. Scales can be set using existing colour palettes from the RColorBrewer package

> key + geom_point() + geom_smooth(se=FALSE) + 
facet_wrap(~Species, scale='free_y') +

Note. We use “scale_color_brewer()” to customize colours for lines or points, whereas we would use “scale_fill_brewer()” for filling colours of area, histogram bars, boxplots, etc

graph, scales set using RColorBrewer

Note. RColorBrewer palettes can be consulted with

> display.brewer.all()

RColorBrewer palette

Step 3. Scales can use different options with other color palettes from the viridis package

> new_key <- key + geom_point(aes(color = Species)) + 
geom_smooth(aes(color = Species, fill = Species), method = "lm") +
facet_wrap(~Species, scale='free_y')

> new_key + scale_shape_manual(values=c(3, 16, 17)) +
scale_size_manual(values=c(2,3,4)) +
scale_color_viridis(discrete = TRUE, option = "D") +
scale_fill_viridis(discrete = TRUE)

Note. We changed options for “geom_point()” and “geom_smooth()” to show you some other possible display variations

graph, scales in different options with other color palettes from the viridis package

Step 4. Scales can use packages designed to offer color palettes taken from sources such as highly accessed journals. Examples are “scale_color_npg()” (from Nature Publishing Group), “scale_color_lancet()” (from The Lancet journal) or even “scale_color_tron()” (from the film “Tron: Legacy”). Remember that scale_color functions have their scale_fill counterparts

> new_key + scale_color_tron() + scale_fill_tron()

Graph, scales use tron() packages designed to offer color palettes taken from sources

The particular case of missing values

Step 1. Missing values (NA) can exist in any data set, and need to be taken into account when plotting data. Let’s use this very simple data frame containing NA values

  • Plotting with default colors in ggplot2. By default, a grey colour will be used for NA
> df_test <- data.frame(x = 1:10, y = 1:10, 
z = c(1, 2, 3, NA, 5, 6, 7, NA, 8, NA))
> plot_test <- ggplot(df_test, aes(x, y)) +
geom_tile(aes(fill = z), size = 10)
> plot_test

Plot with default colours

  • We can ask to have no colour of NA values
> plot_test + scale_fill_gradient(na.value = NA)

graph, no colours on missing values

  • Or color NA values in a chosen colour, such as “red3” here
> plot_test + scale_fill_gradient(na.value = "red3")

graph, red colour on no missing value

  • Or use another colour palette, instead of default ggplot2 colours, and you will still be able to specify the NA values. Because we need many colors in this palette we use “scale_fill_gradientn()” instead of “scale_fill_gradient()
> plot_test + scale_fill_gradientn(colours = viridis(7), na.value = "white")

graph, viridis colour palette for NA

© Wellcome Genome Campus Advanced Courses and Scientific Conferences
This article is from the free online

Bioinformatics for Biologists: An Introduction to Linux, Bash Scripting, and R

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education