Learn more about this course.

Using Pandas with Seaborn

Learn more about using Pandas with Seaborn

Numerical plots

Plots are the way for visualising the relationship between variables. These variables can either be numerical (categories such as a group, class, or division) or categorical. We will unpack the former in this step.

Numerical variables are quantitative. Such variables are numerical measures of individuals or elements used in a data set. To display such variables and data, we use numerical or continuous plots. In a nutshell, numerical plots:

involve numerical variables
compare trends and display outliers.

Want to keep
learning?

This content is taken from
FutureLearn online course,

Data Visualisation with Python: Seaborn and Scatter Plots

View Course

For example:

Scatter plots and line plots

Scatter plots

Typically, Seaborn integrates with Pandas, so that we can pass a DataFrame to one of its plot functions. You can either choose to create a DataFrame from scratch by adding the set of code, using the DataFrame syntax in the image here, or import an existing file. Pandas will then pull the data out for you.

We will start with creating scatter plots in Seaborn and then move to other types of plots in the subsequent activities.

If you remember, scatter plots are presented when you want to show the relationship between two continuous values, which means there are two variables plotted. The plot displays how one variable gets affected by the other in every fraction of the value in the data set.

Graphic with four graphs showing "strong, positive, linear", :moderate, negative, linear", "null/no relationship", and "moderate, negative, lienar" respectively.

For depicting a scatter plot in Seaborn, we will be using the scatterplot() function. Do note that the scatterplot() function takes its arguments as keywords.

Some of the common arguments are as follows:

data: the Pandas DataFrame that you want to plot.
x: the name of the column in the DataFrame to source X values from.
y: the name of the column in the DataFrame to source Y values from.

For now, these arguments should suffice to draw a scatter plot. However, we will introduce you to other arguments as and when we use them.

Demonstration

Let’s look at an easy way to display the relationship between Iris petal width and size. Follow the steps given below in the Jupyter Notebook you just downloaded in the previous step.

Step 1

First, we import Pandas and Seaborn. The convention is to alias Seaborn to sns.

import seaborn as sns
import pandas as pd

Step 2

Then, we’ll read the iris.csv file again. You can extract it from the zipped folder again, if you do not have it already from the previous steps.

Code:

data = pd.read_csv("iris.csv")

Output:

Step 3

Next, we will filter the data to just the Setosa species.

Code:

setosa = data[data.variety == "Setosa"]
setosa

Output:

Step 4

Finally, we plot the data by using the following code snippet.

Code:

sns.scatterplot(
 data=setosa, 
 x="petal.length",
 y="petal.width",
)

Output:

Screenshot of the jupyter notebook output displaying the relationship between Iris petal width and size. The image show a scatter plot. X-axis labelled "petal.length" reads from left to right: 1.0, 1.2, 1.4, 1.6, 1.8. Y-axis labelled "petal.width" reads from bottom to top: 0.1, 0.2, 0.3, 0.4, 0.5, 06. Most of the dots are located on the region of y-axis: 0.2 to 0.4 and x-axis: 1.2 to beyond 1.6.

In the output, we just tell the scatterplot function to source the x-ordinate from the ‘petal.length’ column in the DataFrame and y-ordinate from the ‘petal.width’ column.

The third variable

When we introduced you to scatter plots in Course 1, and again in this one, we mentioned that they are used to plot two variables. But, if we have a third variable that equally plays a role in showing the comparison of values, then with Seaborn it can be automated. How is that possible?

You can display the third variable (in the example below, the variety of the Iris species) by either adding hue or by adding a style to the scatters on your plot.

Adding hue

Seaborn can automatically help us compare data with more than two variables. One simple way is to plot different data series in different colours (or hues, as Seaborn refers to them). To illustrate, let us draw another scatter plot, but tell Seaborn to add a new colour (hue) for each variety of Iris.

You will be using the entire DataFrame that was read from the Iris dataset CSV for this plot, and not just Setosa.

Code:

sns.scatterplot(
 data=data, 
 x="petal.length",
 y="petal.width",
 hue="variety"
)

Output:

As you can see, the Iris variety is treated as the third variable in this resulting plot.

Adding style

Another simple way to plot the third variable would be to depict different data series in different styles of scatter icons. To illustrate, let us draw another scatter plot using the same data set, but tell Seaborn to add a new style for each variety of Iris.

For demonstration, you will be again using the entire DataFrame that was read from the Iris data set for this plot.

Code:

sns.scatterplot(
 data=data, 
 x="petal.length",
 y="petal.width",
 hue="variety",
 style="variety",
)

Output:

As you can see, the Iris variety is treated as the third variable in this resulting plot.

To add more features and arguments to your plots, check the link to the Seaborn documentation for reference and follow the steps.

Refer to: Seaborn documentation [1]

Did you notice?

In the previous example, Seaborn has gone ahead and labelled the axes automatically for you, which saves you having to do it with extra code. What else did you notice was different in plotting using Seaborn as compared to Matplotlib?

References

Seaborn.scatterplot [Document]. Seaborn; [date unknown]. Available from: https://seaborn.pydata.org/generated/seaborn.scatterplot.html

Want to keep learning?

This content is taken from FutureLearn online course

Data Visualisation with Python: Seaborn and Scatter Plots

View Course

See other articles from this course

This article is from the free online

Data Visualisation with Python: Seaborn and Scatter Plots

Created by

Join Now

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now

Learn more about this course.

Using Pandas with Seaborn

Numerical plots

Want to keep
learning?

Data Visualisation with Python: Seaborn and Scatter Plots

Scatter plots and line plots

Scatter plots

Demonstration

Step 1

Step 2

Step 3

Step 4

The third variable

Adding hue

Adding style

Did you notice?

References

Want to keep learning?

Data Visualisation with Python: Seaborn and Scatter Plots

Data Visualisation with Python: Seaborn and Scatter Plots

Data Visualisation with Python: Seaborn and Scatter Plots

Reach your personal and professional goals

Register to receive updates

Learn more about this course.

Learn more about this course.

See all FutureLearn courses.

Learn more about this course.

Using Pandas with Seaborn

Numerical plots

Want to keep learning?

Data Visualisation with Python: Seaborn and Scatter Plots

Scatter plots and line plots

Scatter plots

Demonstration

Step 1

Step 2

Step 3

Step 4

The third variable

Adding hue

Adding style

Did you notice?

References

Want to keep learning?

Data Visualisation with Python: Seaborn and Scatter Plots

Share this

Data Visualisation with Python: Seaborn and Scatter Plots

Data Visualisation with Python: Seaborn and Scatter Plots

Reach your personal and professional goals

Register to receive updates

Learn more about this course.

Learn more about this course.

See all FutureLearn courses.

Want to keep
learning?