Skip main navigation

New offer! Get 30% off one whole year of Unlimited learning. Subscribe for just £249.99 £174.99. New subscribers only. T&Cs apply

Find out more

Confidence bands

Learn about Confidence Bands

Scatter plots are used to show the values of discrete data, and line plots are used to show the values of continuous data where ‘in-between’ values can be interpolated.

Similarly, point estimates show uncertainty at one point while confidence bands show uncertainty over a range of points. For line plots, confidence bands are preferable. We’ve already seen Seaborn doing this automatically for us.

To draw confidence bands in Matplotlib, we make use of the Axes.fill_between method. This takes at least two arguments but to draw error bands we’ll use it with three.

  • The first argument is a sequence of x-positions of a curve.
  • The second and third are corresponding y-positions of the curve. The area between these two y-positions will be filled in.

As an aside, if we call the ‘fill_between‘ method with only two arguments, the area between the curve and the x-axis will be filled.

Demonstration: Errors on line plots

Let us bring back another dataset we used previously to demonstrate this with our time series data of temperatures in New York for the first week of June 2016.

Step 1

First, let us import the date class from the ‘datetime’ module.

Code:

from datetime import date

Step 2

We then draw the figure and axes for the plot we are intending to display.

Code:
fig, ax = plt.subplots()
fig.set_size_inches(12, 8)

We then read the NewYorkHourly.csv on the Notebook, parsing the dates and columns.

Code:

weather_data = pd.read_csv("New_York_Hourly.csv",
parse_dates=[["date", "TimeEST"]],
usecols=["date", "TimeEST", "TemperatureF", "Dew PointF", "Humidity"]
)
june_weather = weather_data[
(weather_data["date_TimeEST"] >= '2016-06-01') & (weather_data["date_TimeEST"] < '2016-06-08')
].sort_values("date_TimeEST")

Step 4

We don’t have any actual data for what sort of error there might be in the data, so we’ll generate some by assuming there might be a ±5% error for any data point.

Code:

error_min = june_weather["TemperatureF"] * 0.95
error_max = june_weather["TemperatureF"] * 1.05

Step 5

We plot the figure so far with the code snippet below.

Code:

ax.plot(june_weather["date_TimeEST"], june_weather["TemperatureF"])
ax.set_xlabel("Date")
ax.set_ylabel("Temperature (ºF)")
fig

Output:

Screenshot of confidence bands shown with the help of a line chart. Y-axis is labelled "Temperature (°F) reads from bottom to top: 65, 70, 75, 80, 85. X-axis is labelled "Date" reads from left to right: 2016-06-01, 2016-06-02, 2016-06-03, 2016-06-04, 2016-06-05, 2016-06-06, 2016-06-07, 2016-06-08. There is a single erratic zigzag blue line. Line starts from just below 75 on y-axis and 2016-06-01 on x-axis. The line ends in between 75 and 80 on y-axis and 2016-06-08 on x-axis. Click to enlarge

Step 6

Then, we just fill between those two points using the x-values from the original data (the date and time).

Code:

ax.fill_between(june_weather["date_TimeEST"], error_min, error_max, color="red", alpha=0.1)

The colour of the band is set with the color argument, and we set 10% opacity by setting the alpha to 0.1.

Output:

Screenshot of confidence bands shown with the help of a line chart using the fill_between method. Y-axis is labelled "Temperature (°F) reads from bottom to top: 65, 70, 75, 80, 85. X-axis is labelled "Date" reads from left to right: 2016-06-01, 2016-06-02, 2016-06-03, 2016-06-04, 2016-06-05, 2016-06-06, 2016-06-07, 2016-06-08. There is an erratic zigzag blue line that is on top of a thicker pink line. Line starts from just below 75 on y-axis and 2016-06-01 on x-axis. The line ends in between 75 and 80 on y-axis and 2016-06-08 on x-axis. Click to enlarge

In the output, you can see how the colours are distributed along the line plot to depict probability.

There are more ways to use the fill_between method, and you can read more about it at the official documentation in the link here:

Read: Fill_between documentation [1]

Reflect and share

Here you wrote a program using the fill_between method on a line plot. Can you think of examples or use cases where you might want to use this method in a combined plot of both scatter and line?

References

  1. matplotlib.pyplot.fill_between [Document]. Matplotlib; 2020. Available from: https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.fill_between.html
This article is from the free online

Data Visualisation with Python: Seaborn and Scatter Plots

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now