Hurry, only 5 days left to get one year of Unlimited learning for £249.99 £174.99. New subscribers only. T&Cs apply

# Highlighting data

Sometimes you might want to highlight selected data points on a plot with colours and highlight some data points with different colours. Other times, you might want to present data points in different colours and annotate them with text.

Now, we’ll use Matplotlib in Python and see examples as demonstrations of how to highlight selected data points on different plots with a different colour.

Let’s see how we can highlight data in our plots, to draw attention to it; this is a way of using the preattentive attributes of colour.

There are three different types of plots that we will look into for highlighting data.

1. Scatter plot
2. Line plot
3. Bar plot

## Scatter plots

A scatter plot (also known as scatter diagrams or x-y graphs) is a type of data visualisation that shows the relationship between different variables. Each data on the graph looks scattered, giving this type of data visualisation its name. Scattered data is shown on the graph by placing various data points between an x- and y-axis.

### When should you use a scatter plot?

Scatter plots are generally used to observe and show relationships between two numeric variables. The dots in a scatter plot describe the values of individual data points and patterns when the data are taken as a whole. A scatter plot should be used to:

• identify correlational relationships of data
• identifying patterns in data
• analyse unexpected gaps in the data
• identify if there are any outlier points.

### Demonstration: Highlighting scatter plots

Let’s look at how to highlight particular data points in a scatter plot. For this example, we will return to the Iris data set. Follow the steps to proceed.

#### Step 1

First, import the Pandas and Matplotlib libraries.

Code:

import pandas as pdimport matplotlib.pyplot as plt

#### Step 2

Then use the following code to import the Iris data set into Matplotlib that you might already have extracted from the zipped folder.

Code:

iris_data = pd.read_csv("iris.csv")

#### Step 3

Next, input Iris data variety as Versicolor, Setosa, and Virginica.

Code:

versicolor = iris_data[iris_data.variety == "Versicolor"]setosa = iris_data[iris_data.variety == "Setosa"]virginica = iris_data[iris_data.variety == "Virginica"]

#### Step 4

Set the axis and figure size on the subplot. Here we will take eight by eight.

Code:fig, ax = plt.subplots()fig.set_size_inches(8, 8)

Output:

The figure shows a blank plot as the result of the code.

#### Step 5

Now, let’s adjust the colour, size (length, width), and add labels to species in Iris data.

Code:

ax.scatter(versicolor["petal.length"], versicolor["petal.width"], marker="x", label="Versicolor", facecolor="green")ax.scatter(setosa["petal.length"], setosa["petal.width"], label="Setosa", marker="x", facecolor="blue")ax.scatter(virginica["petal.length"], virginica["petal.width"], label="Virginica", marker="x", facecolor="red")ax.set_xlabel("Petal Length (cm)")ax.set_ylabel("Petal Width (cm)")ax.set_title("Iris Petal Sizes")ax.legend()

Output:

The figure shows petal length and width and species highlighted with different colour markers as the result of the code.

#### Step 6

Highlight a specific variety of Iris. Set the facecolor of the other species to lightgrey and highlight the Versicolor data points.

Code:

ax.scatter(versicolor["petal.length"], versicolor["petal.width"], marker="x", label="Versicolor", facecolor="green")ax.scatter(setosa["petal.length"], setosa["petal.width"], label="Setosa", marker="x", facecolor="lightgrey")ax.scatter(virginica["petal.length"], virginica["petal.width"], label="Virginica", marker="x", facecolor="lightgrey")fig

Output:

The figure shows petal length and width and Versicolor species highlighted with green markers as the result of the code.

#### Step 7

Now, let’s highlight the colour of specific points on the Iris data set (most points in the Iris plot clearly appear to belong to a specific variety; however, there are a few points between Versicolor and Virginia that could belong to either of those varieties).

To highlight just these points, we need to first identify them and put them into their own DataFrames. Here is how the points were selected for the next example:

Code:

vs_overlaps = versicolor[versicolor["petal.length"] > 4.9]vg_overlaps = virginica[(virginica["petal.width"] < 1.75) & (virginica["petal.length"] < 5.2)]

(If the Versicolor petal length is >4.9CM, we consider it an outlier; if the Virginica petal length is <5.2CM and width <1.75, then that is an outlier.)

#### Step 8

Then, we’ll draw the entire data sets in light grey. This will look familiar:

Code:

fig, ax = plt.subplots()fig.set_size_inches(8, 8)ax.scatter(versicolor["petal.length"], versicolor["petal.width"], marker="x", facecolor="lightgrey")ax.scatter(setosa["petal.length"], setosa["petal.width"], label="Setosa", marker="x", facecolor="lightgrey")ax.scatter(virginica["petal.length"], virginica["petal.width"], marker="x", facecolor="lightgrey")

Output:

The figure shows the entire data set as light grey as the result of the code.

Note: The Versicolor or Virginica points are not labelled here. This is so that they do not appear on the legend twice.

#### Step 9

Next, we plot the filtered points. Since we are plotting these after the other points have been plotted, they will be placed on top of the existing grey points (i.e. we did not also have to remove the filtered points from the original data frames).

Code:

ax.scatter(vs_overlaps["petal.length"], vs_overlaps["petal.width"], label="Versicolor", marker="x", facecolor="green")ax.scatter(vg_overlaps["petal.length"], vg_overlaps["petal.width"], label="Virginica", marker="x", facecolor="red")

#### Step 10

And lastly, give these data sets axis labels and titles so that they appear on the legend with the correct colours.

Code:

ax.set_xlabel("Petal Length (cm)")ax.set_ylabel("Petal Width (cm)")ax.set_title("Iris Petal Sizes")ax.legend()

Output:

The figure shows the filtered points and labels as the result of the code.

Next, we have line plots and bar plots.

## Share with us!

Which plot did you find most interesting to highlight? Why?

Share your thoughts in the comment section below.

## References

1. Phoenix. Highlight 3 points in scatter plot with label on it [Forum]. MATLAB Answers. MathWorks; 2019 Jun 30. Available from: https://www.mathworks.com/matlabcentral/answers/469556-highlight-3-points-in-scatter-plot-with-label-on-it