A look at HOG image feature extraction, and how to use it in Scikit-Image

In the previous video we looked at Histogram of Oriented Gradients or HOG, a common method to extract edge information as features from image data.

This article gives a quick review of the HOG process and demonstrates how to use it in the software package scikit-image. This is a simplified version of calculating HoG features explained to help with your understanding. In practice, we would use a function in a library to calculate the features for us.

Broadly speaking there are three main steps to the extraction of HOG features, plus an optional step.

### Step 1: Horizontal and vertical edge detection

The first step is to calculate the magnitude of the edges in both the horizontal and vertical direction for every pixel in the image. This is done by applying the simple derivative filter to each pixel in the two directions. Horizontally:

and vertically:

If you aren’t familiar with image filters, all this is saying is subtract the value of the pixel to the left of each pixel from the one to the right (for the horizontal filter), and the value of the pixel above each pixel from the one below it (for the vertical filter).

There are more complex and robust filters for picking out edges, but these simple approaches will suffice for now. For more on kernels and filters see the preceding course in this series Introduction to Image Analysis for Plant Phenotyping, if you haven’t already.

### Step 2: Calculation of magnitude and gradient / direction of edges

At this we point we now have twice as much data as we started with – the filtered image with edge detection in both the horizontal and vertical directions. You can think of this as a set of pairs of ((x,y)) coordinates for each pixel giving components for the edge measured in the two directions (x) and (y). The next step is to convert these (x) and (y) components into a magnitude and gradient (or direction) of the edge detected for every pixel.

We don’t need to go into the details here, but this is just a bit of trigonometry. We just need to calculate the length and the angle of the longest side of the right-angled triangle with the other two sides being (x) and (y):

Don’t worry too much about the calculation here, in practice this is all looked after by the software, it’s the concept of using the magnitude and direction of each edge that is important.

### Step 3 : Collating data in cells to make histograms

We now have a measured estimate of the angle and direction of the edge for every pixel in the image – effectively still twice as much data as we started with.

To make it a useful feature extraction method we need to reduce the amount of data from the raw pixel data. As explained in the video, this is where taking cells (grids) across the image, and making histograms of the data in those cells comes in.

We divide the image up into cells of a fixed size, and choose a fixed set of equally spaced angles as bins for the histogram. So for example you might pick 16 x 16 pixel cells and 4 angles, and HoG will make an estimate of the magnitude of the edge at each of the 4 orientations, as in the example pictured below.

An example HoG visualisation. Each set of set lines represents a cell of pixels, and the grayscale intensity of each line represents the magnitude of the edge at that angle orientation. Note lines only exist at certain angles – this reflects the bins used to build the histogram. With more bins, more lines would be present.

The magnitude estimate is made by taking each pixel in the cell and distributing the value of the magnitude for that pixel (calculated in step two) between the two histogram bins nearest the gradient for that pixel (also calculated in step two).

This is weighted so that if the gradient is equally placed between two histogram bins, the magnitude value is divided equally between the two bins, but if it is much closer to one bin than the other, most of the magnitude is placed in the nearest bin.

At the end of this step, for every cell, we now have a set of numbers, one for each of the angles we choose to use. So if, for example, our original image is 1600 x 1600 pixels, and we use 16 x 16 pixel cells for HoG, and 4 angles, we now have 100 x 100 x 4 = 40,000 extracted features representing our image. Still quite a lot, but a lot less than the over two and a half million pixels in the original image.

### Step 4 (optional step) : normalise in blocks

A final optional step is to normalise the edge magnitudes between cells within so-called blocks of the image. This is done so that HoG is less sensitive to variations in light across the image.

## HoG in scikit-image

The implementation of HoG we will use in the practical later this week is in the package scikit-image. It comes with Anaconda so you may already have it installed, but otherwise you just need the usual command:

pip install scikit-learn

Then to import the HOG function its just:

from skimage.feature import hog

Then to use it, assuming we have some image named img in memory, for a grayscale image we can use the following:

hog_features, hog_image = hog(image, orientations=4, pixels_per_cell=(16, 16), cells_per_block=(1, 1), visualize=True)

The function takes the keyword input parameters orientations, which is the number of directions you want for each cell, pixels_per_cell, which as the name suggests is the dimensions of the cell you want which you need to provide as a tuple, and cells_per_block which is optional normalisation mentioned in step 4 above.

Here we have set it to be ((1,1)) so that each cell is considered on its own. The other parameter visualize produces an output image for visualisation purposes if its set to true.

This function then outputs the feature data itself and the image for visualisation. Here we have named them hog_features and hog_image respectively. We’ll demonstrate the use of the HoG function in scikit-image in more detail in the practical later on this week.