Skip main navigation

Annotation tools

An overview of software tools for image annotation, with links.
A screenshot of the Fiji application, showing the menu bar, an image of some plant roots on a blue background with the root tips selected, and an image of a leaf with the outline partially drawn around.

In order to train machine learning on image data, we often need to label regions of images.

To do this, we could use simple image painting tools to “colour in” regions we are interested in. But actually, it is such a common task that dedicated tools have been built to help with the process of annotating image data.

In this article, we’ll discuss some useful features of annotation tools, and provide a few links to commonly used tools. Describing particular tools in detail is, we feel, of limited use, as the tools themselves update and change over time. Therefore, the best place to find out details of the tools, and be informed of any updates or changes, is to read the features and documentation on their respective websites.

Some of the tools are entirely manual, providing different selection tools to help you label. Other tools may be semi-automatic. For example, as you perhaps colour in a region, some tools may be able to identify the features of the region you are colouring, and extend the label for you automatically. These tools can greatly decrease the time spent labelling regions, but must be monitored to make sure they are performing as you expect.

The kinds of labelling that are commonly applied to images include the following, in approximate order of increasing manual effort to perform.

Centres of objects

Sometimes marking just the centre of an object might be enough. For example, marking the centre of seeds scattered across a tray would allow their position to be marked, as well as providing a count of seeds indirectly. Using such annotations, a learning system would then typically look in a close region around this point (dependant on the size of the object being annotated) at image features located nearby.

Bounding boxes

Drawing a rectangle to enclose the item of interest is a common method of annotation; popular as it provides a good balance between fast and informative annotation. It is quick to mark with a simple drag of the mouse, but reveals information about location, count, size of area, and approximate pixels of the object. The downside is that some of the enclosed pixels are likely to be background pixels.

Areas of an image

We normally mark regions of an image that we want to learn to classify or segment using labels. At its simplest, this is in the form of coloured regions on an image, coloured according to the label we are applying. This really can be as simple as painting image regions in a certain colour, each colour representing one label category. For example, we may have a ‘disease’ label which we want to ‘paint’ over areas of the plant which are diseased, with the rest of the plant acquiring a ‘healthy’ label by default. Some annotation tools really help here with the setting of particular labels for chosen colours.

Fine-grained vs. coarse annotations

Fine-grained annotations are where we mark individual regions in an image that comprise details of an object – for example we might mark the individual parts which make up a flower (petals, stigma, stamen etc.). This is in contrast to applying one coarse label of ‘flower’ collectively to all the components.

Crowd-sourcing annotations

As well as performing the above annotation tasks manually, sometimes we repeat the task amongst multiple annotators. This can give us a confidence as to the quality of an annotation set (if two people mark 35 seeds and one person marks 45, we can place more confidence in the 35 count). It can also represent the areas in an image we are most confident about containing pixels of interest, such as where all the annotated pixel regions overlap (called the union of the masks).

The extreme form of this is crowd-sourcing annotations. This involves asking members of the public to annotate images, normally via a web interface annotation tool. This also includes a tutorial phase to teach the annotators, as they will not be experts in the topic. An example of this approach are the Zooniverse citizen science projects linked below. This can be a fast and relatively cheap way to acquire annotations, but being carried out by non-experts, and often for free, the quality can vary a lot. The workaround for this is simply annotating each image many times, and using statistics about the annotations to help weed out poor quality ones.

It is also possible today to pay professional annotation companies to annotate data for you. As this is fast and has a prescribed level of accuracy (depending on cost!), this is a popular option with industry.

How are the annotations stored?

Once annotated, the annotation data can be stored in a variety of ways. Region annotations might be marked in pixel mask images (see Figure). Centre annotations (sometimes called point annotations) are likely stored as sets of x, y coordinates in text files. Bounding boxes can be stored in a similar way, with one coordinate marking the upper left of the box (for example), and two more values representing the width and height respectively.

Example pixel mask image, first as a binary mask (top left), then with different colours representing different regions in the image (a fine-grained instance segmentation, top right). Original image bottom left.
This article is from the free online

Machine Learning for Image Data

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now