Skip main navigation

Summary

Here is a quick recap of the material covered this week, which has focused on data and features. This week we have looked at: types of data and features feature …

Filling in missing data

In some cases it may be useful to fill in gaps in data. A constant such as zero can be used, or often the mean of the feature. However, this …

NaN sentinel items

Sentinel items are specific sets of characters that are used to indicate the absence of data. This could appear as NaN (not a number) or NA (not available) depending on …

Missing data

This video is a continuation of our discussion on missing data. Often, datasets are missing data for one or more features for some examples, due to human or technical error. …

Pre-processing data

This video gives an introduction to the next activity, looking at missing data. In this activity, the primary focus is on measured or derived features, rather than pixel data. In …

Haar-like features

Haar-like features use filters to detect regions containing areas of different image contrast – brightness and darkness in an image. It was first designed for use in face detection, and …

Quality of annotations

This is the final part of our series of videos looking at labelling image data. In this last video, we discuss the importance of quality labelling and annotation to machine …

Filenames and folder structure

This is part four of our series of videos looking at labelling image data. This video provides an overview of how you might label images by using filenames and folder …

Shapes and polygons

This is the third in our series of videos looking at labelling image data. In this video, we look at annotating the shape of objects more accurately using polygons, and …

Points and bounding boxes

This video is a continuation of our series of videos looking at labelling image data. In this video, we look at labelling objects within images using points and bounding boxes.

Why do we label data?

Often, additional labelling needs to be added to images before using them to train machine learning models. This labelling provides information that the machine learning system can use to learn. …

Extracting features from images

Image data can consist of millions of pixels, which is in turn represented by millions of numbers. To reduce that amount of data into a smaller set of useful features, …

Classes of features

This video gives an overview of the main classes of data you might use in machine learning. This includes: numerical Boolean (true or false) frequency of appearance (e.g. words in …