# Perceptrons to neural networks to deep learning

A video discussing how perceptron models are combined in neural networks and lead to deep learning approaches.

All deep learning systems are made up of layers of much simpler ‘building blocks’.

In this article we will briefly go through some of the component parts.

## Perceptron

A perceptron is just about the simplest form of neural network-type classification system it is possible to have. It allows us to perform binary classification. In fact we can think about a perceptron as one neuron in a more complicated neural network model. Mathematically, a perceptron is a basic linear model:

[begin{equation} f(x) = begin{cases} 1 & text{if $w.x + b > 0$}\ 0 & text{otherwise} end{cases} end{equation}]

In this equation, (x) is a vector of inputs, and (w) is a corresponding set of weights. (b) is a set of bias values, allowing us to move the decision plane of the function.

In other words, if (b) is a negative number, then the weighted sum of all the input values (sum_{i=1}^m w_ix_i) must be greater than the absolute (positive) value of b in order to activate in the “1” class (as opposed to the “0” class).

The perceptron then is essentially a set of equations which simply calculates a weighted sum of all the input values. It has a threshold-like activation function, producing a 1 output if the sum is sufficiently large; given the adjustable bias, which can be thought of as providing a sensitivity to the activation.

## Neural network

What we think of as a neural network is in essence a collection of these perceptrons working together, with the output of one connected to the input of the next, forming connected layers in the network.

Typical components of a neural network are an input layer, which receives incoming data, hidden layers, which are set of interconnected neurons, and an output layer where we can see the results. A very simple neural network has one of each of these layers (see figure below). Of course, more practically-useful neural networks contain several hidden layers, and many nodes in each, and may contain more complex activation functions.

Illustration of nodes and connections in a simple neural network. The left-most nodes form an input layer of 3 nodes; these are ‘fully connected’ to the middle nodes, forming one hidden layer. The right-most nodes are the output nodes.

## Deep neural networks

A deep network, at its simplest, is a neural network with enough hidden layers to appear “deep” in appearance. It can also be characterised by using more advanced activation functions, and in some cases organising neurons in a way more appropriate to the data domain; for example, with convolutional neural networks (CNN), spatial windows are used, which fit image domain data well, and pooling functions reduce resolution where required.

Deep networks may have many millions (even billions) of parameters to learn – in other words, millions of weights and biases to learn to set with training data. This process therefore takes a long time, and requires a lot of training examples to learn effectively.