# Model architecture of RNN

What is the model architecture of RNNs? In this article, Prof. Hao Ni explains the mathematical formulation of the RNN model.
In Step 4.3, we discussed the motivation of the RNNs for time series data modelling. In this article, I introduce the mathematical formulation of RNN network architecture.

RNN proposes a universal model for a continuous function on sequential data. It is composed of an input layer, a hidden layer and an output layer. The hidden neurons exhibit the recurrence structure, which is depicted in Figure 1.

Figure 1: the illustration of the network architecture of RNNs.

More specifically, let (bar{x} = (x_t)_{t = 1}^{T} in mathbb{R}^{T times d}), (bar{s} = (x_{t})_{t = 1}^{T} in mathbb{R}^{T times n_1}) and (bar{o} = (o_{t})_{t = 1}^{T} in mathbb{R}^{T times e}) denote the input sequence, hidden time sequence and output sequence respectively.

Mathematically, an RNN model is defined as follows:

• Input Layer ((l^{0}: mathbb{R}^{T times d} rightarrow mathbb{R}^{T times d}):bar{x} mapsto bar{x});
• Hidden Layer ((l^{1}: mathbb{R}^{T times d} rightarrow mathbb{R}^{T times n_1}):bar{x} mapsto bar{s});

[s_{t} = h(Ux_{t} + Ws_{t-1}), text{ for } t in {2, cdots, T};]

• Output Layer ((l^{2}: mathbb{R}^{T times n_1} rightarrow mathbb{R}^{T times e} text{or } mathbb{R}^{e}):bar{x} mapsto bar{o} = (o_{t})_{t = 1}^{T}) (sequential output) or (o_{T}) (static output),

[o_{t}= g(Vs_{t}), text{ for } t in {2, cdots, T}.]

Here (U) is a (n_{1} times d) matrix of weights, (W) is a (n_1 times n_1) matrix of weights and (V) is a (etimes n_1) matrix of weights. (g) and (h) are two activation functions. Note that ((U, W, V)) are trainable model parameters.