Skip main navigation

Fall Detection Using Sequential Data

It's important to note that simply processing individual sequential frames isn’t enough to distinguish between a fall and other activities of daily living like lying down, as information about motion is also needed.
Person lying down

It’s important to note that simply processing individual sequential frames isn’t enough to distinguish between a fall and other activities of daily living like lying down, as information about motion is also needed. We will look at how deep learning can be used in detecting falls by taking advantage of motion information. A deep neural network is capable of extracting the most discriminative features from its training data and utilise that data in fall detection.

Modelling Sequential Data

In machine learning, there are many scenarios where the output at each point (of time, space, etc.) depends not only on the input at the same point, but also on the inputs at other points. Such data is referred to as sequential data, and classifying such data is called sequence labelling. Data can be sequential along any continuous or discrete dimension, such as time and space. When it comes to training a deep neural network with sequential data, a recurrent neural network (RNN) can be used. The difference between a regular neural network, which is also called feed-forward network, and RNN is the presence of a feedback loop in RNN.

Feedback loop in recurrent neural network

When unfolded, this loop produces a recurrent connection in the network, which models the correlation existing among the elements of the sequential data. To train a neural network with long sequences or sequences deep in time, a variant of RNN called long short-term memory (LSTM) neural network can be used.

Modelling Fall with Video Data

In an indoor environment, a fall event may occur in a dark or poorly illuminated room or area. To be robust against changes in lighting conditions as well as preserving the privacy of an individual being monitored, a depth camera, which produces depth maps can be used instead of a regular camera. A deep neural network trained with 2D data (images) is capable of encoding spatial information, but to extract motion features required for fall detection, the network will have to be trained with 3D data (video). The 3D locations of major body joints carry most of the body kinematic information required to discriminate different actions, which includes falls and processing such data is computationally inexpensive compared to regular camera data.

There is a correlation among the states of the body skeleton at different time steps. To exploit this sequential information intrinsically embedded in the input sequence, a RNN is used as the deep learning model. Because actions such as falls may require several frames (motion), the training data can be seen as sequential over a long period and hence a LSTM neural network is used. A deep neural network trained with 3D data can extract both the spatial and temporal features, but a fall event in most cases will occur at a sub-region of the entire frame and thus treating the whole frame equally may cause the salient fall features to diminish. Hence the LSTM can be trained in conjunction with another 3D DNN to classify fall and non-fall activities.

Modelling Fall With Range-Doppler Radar

Similar to a camera, data from a Range-Doppler (or a frequency modulated continuous wave) radar can be used to train a deep neural network to detect falls amongst other daily living activities. Radar signals corresponding to human gross-motor activities are nonstationary in nature and can reveal velocities, accelerations, and higher-order Doppler terms of limbs and various human body parts in motion. The range is another important piece of information, which can be obtained from wideband radar signals and can reveal human location at any point in time.

A continuous wave (CW) radar detects the radial velocity of a moving object, which alters the frequency of signal it reflects, known as the Doppler shift and carries information about the velocity of the human torso, limbs, and other parts of the human body. Training a deep neural network with combined signatures from two domains (Doppler shift and range) yields higher success fall classification rates than the use of a single domain.

The radar data can be represented as a spectrogram (a visual representation of the spectrum of frequencies of the signal as it varies with time) and the range data as a range map.

Radar data spectrogram

Now the combined radar data can be treated as any other image and used as an input to a convolutional neural network to classify it as a fall or other daily living activity.

© University of York
This article is from the free online

Intelligent Systems: An Introduction to Deep Learning and Autonomous Systems

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now