Skip main navigation

Infrared marker-based motion capture

Read a detailed description of infrared marker-based motion capture and techniques used in processing the data: gap-filling and smoothing.
© Kristian Nymoen, University of Oslo

Infrared marker-based motion capture is one subcategory of optical motion capture systems.

As discussed in the previous video these systems may be very precise, with high resolution and high recording speeds. This type of technology is widely used for animation purposes in the film and gaming industries, and for medical purposes and rehabilitation. An increasing number of music researchers are now also making use of such systems in their studies of music and movement.

Cameras and markers

Infrared marker-based motion capture systems use reflective markers on the body or on an instrument. Such markers vary in size, and the best size to choose depends on the type of movement to be recorded. For instance, quite small markers are used to capture facial expressions, and larger markers can be used for full-body movement.

In order to “see” the markers, each of the mocap cameras emits infrared light. The light is reflected off the markers, and sent back as a two-dimensional image to each camera. The computer can then determine the exact location of the marker in space by combining the images from each camera such as sketched below.

Cameras finding the location of a marker

Three-dimensional motion capture data

Infrared mocap systems provide three-dimensional position data for each marker. The three dimensions are measured along the axes X, Y, and Z, and the orientation of these axes are determined when the system is calibrated. In a rectangular room, it often makes sense to let the axes run between opposing walls, and from floor to ceiling.

Recording motion data at a rate of 100 Hz means that 100 measurements are made per second, each with 3 data points (X, Y, Z) per marker. Considering that a full-body motion capture may require up to 30 (or even more) markers, we end up with a large amount of data. Software for simple processing and visualisation of the data is usually available from the mocap system provider. However, for music-related research it is often necessary to use analysis software that is tailored for our needs. One such example is the MoCap Toolbox from the University of Jyväskylä in Finland.

Data processing

Various processing of the recorded data is often needed, this may be small adjustments due to minor errors in the recording, or transformations of the data to calculate for instance the velocity or acceleration of the movement.

Occlusion: Gap-filling

One normal problem with motion capture data, is that a marker is “lost” in the recording. This happens when a marker is occluded or if it is moved out of the field of view of the cameras. Small gaps in the marker data can be easily repaired with so-called “gap filling”, based on interpolating between the closest data points to estimate the marker position in the gap, as shown below.


For longer gaps in the data it may be impossible to accurately estimate the marker position. That is why it is important to create as good recordings as possible in the first place.


Sometimes the recorded mocap data may be noisy, for example with small random errors in the data set. This may be caused by poor lighting conditions or a bad calibration of the system. It is still possible to reduce the noise level, by applying a smoothing filter to the data, as shown below.



Finally, after gap-filling and smoothing the data, it may be necessary to transform it in different ways. Here the research question is the most important when it comes to deciding which types of data processing and transformation is needed. Some popular transformations include:

  • Position data is intuitive, and is often useful directly.
  • Sometimes we need to look at how fast the position changes. The rate of change of position (the derivative of position) is what we call velocity: how fast does the position change? Velocity is closely related to kinetic energy, and may therefore be useful in certain types of analysis.
  • Similarly, we may need to look at how fast the velocity changes. This is called acceleration, and is calculated as the derivative of velocity. Acceleration is closely related to force.
  • Higher derivative levels may also be useful. It has been suggested that jerk, which is the derivative of acceleration may convey certain motion properties related to affect and emotion.
  • We may also use the position data from markers to calculate joint angles, orientation/rotation, periodicities, and so forth.

There are also numerous more advanced processing and transformation techniques in use, and we suggest to check out the MoCap Toolbox to explore these further.


© Kristian Nymoen, University of Oslo
This article is from the free online

Music Moves: Why Does Music Make You Move?

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education