Want to keep learning?

This content is taken from the Raspberry Pi Foundation & National Centre for Computing Education's online course, Representing Data with Images and Sound: Bringing Data to Life. Join the course to learn more.
3.6

Storing sound

As you learned earlier in the course, the size of files is important; generally, we want to keep media files as small as is possible while maintaining sufficient quality. The file size of images depends on their resolution and bit depth. In this step we’re taking a look at what affects the size of sound files.

Length of recording

Obviously, a longer recording is represented by more binary numbers than a shorter one. A recording of an entire concert needs more storage space than the sound file of the brief ping your phone plays to let you know you’ve received a text message.

Longer recording = larger file size

Sample rate

How a sound was sampled also affects audio file size. The sample rate — how often the analogue signal was sampled in one second — is one factor.

Higher sampling rate = larger file size

A higher sample rate leads to a larger file, but if the sample rate is too low, the recording will not contain enough samples to capture all of the detail of the sound.

For example, telephone calls are often sampled at 8 kHz, a much lower sample rate than the CD standard of 44.1 kHz. This means that some details in people’s voice are lost in phone conversations, which affects ‘s’ and ‘f’ sounds and gives the speaker ‘telephone voice’. This sample rate make it easier to transmit multiple phone signals along the same connection, while still providing enough data to allow the person who is listening to understand what the speaker is saying.

Sample resolution

The size of an audio file ais also affected by the recording’s sample resolution (or audio bit depth), meaning how many bits are used to represent each sample.

For both image and audio files, the computer has to determine how many different “levels” are available to represent a single point. For RGB images, this determines how many different levels of red, green, and blue a single pixel can have; for audio files, this determines the number of different frequencies that a sample of the sound taken at a specific time point can have.

A larger audio bit depth leads to a larger audio file, but it also allows sounds with a wider range of frequencies to be recorded without distortion.

Higher sample resolution = larger file size

A classical concert, with noises ranging from a triangle and the high-pitched sound of a violin to the blast of a tuba or heavy percussion, needs to be recorded with a higher sample resolution than a piece of electronic music with a smaller range of frequencies.

Mono vs stereo

There’s one final factor that influences audio file size: whether a sound is monophonic or stereophonic, also known as mono and stereo. Mono sound is just one track of sound, whereas stereo sound contains two different tracks to add an impression of positioning and direction to the recording. As you’d expect, needing two tracks doubles the storage space require.

A stereo recording of the same length, sample rate, and sample resolution has twice the file size of a mono recording.

Bit rate

The sample rate and audio bit depth of an audio file are often combined into a single measure, known as the bit rate, which is the number of bits per second that are used to store a sound recording:

Bit rate = sample rate × audio bit depth = bits per second

Calculating audio file sizes

For a mono file, multipyling the bit rate by the length of the sound in seconds gives you the the overall file size in bits:

Mono file size = bit rate × length of sound

This is the same as:

Mono file size = sample rate × audio bit depth × length of sound

For a stereo file, the only difference of the equation is another factor of 2 that takes into account the use of two tracks:

Stereo file size = bit rate × length of sound × 2

This is the same as:

Stereo file size = sample rate × audio bit depth × length of sound × 2