Skip main navigation

What is lossy compression?

Lossy compression algorithms reduce the number of bits necessary to store a file by removing unnecessary or less important data

Besides lossless compression, the other type of data compression is lossy compression. Lossy compression algorithms reduce the number of bits necessary to store a file by removing unnecessary or less important data.

Activity: compress an emoji

Here we use an emoji as an example to see how we can reduce its file size using this type of compression.

The emoji is a 10×10 pixel image:

bbbbyybbbb
bbyyyyyybb
byyyyyyyyb
byybyybyyb
yyyyyyyyyy
yybyyyybyy
byybbbbyyb
byyyyyyyyb
bbyyyyyybb
bbbbyybbbb

 

One method of reducing the size of this file is to look at the pixels in 2×2 blocks, work out which colour dominates within each block, and assign that colour to the block:

 

bb bb yy bb bb
bb yy yy yy bb

by yy yy yy yb
by yb yy by yb

yy yy yy yy yy
yy by yy yb yy

by yb bb by yb
by yy yy yy yb

bb yy yy yy bb
bb bb yy bb bb

 

Starting from the top left-hand corner, the first 2×2 blocks like this:

 

bb
bb

 

Black dominates here, so we can call this block B.

 

The next 2×2 block is:

 

bb
yy

 

Here, neither black nor yellow dominates this block, so we will pick the mid-point between the RGB values of the two colours.
Black is 0, 0, 0, and yellow is 255, 255, 0, so the mid-point is 127, 127, 0. We will call this H.

 

We do this for all the 2×2 blocks in the 10×10 pixel emoji:

 

bb bb yy bb bb
bb yy yy yy bb

by yy yy yy yb
by yb yy by yb

yy yy yy yy yy
yy by yy yb yy

by yb bb by yb
by yy yy yy yb

bb yy yy yy bb
bb bb yy bb bb

 

So the compressed file looks like this:

 

BHYHB
HYYYH
YYYYY
HYHYH
BHYHB

 

Image quality

This file contains 25 characters compared to 100 in the original emoji — a 75% reduction in image file size. But what about the quality of the image?

 

Translating the compressed file back into an uncompressed image gives us this:

 

bbhhyyhhbb
bbhhyyhhbb
hhyyyyyyhh
hhyyyyyyhh
yyyyyyyyyy
yyyyyyyyyy
hhyyhhyyhh
hhyyhhyyhh
bbhhyyhhbb
bbhhyyhhbb

 

We can now load this newly compressed emoji and compare it to the original:

 

On the left, a 500 pixel by 500 pixel version of the 10 pixel by 10 pixel smiley face emoji created in week 2. On the right, a 500 pixel by 500 pixel version of the compressed 5 pixel by 5 pixel smiley face emoji created above. The eyes are no longer visible, the mouth is now just a single brown pixel, and the edge of the emoji is less smooth and contains some brown pixels.

 

As you can see, in this example the lossy compression has led to a serious reduction in image quality.

 

Activity: JPEG compression algorithm

One real-life compression algorithm is JPEG compression, which works on image files.

The JPEG compression algorithm is a little more complicated than the above example, and as a result it only causes a minor reduction in image quality.

Let’s try it out on the following image of a puppy:

An image of a puppy

 

  • Download the image to your computer, or find a similar file.
  • You can find the few lines of code you need for compressing an image using the JPEG algorithm either in this repl.it project, or copy and paste the code below into a Python file.
from PIL import Image
im = Image.open('puppy.bmp')
im.save('puppy.jpg',"JPEG", quality=90)

 

  • If you’ve created a new Python file, make sure you have saved the image of the puppy as puppy.bmp in the same directory as your Python file.
  • Run the Python script. If you’re using repl.it, download the puppy.jpg file to your computer.
  • Compare the file sizes of the original puppy.bmp image and the compressed puppy.jpg image. The orginial should be about 2MB, while the compressed file should be around 140KB.
  • Open both image files and compare what they look like. The compressed image should have little discernible loss in quality.
  • When you look at the Python script, you can see that in the line of code that saves the file, there is an option for image quality.
  • Reduce the quality value, run the script again, and see what effect the change you made has on the size and quality of the output file.

The risks of lossy compression

Lossy compression algorithms, such as the JPEG algorithm and the MP3 algorithm, can reduce the size of files, which is a crucial factor when files need to be transferred from one computer to another, such as when you view an image in a web browser or watch a film on Netflix.

However, it is important to remember that this type of compression is a destructive process that causes data to be lost. Performing repeated rounds of compressions on a file can cause such severe loss of data that the file output becomes unrecognisable.

This article is from the free online

Data Representation in Computing: Bring Data to Life

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education