# Making a digital image: revisited

A description of code used to convert a digital image into a spreadsheet as an exercise.

Now we know a bit more Python code, we can return to the ‘Making a digital image’ practical from last week and discuss the code in a bit more detail.

NOTE this is quite complex code – don’t worry if you don’t understand what some of it is doing. Also feel free to advance to the next step. Maybe return to this later.

As a reminder, the code takes a digital image and converts into a spreadsheet with the cells coloured according to the pixel values of the image. To do this we identified the following steps:

1. Make an empty spreadsheet using Python
2. Open an image, cycle through pixels in the image file and get the numeric data
3. For every pixel, convert that numeric data to the colour format used by Excel
4. For every pixel, colour specific spreadsheet cells with the right colour (and also write the data in the cell)
5. Save our new spreadsheet

## Making a spreadsheet in Python

The first thing we need to do, is to import xlsxwriter and make a to make a new (empty) workbook called ‘robot.xls’ (or whatever name you choose). To do this you need the following code:

import xlsxwriterworkbook = xlsxwriter.Workbook('robot.xlsx')

Then, to add a worksheet to our blank workbook:

worksheet = workbook.add_worksheet()

Finally, to make the cells (or pixels) of the spreadsheet roughly square we use set_default_row to adjust the row height:

worksheet.set_default_row(50)

## Opening an image and reading the pixels

To open an image with OpenCV a.k.a cv2 we use the imread function:

import cv2img = cv2.imread('robot.jpg')

This looks for an image called ‘robot.jpg’ and saves it using the variable name img.

Once we have the image data, we need to cycle through it pixel by pixel to add it to the spreadsheet. Since the data is stored in a 2-dimensional array we can use for loops to cycle through the pixels:

for i,row in enumerate(img): for j,column in enumerate(row): for k,rgb in enumerate(column): print(rgb) # do some other stuff

The enumerate function is a way of counting the row and column pixel positions, and the colour channel positions, and storing them in the temporary variables i, j and k respectively – we’ll need to use this in the final version of the code.

Here, in the loop we would just print out the data in the ith row, jth column and kth channel, but the script in the practical has code to perform steps 3 and 4 in the list above.

Somewhat confusingly, for each pixel, OpenCV gives colour data in the order blue, green, red rather than red, green blue, so in the final version of the nested loops we need to add the lines:

column = list(column) column.reverse()

to reverse the order of the channels.

## Numeric pixel colour data to Excel format

In the step above we showed how to access the data at every pixel in the image, now we need to do something with that data. Specifically, we need to use the three red green and blue values to colour a specific cell of our spreadsheet.

Before we can do that though we need to consider the format that xlsxwriter needs to colour the cells. Unfortunately, this is not the same as our 8-bit numeric values between 0 and 255. Instead, xlswriter needs hexadecimal values in the form of strings for each channel Red, Green and Blue, preceded by the ‘#’ symbol.

Hexadecimal numbers are numbers written in base 16, with the letters A to F used in addition to the digits 0 to 9 to represent numbers.
This is convenient to represent 8-bit numbers from 0 to 255 using just two digits, since 255 is represent as FF in hexadecimal.
Each pair of numbers in the hexadecimal representation represents one of the channels red, green and blue. So, for example ‘#2F0000’ has data in the red channel, ‘#007B00’ in the green, while ‘#00008A’ has data in the blue channel only. We could combine all this colour data into a single cell (in this case it would be #2F7B8A), but we choose to split the colour channels for illustration purposes.
Converting from the numeric data in the image file to the RGB hexadecimal format is not straightforward, so in the script we use a function we have written called get_hex_str. By all means have a look and try and figure out what it’s doing, but don’t worry if you don’t fully understand it or hexadecimal numbers in general, this is just a means to an end. Once we have the function its easy to use it at every pixel, and colour channel for that pixel:
for i,row in enumerate(img):  for j,column in enumerate(row): column = list(column)  column.reverse() for k,rgb in enumerate(column):  hex_str = get_hex_str(rgb,k)

## Colour cells using xlsxwriter

The code above looks at every pixel in turn, then converts the numerical integer data into text string data. This string is in the RGB hexadecimal format we need to colour cells using xlsxwriter.
To use our hexadecimal string for every pixel and channel we now need to use the functions add_format, set_bg_color and write in xlsxwriter as follows:
for i,row in enumerate(img):  for j,column in enumerate(row): column = list(column)  column.reverse() for k,rgb in enumerate(column):  hex_str = get_hex_str(rgb,k)  cell_format = workbook.add_format()  cell_format.set_bg_color(hex_str)  worksheet.write(3*i+k, j, rgb , cell_format)
This first creates an new format called cell_format, then sets the background colour of this format to hex_str, then finally writes the original data (called rgb) to the (3i+k)th row and jth column of the spreadsheet using the new format.
Try and figure out for yourself why we use 3*i+k for the row number in the loop. Hint: we are combining the pixel image row number, and colour channel number.

## Saving the spreadsheet

To save the spreadsheet, all you need is:

workbook.close()