## Want to keep learning?

This content is taken from the Queensland University of Technology's online course, Robotic Vision: Making Robots See. Join the course to learn more.
3.2

## Queensland University of Technology

Skip to 0 minutes and 7 seconds Now we will consider the case where a camera is looking at a bunch of points and these points all lie on a plane. The plane has got its own coordinate system which we denote by coordinate frame zero. Clearly every point that lies on this plane has got a Z coordinate of zero, which is shown down here. The coordinate capital Z multiples all of the elements in the third column of our camera matrix, but because it’s zero we can effectively remove that column from the matrix and we can remove that row from the world coordinate vector. What we’re left with now is a three by three matrix and we’ll refer to this three by three matrix as a “planar homography”.

Skip to 0 minutes and 46 seconds Just as for the camera matrix, there is an arbitrary scale factor and once again we can normalise the homography matrix by choosing one particular element that we’re going to set to the value of one. So this three by three matrix, it’s got one element that we’ve set to one, there are eight unique numbers remaining in the homography matrix. And we can estimate the homography matrix if we have four world points and the corresponding position of those points on the image plane of our camera. So the concept of corresponding points, imagine that I’ve got two planes, one is perhaps the image plane of the camera; the other might be a physical plane in the world that the camera is looking at.

Skip to 1 minute and 27 seconds Alternatively, the first could be a view of a plane in the world and the second image could be another view of the same plane in the world, where we’ve moved the camera between the two views. Now we’ve got four points in each of these planes, which I’m going to denote by the subscripts one through four and I’m going to arrange the coordinates of those points into the columns of a matrix. But what’s really important here is the ordering of these columns. We have to ensure what’s called correspondence. P1 and Q1 must correspond to the same point in the world and so it goes for P2, P3 and P4.

Skip to 2 minutes and 6 seconds Each point P and the corresponding point Q must refer to the same point in the world. Let’s look at a practical example of how we can use this technique to perform something called “perspective rectification”. Now this is a picture that I took of the Notre Dame Cathedral in Paris. It’s a very tall cathedral, so I’m on the ground in front, looking up and taking a picture. And clearly because my camera is tilted upwards I’ve got a very distorted view of the front of the cathedral. But I know some things about cathedrals and particularly I know that the front of the cathedral is most likely to be a plane.

Skip to 2 minutes and 40 seconds So if I pick four points on the front of the cathedral that I believe all lie in a single plane and I can label them P1 through to P4. But I know that those points in a non-distorted image will form a rectangle in the image plane, not a trapezoid. I can compute the image plane coordinates Q1, Q2, Q3 and Q4 in order to have a rectangle in the image. So if I have now two sets of corresponding points; I have the points P1 through P4 and I have the points Q1 through Q4, then I can compute an homography.

Skip to 3 minutes and 16 seconds So if I build up a matrix P that contains as columns the points P1 through P4 and the matrix Q, whose columns are the points Q1 through Q4, then I can compute an homography. And it’s shown here and very simple to do in MATLAB. Now that I have this homography matrix H, I can use it to transform any point, P, in my original image, to any point, Q, in a second image. And this is what the second image looks like. We see that the cathedral has been straightened up. We can see that the vertical edges of the cathedral are in fact vertical lines in the image.

Skip to 3 minutes and 53 seconds It’s important to remember that there’s a very strong assumption made in this process and that is that all of the points in the image lie on a plane. Certainly many of the points in this image lay on the frontal plane of the cathedral, but not all do. If we look at points around here, which are on the edges of the bell towers, then they do not lie on the frontal plane and the transformation won’t be correct for them. It will introduce a distortion in that part of the image. You can’t get anything for free, we’ve certainly proved that geometric correctness of the bulk of the cathedral.

Skip to 4 minutes and 27 seconds Given that I’ve computed the matrix H using MATLAB, then it’s a very simple matter to apply the homography to every single point in the image. And we perform that by a process known as “image warping”.

Skip to 4 minutes and 41 seconds To do image warping, we can see that every single pixel in the output image and the output image in this case is the geometrically correct, the rectified image, of the cathedral. To illustrate this I’m going to choose just one particular point in the output image and it’s the pixel at coordinate (600, 100). Now if I know that pixel coordinate, I want to try and work out what’s the corresponding pixel coordinate in the input image.

Skip to 5 minutes and 9 seconds The homography is a mapping from the original image to the new image, so in order to map this coordinate I need to use the inverse of the homography and that gives me the coordinate of the corresponding point in the input image and it’s got a coordinate of (757, 51). The way image warping works then is we go and find the pixel at coordinate (757, 51) and we take that pixel value and we insert it into the new image at coordinate (600, 100). So for every single pixel in the output image, we work out where it comes from in the input image.

Skip to 5 minutes and 49 seconds You can see here that the coordinates in the input image are fractional and that requires a technique called “image interpolation” to find what is the actual pixel value at this particular fractional coordinate. In a nutshell, that’s the process of image warping. Another application of image warping is this often-used effect now in swimming telecasts, where we take the flag and the name of the competitors and we overlay them on the lanes of the swimming pool. It’s actually quite an easy trick to do and it involves these homographies. Now image that I could swim well, well enough to get into a swimming tournament, so there’s my flag and there’s my name.

Skip to 6 minutes and 30 seconds Now I’ve got this image that I created, just using ordinary computer graphics, that’s the easy bit. Now I want to lay that image into my lane in the swimming pool. All I need to do that, is to find the four corresponding points, so the four corners of this rectangle that holds the image that I want to overlay and the four points in the swimming pool where I’d like it to be laid. Once I have that information I can warp that original image into this very distorted image, which I could then insert into or overlay onto the original image of the swimming pool.

Skip to 7 minutes and 5 seconds Those of you who are doing the project associated with this course, the homography is going to be very, very useful. You’ve probably already built a two-dimensional robot, that sits on a worksheet and can move its end effector to any particular XY coordinate on the robot worksheet. Now image that we take a picture of that robot worksheet. I have an image of the robot worksheet. The homography lets me create a mapping between a coordinate in the image of the worksheet, which has got a coordinate of (U, V) in the image plane and I can map that to a physical coordinate, (X, Y) on the robot’s worksheet.

Skip to 7 minutes and 44 seconds I can map from an image plane coordinate to a robot worksheet coordinate, or I can map from a robot worksheet coordinate back to a camera image coordinate. Now homographies are going to be very, very helpful for you in completing the project. Just to summarise the capability in the toolbox for computing and using homographies. Given two sets of corresponding points P and Q, we can compute a three by three homography matrix. The columns of P and Q represent points. Now P might be image coordinates of known points in an image, Q might be coordinates of points on the robot’s physical worksheet.

Skip to 8 minutes and 25 seconds Alternatively, P could be a set of image coordinates in one image and Q could be a set of image coordinates in another image. Given that I have the three by three homography matrix H, I can then map a set of points P, in the first plane, to a set of points Q, in the second plane.

# Planar homography

Before we continue with the project, let’s cover the important concept of homography.

Given everything we’ve learnt in this program, we can now derive a linear relationship between the coordinates of points on an arbitrary plane and the coordinates of those points in an image. This is planar homography and it has a number of uses.

In this video, we cover:

• Points on a plane
• Planar homography
• Corresponding points
• Perspective rectification
• Image warping
• Perspective relationship between planes
• Estimating and using the homography on the robotic vision worksheet.