|
8 Mile: Image Processing Algorithm Guide
The factor that will contribute to a team’s victory or defeat in an event like 8 Mile, is rarely because of the hardware. The difference is in the image processing algorithms and codes. Since the camera is the most important piece of sensor in 8 mile, image processing takes the front seat. The basics of IP are explained below. The codes where ever used are written in MATLAB.
Introduction to MATLAB: MATLAB or Matrix Laboratory is a very powerful tool for image processing. In the following tutorial we will be extensively using MATLAB functions and inbuilt algorithms. To get started with MATLAB, the in built help is very useful and comprehensive. Otherwise you can use the following tutorials.
http://amath.colorado.edu/courses/4720/2000Spr/Labs/Worksheets/Matlab_tutorial/matlabimpr.html
The webcam that is mounted on the bot will have a continuous feed of the road in front of it. The easiest way to follow the road in this case is to follow the lanes that are present on the road. Note that the arena picture present on the website shows that the road has only two lanes and the lanes are as always separated by a non continuous white line.
This tutorial mainly worked in the MATLAB environment. To use MATLAB for this purpose, you need to have the Image Processing Toolbox installed in it. The following link provides not only an introduction into the basic techniques of the image processing toolbox but it also gives you an effective algorithm for lane detection and lane following.
http://www.mathworks.com/products/image/demos.html?file=/products/demos/image/IntroIPdemo/launchpage.html#
This is a link to a live demo relating to the lane detection. The documentation part of this is present in the help section of MATLAB.
Algorithms for path followingThe first method that can be used for the path following. We can use a cost determining function to predict the resulting change in pose after a short period of time after a set of possible steering commands. A cost function is then evaluated for each of the predicted poses and the steering command which minimizes the cost function is then chosen.
One particular form of the cost function is
C(Φi) = E2lateral+[Rθsin(Eθ /2)]2.

where Elateral and Eθ are the lateral and the heading offsets of the vehicle relative to the target point on the trajectory. Also note that there is a length parameter Rθ in the cost function which is used to scale the heading error relative to the position error.

A very effective way of finding the shortest path with the lowest cost using the above given cost function is the Dijkstra’s algorithm. This algorithm is explained very shortly here and further explanation can be found using Wikipedia or any of the following websites.
http://students.ceid.upatras.gr/~papagel/project/kef5_7_1.htm
http://www-b2.is.tokushima-u.ac.jp/~ikeda/suuri/dijkstra/Dijkstra.shtml
http://www.ifors.ms.unimelb.edu.au/tutorial/dijkstra_new/index.html
http://www.cs.sunysb.edu/~skiena/combinatorica/animations/dijkstra.html
Dijkstra’s Algorithm:
For a given source vertex node) in the graph, the algorithm finds the path with lowest cost (i.e. the shortest path) between that vertex and every other vertex. It can also be used for finding costs of shortest paths from a single vertex to a single destination vertex by stopping the algorithm once the shortest path to the destination vertex has been determined. For example, if the vertices of the graph represent cities and edge path costs represent driving distances between pairs of cities connected by a direct road, Dijkstra's algorithm can be used to find the shortest route between one city and all other cities.
Traffic Light and Stopping line Detection:
One of the most challenging parts of the event 8mile is the identification of the traffic lights. The aim is to cross the stopping line only when the light is green. A very simple algorithm can be used for the purpose. First, detect the stopping line. It is rather simple to detect. Run a vertical scan across the side of the captured image and find if there exists a broad white strip of high intensity pixels. This is the stopping line.

In the shown figure, run a scan for the captured image in the shown direction and obtain the y coordinate of the strip (the red circle). With proper calibration, the distance of this line from the bot can be calculated.
After stopping at the traffic lights, tilt the camera up, or use the second camera if you have one, face the traffic lights. Then measure the centre of the red color in the image. This centre will shift when light changes. When light is green, the red centre will be more towards the centre, while it will be off centre when light is red. (Provided the cam is adjusted so that the traffic lights lie toward the border of the image when the bot has stopped at the line.)
Green centre may also be used, but since the background contains a lot of greenery, it is not recommended.
Background Detection:
It is often very useful to separate the background from the foreground in the image, it reduces the load on the processor. A background is generally defined as a non changing (or slightly changing) set of pixels, that can be neglected in the processing of the image. To get the background pixels, first line up about 4 captued images, sequentially. Then take the mean of any pixel in all the images and obtain another image.
I(xmean,ymean)= Sum{I(xi,yi)}/4 …(i – 1 to 4). This is your reference image.
Now obtain the deviation of the pixels of the new captured image from the reference image. If the deviation is within a threshold, then the pixel is a background pixel, else it is a foreground object.
Neural Networks:
Neural networks are composed of simple elements operating in parallel. These elements are inspired by biological nervous systems. As in nature, the network function is determined largely by the connections between elements. You can train a neural network to perform a particular function by adjusting the values of the connections (weights) between elements.
With proper biasing and weight balancing, neural network based Image processing robots, can be very useful. The MATLAB help itself provides extensive help on implementation of neural networks.
For further study into image processing based on Neural networks, see the following link:
http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=00485891
Basic Filtering Operations:Filtering is perhaps the single most important operation that will be used in an event like 8mile. Filtering is a technique using which a certain part of a given image can be sharpened to help in the processing of the image. This can also be used to remove disturbances and noise from the image. This is only applicable where a binary mask defines the region. A binary mask is an array of 0s and 1s, which defines the region to be filtered. The portion marked 1 is filtered, and the 0 is part is left out. This increases the speed of the processing.
To filter a region in an image, use the roifilt2 function. When you call roifilt2, you specify a grayscale image, a binary mask, and a filter. roifilt2 filters the input image and returns an image that consists of filtered values for pixels where the binary mask contains 1's and unfiltered values for pixels where the binary mask contains 0's. This type of operation is called masked filtering. It is important to note here that the roifilt2 is best suited to operations that return data in the same range as in the original image. Certain filtering operations can result in values outside the normal image data range (i.e. [0, 1] for images of class double, [0,255] for the images of class uint8, and [0, 65535] for images of class uint16).
Example: Filtering a region in an image
The following is a simple example to show the use of the roifilt2 function. This example will take a standard image that is present in the MATLAB library and increase the contrast of a specific region of an image.
1. Read in the image.
I = imread('pout.tif');
2. Create the mask.
The process of creating a mask can be found in the MATLAB help. The region of interest is specified by the binary mask.
3. Create the filter.
h = fspecial('unsharp');
For further help on filter commands in MATLAB use the following link:
http://www.math.hkbu.edu.hk/~cstong/sci3710/filter_tutor.html4. Call roifilt2, specifying the image to be filtered, the mask, and the filter.
I2 = roifilt2(h,I,BW);
imshow(I)
figure, imshow(I2)
Convolution:
Linear filtering of an image is accomplished through an operation called convolution. Convolution is a neighbourhood operation in which each output pixel is the weighted sum of neighbouring input pixels. The matrix of weights is called the convolution kernel, also known as the filter. A convolution kernel is a correlation kernel that has been rotated 180 degrees.
For example, suppose the image is
A = [17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9]
and the convolution kernel is
h = [8 1 6
3 5 7
4 9 2]
We will now illustrate how to compute the (2,4) output pixel using these steps: Rotate the convolution kernel 180 degrees about its center element. Slide the center element of the convolution kernel so that it lies on top of the (2,4) element of A. Multiply each weight in the rotated convolution kernel by the pixel of A underneath. Sum the individual products from step 3.
Hence the (2,4) output pixel is
1x2 + 8x9 + 15x4 + 7x7 + 14x5 + 16x3 + 13x6
+ 20x1 + 22x8 = 575.
Correlation:
The operation called correlation is closely related to convolution. In correlation, the value of an output pixel is also computed as a weighted sum of neighbouring pixels. The difference is that the matrix of weights, in this case called the correlation kernel, is not rotated during the computation. The filter design functions in the Image Processing Toolbox return correlation kernels.
We will now illustrate how to compute the (2,4) output pixel of the correlation of A, assuming h is a correlation kernel instead of a convolution kernel, using these steps: Slide the centre element of the correlation kernel so that lies on top of the (2,4) element of A. Multiply each weight in the correlation kernel by the pixel of A underneath. Sum the individual products from step 3.
The (2,4) output pixel from the correlation
is
1x8 + 8x1 + 15x6 + 7x3 + 14x5 + 16x7 + 13x4
+ 20x9 + 22x2 = 585
Filtering using imfilter:
Filtering of images, either by correlation or convolution can be performed using the toolbox function imfilter. This example filters an image with a 5-by-5 filter containing equal weights. Such a filter is often called an averaging filter.
I = imread('coins.png');
h = ones(5,5) / 25;
I2 = imfilter(I,h);
imshow(I), title('Original Image');
figure, imshow(I2), title('Filtered
Image')

Data Types:
The imfilter function handles data types similarly to the way the image arithmetic functions do, as described in Image Arithmetic Saturation Rules. The output image has the same data type, or numeric class, as the input image. The imfilter function computes the value of each output pixel using double-precision, floating-point arithmetic. If the result exceeds the range of the data type, the imfilter function truncates the result to that data type's allowed range. If it is an integer data type, imfilter rounds fractional values.
Because of the truncation behavior, you might sometimes want to consider converting your image to a different data type before calling imfilter. In this example, the output of imfilter has negativevalues when the input is of class double.
A = magic(5)
A = [17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9]
h = [-1 0 1]
imfilter(A,h)
Ans = [24 -16 16 14 -8
5 -16 9 9 -14
6 9 14 9 -20
12 9 9 -16 -21
18 14 -16 -16 -2]
Notice that the result has negative values.
Now suppose A is of class uint8, instead of double.
A = uint8(magic(5));
imfilter(A,h)
ans =
24 0 0 14 0
5 0 9 9 0
6 9 14 9 0
12 9 9 0 0
18 14 0 0 0
Since the input to imfilter is of class uint8, the output also is of class uint8, and so the negative values are truncated to 0. In such cases, it might be appropriate to convert the image to another type, such as a signed integer type, single, or double, before calling imfilter.
Correlation and Convolution Options:
The imfilter function can perform filtering using either correlation or convolution. It uses correlation by default, because the filter design functions, described in Filter Design, and the fspecial function, described in Using Predefined Filter Types, produce correlation kernels.
However, if you want to perform filtering using convolution instead, you can pass the string 'conv' as an optional input argument to imfilter. For example:
A = magic(5);
h = [-1 0 1]
imfilter(A,h) % filter using correlation
ans =
24 -16 -16 14 -8
5 -16 9 9 -14
6 9 14 9 -20
12 9 9 -16 -21
18 14 -16 -16 -2
imfilter(A,h,'conv') % filter using convolution
ans =
-24 16 16 -14 8
-5 16 -9 -9 14
-6 -9 -14 -9 20
-12 -9 -9 16 21
-18 -14 16 16 2
Boundary Padding Options:
When computing an output pixel at the boundary of an image, a portion of the convolution or correlation kernel is usually off the edge of the image.
Zero Padding of Outside Pixels
When you filter an image, zero padding can result in a dark band around the edge of the image, as shown in this example.
I = imread('eight.tif');
h = ones(5,5) / 25;
I2 = imfilter(I,h);
imshow(I), title('Original Image');
figure, imshow(I2), title('Filtered Image with Black Border)

To eliminate the zero-padding artifacts around the edge of the image, imfilter offers an alternative boundary padding method called border replication. In border replication, the value of any pixel outside the image is determined by replicating the value from the nearest border pixel.
Multidimensional Filtering:
The imfilter function can handle both multidimensional images and multidimensional filters. A convenient property of filtering is that filtering a three-dimensional image with a two-dimensional filter is equivalent to filtering each plane of the three-dimensional image individually with the same two-dimensional filter. This example shows how easy it is to filter each colour plane of a truecolor image with the same filter:
Relationship to Other Filtering Functions
MATLAB has several two-dimensional and multidimensional filtering functions. The function filter2 performs two-dimensional correlation, conv2 performs two-dimensional convolution, and convn performs multidimensional convolution. Each of these filtering functions always converts the input to double, and the output is always double. These other filtering functions always assume the input is zero padded, and they do not support other padding options.
In contrast, the imfilter function does not convert input images to double. The imfilter function also offers a flexible set of boundary padding options, as described in Boundary Padding Options.
Note that the following link contains a very useful GUI for Multivariate Image Analysis of Multispectral images.
http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=6072&objectType=fileA GUI for MIA of multispectral image data sets (PCA, Simplisma, MCR, classification) and basic processing techniques (threshold, histogram, profile plotting, image filters)
