An algorithm for distinguishing apples from bananas

		custom search of this site
	An algorithm for distinguishing apples from bananas
	[chongo's home] [Astronomy] [Mathematics] [Prime Numbers] [Programming] [Technology] [contacting Landon]

Abstract

The number of set pixels approximates the size of the object when the objects contrasts well with its background. An algorithm is presented that can reliably distinguish between an apple and a banana is presented. This algorithm, which runs in linear or O(n) time with respect to image size, counts the number of set pixels in a low grade B&W image. Using a wide collection of apple and banana varieties and sizes, the algorithm was able to successfully distinguish between apples and bananas.

Background

Image recognition is an on-going area of research in Artificial Intelligence. Systems for general image recognition have been published in the Journal of Artificial Intelligence Research as well as the MIT Artificial Intelligence Lab is being made. These recognition systems tend of be both complex and computationally intensive.

When the recognition problem is simplified down to distinguishing between just a few objects under controlled conditions, simple but effective algorithms can be used. A simplified image recognition problem allows one to use algorithms that are neither complex nor computationally intensive.

The problem

How can one, by computer, distinguish between an apple and a banana given limited computational resources and limited decision time?

For this paper we will consider the situation where one has a conveyer belt (see figure 1) on which both apples and bananas are placed.

Fig 1. Conveyer Belt setup

Problem constraints

In order to comply with the problem condition of ``limited computational resources'' we will assume the use of a low end computer or PC. For example, one may have only a $1000 PC that is typically found in retail personal computer stores in the later 1990's.

In keeping with the low cost constrains, we will assume that the computer has connected to it a low end black and white digital camera that produces an uncompressed TIFF image. Our solution below does not depend on the camera producing TIFF format images. Any format in which the values of individual pixels can be quickly determined is sufficient. Non-compressive cameras and no significant image processing software is in keeping with the same ``limited computational resources'' problem theme.

We assume that fruit is placed on the conveyer belt so that there is some space in-between each individual fruit. We will assume that there is some mechanism, perhaps a light beam & photocell device, to detect when fruit is under the camera. We will assume that the camera is positioned such that when the fruit detector detects a price of fruit, only that fruit is inside the view of the camera.

We will further assume that the conveyer belt is a low gloss flat black color.

It should be noted that due to the speed of the belt, one has only a small amount of time to made the apple / banana decision. This constraint is in keeping with the problem condition of ``limited decision time''.

It also be noted that we do not assume any particular orientation of the fruit other than it is flat on the conveyer belt. For apples this can mean almost any position where as a banana's orientation is more restricted by gravity.

The solution

Given a B&W TIFF image of either an apple or a banana, we will count the number of set bits (white pixels) in the image. Apples will have a lower set bit than bananas.

In the above problem scenario, when the fruit detector determines that a piece of fruit is under the camera, a picture is taken and a TIFF image is transfered to the computer's memory. The transfer of the TIFF image to the computer is detected by a program. The program in turn counts the number of set bits and based on a pre-determined threshold value, determines if the fruit is an apple or a banana.

Fruit detector detects an object in position under the camera
The camera records a image of its field of view
The image is transfered to the computer's memory
A computer program detects the presence of a new image
The computer program counts the number of set image bits (white pixels)
If the set bit count > threshold then the object is a banana
otherwise it is assumed to be an apple

Fig 2. Apple/Banana distinguishing algorithm

The image transfer and bit count process can be done fairly quickly. Low end camera images are typically small. Thus the transfer from the camera to computer memory as well as bit counting will not take very long.

An as example, a 413 by 413 pixel TIFF B&W image is typically less than 22k bytes in size. A low end camera takes an image in 1/60th of a second. Given the speed of most video signals, the download time is usually much less than 1/60th of a second. Thus the total camera, download and memory scan time would be no more than 1/30th of a second.

Checking the solution

We will check the correctness by the examining images of a wide variety of apples and bananas. We will use the disting program to simulate steps 5 & 6 in the distinguishing algorithm (see figure 2).

We will assume that the image of a detected fruit has been transfered to the computer (steps 1 thru 4, see figure 2). For purposes of this test, the image is stored on disk and is read in by disting.

In our test, we used a wide variety of apples and bananas that are commonly found in major produce stores of the the United States. While apples in US markets come in multiple varieties, bananas in US markets are often of a single variety.

Type 4015 - Red Delicious Apple (also type 4016)
Type 4020 - Golden Delicious Apple
Type 4103 - Breaburn Apple
Type 4131 - Fuji Apple
Type 4133 - Gala Apple
Type 4139 - Granny Smith Apple
Type 4153 - McIntosh Apple
Type 4011 - Regular Cabanita Banana

Table 1. Apple and Banana varieties used

The quality of the fruit selected was typical of a major US produce section of a supermarket. Apples were reasonably ripe and free of spots. Bananas were somewhat green (as is the unfortunate case of most bananas in the US) with few spots.

Produce quality grade apples and bananas vary in terms of color, shape and size. To account for these differences, we selected 2 of each type of apple and banana. With the kind assistance of a local produce vendor, we selected the both largest and the smallest (by weight) of each variety used (see table 1).

Given that we cannot assume any particular orientation of the fruit other than it is flat on on the conveyer belt, we took 2 images of each fruit. In the case of apples (which are reasonably symmetric around the axis of the stem) we imaged them both on their sides as well as on their ends. In the case of the bananas, which lying flat are restricted their sizes, we imaged both sides.

The original images taken were from an 8 bit color camera, 413 by 413 pixels in size. At the distance taken, the resolution was approximately 20 pixels per centimeter (50 pixels per inch). Lighting was from behind the camera.

Type - Name	Large image	Small image
4015 - Red Delicious	image 1 image 2	image 1 image 2
4020 - Golden Delicious	image 1 image 2	image 1 image 2
4103 - Breaburn	image 1 image 2	image 1 image 2
4131 - Fuji	image 1 image 2	image 1 image 2
4133 - Gala	image 1 image 2	image 1 image 2
4139 - Granny Smith	image 1 image 2	image 1 image 2
4153 - McIntosh	image 1 image 2	image 1 image 2
4011 - Regular Cabanita (banana)	image 1 image 2	image 1 image 2

Table 2. Original Color images

We converted the original color images (see table 1) into B&W images by two means. One method was by a dithering process, the other was by a saturation process. Color to B&W dithering was perform in a standard fashion by the XV program using its TIFF dithering method. The saturation conversion process used a luminosity step function. The step was set to the highest luminosity level such that very few pixels off of the flat back background (simulated conveyer belt) registered as white pixels. The luminosity level was set by experimentation on the reference frame images (pictures without fruit).

Reference frame	Color image	Dithered B&W image	Saturation B&W image
Flat black Conveyer belt	color	dithered	saturated
All white full luminosity	color	dithered	saturated

Table 3. Reference frame images

The results of converting table 2 images into B&W images are found in table 4 below.

Type - Name	Large dithered	Small dithered	Large saturated	Small saturated
4015 - Red Delicious	image 1 image 2	image 1 image 2	image 1 image 2	image 1 image 2
4020 - Golden Delicious	image 1 image 2	image 1 image 2	image 1 image 2	image 1 image 2
4103 - Breaburn	image 1 image 2	image 1 image 2	image 1 image 2	image 1 image 2
4131 - Fuji	image 1 image 2	image 1 image 2	image 1 image 2	image 1 image 2
4133 - Gala	image 1 image 2	image 1 image 2	image 1 image 2	image 1 image 2
4139 - Granny Smith	image 1 image 2	image 1 image 2	image 1 image 2	image 1 image 2
4153 - McIntosh	image 1 image 2	image 1 image 2	image 1 image 2	image 1 image 2
4011 - Regular Cabanita (banana)	image 1 image 2	image 1 image 2	image 1 image 2	image 1 image 2

Table 4. B&W images

The results

The white pixel counts as determined by the disting program are detailed in table 5 below.

Type - Name	Dithered count	Saturated count
4015 - Red Delicious	865 (0.507%) 476 (0.279%) 1073 (0.629%) 320 (0.188%)	2608 (1.529%) 2569 (1.506%) 4065 (2.383%) 1614 (0.946%)
4020 - Golden Delicious	4456 (2.612%) 3422 (2.006%) 2409 (1.412%) 3022 (1.772%)	11997 (7.033%) 9660 (5.663%) 8310 (4.872%) 9539 (5.592%)
4103 - Breaburn	1485 (0.871%) 1636 (0.959%) 1367 (0.801%) 1613 (0.946%)	6566 (3.849%) 7383 (4.328%) 5377 (3.152%) 5835 (3.421%)
4131 - Fuji	2820 (1.653%) 2828 (1.658%) 1482 (0.869%) 2342 (1.373%)	9305 (5.455%) 10145 (5.947%) 5652 (3.313%) 9079 (5.323%)
4133 - Gala	3018 (1.769%) 1901 (1.114%) 2063 (1.209%) 2321 (1.361%)	9603 (5.630%) 7117 (4.172%) 7509 (4.402%) 8364 (4.903%)
4139 - Granny Smith	1654 (0.970%) 1277 (0.749%) 1796 (1.053%) 1040 (0.610%)	6690 (3.922%) 7433 (4.358%) 6942 (4.070%) 6490 (3.805%)
4153 - McIntosh	1410 (0.827%) 970 (0.569%) 1415 (0.830%) 2066 (1.211%)	5140 (3.013%) 4935 (2.893%) 4500 (2.638%) 8224 (4.821%)
4011 - Regular Cabanita (banana)	8360 (4.901%) 12300 (7.211%) 5699 (3.341%) 6844 (4.012%)	25946 (15.211%) 29089 (17.053%) 20263 (11.879%) 18603 (10.906%)

Table 5. White pixel counts

The lowest and highest pixel counts from table 4 for apples and bananas for both dithered and saturated images are found in table 6 below.

Fruit	Dithered	Saturated
Apples	320 (0.188%) min 4456 (2.629%) max	1614 (0.946%) min 11997 (7.033%) max
Bananas	5699 (3.341%) min 12300 (7.211%) max	18603 (10.906%) min 29089 (17.053%) max

Table 6. Min and Max pixel counts by fruit

Dithered Apple image with the most white pixels	Saturated Apple image with the most white pixels
Dithered Banana image with the most white pixels	Saturated Banana image with the most white pixels

All of the varieties, sizes and orientations of apples that we tested had notably fewer set pixels than that of the bananas. This relationship held for both dithered and saturated images.

The large Golden Delicious apple produced the most number of white pixels in saturated images. However that saturated apple image had only 78.2% of the white pixels of the smallest banana. The large Golden Delicious also produced the most number of white pixels in dithered images. However that dithered apple image had only 64.5% of the white pixels of the smallest banana. However as one can see from table 6, the typical apple imaged was even more distinct. Typically bananas have between 3 and 4 times the number of white pixels as apples.

Statistic	Dithered Apple	Saturated Apple	Dithered Banana	Saturated Banana	Dithered Apple/Banana	Saturated Apple/Banana
Average	1876.7	6880.4	8300.7	23475.2	4.423	3.412
Standard Deviation	929.7	2486.6	2880.3	4887.9	(n/a)	(n/a)

Table 7. Image statistics

Dithering of images produced the sharpest distinction between apples and bananas. Presumably dithered images would result in fewer mis-identifications errors. But in the worst case that we observed, the apple and banana while pixel count populations were very distinct.

The time to both read the image and count the pixels was so insignificant as to not be measurable, often taking less than 10 milliseconds on a 200 Mhz Pentium or SGI R5k O2.

Conclusions

One can distinguish between apples and bananas by scanning their black and white image against a flat black background and counting the number of white pixels. This was accomplished over a wide range of sizes and varieties. No special orientation other than the fruit lying flat was required. B&W image production by dithering was shown to produce the best results. A single image scan was required resulting in an algorithm that runs on linear or O(n) time with respect to the image size.