Threshold to amplify black lines - image-processing

Given an image (Like the one given below) I need to convert it into a binary image (black and white pixels only). This sounds easy enough, and I have tried with two thresholding functions. The problem is I cant get the perfect edges using either of these functions. Any help would be greatly appreciated.
The filters I have tried are, the Euclidean distance in the RGB and HSV spaces.
Sample image:
Here it is after running an RGB threshold filter. (40% it more artefects after this)
Here it is after running an HSV threshold filter. (at 30% the paths become barely visible but clearly unusable because of the noise)
The code I am using is pretty straightforward. Change the input image to appropriate color spaces and check the Euclidean distance with the the black color.
sqrt(R*R + G*G + B*B)
since I am comparing with black (0, 0, 0)

Your problem appears to be the variation in lighting over the scanned image which suggests that a locally adaptive thresholding method would give you better results.
The Sauvola method calculates the value of a binarized pixel based on the mean and standard deviation of pixels in a window of the original image. This means that if an area of the image is generally darker (or lighter) the threshold will be adjusted for that area and (likely) give you fewer dark splotches or washed-out lines in the binarized image.
http://www.mediateam.oulu.fi/publications/pdf/24.p
I also found a method by Shafait et al. that implements the Sauvola method with greater time efficiency. The drawback is that you have to compute two integral images of the original, one at 8 bits per pixel and the other potentially at 64 bits per pixel, which might present a problem with memory constraints.
http://www.dfki.uni-kl.de/~shafait/papers/Shafait-efficient-binarization-SPIE08.pdf
I haven't tried either of these methods, but they do look promising. I found Java implementations of both with a cursory Google search.

Running an adaptive threshold over the V channel in the HSV color space should produce brilliant results. Best results would come with higher than 11x11 size window, don't forget to choose a negative value for the threshold.
Adaptive thresholding basically is:
if (Pixel value + constant > Average pixel value in the window around the pixel )
Pixel_Binary = 1;
else
Pixel_Binary = 0;

Due to the noise and the illumination variation you may need an adaptive local thresholding, thanks to Beaker for his answer too.
Therefore, I tried the following steps:
Convert it to grayscale.
Do the mean or the median local thresholding, I used 10 for the window size and 10 for the intercept constant and got this image (smaller values might also work):
Please refer to : http://homepages.inf.ed.ac.uk/rbf/HIPR2/adpthrsh.htm if you need more
information on this techniques.
To make sure the thresholding was working fine, I skeletonized it to see if there is a line break. This skeleton may be the one needed for further processing.
To get ride of the remaining noise you can just find the longest connected component in the skeletonized image.
Thank you.

You probably want to do this as a three-step operation.
use leveling, not just thresholding: Take the input and scale the intensities (gamma correct) with parameters that simply dull the mid tones, without removing the darks or the lights (your rgb threshold is too strong, for instance. you lost some of your lines).
edge-detect the resulting image using a small kernel convolution (5x5 for binary images should be more than enough). Use a simple [1 2 3 2 1 ; 2 3 4 3 2 ; 3 4 5 4 3 ; 2 3 4 3 2 ; 1 2 3 2 1] kernel (normalised)
threshold the resulting image. You should now have a much better binary image.

You could try a black top-hat transform. This involves substracting the Image from the closing of the Image. I used a structural element window size of 11 and a constant threshold of 0.1 (25.5 on for a 255 scale)
You should get something like:
Which you can then easily threshold:
Best of luck.

Related

How to assess image quality using image comparison

I would like to compare videos. To compare the quality (Non blurry) by coding a C program. Someone told me to learn about DFT (Discrete Fourier Transform) for image analysis and to use a FFT or DFT tool to learn the difference between blurred vs detailed (non-blurry) copies of same image.
(copied from other question):
Lets say we have different files with different video quality, one is extremely clear, other is blurred, one is having rough colors. Compare all files basically frame by frame and report to the user which has better quality.
So can anyone help me with this ??
Let's say we have various files having different video quality:
one is extremely clear, other is blurred, one is having rough colors.
Compare all files basically frame by frame and report to the user which has better quality.
(1) Color Quality detection...
To check which has better color, you analyze the histograms of the test images. The histogram will be a count of how many pixels have intensity X. Where X is a number ranging between 0 up to 255 (because each red, green and blue channels each holds any of those 256 possible intensities).
There are many tutorials online about how to create a histogram since it's a basic task in computer graphics.
Generally it goes like:
First make 3 arrays (eg: hist_Red) to hold data for red, green and blue channels.
Break up (using FOR loop) each pixel into individual R/G/B channel components:
example:
temp_Red = this_pixel >> 16 & 0x0ff;
temp_Grn = this_pixel >> 8 & 0x0ff;
temp_Blu = this_pixel >> 0 & 0x0ff;
Then add +1 to that specific red/green/blue intensity in relevant histogram.
example:
hist_Red[ temp_Red ] += 1;
hist_Grn[ temp_Grn ] += 1;
hist_Blu[ temp_Blu ] += 1;
By adding the totals of red, green and blue, you will have total intensities of RGB in an array that could build charts like below. Check with image's array has most values to find image with better quality of colors:
(2) Detailed vs Blurred detection...
You can try using a convolution filter to detect blur in image. Give the filter a kernel (eg: a matrix). The matrix (3x3) shown below gives an edge-detect filter, where blurred images give less edges (therefore gives more black pixels).
Use logic to assume that: more black pixels EQuals a more blurred image (less detail).
You can read about convolutions here
Lode's Computer Graphics Tutorial: Image Filtering
Image Convolution with C/C++ code
PDF Image Manipulation: Filters and Convolutions
PDF Read page 10 onwards : Convolution filters

photoshop parameters not good on graphicsmagick

I'm trying to translate a photoshop setting for sharpening images to graphicsmagick. Therefore I found this helpful article:
https://redskiesatnight.com/2005/04/06/sharpening-using-image-magick/
The problem is that if I use to photoshop equivalent values explained in the article in graphicsmagick the images are not so sharp and clear like on photoshop.
For example I use this settings on photoshop:
Strength: 500%
Radius: 2.0 Pixel
Threshold: 8
In the article the parameters are explained like this:
The radius parameter
The radius parameter specifies (official documentation)
“the radius of the Gaussian, in pixels, not counting the center pixel”
Unsharp masking, like many other image-processing filters, is a
convolution kernel operation. The filter processes the image pixel by
pixel. For each pixel it examines a block of pixels surrounding it
(the kernel) and does some calculations on them to render the output
pixel value. The radius parameter determines which pixels surrounding
the center pixel will be considered in the convolution kernel: (think
of a circle) the larger the radius, the more pixels that will need to
be processed for each individual pixel.
Image Magick’s radius is similar to the same parameter in Photoshop,
GIMP and other image editors. In practical terms, it affects the size
of the “halos” created to increase contrast along the edges in the
image, increasing acutance and thus apparent sharpness.
How do you know how big of a radius to use? It depends on your output
target resolution, for one thing. It also depends on your personal
preferences, as well as the specific needs of the image at hand. As
far as the resolution issue goes, the GIMP User Manual recommends that
unsharp mask radius be set as follows:
radius = (output ppi / 30) * 0.2 Which is very similar to another commonly found rule of thumb:
radius = output ppi / 150 So for a monitor with 72 PPI resolution, you’d use a radius of approximately 0.5; if your targeting a printer
at 300 PPI you’d use a value of 2.0. Use these as a starting point;
different images have different sharpening requirements, and
individual preference is also a consideration. [Aside: there are a few
postings around the net (including some referenced in this article)
that suggest that Image Magick accepts, but does not honor, fractional
radii; that is, if you specify a radius of 0.5 or 1.2 it is rounded,
or defaults to an integer, or is silently ignored, etc. This is not
true, at least as of version 5.4.7, which is the one that I am using
as I write this article. You can easily see for yourself by doing
something like the following:
$ convert -unsharp 1.2x1.2+5+0 test.tif testo1.tif $ convert -unsharp
1.4x1.4+5+0 test.tif testo2.tif $ composite -compose difference test01.tif testo2.tif diff.tif $ display diff.tif you can also load
them into the GIMP or Photoshop into different layers and change the
blend mode to “Difference”; the resulting image is not black (you may
need to look closely for a 0.2 difference in radius). No, this
mistaken impression likely comes from the fact that there is a
relationship between the radius and sigma parameters, and if you do
not specify sigma properly in relation to the radius, the radius may
indeed be changed, or at least not work as expected. Read on for more
on this.]
Please note that the default radius (if you do not specify anything)
is 0, a special value which tells the unsharp mask algorithm to
“select an appropriate value for the radius”!.
The sigma parameter
The sigma parameter specifies (official documentation)
“the standard deviation of the Gaussian, in pixels”
This is the most confusing parameter of the four, probably because it
is “invisible” in other implementations of unsharp masking, and it is
most sparsely documented. The best explanation I have found for it
came from a google search that unearthed an archived mailing list
thread which had the following snippet:
Comparing the results of
convert -unsharp 1.2x1+4+0 test test1.2x1+4+0
and
convert -unsharp 30x1+4+0 test test30x1+4+0
results in no significant differences but the latter takes approx. 50 times
longer to complete.
That is not surprising. A radius of 30 involves on the order of 61x61
input pixels in the convolution of each output pixel. A radius of 1.2
involves 3x3 or 5x5 pixels.
Please can anybody give me any hints, what 'sigma' means?
It describes the relative weight of pixels as a function of their
distance from the center of the convolution kernel. For small sigma,
the outer pixels have little weight. Another important clue comes from
the documentation for the -unsharp option to convert (emphasis mine):
The -unsharp option sharpens an image. We convolve the image with a
Gaussian operator of the given radius and standard deviation (sigma).
For reasonable results, radius should be larger than sigma. Use a
radius of 0 to have the method select a suitable radius.
Combining the two clues provides some good insight: sigma is a
parameter that gives you some control over how the strength (given by
the amount parameter) of the sharpening is “graduated” or lessened as
you radiate away from a given pixel at the center of the convolution
matrix to the limit defined by the radius. My testing confirms this
inferred conclusion, namely that a bigger sigma causes more pronounced
sharpening for a given radius. That is why the poster in the mailing
list question (above) did not see any significant difference in the
sharpening even though he was using an amount of 400% (!!) and a
threshold of 0%; with a sigma of only 1.0, the strength of the filter
falls off too rapidly to be noticed despite the large difference in
radius between the two invocations. This is also why the man page says
“for reasonable results, radius should be larger than sigma”; if it is
not, then the sigma parameter does not have a graduated effect, as
designed, to “soften” the halos toward their edges; instead it simply
applies the amount evenly to the edge of the radius (which may be what
you want in some circumstances). A general rule of thumb for choosing
sigma might be:
if radius < 1, then sigma = radius else sigma = sqrt(radius) Summary:
choose your radius first, then choose a sigma smaller than or equal to
that. Experimentation will yield the best results. Please note that
the default sigma (if you do not specify anything) is 1.0. This is the
main culprit for why most people don’t see as much effect with Image
Magick’s unsharp mask operator as they do with other implementations
of unsharp mask if they are using a larger radius: unless you bump up
this parameter you are not getting the full benefit of the larger
radius!
[Aside: you might be wondering what happens if sigma is specifed
larger than the radius. The answer, as the documentation states, is
that the result may not be “reasonable”. In my testing, the usual
result is that the sharpening is extended at the specified amount to
the edge of the specified radius, and larger values of sigma have
little if any effect. In some cases (e.g. for radius < 0) specifying a
larger sigma increased the effective radius (e.g. to 1); this may be
the result of a “sanity check” on the parameters in the code. In any
case, keep in mind that the algorithm is designed for sigma to be less
than or equal to the radius, and results may be unexpected if used
otherwise.]
The amount parameter
The amount parameter specifies (official documentation)
“the percentage of the difference between the original and the blur
image that is added back into the original”
The amount parameter can be thought of as the “strength” of the
sharpening: higher values increase contrast in the halos more than
lower ones. Very large values may cause highlights on halos to blow
out or shadows on them to block up. If this happens, consider using a
lower amount with a higher radius instead.
amount should be specified as a decimal value indicating the
percentage. So, for example, if in Photoshop you would use an amount
of 170 (170%), in Image Magick you would use 1.7.
Please note that the default amount (if you do not specify anything)
is 1.0 (i.e. 100%).
The threshold parameter
The threshold parameter specifies (official documentation)
“as a fraction of MaxRGB, needed to apply the difference amount”
The threshold specifies a minimum amount of difference between the
center pixel vs. sourrounding pixels in the convolution kernel
necessary to apply the local contrast enhancement. Increasing this
value causes the algorithm to become less sensitive to differences
that may define edges. Specifying a positive threshold is often used
to avoid sharpening smooth areas that may contain noise (e.g. an area
of blue sky). If you have a noisy image, strongly consider raising the
threshold, or using some kind of smart sharpening technique instead.
The threshold parameter should be specified as a decimal value
indicating this percentage. This is different than GIMP or Photoshop,
which both specify the threshold in actual pixel levels between 0 and
the maximum (for 8-bit images, 255).
Please note that the default threshold (if you do not specify
anything) is 0.05 (i.e. 5%; this corresponds to a threshold of .05 *
255 = 12-13 in Photoshop). Photoshop uses a default threshold of 0
(i.e. no threshold) and the unsharp masking is applied evenly
throughout the image. If that is what you want you will need to
specify a 0.0 value for Image Magick’s threshold. This is undoubtedly
another source of confusion regarding Image Magick’s sharpening
algorithm.
So I did it like than and come up to this command:
gm convert file1.jpg -unsharp 2x1.41+5+0.03 file1_2x1.41+5+0.03.jpg
But like I said the images does not get that much sharpen like in photoshop. We also experimented with a lot of other values but without good images. So is it possible to do photoshop sharpening stuff with graphicsmagick? Or is it just a not good library? The main problem of just using photoshop for sharpenings is that we want to improve the images on our linux server and photoshop is not really good running on linux.

Simple way to check if an image bitmap is blur

I am looking for a "very" simple way to check if an image bitmap is blur. I do not need accurate and complicate algorithm which involves fft, wavelet, etc. Just a very simple idea even if it is not accurate.
I've thought to compute the average euclidian distance between pixel (x,y) and pixel (x+1,y) considering their RGB components and then using a threshold but it works very bad. Any other idea?
Don't calculate the average differences between adjacent pixels.
Even when a photograph is perfectly in focus, it can still contain large areas of uniform colour, like the sky for example. These will push down the average difference and mask the details you're interested in. What you really want to find is the maximum difference value.
Also, to speed things up, I wouldn't bother checking every pixel in the image. You should get reasonable results by checking along a grid of horizontal and vertical lines spaced, say, 10 pixels apart.
Here are the results of some tests with PHP's GD graphics functions using an image from Wikimedia Commons (Bokeh_Ipomea.jpg). The Sharpness values are simply the maximum pixel difference values as a percentage of 255 (I only looked in the green channel; you should probably convert to greyscale first). The numbers underneath show how long it took to process the image.
If you want them, here are the source images I used:
original
slightly blurred
blurred
Update:
There's a problem with this algorithm in that it relies on the image having a fairly high level of contrast as well as sharp focused edges. It can be improved by finding the maximum pixel difference (maxdiff), and finding the overall range of pixel values in a small area centred on this location (range). The sharpness is then calculated as follows:
sharpness = (maxdiff / (offset + range)) * (1.0 + offset / 255) * 100%
where offset is a parameter that reduces the effects of very small edges so that background noise does not affect the results significantly. (I used a value of 15.)
This produces fairly good results. Anything with a sharpness of less than 40% is probably out of focus. Here's are some examples (the locations of the maximum pixel difference and the 9×9 local search areas are also shown for reference):
(source)
(source)
(source)
(source)
The results still aren't perfect, though. Subjects that are inherently blurry will always result in a low sharpness value:
(source)
Bokeh effects can produce sharp edges from point sources of light, even when they are completely out of focus:
(source)
You commented that you want to be able to reject user-submitted photos that are out of focus. Since this technique isn't perfect, I would suggest that you instead notify the user if an image appears blurry instead of rejecting it altogether.
I suppose that, philosophically speaking, all natural images are blurry...How blurry and to which amount, is something that depends upon your application. Broadly speaking, the blurriness or sharpness of images can be measured in various ways. As a first easy attempt I would check for the energy of the image, defined as the normalised summation of the squared pixel values:
1 2
E = --- Σ I, where I the image and N the number of pixels (defined for grayscale)
N
First you may apply a Laplacian of Gaussian (LoG) filter to detect the "energetic" areas of the image and then check the energy. The blurry image should show considerably lower energy.
See an example in MATLAB using a typical grayscale lena image:
This is the original image
This is the blurry image, blurred with gaussian noise
This is the LoG image of the original
And this is the LoG image of the blurry one
If you just compute the energy of the two LoG images you get:
E = 1265 E = 88
or bl
which is a huge amount of difference...
Then you just have to select a threshold to judge which amount of energy is good for your application...
calculate the average L1-distance of adjacent pixels:
N1=1/(2*N_pixel) * sum( abs(p(x,y)-p(x-1,y)) + abs(p(x,y)-p(x,y-1)) )
then the average L2 distance:
N2= 1/(2*N_pixel) * sum( (p(x,y)-p(x-1,y))^2 + (p(x,y)-p(x,y-1))^2 )
then the ratio N2 / (N1*N1) is a measure of blurriness. This is for grayscale images, for color you do this for each channel separately.

Fast image thresholding

What is a fast and reliable way to threshold images with possible blurring and non-uniform brightness?
Example (blurring but uniform brightness):
Because the image is not guaranteed to have uniform brightness, it's not feasible to use a fixed threshold. An adaptive threshold works alright, but because of the blurriness it creates breaks and distortions in the features (here, the important features are the Sudoku digits):
I've also tried using Histogram Equalization (using OpenCV's equalizeHist function). It increases contrast without reducing differences in brightness.
The best solution I've found is to divide the image by its morphological closing (credit to this post) to make the brightness uniform, then renormalize, then use a fixed threshold (using Otsu's algorithm to pick the optimal threshold level):
Here is code for this in OpenCV for Android:
Mat kernel = Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, new Size(19,19));
Mat closed = new Mat(); // closed will have type CV_32F
Imgproc.morphologyEx(image, closed, Imgproc.MORPH_CLOSE, kernel);
Core.divide(image, closed, closed, 1, CvType.CV_32F);
Core.normalize(closed, image, 0, 255, Core.NORM_MINMAX, CvType.CV_8U);
Imgproc.threshold(image, image, -1, 255, Imgproc.THRESH_BINARY_INV
+Imgproc.THRESH_OTSU);
This works great but the closing operation is very slow. Reducing the size of the structuring element increases speed but reduces accuracy.
Edit: based on DCS's suggestion I tried using a high-pass filter. I chose the Laplacian filter, but I would expect similar results with Sobel and Scharr filters. The filter picks up high-frequency noise in the areas which do not contain features, and suffers from similar distortion to the adaptive threshold due to blurring. it also takes about as long as the closing operation. Here is an example with a 15x15 filter:
Edit 2: Based on AruniRC's answer, I used Canny edge detection on the image with the suggested parameters:
double mean = Core.mean(image).val[0];
Imgproc.Canny(image, image, 0.66*mean, 1.33*mean);
I'm not sure how to reliably automatically fine-tune the parameters to get connected digits.
Using Vaughn Cato and Theraot's suggestions, I scaled down the image before closing it, then scaled the closed image up to regular size. I also reduced the kernel size proportionately.
Mat kernel = Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, new Size(5,5));
Mat temp = new Mat();
Imgproc.resize(image, temp, new Size(image.cols()/4, image.rows()/4));
Imgproc.morphologyEx(temp, temp, Imgproc.MORPH_CLOSE, kernel);
Imgproc.resize(temp, temp, new Size(image.cols(), image.rows()));
Core.divide(image, temp, temp, 1, CvType.CV_32F); // temp will now have type CV_32F
Core.normalize(temp, image, 0, 255, Core.NORM_MINMAX, CvType.CV_8U);
Imgproc.threshold(image, image, -1, 255,
Imgproc.THRESH_BINARY_INV+Imgproc.THRESH_OTSU);
The image below shows the results side-by-side for 3 different methods:
Left - regular size closing (432 pixels), size 19 kernel
Middle - half-size closing (216 pixels), size 9 kernel
Right - quarter-size closing (108 pixels), size 5 kernel
The image quality deteriorates as the size of the image used for closing gets smaller, but the deterioration isn't significant enough to affect feature recognition algorithms. The speed increases slightly more than 16-fold for the quarter-size closing, even with the resizing, which suggests that closing time is roughly proportional to the number of pixels in the image.
Any suggestions on how to further improve upon this idea (either by further reducing the speed, or reducing the deterioration in image quality) are very welcome.
Alternative approach:
Assuming your intention is to have the numerals to be clearly binarized ... shift your focus to components instead of the whole image.
Here's a pretty easy approach:
Do a Canny edgemap on the image. First try it with parameters to Canny function in the range of the low threshold to 0.66*[mean value] and the high threshold to 1.33*[mean value]. (meaning the mean of the greylevel values).
You would need to fiddle with the parameters a bit to get an image where the major components/numerals are visible clearly as separate components. Near perfect would be good enough at this stage.
Considering each Canny edge as a connected component (i.e. use the cvFindContours() or its C++ counterpart, whichever) one can estimate the foreground and background greylevels and reach a threshold.
For the last bit, do take a look at sections 2. and 3. of this paper. Skipping most of the non-essential theoretical parts it shouldn't be too difficult to have it implemented in OpenCV.
Hope this helped!
Edit 1:
Based on the Canny edge thresholds here's a very rough idea just sufficient to fine-tune the values. The high_threshold controls how strong an edge must be before it is detected. Basically, an edge must have gradient magnitude greater than high_threshold to be detected in the first place. So this does the initial detection of edges.
Now, the low_threshold deals with connecting nearby edges. It controls how much nearby disconnected edges will get combined together into a single edge. For a better idea, read "Step 6" of this webpage. Try setting a very small low_threshold and see how things come about. You could discard that 0.66*[mean value] thing if it doesn't work on these images - its just a rule of thumb anyway.
We use Bradleys algorithm for very similar problem (to segment letters from background, with uneven light and uneven background color), described here: http://people.scs.carleton.ca:8008/~roth/iit-publications-iti/docs/gerh-50002.pdf, C# code here: http://code.google.com/p/aforge/source/browse/trunk/Sources/Imaging/Filters/Adaptive+Binarization/BradleyLocalThresholding.cs?r=1360. It works on integral image, which can be calculated using integral function of OpenCV. It is very reliable and fast, but itself is not implemented in OpenCV, but is easy to port.
Another option is adaptiveThreshold method in openCV, but we did not give it a try: http://docs.opencv.org/modules/imgproc/doc/miscellaneous_transformations.html#adaptivethreshold. The MEAN version is the same as bradleys, except that it uses a constant to modify the mean value instead of a percentage, which I think is better.
Also, good article is here: https://dsp.stackexchange.com/a/2504
You could try working on a per-tile basis if you know you have a good crop of the grid. Working on 9 subimages rather than the whole pic will most likely lead to more uniform brightness on each subimage. If your cropping is perfect you could even try going for each digit cell individually; but it all depends on how reliable is your crop.
Ellipse shape is complex to calculate if compared to a flat shape.
Try to change:
Mat kernel = Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, new Size(19,19));
to:
Mat kernel = Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new Size(19,19));
can speed up your enough solution with low impact to accuracy.

Algorithm for determining the prominant colour of a photograph

When we look at a photo of a group of trees, we are able to identify that the photo is predominantly green and brown, or for a picture of the sea we are able to identify that it is mostly blue.
Does anyone know of an algorithm that can be used to detect the prominent color or colours in a photo?
I can envisage a 3D clustering algorithm in RGB space or something similar. I was wondering if someone knows of an existing technique.
Convert the image from RGB to a color space with brightness and saturation separated (HSL/HSV)
http://en.wikipedia.org/wiki/HSL_and_HSV
Then find the dominating values for the hue component of each pixel. Make a histogram for the hue values of each pixel and analyze in which angle region the peaks fall in. A large peak in the quadrant between 180 and 270 degrees means there is a large portion of blue in the image, for example.
There can be several difficulties in determining one dominant color. Pathological example: an image whose left half is blue and right half is red. Also, the hue will not deal very well with grayscales obviously. So a chessboard image with 50% white and 50% black will suffer from two problems: the hue is arbitrary for a black/white image, and there are two colors that are exactly 50% of the image.
It sounds like you want to start by computing an image histogram or color histogram of the image. The predominant color(s) will be related to the peak(s) in the histogram.
You might want to change the image from RGB to indexed, then you could use a regular histogram and detect the pics (Matlab does this with rgb2ind(), as you probably already know), and then the problem would be reduced to your regular "finding peaks in an array".
Then
n = hist(Y,nbins) bins the elements in vector Y into 10 equally spaced containers and returns the number of elements in each container as a row vector.
Those values in n will give you how many elements in each bin. Then it's just a matter of fiddling with the number of bins to make them wide enough, and with how many elements in each would make you count said bin as a predominant color, then taking the bins that contain those many elements, calculating the index that corresponds with their middle, and converting it to RGB again.
Whatever you're using for your processing probably has similar functions to those
Average all pixels in the image.
Remove all pixels that are farther away from the average color than standard deviation.
GOTO 1 with remaining pixels until arbitrarily few are left (1 or maybe 1%).
You might also want to pre-process the image, for example apply high-pass filter (removing only very low frequencies) to even out lighting in the photo — http://en.wikipedia.org/wiki/Checker_shadow_illusion

Resources