What algorithm for chessboard recognition (w/opencv)? - opencv

I am playing around with OpenCV. The task I made up for myself to make the process interesting is recognizing the chess board with random set of pieces on it, they can be placed between the squares as well or even on top of each other.
There are complications: the piece's colors are very close to chessboard's. The final goal is to mark intersections (know their coordinates) and fill the cells according to their color (black/white). With perspective support, shadows etc.
example
another example
red squares are created automatically by my current code. But it is not as good as I expected.
Any recommendations? What image processing algorithms?

Related

OpenCV detect square with difficult background

i am working on an Android app that will recognize a GO board and create a SGF file of it.
i need to detect the whole board in order to warp it and to be able to find the correct lines and stones like below.
(source: eightytwo.axc.nl)
right now i use an Opencv RGB Mat and do the following:
separate the channels
canny the separate channels
Imgproc.Canny(channel, temp_canny, 30, 100);
combine (bitwise OR) all channels.
Core.bitwise_or(temp_canny, canny, canny);
find the board contour
Still i am not able to detect the board consistently as some lines tend to disappear as you can see in the picture below, black lines on the board and stones are clearly visible but the board edge is missing at some places.
(source: eightytwo.axc.nl)
how can i improve this detection? or should i implement multiple ways of detecting it and switching between them when one fails..
* Important to keep in mind *
go boards vary in color
go boards can be empty or completely filled with stones
this implies i can't rely on detecting the outer black line on the board
backgrounds are not always plain white
this is a small collection of pictures with go boards i would like to detect
* Update * 23-05-2016
I kinda ran out of inspiration using opencv to solve this, so new inspiration is much appreciated!!!
In the meantime i started using machine learning, first results are nice and i'll keep you posted but still having high hopes creating a opencv implementation.
I'm working on the same problem!
My approach is to assume two things:
The camera is steady
The board is steady
That allows me to deduce the warping parameters from a single frame when the board is still empty (before playing). Use these parameters to warp every frame, no matter how many stones are occluding the board edges.

Detecting multiple shapes in a picture and calculate the middle

This question can be answered with any type of programming language, cause I would like some help with algorithms, but I prefer Delphi. I have a the task to detect and count multiple shapes (between 1 and N - mostly circular or a Elipse) of random pictures and calculate their middle and return them as coordinates of a picture. The middle of each shape can have a filling (but it doesn't matter). The shapes are at least 1+ pixel away from each other. None of the shapes will like blend in with another or the corner of a picture.
The background of the picture has always the same background color, which actually doesn't matter, cause the borders/frames of the shapes are always a different color compared to the background. This makes it easy to detect the shapes. I was thinking about going pixel by pixel and collect the coordinates and then draw like an invisible rectangle/square around every shape to calculate the middle. Then I also heard about scanline, but I don't think it would be faster in this case. So my question is, how can I calculate:
How many shapes are in the picture.
How can I calculate (more or less) the exact middle of them.
A few pictures to visualize the task:
This is a picture with random shapes (mostly close circles)
As you can see they are apart from each other just fine.
Then I could easily draw/calculate an imaginary rectangle/square around every shape and calculate the middle of it like that:
After I have the rectangles/squares. I can easily calculate the middle.
How do I start?
PS.: I've drawn some circles in mspaint. I have to add that all shapes are CLOSED, which makes it possible to flood fill EVERY shape in the picture with no problems!
Thank you for your help.
Calculate MSER (Maximally stable extremal regions) for the image. I can't explain that algorithm here. You can refer to the Maximally stable extremal regions article for more information about the algorithm.
That will give you centroid too.
This algorithm is implemented as inbuilt functions in OpenCv tool and Matlab 2012b.
Another method which i can think of and possibly simple than previous method is to apply connected components algorithm and count number of objects.More information of this can be found in book by Gonzalez and Woods on Digital Image Processing.

Measuring an object from a picture using a known object size

So what I need to do is measuring a foot length from an image taken by an ordinary user. That image will contain a foot with a black sock wearing, a coin (or other known size object), and a white paper (eg A4) where the other two objects will be upon.
What I already have?
-I already worked with opencv but just simple projects;
-I already started to read some articles about Camera Calibration ("Learn OpenCv") but still don't know if I have to go so far.
What I am needing now is some orientation because I still don't understand if I'm following right way to solve this problem. I have some questions: Will I realy need to calibrate camera to get two or three measures of the foot? How can I find the points of interest to get the line to measure, each picture is a different picture or there are techniques to follow?
Ps: sorry about my english, I really have to improve it :-/
First, some image acquisition things:
Can you count on the black sock and white background? The colors don't matter as much as the high contrast between the sock and background.
Can you standardize the viewing angle? Looking directly down at the foot will reduce perspective distortion.
Can you standardize the lighting of the scene? That will ease a lot of the processing discussed below.
Lastly, you'll get a better estimate if you zoom (or position the camera closer) so that the foot fills more of the image frame.
Analysis. (Note this discussion will directed to your question of identifying the axes of the foot. Identifying and analyzing the coin would use a similar process, but some differences would arise.)
The next task is to isolate the region of interest (ROI). If your camera is looking down at the foot, then the ROI can be limited to the white rectangle. My answer to this Stack Overflow post is a good start to square/rectangle identification: What is the simplest *correct* method to detect rectangles in an image?
If the foot lies completely in the white rectangle, you can clip the image to the rect found in step #1. This will limit the image analysis to region inside the white paper.
"Binarize" the image using a threshold function: http://opencv.willowgarage.com/documentation/cpp/miscellaneous_image_transformations.html#cv-threshold. If you choose the threshold parameters well, you should be able to reduce the image to a black region (sock pixels) and white regions (non-sock pixel).
Now the fun begins: you might try matching contours, but if this were my problem, I would use bounding boxes for a quick solution or moments for a more interesting (and possibly robust) solution.
Use cvFindContours to find the contours of the black (sock) region: http://opencv.willowgarage.com/documentation/structural_analysis_and_shape_descriptors.html#findcontours
Use cvApproxPoly to convert the contour to a polygonal shape http://opencv.willowgarage.com/documentation/structural_analysis_and_shape_descriptors.html#approxpoly
For the simple solution, use cvMinRect2 to find an arbitrarily oriented bounding box for the sock shape. The short axis of the box should correspond to the line in largura.jpg and the long axis of the box should correspond to the line in comprimento.jpg.
http://opencv.willowgarage.com/documentation/structural_analysis_and_shape_descriptors.html#minarearect2
If you want more (possible) accuracy, you might try cvMoments to compute the moments of the shape. http://opencv.willowgarage.com/documentation/structural_analysis_and_shape_descriptors.html#moments
Use cvGetSpatialMoment to determine the axes of the foot. More information on the spatial moment may be found here: http://en.wikipedia.org/wiki/Image_moments#Examples_2 and here http://opencv.willowgarage.com/documentation/structural_analysis_and_shape_descriptors.html#getspatialmoment
With the axes known, you can then rotate the image so that the long axis is axis-aligned (i.e. vertical). Then, you can simply count pixels horizontally and vertically to obtains the lengths of the lines. Note that there are several assumptions in this moment-oriented process. It's a fun solution, but it may not provide any more accuracy - especially since the accuracy of your size measurements is largely dependent on the camera positioning issues discussed above.
Lastly, I've provided links to the older C interface. You might take a look at the new C++ interface (I simply have not gotten around to migrating my code to 2.4)
Antonio Criminisi likely wrote the last word on this subject years ago. See his "Single View Metrology" paper , and his PhD thesis if you have time.
You don't have to calibrate the camera if you have a known-size object in your image. Well... at least if your camera doesn't distort too much and if you're not expecting high quality measurements.
A simple approach would be to detect a white (perspective-distorted) rectangle, mapping the corners to an undistorted rectangle (using e.g. cv::warpPerspective()) and use the known size of that rectangle to determine the size of other objects in the picture. But this only works for objects in the same plane as the paper, preferably not too far away from it.
I am not sure if you need to build this yourself, but if you just need to do it, and not code it. You can use KLONK Image Measurement for this. There is a free and payable versions.

Finding a grid in an image

Having a match-3 game screenshot (for example http://www.gameplay3.com/images/games/jewel-quest-ii-01S.jpg), what would be the correct way to find the bound box for the grid (table with tiles)? The board doesn't have to be a perfect rectangle (as can be seen in the screenshot), but each cell is completely square.
I've tried several games, and found that there are some per-game image transformations that can be done to enhance the tiles inside the grid (for example in this game it's enough to take the V channel out of HSV color space). Then I can enlarge the tiles so that they overlap, find the largest contour of the image and get the bound box from it.
The problem with above approach is that every game (or even level inside the same game) may need a different transformation to get hold of the tiles. So the question is - is there a standard way to enhance either tiles inside the grid or grid's lines (I've tried finding lines with Hough transform, but, although the grid seems pretty visible to the eye, Hough doesn't find it)?
Also, what if the screenshot is obtained using the phone camera instead of taking a screenshot of a desktop? From my experience, captured images have less defined colors (which depends on lighting), and also can be distorted a little, as there is no way to hold the phone exactly in front of the screen.
I would go with the following approach for a screenshot:
Find corners in the image using for example a canny like edge detector.
Perform a hough line transform. This should work quite nicely on the edge image.
If you have some information about size of the tiles you could eliminate false positive lines using some sort of spatial model of the grid (eg. lines only having a small angle to x/y axis of the image and/or distance/angle of tile borders.
Identifiy tile borders under the found hough lines by looking for edges found by canny under/next to the lines.
Which implementation of the hough transform did you use? How did you preprocess the image?
Another approach would be to use some sort of machine learning approach. As you are working in OpenCV you could use either a Haar like feature detector. An example for face detection using Haar like features can be found here:
OpenCV Haar Face Detector example
Another machine learning approach would be to follow a Histogram of Oriented Gradients (Hog) approach in combination with a Support Vector Machine (SVM). An example is located here:
HOG example
You can find general information about HoG detection at:
Hog detection

how to remove background image and get fore image

there are two images
alt text http://bbs.shoucangshidai.com/attachments/month_1001/1001211535bd7a644e95187acd.jpg
alt text http://bbs.shoucangshidai.com/attachments/month_1001/10012115357cfe13c148d3d8da.jpg
one is background image another one is a person's photo with the same background ,same size,what i want to do is remove the second image's background and distill the person's profile only. the common method is subtract first image from the second one,but my problem is if the color of person's wear is similar to the background. the result of subtract is awful. i can not get whole people's profile. who have good idea to remove the background give me some advice.
thank you in advance.
If you have a good estimate of the image background, subtracting it from the image with the person is a good first step. But it is only the first step. After that, you have to segment the image, i.e. you have to partition the image into "background" and "foreground" pixels, with constraints like these:
in the foreground areas, the average difference from the background image should be high
in the background areas, the average difference from the background image should be low
the areas should be smooth. Outline length and curvature should be minimal.
the borders of the areas should have a high contrast in the source image
If you are mathematically inclined, these constraints can be modeled perfectly with the Mumford-Shah functional. See here for more information.
But you can probably adapt other segmentation algorithms to the problem.
If you want a fast and simple (but not perfect) version, you could try this:
subtract the two images
find the largest consecutive "blob" of pixels with a background-foreground difference greater than some threshold. This is the first rough estimate for the "person area" in the foreground image, but the segmentation does not meet the criteria 3 and 4 above.
Find the outline of the largest blob (EDIT: Note that you don't have to start at the outline. You can also start with a larger polygon, as the steps will automatically shrink it to the optimal position.)
now go through each point in the outline and smooth the outline. i.e. for each point find the point that minimizes the formula: c1*L - c2*G, where L is the length of the outline polygon if the point were moved here and G is the gradient at the location the point would be moved to, c1/c2 are constants to control the process. Move the point to that position. This has the effect of smoothing the contour polygon in areas of low gradient in the source image, while keeping it tied to high gradients in the source image (i.e. the visible borders of the person). You can try different expressions for L and G, for example, L could take the length and curvature into account, and G could also take the gradient in the background and subtracted images into account.
you probably will have to re-normalize the outline polygon, i.e. make sure that the points on the outline are spaced regularly. Either that, or make sure that the distances between the points stay regular in the step before. ("Geodesic Snakes")
repeat the last two steps until convergence
You now have an outline polygon that touches the visible person-background border and continues smoothly where the border is not visible or has low contrast.
Look up "Snakes" (e.g. here) for more information.
Low-pass filter (blur) the images before you subtract them.
Then use that difference signal as a mask to select the pixels of interest.
A wide-enough filter will ignore the too-small (high-frequency) features that end up carving out "awful" regions inside your object of interest. It'll also reduce the highlighting of pixel-level noise and misalignment (the highest-frequency information).
In addition, if you have more than two frames, introducing some time hysteresis will let you form more stable regions of interest over time too.
One technique that I think is common is to use a mixture model. Grab a number of background frames and for each pixel build a mixture model for its color.
When you apply a frame with the person in it you will get some probability that the color is foreground or background, given the probability densities in the mixture model for each pixel.
After you have P(pixel is foreground) and P(pixel is background) you could just threshold the probability images.
Another possibility is to use the probabilities as inputs in some more clever segmentation algorithm. One example is graph cuts which I have noticed works quite well.
However, if the person is wearing clothes that are visually indistguishable from the background obviously none of the methods described above would work. You'd either have to get another sensor (like IR or UV) or have a quite elaborate "person model" which could "add" the legs in the right position if it finds what it thinks is a torso and head.
Good luck with the project!
Background vs Foreground detection is very subjective. The application scenario defines background or foreground. However in the application you detail, I guess you are implicitly saying that the person is the foreground.
Using the above assumption, what you seek is a person detection algorithm. A possible solution is:
Run a haar feature detector+ boosted cascade of weak classifiers
(see the opencv wiki for details)
Compute inter-frame motion (differences)
If there is a +ve face detection for a frame, cluster motion pixels
around the face (kNN algorithm)
voila... you should have a simple person detector.
Post the photo on Craigslist and tell them that you'll pay $5 for someone to do it.
Guaranteed you'll get hits in minutes.
Instead of a straight subtraction, you could step through both images, pixel by pixel, and only "subtract" the pixels which are exactly the same. That of course won't account for minor variances in colors, though.

Resources