Recommended pattern recognition technique for chess board - image-processing

I'm trying to do an application which, among other things, is able to recognize chess positions on a computer screen from screenshots. I have very limited experience with image processing techniques and don't wish to invest a great amount of time in studying this, as this is just a pet project of mine.
Can anyone recommend me one or more image processing techniques that would yield me a good result?
The conditions are:
The image is always crispy clean, no noise, poor light conditions etc (since it's a screenshot)
I'm expecting a very low impact on computer performance while doing 1 image / second
I've thought of two modes to start the process:
Feed the piece shapes to the program (so that it knows what a queen, king etc. looks like)
just feed the program an initial image which contains the startup position, from which the program can (after it recognizes the position of the board) pick each chess piece
The process should be relatively easy to understand, as I don't have a very good grasp of image processing techniques (yet)
I'm not interested in using any specific technology, so technology-agnostic documentation would be ideal (C/C++, C#, Java examples would also be fine).
Thanks for taking the time to read this, and I hope to get some good answers.

It' an interesting problem, but you need to specify a lot more than in your original question in order to find an acceptable answer.
On the input images: "screenshots" is quote vague a category. Can you assume that the chessboard will always be entirely in view? Will you have multiple views of the same board? Can you assume that no pieces will be partially or completely occluded in all views?
On the imaged objects and the capture system: will the same chessboard and pieces be used, under very similar illumination? Will the same lens/camera/digitization pipeline be used?

Salut Andrei,
I have done a coin counting algorithm from a picture so the process should be helpful.
The algorithm is called Generalized Hough transform
Make the picture black and white, it is easier that way
Take the image from 1 piece and "slide it over the screenshot"
For each cell you calculate the nr of common pixel in the 2 images
Where you have the largest number there you have the piece
Hope this helps.

Yeah go with Salut Andrei,
Convert the picture into greyscale
Slice into 64 squares and store in array
Using Mat lab can identify the pieces easily
Color can be obtained from Calculating the percentage of No. dot pixels(black pixels)
threshold=no.black pixels /no. of black pixels + no. of white pixels,
If ur value is above threshold then WHITE else BLACK

I'm working on a similar project in c# finding which piece is which isn't the hard part for me. First step is to find a rectangle that shows just the board and cuts everything else out. I first hard-coded it to search for the colors of the squares but would like to make it more robust and reliable regardless of the color scheme. Trying to make it find squares of pixels that match within a certain threshold and extrapolate the board location from that.

Related

Quantifying differences in an image sequence to measure activity

I'm looking for a program that will enable me to quantity the difference between images in an image sequence over time.
We are hoping to use timelapse images to measure the activity of tadpoles by comparing how the images change over time. Tracking the movement of individuals isn’t necessary. The tadpoles are dark and the background of the aquarium is light, however the background isn’t uniform and some of the decor items like dark rocks and foliage make it so that all the tadpoles aren’t visible at all times.
Basically need a program that will allow me to quantity the differences/motion detected in an image sequence (i.e 209 images) and produce data that can be exported...
Any and all suggestions appreciated!!
Your question is rather vague and you don't supply any images or real indication of what you expect as results, so my answer will not be as thorough as it might otherwise be.
You don't mention any tools you are familiar with, but my recommendation would be Python and OpenCV. Alternatives are probably scikit-image, Python Wand.
In general, when trying to detect movement across a series of images, you would:
try and work out what the background is
look for movement by sutracting, or differencing, frames from the background
clean up the difference image
identify objects - maybe by shape or size or colour
maybe track objects
produce statistics
As regards working out the background, I did an example here by finding the median pixel across all images at each location in the images. There is also an OpenCV tutorial here.
As regards cleaning up images, you can probably remove noise in the background subtraction with a small median filter, say 3x3 or 5x5 depending on the resolution of your images.
As regards detecting tadpoles, you will probably want to use OpenCV findContours() and filter by size, or colour, or circularity. There are some fairly decent tutorials on PyImageSearch. There is also an ImageMagick "Connected Component" analysis to find a tennis player that I did here.

IDEAL solution to separate text from background?

Suppose I have gray-scale photographic pictures of texts sheets. Each sheet of paper is exactly white and text is exactly black.
Unfortunately, the light is not uniform, also perspective shading occurs, also sheets of papers may be curved. Of course, there are some small hi freq noise on an image.
I AM SURE that there should be nearly IDEAL solution to separate text and background in this situation.
So what is it? :)
I don't believe it is impossible or even hard to turn such gray-scale images into nearly perfect black and white pictures. I cant prove this but I judge on my own perception: I need no any intelligence to recognize such pictures by an eye. They can be in any language even unfamiliar, but I will SEE what is written exactly.
So, how to teach computer to do the same?
UPDATE
Consider original image
Any global thresolding will cause artefacts (1) and nonuniform text representation (2)
I need some thresolding, which looks for local statistics.
Switch to adaptive thresholding.
Here you will find some introduction - http://homepages.inf.ed.ac.uk/rbf/HIPR2/adpthrsh.htm
Adaptive thresholding is designed to deal with exactly this kind of problems.

Object recognition and measuring size

I'd like to create a system for use in a factory to measure the size of the objects coming off the assembly line. The objects are slabs of stone, approximately rectangular, and I'd like the width and height. Each stone is photographed in the same position with a flash, so the conditions are pretty controlled. The tricky part is the stones sometimes have patterns on their surface (often marble with ripples and streaks) and they are sometimes almost black, blending in with the shadows.
I tried simply subtracting each image from a reference image of the background, but there are enough small changes in the lighting and the positions of rollers and small bits of machinery that the output is really noisy.
The approach I plan to try next is to use Canny's edge detection algorithm and then use some kind of numerical optimization (Nelder-Mead maybe) to match a 4-sided polygon to the edges. Before I home-brew something, though, is there an existing approach that works well in this kind of situation?
If it helps, it would be possible to 'seed' the algorithm with a patch of the image known to be within the slab (they're always lined up in the corner) to help identify its surface pattern and colors. I could also produce a training set of annotated images if necessary.
Some sample images of the background and some stone slabs:
Have you tried an existing image segmentation algorithm?
I would start with the maxflow algorithm for image segmentation by Vladimir Kolmogorov here: http://pub.ist.ac.at/~vnk/software.html
In the papers they fix areas of an image to belong to a particular segment, which would help for you problem, but it may not be obvious how to do this in the software.
Deep learning algorithms for parsing scenes by Richard Socher might also help: http://www.socher.org/
And Eric Sudderth has at least one interesting method for visual scene understanding here: http://www.cs.brown.edu/~sudderth/software.html
I also haven't actually used any of this software, which is mostly, if not all, for research and not particularly user friendly.

How to align two different pictures in such a way, that they match as close as possible?

I need to automatically align an image B on top of another image A in such a way, that the contents of the image match as good as possible.
The images can be shifted in x/y directions and rotated up to 5 degrees on z, but they won't be distorted (i.e. scaled or keystoned).
Maybe someone can recommend some good links or books on this topic, or share some thoughts how such an alignment of images could be done.
If there wasn't the rotation problem, then I could simply try to compare rows of pixels with a brute-force method until I find a match, and then I know the offset and can align the image.
Do I need AI for this?
I'm having a hard time finding resources on image processing which go into detail how these alignment-algorithms work.
So what people often do in this case is first find points in the images that match then compute the best transformation matrix with least squares. The point matching is not particularly simple and often times you just use human input for this task, you have to do it all the time for calibrating cameras. Anyway, if you want to fully automate this process you can use feature extraction techniques to find matching points, there are volumes of research papers written on this topic and any standard computer vision text will have a chapter on this. Once you have N matching points, solving for the least squares transformation matrix is pretty straightforward and, again, can be found in any computer vision text, so I'll assume you got that covered.
If you don't want to find point correspondences you could directly optimize the rotation and translation using steepest descent, trouble is this is non-convex so there are no guarantees you will find the correct transformation. You could do random restarts or simulated annealing or any other global optimization tricks on top of this, that would most likely work. I can't find any references to this problem, but it's basically a digital image stabilization algorithm I had to implement it when I took computer vision but that was many years ago, here are the relevant slides though, look at "stabilization revisited". Yes, I know those slides are terrible, I didn't make them :) However, the method for determining the gradient is quite an elegant one, since finite difference is clearly intractable.
Edit: I finally found the paper that went over how to do this here, it's a really great paper and it explains the Lucas-Kanade algorithm very nicely. Also, this site has a whole lot of material and source code on image alignment that will probably be useful.
for aligning the 2 images together you have to carry out image registration technique.
In matlab, write functions for image registration and select your desirable features for reference called 'feature points' using 'control point selection tool' to register images.
Read more about image registration in the matlab help window to understand properly.

An algorithm for a drawing and painting robot - any tips?

Algorithm for a drawing and painting robot -
Hello
I want to write a piece of software which analyses an image, and then produces an image which captures what a human eye perceives in the original image, using a minimum of bezier path objects of varying of colour and opacity.
Unlike the recent twitter super compression contest (see: stackoverflow.com/questions/891643/twitter-image-encoding-challenge), my goal is not to create a replica which is faithful to the image, but instead to replicate the human experience of looking at the image.
As an example, if the original image shows a red balloon in the top left corner, and the reproduction has something that looks like a red balloon in the top left corner then I will have achieved my goal, even if the balloon in the reproduction is not quite in the same position and not quite the same size or colour.
When I say "as perceived by a human", I mean this in a very limited sense. i am not attempting to analyse the meaning of an image, I don't need to know what an image is of, i am only interested in the key visual features a human eye would notice, to the extent that this can be automated by an algorithm which has no capacity to conceptualise what it is actually observing.
Why this unusual criteria of human perception over photographic accuracy?
This software would be used to drive a drawing and painting robot, which will be collaborating with a human artist (see: video.google.com/videosearch?q=mr%20squiggle).
Rather than treating marks made by the human which are not photographically perfect as necessarily being mistakes, The algorithm should seek to incorporate what is already on the canvas into the final image.
So relative brightness, hue, saturation, size and position are much more important than being photographically identical to the original. The maintaining the topology of the features, block of colour, gradients, convex and concave curve will be more important the exact size shape and colour of those features
Still with me?
My problem is that I suffering a little from the "when you have a hammer everything looks like a nail" syndrome. To me it seems the way to do this is using a genetic algorithm with something like the comparison of wavelet transforms (see: grail.cs.washington.edu/projects/query/) used by retrievr (see: labs.systemone.at/retrievr/) to select fit solutions.
But the main reason I see this as the answer, is that these are these are the techniques I know, there are probably much more elegant solutions using techniques I don't now anything about.
It would be especially interesting to take into account the ways the human vision system analyses an image, so perhaps special attention needs to be paid to straight lines, and angles, high contrast borders and large blocks of similar colours.
Do you have any suggestions for things I should read on vision, image algorithms, genetic algorithms or similar projects?
Thank you
Mat
PS. Some of the spelling above may appear wrong to you and your spellcheck. It's just international spelling variations which may differ from the standard in your country: e.g. Australian standard: colour vs American standard: color
There is an model that can implemented as an algorithm to calculate a saliency map for an image, determining which parts of the image would get the most attention from a human.
The model is called itti koch model
You can find a startin paper here
And more resources and c++ sourcecode here
I cannot answer your question directly, but you should really take a look at artist/programmer (Lisp) Harold Cohen's painting machine Aaron.
That's quite a big task. You might be interested in image vectorizing (don't know what it's called officially), which is used to take in rasterized images (such as pictures you take with a camera) and outputs a set of bezier lines (i think) that approximate the image you put in. Since good algorithms often output very high quality (read: complex) line sets you'd also be interested in simplification algorithms which can help enormously.
Unfortunately I am not next to my library, or I could reccomend a number of books on perceptual psychology.
The first thing you must consider is the physiology of the human eye is such that when we examine an image or scene, we are only capturing very small bits at a time, as our eyes dart around rapidly. Our mind peices the different parts together to try and form a whole.
You might start by finding an algorithm for the path of an eyeball as it darts around. Perhaps it is attracted to contrast?
Next is that our eyes adjust the "exposure" depending on the context. It's like those high dynamic range images, if they were peiced together not by multiple exposures of a whole scene, but by many small images, each balanced on its own, but blended into its surroundings to form a high dynamic range.
Now there was a finding in a monkey brain that there is a single neuron that lights up if there's a diagonal line in the upper left of its field of vision. Similar neurons can be found for vertical lines, and horizontal lines in various areas of that monkey's field of vision. The "diagonalness" determines the frequency with which that neuron fires.
one might speculated that other neurons might be found and mapped to other qualities such as redness, or texturedness, and other things.
There's something humans can do that I've not seen a computer program ever able to do. it's something called "closure", where a human is able to fill in information about something that they are seeing, that doesn't actually exist in the image. an example:
*
* *
is that a triangle? If you knew that it was in advance, then you could probably make a program to connect the dots. But what if it's just dots? How can you know? I wouldn't attempt this one unless I had some really clever way of dealing with that one.
There are many other facts about human perception you might be able to use. Good luck, you've not picked a straightforward task.
i think a thing that could help you in this enormous task is human involvement. i mean data. like you could have many people sitting staring at random dots (like from the previous post) and connect them as they see right. you could harness that data.

Resources