i am working on an Android app that will recognize a GO board and create a SGF file of it.
i need to detect the whole board in order to warp it and to be able to find the correct lines and stones like below.
(source: eightytwo.axc.nl)
right now i use an Opencv RGB Mat and do the following:
separate the channels
canny the separate channels
Imgproc.Canny(channel, temp_canny, 30, 100);
combine (bitwise OR) all channels.
Core.bitwise_or(temp_canny, canny, canny);
find the board contour
Still i am not able to detect the board consistently as some lines tend to disappear as you can see in the picture below, black lines on the board and stones are clearly visible but the board edge is missing at some places.
(source: eightytwo.axc.nl)
how can i improve this detection? or should i implement multiple ways of detecting it and switching between them when one fails..
* Important to keep in mind *
go boards vary in color
go boards can be empty or completely filled with stones
this implies i can't rely on detecting the outer black line on the board
backgrounds are not always plain white
this is a small collection of pictures with go boards i would like to detect
* Update * 23-05-2016
I kinda ran out of inspiration using opencv to solve this, so new inspiration is much appreciated!!!
In the meantime i started using machine learning, first results are nice and i'll keep you posted but still having high hopes creating a opencv implementation.
I'm working on the same problem!
My approach is to assume two things:
The camera is steady
The board is steady
That allows me to deduce the warping parameters from a single frame when the board is still empty (before playing). Use these parameters to warp every frame, no matter how many stones are occluding the board edges.
Related
I have an image with a collection of objects in K given perceived colors. Providing I extract those objects, how could I cluster them by their perceived color?
Let me give you an example. I am trying to cluster two football teams - so there will be two teams, referees and a keeper (or two, but that`s a rare situation) on the image - 3, 4 or 5 clusters.
For a human's eye, it`s an easy situation. On the picture above, we have white players, red players and a black ref. But it turns out not so easy for automatic processing.
What I have tried so far:
1) I've started working on the BGR colorspace, then tried HSV and now I am exploring CIE Luv, as I read it has unified distances describing the perceived differences between colors.
2) [BGR and HSV] taking the most common color from the contour (not the bounding box). this didn' work at all because of the noise (green field getting in the way), the quality of the image, the position of the player, etc. Colors were pretty much random.
3) [CIE Luv] Resizing all players' boxes to a common size and taking a small portion of the image from the middle (as marked by a black rectangle in the example below).
Taking the mean value of all pixels in each player's window and adding to the list (so, it`s one pixel with the mean value per player). Using K-means (with a defined number of clusters) to find out clusters on that list. This has proven somewhat successful, for the image above I have redish, white and blackish centres in the clusters.
Unfortunately, the assignment of players back to these clusters is pretty much random. I am doing that by calculating the mean color for each player like I described above and then measuring the distance to each cluster. A player might be assigned to the white cluster on one frame and to the red one on the next. Part of the problem might be that the window in the middle of the player's box will sometimes catch a number, grass or shorts, instead of the jersey.
I have already spent a considerable amount of time on trying to figure that out, grateful for any help.
I may be overcomplicating the problem since you just have 3 classes, but try training an SVM classifier based on HOG descriptors. maybe try LDA to improve speed
Some references -
1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.627.6465&rep=rep1&type=pdf - skip to recognition part
2] https://rodrigob.github.io/documents/2013_ijcnn_traffic_signs.pdf - skip to recognition part.
3] https://www.learnopencv.com/handwritten-digits-classification-an-opencv-c-python-tutorial/ - if you want to jump into the code right away
This will always work as long as your detection is good. and can also help to identify different players based on their shirt number
(maybe more???) if you train it right
EDIT: Okkay I have another idea, based on colour segmentation since that was your original approach and require less work (maybe not? color segmentation is a pain! also LIGHTING! LIGHTING! LIGHTING!).
Create a green mask and create a threshold so you detect as little grass as possible when doing your kmeans. Then instead of finding mean, try median instead, that will get you closer to red, coz white is detected as 0 and mean just drops drastically, median doesnt. So it'll be way more robust and you should be able to sort players better (hair color and skin color shouldnt affect it too much)
EDIT 2: Just noticed, if you use the black rectangle you'll get shirt number more (which is white), gonna mess up your classifier, use original box with green masked out
EDIT 3: Also. You can just create 3 thresholds for your required colors and split them up! don't really need Kmeans in this actually. Basically you just need your detected boxes to give out a value inside that threshold. Try the median method I mentioned above. Should improve. Also, might need some more minor tweaks here and there (blur, morphology etc to improve detection)
I am playing around with OpenCV. The task I made up for myself to make the process interesting is recognizing the chess board with random set of pieces on it, they can be placed between the squares as well or even on top of each other.
There are complications: the piece's colors are very close to chessboard's. The final goal is to mark intersections (know their coordinates) and fill the cells according to their color (black/white). With perspective support, shadows etc.
example
another example
red squares are created automatically by my current code. But it is not as good as I expected.
Any recommendations? What image processing algorithms?
First of all I'm a total newbie in image processing, so please don't be too harsh on me.
That being said, I'm developing an application to analyse changes in blood flow in extremities using thermal images obtained by a camera. The user is able to define a region of interest by placing a shape (circle,rectangle,etc.) on the current image. The user should then be able to see how the average temperature changes from frame to frame inside the specified ROI.
The problem is that some of the images are not steady, due to (small) movement by the test subject. My question is how can I determine the movement between the frames, so that I can relocate the ROI accordingly?
I'm using the Emgu OpenCV .Net wrapper for image processing.
What I've tried so far is calculating the center of gravity using GetMoments() on the biggest contour found and calculating the direction vector between this and the previous center of gravity. The ROI is then translated using this vector but the results are not that promising yet.
Is this the right way to do it or am I totally barking up the wrong tree?
------Edit------
Here are two sample images showing slight movement downwards to the right:
http://postimg.org/image/wznf2r27n/
Comparison between the contours:
http://postimg.org/image/4ldez2di1/
As you can see the shape of the contour is pretty much the same, although there are some small differences near the toes.
Seems like I was finally able to find a solution for my problem using optical flow based on the Lukas-Kanade method.
Just in case anyone else is wondering how to implement it in Emgu/C#, here's the link to a Emgu examples project, where they use Lukas-Kanade and Farneback's algorithms:
http://sourceforge.net/projects/emguexample/files/Image/BuildBackgroundImage.zip/download
You may need to adapt a few things, e.g. the parameters for the corner detection (the frame.GoodFeaturesToTrack(..) method) , but it's definetly something to start with.
Thanks for all the ideas!
I'm trying to do an application which, among other things, is able to recognize chess positions on a computer screen from screenshots. I have very limited experience with image processing techniques and don't wish to invest a great amount of time in studying this, as this is just a pet project of mine.
Can anyone recommend me one or more image processing techniques that would yield me a good result?
The conditions are:
The image is always crispy clean, no noise, poor light conditions etc (since it's a screenshot)
I'm expecting a very low impact on computer performance while doing 1 image / second
I've thought of two modes to start the process:
Feed the piece shapes to the program (so that it knows what a queen, king etc. looks like)
just feed the program an initial image which contains the startup position, from which the program can (after it recognizes the position of the board) pick each chess piece
The process should be relatively easy to understand, as I don't have a very good grasp of image processing techniques (yet)
I'm not interested in using any specific technology, so technology-agnostic documentation would be ideal (C/C++, C#, Java examples would also be fine).
Thanks for taking the time to read this, and I hope to get some good answers.
It' an interesting problem, but you need to specify a lot more than in your original question in order to find an acceptable answer.
On the input images: "screenshots" is quote vague a category. Can you assume that the chessboard will always be entirely in view? Will you have multiple views of the same board? Can you assume that no pieces will be partially or completely occluded in all views?
On the imaged objects and the capture system: will the same chessboard and pieces be used, under very similar illumination? Will the same lens/camera/digitization pipeline be used?
Salut Andrei,
I have done a coin counting algorithm from a picture so the process should be helpful.
The algorithm is called Generalized Hough transform
Make the picture black and white, it is easier that way
Take the image from 1 piece and "slide it over the screenshot"
For each cell you calculate the nr of common pixel in the 2 images
Where you have the largest number there you have the piece
Hope this helps.
Yeah go with Salut Andrei,
Convert the picture into greyscale
Slice into 64 squares and store in array
Using Mat lab can identify the pieces easily
Color can be obtained from Calculating the percentage of No. dot pixels(black pixels)
threshold=no.black pixels /no. of black pixels + no. of white pixels,
If ur value is above threshold then WHITE else BLACK
I'm working on a similar project in c# finding which piece is which isn't the hard part for me. First step is to find a rectangle that shows just the board and cuts everything else out. I first hard-coded it to search for the colors of the squares but would like to make it more robust and reliable regardless of the color scheme. Trying to make it find squares of pixels that match within a certain threshold and extrapolate the board location from that.
I am a bit new to image processing so I'd like to ask you about finding the optimal solution for my problem, not help for code.
I couldn't think of a good idea yet so wanted to ask for your advices. Hope you can help.
I'm working on a project under OpenCV which is about counting the vehicles from a video file or a live camera. Other people working on such a project generally track the moving objects then count them but instead of it, I wanted to work with a different viewpoint; asking user to set a ROI(Region of interest) on the video window and work only for this region(for some reasons, like to not deal with the whole frame and some performance increase), as seen below.(btw, user can set more than one ROI and user is asked to set the height of the ROI about 2 times of a normal car by sense of proportion )
I've done some basic progress so far, like backgound updating, morphological filters, threshoulding and getting the moving object as a binary image something like below.
After doing them, I tried to count the white pixels of the final threshoulded foreground frame and estimate whether it was a car or not by checking the total white pixels number(I set a lower bound by a static calculation by knowing the height of ROI). To illustrate, I drew a sample graphic:
As you can see from the graphic, it was easy to calculate the white pixels and checking if it draws a curve by the time and determining whether a car or something like noise.
I was quite successful until two cars passed through my ROI together at the same time. My algorithm crashed by counting them as one car as you can guess :/ I tried different approaches for this problem and similar to this like long vehicles but I couldn't get an optimum solution up to now.
My question is: is it impossible to handle this task by this approach of pixel value counting? If it is possible, what would be your suggestion? I wish you also faced something similar to this before and can help me.
All ideas are welcome, thanks in advance friends.
Isolate the traffic from the background - take two images, run high pass filter on one of them, convert the other to a binary image - use the binary image to mask the filtered one, you should be able to use edge detection to identify the roof of each vehicle as a quadrilateral and you should then be able to compute a relative measure of it.
You then have four scenarios:
no quadrilaterals - no cars
large quadrilaterals - trucks
multiple small quadrilaterals - several cars
single quadrilaterals - one car
In answer to your question "Is it possible to do this using pixel counting?"
The short answer is "No", for the very reason your quoting: mere pixel counting of static images is not enough.
If you are limited to pixel counting, you can try looking at pixel count velocity (change of pixel counts between success frames) and you might pick out different "velocity" shapes when 1 car, 2 cars or trucks pass.
But just plain pixel counting? No. You need shape (geometric) information as well.
If you apply any kind of thresholding algorithm (e.g. for background subtraction), don't forget to update the background whenever light levels change (e.g. day and night). Also consider the grief when it is a partly cloudy with sharp cloud shadows that move across your image.