I was hoping someone could point me in the right direction here. With a picture of a die (from above) I want to recognize which side is up.
I understand the basics in play here, but I'm having trouble grasping the power of OpenCV. I imagine I want a picture of each side of the die. Then I can somehow compare them all to the current image to be classified. How can I use OpenCV to do this?
Thanks,
Jonathan
While that would work and OpenCV has template matching functions, it would probably be harder than necessary. Good results would require that the lighting is more or less unchanged between all images, and that the camera is fixed and no projective distorsions occur.
Instead, I would do something like this:
In the image, locate the die. The difficulty here will vary with regard to how the die looks and the background. If you have a white die on a lpain black (or some other color) background then finding the die will be easy.
When the die has been located, find the eyes. This can be done by simply finding all black blobs.
If necessary, make sure that the found eyes form a coherent pattern. E.g. if the side up is four you expect to find the eyes as the corners in a square, not on a straight line.
Count valid eyes. There's your side up.
This outline is quite vague as there are lots of ways of performing each step. I do however believe that everything you need is available in OpenCV.
Good luck!
Related
I've been trying to extract hand-drawn circles from a document for a while now but every attempt I make doesn't have the level of consistency I need.
Process Album
The problem I keep coming up against is when 2 "circles" are too close they become a single contour, ruining my attempt to detect if a contour is curved. I'm sure there must be a better way to extract these circles, but their imperfection and inconsistency are really stumping me.
I've tried many other ways to single out the curves, the most accurate of which being:
Rather than use dilation to bridge the gap between the segmented contours, find the endpoints and attempt to continue the curve until it hits another contour.
Problem: I can't effectively find the turning points of the contour, otherwise this would be my preferable method
I apologize if this question is deemed "too specific", but I feel like Computer Vision stuff like this can always be applied elsewhere.
Thanks ahead of time for any and all help, I'm about at the end of my rope here.
EDIT: I've just realized the album wasn't working correctly, I think it should be fixed now though.
It looks like a very challenging problem so it is very likely that the things I am going to write wouldn't work very well in practice.
In order to ease the problem, I would probably try to remove as much of other stuff from the image as possible.
If the template of the document is always the same, it might be worth trying to remove horizontal and vertical lines along with grayed areas. For example, given the empty template, substract it from the document that you are processing. Probably, it might be possible to get rid of the text also. This would result in an image with only parts of hand drawn circles.
On such image, detecting circles or ellipses with hough transform might give some results (although shapes might be far from circles or ellipses).
with image processing libraries like opencv you can determine if there are faces recognized in an image or even check if those faces have a smile on it.
Would it be possible to somehow determine, if the person is looking directly into the camera? As it is hard even for the human eye to determine is someone is looking into the camera or to a close point, i think that this will be very tricky.
Can someone agree?
thanks
You can try using an eye detection program, I remember doing back a few years ago, and it wasn't that strong, so when we tilt our head slightly off the camera, or close our eyes, the eyes can't be detected.
Is it is not clear, what I really meant was our face must be facing straight at the camera with our eyes open before it can detect our eyes. You can try doing something similar with a bit of tweaks here and there.
Off the top of my head, split the image to different sections, for each ROI, there are different eye classifiers, for example, upper half of the image, u can train the a specific classifiers of how eyes look like when they look downwards, lower half of image, train classifiers of how eyes look like when they look upwards. and for the whole image, apply the normal eye detection in case the user move their head along while looking at the camera.
But of course, this will be based on extremely strong classifiers and ultra clear quality images, video due to when the eye is looking at. Making detection time, extremely slow even if my method is successful.
There maybe other ideas available too that u can explore. It's slightly tricky, but it not totally impossible. If openCV can't satisfy, openGL? so many libraries, etc available. I wish you best of luck!
I'm trying to detect a texture using OpenCV. The texture would be similar to that of a brush on a paintbrush, so on an image it would have many many little lines together. I've tried using Hough Lines to be able to distinguish the texture from other things, but it hasn't been working out too well, as too many other false positives are detected. Other than that, I've had ideas about using Template Matching as well as Fast-Fourier Transforms, but I haven't tried testing or implementing them.
So, would anyone else have any idea of a possible method to do this? Maybe use some other line detector or an edge detector? Or would that bring up too many false positives?
This texture should be able to be detected in a cluttered scene and the algorithm for doing so should be relatively fast, since I want it to be tracked in real-time if possible. Sorry for not being able to post a sample of the texture I want (too less rep l0l), but you can simply search up paintbrush/paintbrush texture if you really need to see what it looks like. But if you've seen a paintbrush before, it should be pretty obvious which part I'm talking about (the part with the brush).
Thanks in advance, really appreciate it.
I have a random 2D image. I would like to be able to present the image in 3D. This doesn't have to be very detailed, even if the image were arbitrarily broken into layers like a pop-up cutout from a children's book.
The goal would be that a given image would look normal when directly viewed but that if a viewer were to move/tilt left, right, up, down there would be a 3d effect.
This is similar but not exactly the same as this question here:
How to create 3D streoscopic images using MATLAB with image tool?
This is complete over-kill:
http://make3d.cs.cornell.edu/
And this is probably on the right track:
http://www.imagemagick.org/Usage/distorts/#perspective
My ideal implementation would be a automated PHP script with ImageMagick take is fed an image and spits out as a result either (in order of preference):
Images representing each layer, from
nearest to deepest (closer to the
childs pop-up book layer analogy)
5 images representing the said views
(direct, left, right, top, bottom)
Has this been done (either of the above ideal implementations), or does anyone know how to do all, or part, of this?
As far as the first part of your question is concerned, it sounds like your ideal implementation is http://make3d.cs.cornell.edu/, except that:
you want it simpler (return images from a fixed set of angles as opposed to a walkthrough)
you want it with imagemagick and PHP
I think that last restriction is unrealistic because there's a fair amount of maths and computer vision behind this kind of problem. Imagemagick will help you with lower level-image processing tasks like affine transforms, but it doesn't really provide the required higher-level computer vision functionality like 3D image reconstruction.
So my advice would be to try and work around that restriction somehow. If you implement the approach using more suitable tools (like C++ and OpenCV, for example, or Matlab, as the Make3D guys did), then you can wrap that in a CGI application so your PHP scripts can access it. Cornell (the authors of Make3D) had a similar thing going a while back, but it looks like they're not doing it any more.
For the second part of your question, the theory behind what you want to do has been fairly well-researched. See here for a list of depth estimation papers. Here is what things look like in source.
Algorithm for a drawing and painting robot -
Hello
I want to write a piece of software which analyses an image, and then produces an image which captures what a human eye perceives in the original image, using a minimum of bezier path objects of varying of colour and opacity.
Unlike the recent twitter super compression contest (see: stackoverflow.com/questions/891643/twitter-image-encoding-challenge), my goal is not to create a replica which is faithful to the image, but instead to replicate the human experience of looking at the image.
As an example, if the original image shows a red balloon in the top left corner, and the reproduction has something that looks like a red balloon in the top left corner then I will have achieved my goal, even if the balloon in the reproduction is not quite in the same position and not quite the same size or colour.
When I say "as perceived by a human", I mean this in a very limited sense. i am not attempting to analyse the meaning of an image, I don't need to know what an image is of, i am only interested in the key visual features a human eye would notice, to the extent that this can be automated by an algorithm which has no capacity to conceptualise what it is actually observing.
Why this unusual criteria of human perception over photographic accuracy?
This software would be used to drive a drawing and painting robot, which will be collaborating with a human artist (see: video.google.com/videosearch?q=mr%20squiggle).
Rather than treating marks made by the human which are not photographically perfect as necessarily being mistakes, The algorithm should seek to incorporate what is already on the canvas into the final image.
So relative brightness, hue, saturation, size and position are much more important than being photographically identical to the original. The maintaining the topology of the features, block of colour, gradients, convex and concave curve will be more important the exact size shape and colour of those features
Still with me?
My problem is that I suffering a little from the "when you have a hammer everything looks like a nail" syndrome. To me it seems the way to do this is using a genetic algorithm with something like the comparison of wavelet transforms (see: grail.cs.washington.edu/projects/query/) used by retrievr (see: labs.systemone.at/retrievr/) to select fit solutions.
But the main reason I see this as the answer, is that these are these are the techniques I know, there are probably much more elegant solutions using techniques I don't now anything about.
It would be especially interesting to take into account the ways the human vision system analyses an image, so perhaps special attention needs to be paid to straight lines, and angles, high contrast borders and large blocks of similar colours.
Do you have any suggestions for things I should read on vision, image algorithms, genetic algorithms or similar projects?
Thank you
Mat
PS. Some of the spelling above may appear wrong to you and your spellcheck. It's just international spelling variations which may differ from the standard in your country: e.g. Australian standard: colour vs American standard: color
There is an model that can implemented as an algorithm to calculate a saliency map for an image, determining which parts of the image would get the most attention from a human.
The model is called itti koch model
You can find a startin paper here
And more resources and c++ sourcecode here
I cannot answer your question directly, but you should really take a look at artist/programmer (Lisp) Harold Cohen's painting machine Aaron.
That's quite a big task. You might be interested in image vectorizing (don't know what it's called officially), which is used to take in rasterized images (such as pictures you take with a camera) and outputs a set of bezier lines (i think) that approximate the image you put in. Since good algorithms often output very high quality (read: complex) line sets you'd also be interested in simplification algorithms which can help enormously.
Unfortunately I am not next to my library, or I could reccomend a number of books on perceptual psychology.
The first thing you must consider is the physiology of the human eye is such that when we examine an image or scene, we are only capturing very small bits at a time, as our eyes dart around rapidly. Our mind peices the different parts together to try and form a whole.
You might start by finding an algorithm for the path of an eyeball as it darts around. Perhaps it is attracted to contrast?
Next is that our eyes adjust the "exposure" depending on the context. It's like those high dynamic range images, if they were peiced together not by multiple exposures of a whole scene, but by many small images, each balanced on its own, but blended into its surroundings to form a high dynamic range.
Now there was a finding in a monkey brain that there is a single neuron that lights up if there's a diagonal line in the upper left of its field of vision. Similar neurons can be found for vertical lines, and horizontal lines in various areas of that monkey's field of vision. The "diagonalness" determines the frequency with which that neuron fires.
one might speculated that other neurons might be found and mapped to other qualities such as redness, or texturedness, and other things.
There's something humans can do that I've not seen a computer program ever able to do. it's something called "closure", where a human is able to fill in information about something that they are seeing, that doesn't actually exist in the image. an example:
*
* *
is that a triangle? If you knew that it was in advance, then you could probably make a program to connect the dots. But what if it's just dots? How can you know? I wouldn't attempt this one unless I had some really clever way of dealing with that one.
There are many other facts about human perception you might be able to use. Good luck, you've not picked a straightforward task.
i think a thing that could help you in this enormous task is human involvement. i mean data. like you could have many people sitting staring at random dots (like from the previous post) and connect them as they see right. you could harness that data.