learning steps for image recognition algorithm - image-processing

I have decided to spend my personal time after office hours to learn the building blocks of how images jpeg type are parsed and represented in screen. My interest is on object recognition in an image.so I want to know where to start , I know there are math involved in this.so I needed step by step on what resources in Internet specifically to look at.

Need a lot more information on what you want, but take a look at OpenCV
http://sourceforge.net/projects/opencvlibrary/
To see good examples.

I'd get Ritter's book (warning: costly!) and give it serious studying. If you just want to grab existing code and go play then perhaps you should look at libraries like OpenCV (see Lou's answer).

The ultimate goal of most image processing is to extract information about some high-level and application-dependent objects from an image available in low-level (pixel) form. The objects may be of every day interest like in robotics, cosmic ray showers or particle tracks like in physics, chromosomes like in biology, houses, roads, or differently used agricultural surfaces like in aerial photography or synthetic-aperture radar, etc.
This task of pattern recognition is usually preceded by multiple steps of image restoration and enhancement, image segmentation, or feature extraction, steps which can be described in general terms. The final description in problem-dependent terms, and even more so the eventual image reconstruction, escapes such generality, and the literature of application areas has to be consulted.

Related

Is there a OCR API that can count objects?

Is there a OCR API that could be used for recognizing and counting objects from image? Or can this be done with another image processing image processing technique?
For example if i take a close-up photo of three boxes, API would just return number 3 as a result.
You can look into OpenCV, which is popular for programmers learning about image processing and vision. You'll find an endless number of posts here on StackOverflow about OpenCV.
http://opencv.org/
Some freeware GUIs and free starter versions of commercial image processing packages will allow you to test image processing techniques without having to write the code. ImageJ is old but still worth checking out:
http://rsbweb.nih.gov/ij/
I don't want to show favoritism towards any of my sisters and brothers in the image processing world, but if you google for "machine vision free" or "computer vision free" and add words such as "GUI" you should be able to quickly find some free software that will allow you to test different image processing techniques just by using your mouse.
Along with your OCR algorithm, you'll need a segmentation method to count objects.
One such technique is the connected components algorithm:
http://en.wikipedia.org/wiki/Connected-component_labeling
The typical connect components algorithm would rely on some preprocessing:
Find a binarization threshold.
Apply the binarization threshold to generate an image of black (0) and white (1) values.
Run the connect components algorithm and label all components (objects)
Filter the results by size and other parameters. For example, you probably don't want to include foreground objects that are only a few pixels in size.
Check the size of the list of filtered components.
This is a simple, low-level method, but it's useful in many situations. Even if you think you need a more complicated technique, I would strongly recommend that you first become familiar with connected components before moving on. Until one grasps the subtleties of lighting, binarization, and component labeling, it's unlikely one can learn much useful about more complicated algorithms. There really are no shortcuts.
There are other,more complicated methods, but before suggesting which might be appropriate you would have to be more specific about what kind of objects you want to find.
With any image processing question, always include one or more sample images. It's generally not useful to talk about image processing algorithms without first understanding the image set with which you are working. What may be obvious to you will not be obvious to others, especially those who have spent years working on OCR applications and who have had to deal with a wide variety of backgrounds, scripts, and specifications.

Using OpenCV to find people who wear a certain hat

I would like to use computer vision to do the following:
A camera is mounted outside a building, capturing a videostream of the street below. The camera is installed approximately 5-6 meters above the street.
Whenever a person wearing a certain kind of hat(white, round) is captured by the camera, an event should be triggered.
Which algorithm should I look into to implement this kind of behavior ?
Is this best achieved through training the algorithm with sample data or is there another way to tell it to look for this type of hat ?
Also, how do I use multiple frames of video to increase the quality of detection ?
Edit: Added a picture of the hat
Before we do everything in comments I will start an answer here.
The first link you posted describes a simple color-based detection. You can try that, but it will fail if there are other pixel clusters of similar color in the image. Your idea of combining it with tracking is good: Identify clusters, build trajectories over several images, and only accept plausible trajectories as a hit. For robust tracking you may want to look into Kalman filtering. A problem you will most likely encounter is that a "white" hat will hardly be "white" in the images your camera delivers.
The second link you refer to - boosted Classifiers Based on Haar-like Features - is for detection of more complex objects. It probably won't help you find white blobs. Invest your time and energy in learning about tracking.
I'm happy to repeat myself here: "Solving a computer vision problem" is not something like "sorting an array". OpenCV is not the C++ Standard Library. You can use an std::map without knowing anything about a red-black tree. But (IMHO) you can't use Vision APIs without knowing a good deal of the math and theory. Working solutions Computer Vision are typically heavily tuned towards the specific problem scenario. Sorry if that sounds pedantic, but it explains why your question got beaten.

How to detect architecture and sculpture in opencv?

can someone tell me how i can detect pictures of architecture or sculpture?
I think hough-transforming is a good approach. But i'm new in CV and maybe there a better methods to detect pattern. I heard about haarcascade. can i take this for architecture,too?
For example i want to detect those kind of pictures:
Image Hosted by ImageShack.us http://img842.imageshack.us/img842/4748/resizeimg0931.jpg
If you want an algorithm to detect them, then detecting an object from an image need a description of that object which can be understood by a machine or computer. For a sculpture or architecture, how can you have such uniform definition since they vary a lot in every sense? For example both your input images vary a lot. How can we differentiate between a house and an architecture? A lot of problems will rise in your question. Even with Hough Transforming, how you are supposed to differentiate a big house and a big architecture?
Check out this SOF : Image Processing: Algorithm Improvement for 'Coca-Cola Can' Recognition
He wants to detect coca-cola cans, and not coca-cola bottles. But if you look into it clearly, you will understand can and bottles are almost alike and it will be difficult to differentiate between them. You can find a lot of its difficulties in subsequent answers. Major problem is that, in some cases, it will be difficult for humans as well to differentiate them.
In your second image, even if you train some cascades for second image, there is a change it will detect live lions if they are present in your image, since a sculpture lion and an original lion seems almost same for a machine.
Haar cascades may not be much effective since you have to train for a lot of these kinds of images.
If you have some sample images and want to check if those things are there in your image, may be you can use SURF features etc. But you may need some sample images first to compare. For a demo of SURF, check out this SOF : OpenCV 2.4.1 - computing SURF descriptors in Python
Another option is template matching. But it is slow, and it is not scale and orientation invariant. And you need some template images for this
I think I have seen some papers relating this topic ( but i don't remember now). May be googling will get you them. I will update the answer if I get it.

A good method for detecting the presence of a particular feature in an image

I have made a videochat, but as usual, a lot of men like to ehm, abuse the service (I leave it up to you to figure the nature of such abuse), which is not something I endorse in any way, nor do most of my users. No, I have not stolen chatroulette.com :-) Frankly, I am half-embarassed to bring this up here, but my question is technical and rather specific:
I want to filter/deny users based on their video content when this content is of offending character, like user flashing his junk on camera. What kind of image comparison algorithm would suit my needs?
I have spent a week or so reading some scientific papers and have become aware of multiple theories and their implementations, such as SIFT, SURF and some of the wavelet based approaches. Each of these has drawbacks and advantages of course. But since the nature of my image comparison is highly specific - to deny service if a certain body part is encountered on video in a range of positions - I am wondering which of the methods will suit me best?
Currently, I lean towards something along the following (Wavelet-based plus something I assume to be some proprietary innovations):
http://grail.cs.washington.edu/projects/query/
With the above, I can simply draw the offending body part, and expect offending content to be considered a match based on a threshold. Then again, I am unsure whether the method is invariable to transformations and if it is, to what kind - the paper isn't really specific on that.
Alternatively, I am thinking that a SURF implementation could do, but I am afraid that it could give me false positives. Can such implementation be trained to recognize/give weight to specific feature?
I am aware that there exist numerous questions on SURF and SIFT here, but most of them are generic in that they usually explain how to "compare" two images. My comparison is feature specific, not generic. I need a method that does not just compare two similar images, but one which can give me a rank/index/weight for a feature (however the method lets me describe it, be it an image itself or something else) being present in an image.
Looks like you need not feature detection, but object recognition, i.e. Viola-Jones method.
Take a look at facedetect.cpp example shipped with OpenCV (also there are several ready-to-use haarcascades: face detector, body detector...). It also uses image features, called Haar Wavelets. You might be interested to use color information, take a look at CamShift algorithm (also available in OpenCV).
This is more about computer vision. You have to recognize objects in your image/video sequence, whatever... for that, you can use a lot of different algorithms (most of them work in the spectral domain, that's why you will have to use a transformation).
In order to be accurate, you will also need a knowledge base or, at least, some descriptors that will define the object.
Try OpenCV, it has some algorithms already implemented (and basic descriptors included).
There are applications/algorithms out there that you can "train" (like neural networks) and are able to identify objects based on the training. Most of them (at least, the good ones) are not very popular and can only be found in research groups specialized in computer vision, object recognition, AI, etc.
Good luck!

what are the steps in object detection?

I'm new to image processing and I want to do a project in object detection. So help me by suggesting a step-by-step procedure to this project. Thanx.
Object detection is a very complex problem that includes some real hardcore math and long tuning of parameters to the computation methods involved. Your best bet is to use some freely available library for that - Google will help.
There are lot of algorithms about the theme and no one is the best of all. It's usually a mixture of them what makes the best solution to the solution.
For example, for object movement detection you could look at frame differencing and misture of gaussians.
Also, it's very dependent of your application, the environment (i.e. noise, signal quality), the processing capacity you may have available, the allowable error margin...
Besides, for it to work, most of time it's first necessary to do some kind of image processing to the input data like median filter, sobel filter, contrast enhancement and a large so on.
I think you should start reading all you can: books, google and, very important, a lot of papers about the subjects (there are many free in internet) you are interested in.
And first of all, i think it's fundamental (at least it has been for me) having a good library for testing. The one i have used/use is OpenCV. It's very complete, implement many of the actual more advanced algorithms, is very active, has a big community and it's free.
Open Computer Vision Library (OpenCV)
Have luck ;)
Take a look at AForge.NET. It's nowhere near Project Natal's levels of accuracy or usefulness, but it does give you the tools to learn the algorithms easily. It's an image processing and AI library and there are several tutorials on colored object tracking and motion detection.
Another one to look at is OpenCV from Intel. I believe it's a bit more advanced, but it's written in C.
Take a look at this. It might get you started in this complex field. The algorithm pages that it links to are interesting reading.
http://sun-valley.stanford.edu/projects/helicopters/final.html
This lecture by Jeff Hawkins, will give you an idea about the state of the art in this super-difficult field.
Seems that video disappeared... but this vid should cover similar ground.

Resources