How can I compare images of the same origin that were cropped? - image-processing

Suppose I have an image file/URL, and I want my software to search it within a set of up to 100 images (or at least in that order of magnitude). The target image that the software should find should be the "same" image as the given image, but it should still be able to "forgive" slight processing on either of them (the two images may have been cropped differently, or they were compressed differently).
The question is - is this feasible a task, given that I won't have any of the images before the search is taking place (i.e., there won't be any indexing prior to the search.) Is it likely to work in subsecond time (remember that the compare set is quite small). And if feasible, which tools can I use for this task? This could be software components or even an online service (I can live with that for a proof of concept). Can OpenSURF help me here?
To focus my question further - I'm not asking which algorithms to use, at this point I would rather use an existing tool/API/service.

The target image that the software should find should be the "same" image as the given image, but it should still be able to "forgive" slight processing on either of them.
If "slight processing" doesn't involve rotation, but only "cropping", then simple cross-correlation should work, if there could be perspective correction, rotation, lens distortion correction, then things are more complicated.
I think this method is quite forgiving to slight color corrections. Anyway, you can always convert both images to grayscale and compare grayscale versions if you want.
To focus my question further - I'm not asking which algorithms to use, at this point I would rather use an existing tool/API/service.
You can start from cvMatchTemplate from OpenCV library (the link points to the C version of the API, but it's available also for C++ and Python). Use the cropped image as a template, and look for it in all your images.
If the images you compare have dark features on light backgrounds, you may benefit from using CV_TM_CCOEFF or CV_TM_CCOEFF_NORMED methods. They both subtract the average over the template area from both images. Normalized methods (CV_TM_*_NORMED) generally work better but are slower than their non-normalized counterparts.
You may consider to do some preprocessing with the images before the cross-correlation. If you normalize them first, the cross-correlation will be less sensitive to slight brightness/contrast modification. If you detect edges first, as suggested by #misha, you'll lose color/lightness information, but the results for contour overlapping will be much better.

jetxee set you off on the right track. However, if you simply use template matching, you can run into problems where the background interferes with your template matching result. For example, if your template is a building and your background is primarily light (e.g. desert sand), then the template matching will fail because the lighter background will always return a higher cross-correlation than the darker template. Here is an example of this problem.
The way you solve it is the same as what is in the link:
Perform edge-detection on both your template and the target image.
Throw original template and image away
Perform template detection using the edge-detected template and edge-detected target image
As far as forgiving slight processing, the edge detection step will take care of that. As long as the edges in the two images are not modified significantly (blurred, optically distorted), the approach will work.

I know you are not looking specifically for algorithms, but nonetheless, let me suggest the following which can accomplish exactly what you are trying to do, very efficiently...
For cropped versions of the same image, including rotation, the Fourier-Mellin transform or a log-polar transform (watch out for the artsy semi-nude drawing - good source however) will give you the translation, rotation and scale coefficients between the two images, allowing to to determine what operations were needed to go from one to the other.

Related

Feature detection on a small, noisy image with OpenCV

I have an image that is both pretty noisy, small (the relevant portion is 381 × 314) and the features are very subtle.
The source image and the cropped relevant area are here as well: http://imgur.com/a/O8Zc2
The task is to count the number of white-ish dots within the relevant area using Python but I would be happy with just isolating the lighter dots and lines within the area and removing the background structure (in this case the cell).
With OpenCV I've tried Histogram equalization (destroys the details), finding contours (didn't work), using color ranges (too close in color?)
Any suggestions or guidance on other things to try? I don't believe I can get a higher res image so is this task possible with the rather difficult source?
(This is not a Python answer, since I never used the Python/OpenCV binding. The images below were created using Mathematica. But I just used basic image processing functions, so you should be able to implement that in Python on your own.)
A very general "trick" in image processing is to think about removing the thing you're looking for, instead of actually looking for it. Because often, removing it is much easier than finding it. You could for instance apply a morphological opening, median filter or a gaussian filter to it:
These filters effectively remove details smaller than the filter size, and leave the coarser structures more or less untouched. So you can just take the difference from the original image and look for local maxima:
(You'll have to play around with different "detail removal filters" and filter sizes. There's no way to tell which one works best with just one image.)

Sparse Image matching in iOS

I am building an iOS app that, as a key feature, incorporates image matching. The problem is the images I need to recognize are small orienteering 10x10 plaques with simple large text on them. They can be quite reflective and will be outside(so the light conditions will be variable). Sample image
There will be up to 15 of these types of image in the pool and really all I need to detect is the text, in order to log where the user has been.
The problem I am facing is that with the image matching software I have tried, aurasma and slightly more successfully arlabs, they can't distinguish between them as they are primarily built to work with detailed images.
I need to accurately detect which plaque is being scanned and have considered using gps to refine the selection but the only reliable way I have found is to get the user to manually enter the text. One of the key attractions we have based the product around is being able to detect these images that are already in place and not have to set up any additional material.
Can anyone suggest a piece of software that would work(as is iOS friendly) or a method of detection that would be effective and interactive/pleasing for the user.
Sample environment:
http://www.orienteeringcoach.com/wp-content/uploads/2012/08/startfinishscp.jpeg
The environment can change substantially, basically anywhere a plaque could be positioned they are; fences, walls, and posts in either wooded or open areas, but overwhelmingly outdoors.
I'm not an iOs programmer, but I will try to answer from an algorithmic point of view. Essentially, you have a detection problem ("Where is the plaque?") and a classification problem ("Which one is it?"). Asking the user to keep the plaque in a pre-defined region is certainly a good idea. This solves the detection problem, which is often harder to solve with limited resources than the classification problem.
For classification, I see two alternatives:
The classic "Computer Vision" route would be feature extraction and classification. Local Binary Patterns and HOG are feature extractors known to be fast enough for mobile (the former more than the latter), and they are not too complicated to implement. Classifiers, however, are non-trivial, and you would probably have to search for an appropriate iOs library.
Alternatively, you could try to binarize the image, i.e. classify pixels as "plate" / white or "text" / black. Then you can use an error-tolerant similarity measure for comparing your binarized image with a binarized reference image of the plaque. The chamfer distance measure is a good candidate. It essentially boils down to comparing the distance transforms of your two binarized images. This is more tolerant to misalignment than comparing binary images directly. The distance transforms of the reference images can be pre-computed and stored on the device.
Personally, I would try the second approach. A (non-mobile) prototype of the second approach is relatively easy to code and evaluate with a good image processing library (OpenCV, Matlab + Image Processing Toolbox, Python, etc).
I managed to find a solution that is working quite well. Im not fully optimized yet but I think its just tweaking filters, as ill explain later on.
Initially I tried to set up opencv but it was very time consuming and a steep learning curve but it did give me an idea. The key to my problem is really detecting the characters within the image and ignoring the background, which was basically just noise. OCR was designed exactly for this purpose.
I found the free library tesseract (https://github.com/ldiqual/tesseract-ios-lib) easy to use and with plenty of customizability. At first the results were very random but applying sharpening and monochromatic filter and a color invert worked well to clean up the text. Next a marked out a target area on the ui and used that to cut out the rectangle of image to process. The speed of processing is slow on large images and this cut it dramatically. The OCR filter allowed me to restrict allowable characters and as the plaques follow a standard configuration this narrowed down the accuracy.
So far its been successful with the grey background plaques but I havent found the correct filter for the red and white editions. My goal will be to add color detection and remove the need to feed in the data type.

JPEG artifacts detection

Are there known algorithms that can detect image degradation programatically without looking at the image?
I think about obvious (visible) image artifacts of lossy re-encoding, like color
distorsion, edge noise, blockness etc.
For example, images encoded from original source with JPEG quality 80 are fine.
I hope this is right place to ask, but if moderators think that I should have asked at DSP stackexange or similar, please relink.
There is a library called CMFD (Copy-Move Forgery Detection) which does artifacts detection and other algorithms to detect image forgery. It's freely available from http://www5.cs.fau.de/research/software/copy-move-forgery-detection/.
From the few tests I've made, it does detect forgeries pretty well, but there are a lot of false positives.
You need to evaluate methods for finding the artifacts that you define. Once you have those characterized, you'll need to code up each method to find those artifacts. These methods will probably best be employed on a difference image - the original (or intermediate) minus the encoded file. You'll probably have to analyze each color channel separately. The simplest would be a threshold - are any parts of the encoded image off by some threshold? For blockiness and edge noise, I imagine you'll probably use some kind of Hough transform to recognize shapes/lines in the difference image, and possibly a wavelet transform or something similar that can be tuned to particular frequency patterns to pick out ringing around edges.
Edit (in response to klo's comment):
Without a reference, I'm not sure that you'll be able to accomplish what you want. You can still try applying the techniques I mentioned on individual color channels. The hard part will be that without a reference, you won't necessarily be measuring any artifact, but rather image features too. You can still use some a-priori information, such as the fact that any blockiness will be oriented exactly with the image frame - not rotated. Any real image would probably be unlikely to have many nicely blocky features completely oriented with a frame. You could also apply an edge-finding algorithm like difference of gaussians or Canny edge detection, and then apply wavelet filters near the located edges to look for ringing.
For known programmable methods, see python's scikits-image (which you know about from your post to ther mailing list) or possibly OpenCV, which has Python bindings. I'm not familiar with Matlab's capabilities, but that would probably also work as well.

Map image model to data

I'm working on a software that checks if some laser-cut parts were cut correctly, using AutoCAD data as reference. I have parsed the dxf-files, converted them to a bmp (and to an xml File that gives me all the information), and now I want to compare this to the real, acquired data.
I have applied enough preprocessing to get a reasonably thresholded, binary picture. This is, however, distorted (unfortunately, telecentric lenses are expensive and the user places the object into a device, causing some translation, some scalation and a tiny amount of rotation, as in 1-2degs).
I have considered Hough transform, but memory is an issue. I have played around with bounding box transformation, but the unknown shape makes this hard. I've read about TILT (no symmetry) and registration algorithms, but I'd like to get another opinion.
I'm looking for some papers, some ideas, some pointers on how to go on.
Thanks.
First step is to undistort the image ( see camera calibration - ignore the 3d part).
Then think about the shape matching. Depending on how small the error you are trying to find, this could be very easy or very very difficult, but those links should get you started
You may want to look at features that can discriminate the two. Are there simple features that can accurately distinguish a properly cut piece vs. an incorrectly cut piece? If so, you can use the same idea as the Hough transform/template matching, but reducing the template to certain distinguishing features (edges, corners, etc.) to reduce the memory required.
You may want to look at the SIFT/SURF features that aim to match images by a certain set of features while being invariant to the rotation and scale of the objects within the image. There are libraries out there that implement these features (shown on the SURF page).
This however, wont help with the distortion. If you're using the same camera for all images, then you should be able to de-skew them accordingly.

Genetic algorithms for image processing project

I'm thinking of starting a project for school where I'll use genetic algorithms to optimize digital sharpening of images. I've been playing around with unsharp masking (USM) techniques in Photoshop. Basically, I want to create a software that optimizes the parameters (i.e. blur radius, types of blur, blending the image) to create the "best-fit" set of filters.
I'm sort of quickly planning this project before starting it, and I can't think of a good fitness function for the 'selection' part. How would I determine the 'quality' of the filter sets, or measure how sharp the image is?
Also, I will be programming using python (with the Python Imaging Library) since it's the only language I'm proficient with. Should I learn a low-level language instead?
Any advice/tips on anything is greatly appreciated. Thanks in advance!
tl;dr How do I measure how 'sharp' an image is?
if its for tuning parameters you could take a known image and apply a known blurring/low pass filter. Then sharpen this with your GA+USM algorithm. Calculate your fitness function making use of the original image, e.g maybe something as simple as the mean absolute error. May need to create different datasets, e.g. landscape images (mostly sharp, in focus with large depth of field), portrait images (could be large areas deliberately out of focus and "soft"), along with low noise and noisy images. Sharpening noisy images is actually quite a challenge.
It would definitely be worth taking a look at Bruce Frasier' work on sharpening techniques for Photoshop etc.
Also it might worth checking out Imatest (www.imatest.com) to see if there is anything regarding sharpness/resolution. And finally you might also consider resolution charts.
And finally I seroiusly doubt one set of ideal parameters exists for USM, the optimum parameters will be image dependant and indeed be a personal perference (thatwhy I suggest starting for a known sharp image and blurring it). Understanding the type of image is probably as important and in itself and very interesting and challenging problem. Although perhaps basic hueristics like image varinance and edge histogram would reveal suitable clues.
Anyway just a thought, hopefully some of the above is useful

Resources