Perceptual Image Comparison - image-processing

I'm trying to do image comparison to detect changes in a video processing application. These are two images that look identical to me, but are different according to both
http://pdiff.sourceforge.net/
and http://www.itec.uni-klu.ac.at/lire/nightly/api/net/semanticmetadata/lire/imageanalysis/LireFeature.html
Can anyone explain the difference? Eventually I need to find a library that can detect differences that doesn't have any false positives.

The two images are different.
I used GIMP (open source) to stack the two images one on top of the other and do a difference for the top layer. It showed a very faint black image, i.e. very little difference. I then used Curve to raise the tones and it revealed that what seem to be JPEG artifacts, even though the files given are PNG. I recommend GIMP and sometimes I use it instead of Photoshop.
Using GIMP to do a blink comparison between layers at 400% view, I would guess that the first image is closer to the original. The second may be saved copy of the first or from the original but saved at a lower quality setting.
It seems that the metadata has been stripped off both images (haven't done a definitive look), so no clues there.
There was a program called Unique Filer that I used for years. It is tunable and rather good. But any comparator is likely to generate a number of false positives if you tune it well enough to make sure it doesn't miss duplicates. If you only want to catch images that are very similar like this pair, then you can tune it very tightly. It is old and may not work on Windows 7 or later.
I would like to find good image checkers / comparators too. I've considered writing my own program.

Related

How to generate an image using parts of another image?

Before clarifying my question, please just consider these two generative portraits by Sergio Albiac:
Since I really like this kind of portraits I wanted to find a way of producing them myself.
I don't have much for now, the only things I can deduce from these examples are:
each portrait takes at least two inputs, one target image (the
portrait) and one or more source images (pictures of text) whose parts are used to
generate a stylized portrait
matching the parts from source images with the target image is
done using template matching
What I'd like to know is how to proceed, what things to learn and look for? What other concepts should I consider before trying to make this work?
Cheers
The Cover Maker plugin for Fiji/ImageJ does a similar thing.
It first builds a database from your source images indexed according to color/intensity. These source images are then used to build your target image. (Contrary to your example images, it only works with a constant tile size throughout the image, though.)
Have a look at the python source code for details.
EDIT: If you want to avoid the constant tile size, you could use e.g. a quadtree segmentation or a k-means segmentation to get regions of similiar intensity/texture in your target image, and then do the template matching for the segmented regions.

Besides standard/progressive, the 3rd kind of JPEG compression: load by channel?

this question might be an "Open Question" and many of you might be eager to close it, but please don't. Let me explain.
As we all know, JPEG has two kinds of compression (at least in Photoshop save dialog)
optimized, where image was loaded kinda like line-by-line
progressive, where image was loaded first mosaic-like, the progressively better till the original resolution
I have read a lot of PNG/JPEG optimization articles before, but now I encountered this awesome third kind compression, from a wild random Google Image search. The JPEG in question is this
http://storage.googleapis.com/marc-pres/boston-event-1012/images/google-data-center.jpg
Try load the link in Chrome/Firefox (in IE/Safari only until the image was fully loaded then displayed)
you can observe:
image were loaded first in black & white
then looks like the Red channel loaded
next the Green channel loaded
last the Blue channel loaded
I tried loading it again with a emulated very slow connection, and observed that the JPEG is not only loads by channel order, but in progressive way as well. So first loaded image is blank-and-white mosaic then green-ish mosaic then gradually full color mosaic and finally full resolution and full color image.
This is amazing technology, suppose you are building an e-magazine, where each page has a lot of pictures, you want the user to fast flip browsing through pages, and this kind of image is exactly what works best. For fast preview, load blank-n-white thumbnail, if the user stays, fully load the original image.
So my question is: How could I generate such image using Python Pillow or ImageMagick, or any kind of open source software?
If you think this question is inappropriate please comment, don't just close it.
Update 1:
It turns out Google used this technology in all of its JPEG pictures 1, 2 e.g. this
Update 2: I found another clue
The image data in a JPEG file can be sliced up in many different ways, and the slices (or "scans" as they're usually called) can be stored in the file in many different orders.
In most JPEG files, the first scan in the file contains all of the image's color components, interleaved together if it is a color image. In a non-progressive JPEG, the file will contain just that one scan. In a progressive JPEG, other scans will follow, each of which may contain one component or multiple components.
But there's nothing that requires it to be done that way. If the first scan in the file does not contain all the color components, we might call such a file "non-interleaved".
Your examples files are non-interleaved, and they are also progressive. Progressive non-interleaved JPEGs seem to be more widely supported than non-progressive non-interleaved JPEGs.
The standard IJG libjpeg software is capable of creating non-interleaved files. Though it's not exactly easy, you can use its cjpeg utility, with the -scans option documented in the wizard.txt file.

OpenCv Issue of Image Subtraction?

i am trying to subtract 2 image using the function cvAbsDiff(img1, img2, dest);
it working but sometimes when i bring my hand before my head or body the hand is not clear and background comes into picture... the background image(head) overlays my foreground.(hand)..
it works correctly on plain surfaces i.e when the background is even like a wall.
please check out my image...so that you can better understand my problem...!!!!
http://www.2shared.com/photo/hJghiq4b/bg_overlays_foreground.html
if you have any solution/hint please help me.......
There's nothing wrong with your code . Background subtraction is not a preffered way for motion detection or silhoutte detection because its not very robust.The problem is coming because both the background and the foreground are similar in colour at many regions which on subtractions pushes the foreground to back . You might try using
- optical flow for motion detection
- If your task is just detecting silhoutte or hand try training a HOG classifier over it
In case you do not want to try a new approach . You may try around playing with the threshold value(in your case 30).So when you subtract similar colour image there difference is less than 30 . And later you threshold with 30 so it just blacks out. Also you may try HSV or some other colourspace as well .
Putting in the relevant code would help. Also knowing what you're actually trying to achieve.
Which two images are you subtracting? I've done subtracting subsequent images (so, images taken with a delay of a fraction of a second), and the background subtraction generally results in the edges of moving objects, for example the edges of a hand, and not the entire silhouette of a hand. I'm guessing you're taking the difference of the current frame and a static startup frame. It's possible that parts aren't different enough (skin+skin).
I've got some computer problems tonight, I'll test it out tomorrow (pls put up at least the steps you actually carry thorough though) and let you know.
I'm still not sure what your ultimate goal is, although I'm guessing you want to do some gesture-recognition (since you have a vector called "fingers").
As Manpreet said, your biggest problem is robustness, and that is from the subjects having similar color.
I reproduced your image by having my face in the static comparison image, then moving it. If I started with only background, it was already much more robust and in anycase didn't display any "overlaying".
Quick fix is, make sure to have a clean subject-free static image.
Otherwise, you'll want to have dynamic comparison image, simplest would be comparing frame_n with frame_n-1. This will generally give you just the moving edges though, so if you want the entire silhouette you can either:
1) Use a different segmenting algorithm (what I recommend. Background subtraction is fast and you can use it to determine a much smaller ROI in which to search, and then use a different algorithm for more robust segmentation.)
2) Try to make a compromise between the static and dynamic comparison image, for example as an average of the past 10 frames or something like that. I don't know how well this works, but would be quite simple to implement, worth a try :).
Also, try with CV_THRESH_OTSU instead of 30 for your threshold value, see if you like that better.
Also, I noticed often the output flares (regions which haven't changed switch from black to white). Checking with the live stream, I'm quite certain it because of the webcam autofocusing/adjusting white balance etc.. If you're getting that too, turning off the autofocus etc. should help (which btw isn't done through openCV but depends on the camera. Possibly check this: How to programatically disable the auto-focus of a webcam?)

Improve OCR accuracy from scanned documents

I'm scanning a lot of A3 documents using a standard Brother A3 Multifunction and then use FineReader Pro for OCR'ing the images.
However, I'm getting a lot of errors in the characters recognized, and lots of non-alphanumeric strange characters.
Can someone give me any tips for programmatically improving the OCR accuracy, either pre-processing on the scanned images, or post-processing on the recognized text?
Edit: Find a sample pdf. It includes some sample images from which I get the poorest results.
Do you have a sample image you can post somewhere then we can quickly tell you what is causing most of your problems. FineReader is one of the better OCR engines out there so there are definitely reasons why you are getting poor results.
It could be related to poor contrast and threshold settings, image skewing, dirty rollers in the scanner, complex and coloured backgrounds, dithered backgrounds, font sizes too small, scanning dpi being too low etc...
After seeing the attached image there are a few small issues.
There are lots of dirty specks on the background page. FineReader seems to do a reasonable job with this on your images.
There is some slight skew but that is not causing and problems.
FineReader is getting confused with BOLD tall Arial type font used for column headers.
4 A big problem seems to be the bottom region of the pages where the contrast is poor and the image is fuzzy. This seems to be a problem with the scanner but could be due to printing problems.
The printing is quite poor and I am guessing it is a scan from a newspaper. Most of your errors are due to scanning issues so it would be hard to programmatically improve the results.
Firstly, I would try scanning the image in grayscale using a slightly higher resolution and see if that helps. FineReader works well with grayscale images. If you have to have a B/W image then see if the scanner driver includes a setting for dynamic thresholding and turn it on.
Your images would not be an easy task for any OCR engine. You will get better results if you can improve the scanning. Page 3 has a lot of noise in the bottom right corner.
What version of FineReasder are you using ? FR10 would probably give better results than previous versions.

How do you scale an image for print without degrading the quality?

I was wondering how would you print an image that's scaled three times its original size without making it look like crap? If you change the dpi to 300 and print it'll look like crap. Is there a way to convert it gracefully?
You may have the problem of trying to add detail that isn't there. Hopefully you're aware of this.
The best way to enlarge an image that I know of is to use bicubic interpolation. If it's any help, Photoshop recommends using 'bicubic smoother' for enlargement.
Also, be careful with DPI vs PPI.
This is called supersampling or interpolation. There's no 'perfect' algorithm, since that would imply generating new information where there was none ('between' the pixels); but some methods are better than others in fooling the eye/brain to fill the voids, or at least not making big square boxes.
Start with the wikipedia articles on Nearest-Neighbor, Bilinear and Bicubic interpolations (the three offered by PhotoShop). A few more Tricubic interpolation, Lanczos resampling could be of interest, also check the theory, and comparison links.
In short, this isn't a cut-and-clear issue; but an active investigation field, full of subjectivity and practical trade-offs.
You should vectorize your image, scale it, and if you wish you may convert it back to the original format (jpg, gif, png...). However this works best for simple images.
Do you know how to vectorize? There are some sites that do it online, just do some Google research and you'll find some.
Changing the DPI won't matter if you don't have enough pixels in your image for the size you are printing. In the biz it's called GIGO (Garbage In, Garbage Out).
If your image is in HTML then create a media="print" stylesheet and feed a high-res image that way.

Resources