image similarity algorithm for charts - image-processing

for automating the tests of a legacy Windows application I need to compare screenshots of charts.
Pixel comparison works fine as long as the Windows session opens with same resolution, DPI, color depth, font size, font family, etc., otherwise the screenshot taken during the test may differ slightly from that recorded during the development of the test.
Therefore, I am looking for a method that allows slight variations and produces a score rather than a boolean.
Started with scaling the retrieved screenshot to match the size of recorded one. Of course, pixel comparison fails.
Then I tried to use SSIM to get a similarity score (used https://github.com/rhys-e/structural-similarity). Definitely it does not work for my case -- see below a simplified experiment.
Any suggestions?
Thanks in advance,
Adrian.
SSIM experiments
This is the reference picture:
This one contains a black line slightly above than the reference --> getting 0.9447093986742424
This one, completely different --> getting 0.9516260505445076

Related

How to generate an image using parts of another image?

Before clarifying my question, please just consider these two generative portraits by Sergio Albiac:
Since I really like this kind of portraits I wanted to find a way of producing them myself.
I don't have much for now, the only things I can deduce from these examples are:
each portrait takes at least two inputs, one target image (the
portrait) and one or more source images (pictures of text) whose parts are used to
generate a stylized portrait
matching the parts from source images with the target image is
done using template matching
What I'd like to know is how to proceed, what things to learn and look for? What other concepts should I consider before trying to make this work?
Cheers
The Cover Maker plugin for Fiji/ImageJ does a similar thing.
It first builds a database from your source images indexed according to color/intensity. These source images are then used to build your target image. (Contrary to your example images, it only works with a constant tile size throughout the image, though.)
Have a look at the python source code for details.
EDIT: If you want to avoid the constant tile size, you could use e.g. a quadtree segmentation or a k-means segmentation to get regions of similiar intensity/texture in your target image, and then do the template matching for the segmented regions.

Perceptual Image Comparison

I'm trying to do image comparison to detect changes in a video processing application. These are two images that look identical to me, but are different according to both
http://pdiff.sourceforge.net/
and http://www.itec.uni-klu.ac.at/lire/nightly/api/net/semanticmetadata/lire/imageanalysis/LireFeature.html
Can anyone explain the difference? Eventually I need to find a library that can detect differences that doesn't have any false positives.
The two images are different.
I used GIMP (open source) to stack the two images one on top of the other and do a difference for the top layer. It showed a very faint black image, i.e. very little difference. I then used Curve to raise the tones and it revealed that what seem to be JPEG artifacts, even though the files given are PNG. I recommend GIMP and sometimes I use it instead of Photoshop.
Using GIMP to do a blink comparison between layers at 400% view, I would guess that the first image is closer to the original. The second may be saved copy of the first or from the original but saved at a lower quality setting.
It seems that the metadata has been stripped off both images (haven't done a definitive look), so no clues there.
There was a program called Unique Filer that I used for years. It is tunable and rather good. But any comparator is likely to generate a number of false positives if you tune it well enough to make sure it doesn't miss duplicates. If you only want to catch images that are very similar like this pair, then you can tune it very tightly. It is old and may not work on Windows 7 or later.
I would like to find good image checkers / comparators too. I've considered writing my own program.

Improve OCR accuracy from scanned documents

I'm scanning a lot of A3 documents using a standard Brother A3 Multifunction and then use FineReader Pro for OCR'ing the images.
However, I'm getting a lot of errors in the characters recognized, and lots of non-alphanumeric strange characters.
Can someone give me any tips for programmatically improving the OCR accuracy, either pre-processing on the scanned images, or post-processing on the recognized text?
Edit: Find a sample pdf. It includes some sample images from which I get the poorest results.
Do you have a sample image you can post somewhere then we can quickly tell you what is causing most of your problems. FineReader is one of the better OCR engines out there so there are definitely reasons why you are getting poor results.
It could be related to poor contrast and threshold settings, image skewing, dirty rollers in the scanner, complex and coloured backgrounds, dithered backgrounds, font sizes too small, scanning dpi being too low etc...
After seeing the attached image there are a few small issues.
There are lots of dirty specks on the background page. FineReader seems to do a reasonable job with this on your images.
There is some slight skew but that is not causing and problems.
FineReader is getting confused with BOLD tall Arial type font used for column headers.
4 A big problem seems to be the bottom region of the pages where the contrast is poor and the image is fuzzy. This seems to be a problem with the scanner but could be due to printing problems.
The printing is quite poor and I am guessing it is a scan from a newspaper. Most of your errors are due to scanning issues so it would be hard to programmatically improve the results.
Firstly, I would try scanning the image in grayscale using a slightly higher resolution and see if that helps. FineReader works well with grayscale images. If you have to have a B/W image then see if the scanner driver includes a setting for dynamic thresholding and turn it on.
Your images would not be an easy task for any OCR engine. You will get better results if you can improve the scanning. Page 3 has a lot of noise in the bottom right corner.
What version of FineReasder are you using ? FR10 would probably give better results than previous versions.

Generate font from an image of text

Is it possible to generate a specific
set of font from the below given image
?
My idea is to generate a specific font
for the below given image of text ,by
manually selecting portion of the
image and mapping it to a set of
letter's.Generate the font for this
and then use this font to make it
readable for an OCR.Is generation of
font possible using any open-source
implementation ? Also please suggest
any good OCR's.
Abbyy FineReader 10 gets better than expected results but predictably gets confused when the characters touch.
Your problem is that the line spacing is too small. The descenders of each line overlap the character bounding boxes of the characters in the line directly below. This makes character segmentation almost impossible because the characters are touching and overlapping. The number of combinations of overlapping characters is virtually impossible to train for. The 'g' and 'y' characters are the worst offenders.
A double line spaced version of this would probably OCR reasonably well.
A custom solution that segmented and separated the each line along with a good dictionary would definitely improve the results. There would still be some errors to correct manually though. The custom routine would have to deal with the ascenders and descenders and try and segment the image into lines which can then be fed to a decent OCR engine. One way would be to analyse every character blob on the page and allocate it to a line. Leptonica (www.leptonica.com - C Imaging Library) would probably make this job a little easier.
I would not try this without increasing the resolution to 200 or 300 dpi first.
With this custom solution, training a font becomes an option if the OCR engine does a poor job initially.
Abbyy (www.abbyy.com) or Google Tesseract OCR 3.00 would be a good place to start.
No guarantees as to whether all of this will work though. This is quite a difficult page to OCR and you need to work out whether it is better to have it typed up manually overseas. It depends on the number of pages to need to process.

character matching in grayscale image

I made patterns: images with the "A" letter of different sizes (from 12 to 72: 12, 14, .., 72)
And I tested the method of pattern matching and it gave a good results.
One way to select text regions from image is to run that algorithm for all small and big letters and digits of different sizes. And fonts!
I don't like it. Instead of it I want to make something like a universal pattern or
better to say: scanning image with different window sizes and select those regions where some function (probability of that there is a character at that window) is more than some fixed value.
Do you know any methods or ideas to make that function?
It must work with original image (grayscale).
I suppose you are developing OCR, right?
You decided to go quite unusual way since everyone else do matching on bi-tonal images. This makes everything much simplier. Once you degradated it properly (which is very difficult task by itself), you do not have to deal with different brightness levels, take care about uneven background, etc. And sure, less computation resources needed. However, is doing everything in grayscale is actually your goal and you want to show other OCR scientists that it is actually doable - well, I wish you good luck then.
Approach of letters location you described is very-very-very computation intesive. You have to scan whole image (image_size^2), then match with pattern ( * pattern_size^2) and then do it for each pattens ( * pattern_num ). This will be incredibly slow.
Instead try to simplify your algorithm to break it to two stages. First should look for some features on picture (like connected dark regions, or split image on large squares and throw away all light ones) and only then perform pattern matching on small number of found areas. This is all at least N^2, and you could try to reduce complexity to working on rows or columns of image first (by creating histogram). So there is a lot of different simplification methods you can try to play with.
After you have located those objects on picture and going to match patterns on them, you actually know their size, so you don't have to store letter A in all sizes, you can just rescale original image of object to the size say 72, and match it.
As to fonts - you don't really have much choice here, you will need to match against all possible shapes of A to make sure you found A. But once you match against just one size of A - you have more computing power to try different A's.

Resources