Lower noise in picture to enable OCR with tesseract

Lower noise in picture to enable OCR with tesseract - image-processing

I'm trying to do OCR on this kind of images:
Unfortunately, tesseract is unable to retrieve the number because of the noisy points arround the characters.
I tried playing with ImageMagick to enhance the quality of the image but no luck.
Examples:
convert input.tif -level 0%,150% output.tif
convert input.tif -colorspace CMYK -separate output_%d.tif
Is there any way to retrieve efficiently the characters in this kind of images?
Many thanks.

Simple closing operation(Dilation followed by Erosion) will give you desired output. Below is the Python implementation of the same.
img = cv2.imread(r'D:\Image\noiseOCR.png',0)
kernel = np.ones((3,3),np.uint8)
closing = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)

Digits in this image are largest connected components. So another approach is doing the connected component analysis.

Related

Imagemagick resized pictures are different using a single command or two commands

I can't understand why those two scripts seem to produce a different result, given that the second one is like the first one but separated into two commands.
First script:
convert lena_std.tif -compress None -resize 160x160 -compress None -resize 32x32 test1.bmp
Second script:
convert lena_std.tif -compress None -resize 160x160 test2.bmp
convert test2.bmp -compress None -resize 32x32 test3.bmp
I use the following command to check the difference between the results:
convert test1.bmp test3.bmp -metric AE -compare diff.bmp
I use Imagemagick on Ubuntu 22.04. My convert -version indicates: Version: ImageMagick 6.9.11-60 Q16 x86_64 2021-01-25.

Because when you scale you interpolate pixels.
Roughly, the code considers the pixel at (x,y) in the result, and computes where it comes from in the source. This is usually not an exact pixel, more like an area, when you scale down, or part of a pixel, when you scale up. So to make up the color of the pixel at (x,y) some math is applied: if you scale down, some averaging of the source area, and if you scale up, something that depends on how close the source is to the edge of the pixel and how different the color of neighboring pixels are.
This math can be very simple (the color of the closest pixel), simple (some linear average), a bit more complex (bi-cubic interpolation) or plain magic (sinc/Lanczos), the more complex forms giving the better results.
So, in one case, you obtain a result directly from the source to the pixel you want, and in the other you obtain the final result from an approximation of what the image would look at the intermediate size.
Another way to see it is that each interpolation has a spatial frequency response (like a filter in acoustics), and in one case you apply a single filter and in the other one you compose two filters.

Gimp AutoInputLevels in ImageMagick

I am trying to recreate Gimp's AutoInputLevels (Colors>Levels>AutoInputLevels) in Imagemagick (need to batch process 1000 files). This is for an infrared image. I tried contrast_stretch, normalize and auto level, but they didn't help. Any suggestions?
Thanks.
Edit: I won't be able to provide the representative images. However, when I say they didn't help, I am using other operations in GIMP. Doing the same (auto level and hard_light in ImageMagick) are not providing equivalent results.

Adding to #Mark Setchell's answer, I can tweak it a bit and get close using:
Input:
convert redhat.jpg -channel rgb -contrast-stretch 0.6%x0.6% im.png
ImageMagick Result:
GIMP AutoInputLevels Result:
And get a numerical comparison:
compare -metric rmse gimp.png im.png null:
363.484 (0.00554641)
which is about 0.5% difference.

As you didn't provide a representative image, or expected result, I synthesised an image with three different distributions of red, green and blue pixels, using code like this:
#!/usr/bin/env python3
from PIL import Image
import numpy as np
w, h = 640, 480
# Synthesize red channel, mu=64, sigma=3
r = np.random.normal(64, 3, size=h*w)
# Synthesize green channel, mu=128, sigma=10
g = np.random.normal(128, 10, size=h*w)
# Synthesize blue channel, mu=192, sigma=6
b = np.random.normal(192, 6, size=h*w)
# Merge channels to RGB and round
RGB = np.dstack((r,g,b)).reshape((h,w,3)).astype(np.uint8)
# Save
Image.fromarray(RGB).save('result.png')
Then I applied the GIMP AutoInputLevels contrast stretch that you are asking about, and saved the result. The resulting histogram is:
which seems to show that the black-point and white-point levels have been set independently for each channel. So, in ImageMagick I guess you would want to separate the channels, apply some contrast-stretch to each channel independently and then recombine the channels along these lines:
magick input.png -separate -contrast-stretch 0.35%x0.7% -combine result.png
If you provide representative input and output images, you may get a better answer.

Lena grayscale image processing values

I am trying to get the grayscale image values 0-255 of the 512x512 image of lena. Some have suggested using Matlab, however I do not have Matlab. Has anyone used Gimp for this?

Just use ImageMagick. It is installed on most Linux distros and available for OSX and Windows:
convert lena.jpg -colorspace gray -depth 8 txt:-

The octave solution is to read the image using
im = imread("lena512.jpg");
The image im can then be shown using imshow (im).
Conversion to grayscale can be performed using
lenagy = 0.3*im(:,:,1) + 0.6*im(:,:,2) + 0.1*im(:,:,3);
The result is that lenagy consists of a 2-D array, which can be saved to a file using for example
save lenagy.org lenagy

Normalization image rgb

I have a problem with normalization.
Let me what the problem is and how I attempt to solve it.
I take a three-channel color image, convert it to grayscale and apply uniform or non-uniform quantization and the same thing.
To this image, I should apply the normalization, but I have a problem even if the image and grayscale and always has three channels.
How can I apply normalization having a three-channel image?
Should the min and the max all be in the three channels?
Could someone give me a hand?
The language I am using is processing 2.
P.S.
Can you do the same thing with a color image instead use a grayscale image?

You can convert between the 1-channel and 3-channel representations easily. I'd recommend scikit-image (http://scikit-image.org/).
from skimage.io import imread
from skimage.color import rgb2gray, gray2rgb
rgb_img = imread('path/to/my/image')
gray_img = rgb2gray(rgb_image)
# Now normalize gray image
gray_norm = gray_img / max(gray_img)
# Now convert back
rgb_norm = gray2rgb(gray_norm)

I worked with a similar problem sometime back. One of the good solutions to this was to:
Convert the image from RGB to HSI
Leaving the Hue and Saturation channels unchanged, simply normalize across the Intensity channel
Convert back to RGB
This logic can be applied accross several other image processing tasks, like for example, applying histogram equalization to RGB images.

How can I get Photoshop's "Save for Web" quality with ImageMagick?

When I resize an image with photoshop's "Save for Web", it looks different than if I convert it with ImageMagick. Is there a setting I can change in ImageMagick to get same results as Photoshop? Here is an example.
The original:
"Save for Web" 30.01%
vs
convert -geometry 30.01% home-button-full.png home-button-ipad.png
Enlarged so it's easier to see the difference:
Photoshop:
ImageMagick:

The only immediate difference that's discoverable are these:
Photoshop's result is 76x86 pixels in size.
ImageMagick's result is 76x87 pixels in size.
Photoshop's number of colors used by the PNG is 378.
ImageMagick's number of colors used by the PNG is 401.
Photoshop's filesize for the PNG is 4.239 Bytes.
ImageMagick's filesize for the PNG is 3.410 Bytes.
I only know how to fix the first difference:
convert orig.png -scale 76x86\! scaled-76x86.png
(This command's result has reduced the number of uniq colors to 358... but that's by accident only.)
As long as we don't know what other sort of filtering Photoshop's Save for Web... does apply, we have little chance to mimic its results exactly... You could try this:
convert orig.png -scale 76x86\! -interpolate bicubic scaled-76x86.png

Check to see what re-sampling method (bicubic, bilinear, etc) you used in photoshop and make sure it's using the same method.
-interpolate type type being bicubic, bilinear, average, etc. Interpolation type
According to the docs Imagemagick uses bilinear by default, whereas Photoshop uses Bicubic by default.

Try using the -quality parameter if you want lossy compression instead. ImageMagick defaults to 100 (lossless) for jpgs.
http://www.imagemagick.org/script/command-line-options.php#quality
http://www.simplesystems.org/RMagick/doc/imageattrs.html#quality

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Lower noise in picture to enable OCR with tesseract - image-processing

Simple closing operation(Dilation followed by Erosion) will give you desired output. Below is the Python implementation of the same. img = cv2.imread(r'D:\Image\noiseOCR.png',0) kernel = np.ones((3,3),np.uint8) closing = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)

Digits in this image are largest connected components. So another approach is doing the connected component analysis.

Related

Imagemagick resized pictures are different using a single command or two commands

Gimp AutoInputLevels in ImageMagick

Lena grayscale image processing values

Normalization image rgb

How can I get Photoshop's "Save for Web" quality with ImageMagick?

Categories

Resources