How to read captcha with tesseract -magickimage - imagemagick

I'm having some issues reading a captcha image using ImageMagick - teserract.
Tried so many options and failed. Can this format actually be read issueformat?
Thanks in advance

The point of captcha's are to make it difficult for Computers to read, so it's natural to have many failed attempts.
However this example seems to lack enough entropy to halt any OCR. Use any combination of noise-reduction pre-processing techniques before passing to OCR engines.
For example: Drop colors (we don't need it), slightly blur & erode shapes together, and drop the outliner grays.
convert TBWyI.jpg -colorspace Gray \
-blur 3x1 -morphology Erode Diamond \
-level 20% output.jpg
Which produces ...
And Tesseract is more than happy with that.
tesseract output.jpg stdout
#=> '6DEAV

Related

Apply imagemagick transformation on only part of an image, whilst keeping the rest "stock"?

I have many documents per day that are photographed and I need to organise by QR code. The problem is, zbarimg is struggling to pick up many of the QR codes in the photos, so I have been trialling processing them with imagemagick first, using morphology open, thesholding, etc, which has yielded much better results.
The only issue with this is these apply to the whole image, which makes the rest of the file unusable for me, as I deal with the rest of the image based on colours and information which all gets destroyed in the processing. Could anybody give me an example on how I could apply my imagemagick filters to only a part of an image (coordinate based is fine) and leave the rest of the image untouched, so I can continue my process? I will be applying this to all images in a folder, so it's a batch file running this for me in most instances.
I have tried using crops, however this obviously leaves me with only the cropped portion of the image, which doesn't actually help when trying to process the rest of the file.
I'm running my scripts on Windows 11, if that means anything in terms of the solution.
Many thanks!
Tom
EDIT:
Thank you all for the advice given!
I solved my problem using the following:
convert a.jpg ( -clone 0 -fill white -colorize 100 -fill black -draw "polygon 500,300 500,1500 1300,1500 1300,300" -alpha off -write mpr:mask +delete ) -mask mpr:mask +repage -threshold 50% -morphology open square:4 +mask c.jpg
I did post this as an answer, but (and I have no idea why, I'm brand new to stack exchange) my answer was deleted. I used the clone to make the mask with the coordinates needed, then added the threshold and morphology that would make my QR codes more legible!
Thanks again everyone, really helped me out on my journey to figure it out :D
You can use -region to specify a region to process. So starting with this:
You can then specify a region to colorise with blue and then change the region to blur part of the blue and part of the original:
magick swirl.jpg -region 100x100+50+50 -fill blue -colorize 100% -region 100x100+100+100 -blur x20 result.png
The solution using -region may be the most direct. In ImageMagick versions where -region is not supported the same result can usually be achieved by cropping and modifying a clone inside parentheses.
magick swirl.jpg ( +clone -crop 100x100+50+50 -fill blue -colorize 50 ) -flatten result.png
The cloned, cropped, and and modified piece maintains its original geometry, so the -flatten operation puts it back where it was on the input image after the parentheses.

Generating Low Resolution Black and White Text with Image Magick

I need to generate low resolution black and white images of texts in ImageMagick. These images will be plotted in a small LED matrix. The text need to have 7 pixels of height.
For now, I'm using:
convert -negate -threshold 15% -font Courier -size 80x11 caption:'hello' out.bmp
Output image:
Even with the height being more than I need, due to low resolution and anti-aliasing correction, the letters are not pretty and symmetric. Has anyone did this and can help me out?
Version: ImageMagick 6.8.9-9 Q16 x86_64
The solution I found was to use an TrueType font. Just got a free font from the internet and used it in the size it was built for.
P.S.: Switched for OpenCV as well. My Python app generates images dynamically. The cost for invoking ImageMagick several times(could get close to hundred) per minute is too high.
Posting a snippet, hope it helps.
import cv2 as cv
from PIL import ImageFont, ImageDraw, Image
# Creates a black image with 80x10 size
img = Image.new('RGBA', (80, 10), (0,0,0,0))
draw = ImageDraw.Draw(img)
# Load TrueType font of height size 8
font = ImageFont.truetype("font.ttf", 8)
# Draw text using the loaded font
draw.text((0, 0), "Hello World!", font=font)
img.save("out.bmp")
Output Image:
I would be inclined to output the letters larger than required, then to trim any extraneous spare space so as to make the most of the available resolution, then resize down to your specific needs:
convert -size 320x32 -font Courier label:'hello' -trim +repage -resize 80x8 +write out.gif
Mark, I think he wants a binary result. But you have an excellent idea.
Let's take Mark's result, threshold and then scaling down to 8 pixels tall. This ImageMagick command seems to work better than my earlier post.
Mark's Output:
convert wcwuj.gif -threshold 60% +write thresh.gif -scale x8 result.gif
Threshold Result:
Scaled Result:
Perhaps making Mark's image much larger and choosing a better threshold will produce a better result.
You have not told us what version of ImageMagick nor platform and you do not show your result for us to see what might be wrong. Also your ImageMagick syntax is not proper, though ImageMagick 6 is rather forgiving.
This is what I get using ImageMagick 6.9.10.8 Q16 Mac OSX Sierra. The first output is 8 pixels tall and the second output is scaled by 1000% (10x).
This forum does not seem to convert bmp to a usable format for display, so I am using GIF in place of BMP. But my results look the same whether BMP or GIF
convert -size x8 -font Courier label:'hello' -negate -threshold 20% +write out.gif -scale 1000% out2.gif
I have tried changing threshold, but much larger or smaller values make it worse. A range from about 10-30% produces the same results.
I have also tried using -monochrome in place of -threshold and get the following:
convert -size x8 -font Courier label:'hello' -negate -monochrome +write out3.gif -scale 1000% out4.gif
You might try a dot-matrix type font. See https://www.1001fonts.com/digital+dot-matrix-fonts.html?page=1&items=10. I have not tried any of them.
You could try some of the old X11 fonts. These were hand-drawn rather than being rendered from a set of curves, so they look good at very small sizes.
For example, if I run xfontsel I get things like this (enlarged for clarity):
Take a look in /usr/share/fonts/X11/misc.

How to detect frames to keep or discard based on luminosity levels?

I have a series of images from a slow motion capture of pulsing electrical discharges. Many of the frames are nearly black. I would like to selectively keep the frames that are more interesting; eg have more luminosity.
I've considered using ImageMagick or GraphicsMagick (or any other; not married to any tool - I'm up for more efficient suggestions).
How would I go about selecting such images and then discarding the other images without appreciable luminosity levels? I'm assuming that I have to establish a baseline first of "black" and then perhaps visually find the least luminous frame image and then use that as the lower limit to use for getting meaningful images / frames...
Example of DISCARD ("empty" frame):
Example of KEEP (frame with "data"):
I would suggest ImageMagick to Erode the image (clean-up noise), reduce data to a monochrome binary image, and print the statistical mean of the image.
convert 5HzsV.jpg -format "%[mean]" -monochrome -morphology Erode Diamond info:
# => 0
convert lLZFX.jpg -format "%[mean]" -monochrome -morphology Erode Diamond info:
# => 149.992
So a bash script might be as easy as...
for image in $(ls *.jpg)
do
L=$(convert "$image" -format "%[mean]" -monochrome -morphology Erode Diamond info:)
if [[ $L -gt 0 ]]; then
echo "Image $image is not empty! # $L"
fi
done
Of course that can be adjusted to meet your needs.
The way images are encoded, you'll likely find that the 'interesting' images are bigger, because the uniformly dark background compresses better than a random spark. For instance, your empty Jpeg is 21K while the interesting one is 39K.

ImageMagick morphology - missing kernels - UnrecognizedKernelType

I am trying to use any of those:
http://www.imagemagick.org/Usage/morphology/#erode
but it only returns an error message:
convert original.png -morphology Erode Octagon converted.png
convert: UnrecognizedKernelType `Octagon' # error/convert.c/ConvertImageCommand/1967
The same error occurs with -morphology Dilate Octagon.
The answer to that question for future generations:
Octagon shape was added in 6.6.9.x version of ImageMagick but with older versions you can use:
Diamond,
Square,
Disk,
Plus,
Cross,
Rectangle,
Ring
And those work fine.
You can even define their size, like this:
convert original.png -morphology Erode Square:2:2 converted.png
Just play around yourself.

ImageMagick to preprocess image for tesseract-ocr

Is there anyway to process an image like this with ImageMagick so that I can use tesseract-ocr to convert it to text?
Because of the lines in the background I get nonsense from conventional methods. Does anyone know how to deal with an image such as this?
'convert -density 300 -units PixelsPerInch -type Grayscale +compress input.png input.tif' followed by 'tesseract input.tif output -l eng' gives me utter garbage.
Or are there any alternatives to ImageMagick that I can use to pre-process such an image whether through command-line or in python?
Have you tried morphology operations Morphology of Shapes after converting image to grayscale?

Resources