I have a simple text in an image image_ball.png. Usually OCR of Tesseract works well, but for this certain image it returns always an empty string.
In [1]: from PIL import Image
In [2]: from pytesseract import image_to_string
In [3]: img = Image.open("image_ball.png")
In [4]: image_to_string(img)
Out[5]: u''
I could not find a workaround up-to-now.
How could I figure out what is going wrong with this image?
The versions are:
In [6]: import PIL
In [7]: PIL.__version__
Out[7]: '4.0.0'
$ tesseract -v
tesseract 4.0.0
leptonica-1.77.0
libgif 5.1.4 : libjpeg 9c : libpng 1.6.36 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 1.0.2 : libopenjp2 2.3.0
Found AVX2
Found AVX
Found SSE
EDIT
I tried also to convert the image to black/white. But it is still not recognized.
In [6]: image = img.convert('L')
In [7]: image_to_string(image)
Out[8]: u''
EDIT 2
Single characters seem also to be a problem to Tesseract. Dilating or eroding the image seems not to help: image_1.png
Dilating image gives you the desired output.
image = cv2.imread("Ball.png", cv2.IMREAD_GRAYSCALE)
cv2.dilate(image, (5, 5), image)
print(pytesseract.image_to_string(image), config='--psm 7')
Ball
Related
I'm wondering how to convert this working command line sequence for ImageMagick into a Python script using the Wand library:
convert test.gif -fuzz 5% -layers Optimize test5.gif
Python code is:
from wand.api import library
from wand.color import Color
from wand.drawing import Drawing
from wand.image import Image
import ctypes
library.MagickSetImageFuzz.argtypes = (ctypes.c_void_p,
ctypes.c_double)
with Image(filename='test.gif') as img:
library.MagickSetImageFuzz(img.wand, img.quantum_range * 0.05)
with Drawing() as ctx:
ctx(img)
img.optimize_layers()
img.save(filename='test5.gif')
But, I got a different result from the ImageMagick command line.
Why...
This matches the CLI, but results may vary if the gif is animated or previously optimized.
from wand.image import Image
with Image(filename='test.gif') as img:
img.fuzz = img.quantum_range * 0.05
img.optimize_layers()
img.save(filename='test5.gif')
This question already has answers here:
What does OpenCV's cvWaitKey( ) function do?
(9 answers)
what does waitKey (30) mean in OpenCV? [duplicate]
(1 answer)
Closed 2 years ago.
I am trying to denoise multiple gray-scaled text images from a folder. I have converted all the images into gray-scale already. All I want is to remove noise or blurriness from all the images without changing text. For this, I am using opencv in order to remove blurriness or noisiness. I have written the code as shown below, when I run the code it shows no error and displays nothing.Please help me to solve this problem. I am new in image processing that's why I am confused. Here's my code...
import numpy as np
from PIL import Image
import cv2
import glob
src_path = r"C:\Users\usama\Documents\FYP-Data\FYP Project Data\grayscale images\*.png" #images folder path
def get_string(src_path):
for filename in glob.glob(src_path):
img = cv2.imread(filename)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
kernel = np.ones((1, 1), np.uint8)
img = cv2.dilate(img, kernel, iterations=1)
img = cv2.erode(img, kernel, iterations=1)
cv2.imwrite(src_path + "filename", img)
You should narrow down the files you load in. This I prefer to do with glob which allows for easy regular expression patterns when searching for files. I would expect that either you get to a file that is not an image but still loaded or that you are missing a cv2.waitKey(0) to exit the view.
import cv2
from glob import glob
files = glob('*.jpg')
for filename in glob('*.jpg'):
img = cv2.imread(filename)
bilateral_blur = cv2.bilateralFilter(img, 9, 75, 75)
cv2.imshow('denoised_images', bilateral_blur)
cv2.waitKey(0)
I am trying to implement face detection tutorial of openCV but my google colab kernel is crashed when following code is used:
from google.colab import files
xml = files.upload()
import numpy as np
import cv2 as cv
face_cascade = cv.CascadeClassifier('haarcascade_frontalface_default.xml')
eye_cascade = cv.CascadeClassifier('haarcascade_eye.xml')
img = cv.imread('elonMusk.jpg')
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray, 1.3, 5)
for (x,y,w,h) in faces:
cv.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
roi_gray = gray[y:y+h, x:x+w]
roi_color = img[y:y+h, x:x+w]
eyes = eye_cascade.detectMultiScale(roi_gray)
for (ex,ey,ew,eh) in eyes:
cv.rectangle(roi_color,(ex,ey),(ex+ew,ey+eh),(0,255,0),2)
cv.imshow('img',img)
cv.waitKey(0)
cv.destroyAllWindows()
Error displayed : Runtmie died. Automatically restarting.
All the desired xml and jpg files were uploaded.
The code used is exactly the same code as used for face detection openCV tutorial.
https://docs.opencv.org/3.4/d7/d8b/tutorial_py_face_detection.html
Google Colab is actually not designed to run opencv smoothly, so you will absolutely get an error. You should use Jupiter notebook or any other IDE.
This code runs as expected, and gives the expected output
import multiprocessing
import cv2
import os
path = r"/home/pi/Desktop/calibration.jpg"
image = cv2.imread(path)
def cvtcolor(img):
print "converting to gray ..."
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
print "converted to gray"
if True:
p = multiprocessing.Process(name='test',
target=cvtcolor,
kwargs={'img':image}
)
p.start()
p2 = multiprocessing.Process(name='test',
target=cvtcolor,
kwargs={'img':image}
)
p2.start()
outputs:
converting to gray ...
converting to gray ...
converted to gray
converted to gray
However, this code hangs when executed
import multiprocessing
import cv2
import os
path = r"/home/pi/Desktop/calibration.jpg"
image = cv2.imread(path)
def cvtcolor(img):
print "converting to gray ..."
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
print "converted to gray"
cvtcolor(image)
if True:
p = multiprocessing.Process(name='test',
target=cvtcolor,
kwargs={'img':image}
)
p.start()
the function executed in the main process proceeds, but the function executed in the "test" process hangs forever
converting to gray ...
converted to gray
converting to gray ...
I am using OpenCV version 3.2.0, installed as detailed here on Raspbian Jessie (raspberry pi)
Does anyone has an explanation / solution for this?
have a look at what is returned. If you try the BGR2GRAY directly, you will get an array with shape attribute same as input image but with only 1 color, e.g. gray. When you run the same function using multiprocessing you do not get an array returned. It will have no shape attribute, try printing the output to see what form it is in, then maybe reconstruct an image from this.
I'm using Tesseract to recognize numbers from images of a screen taken with a phone camera. I've done some preprocessing of the image: processed image, and using Tesseract, I'm able to get some mixed results. Using the following code on the above images, I get the following output: "EOE". However, with this image, processed image, I get an exact match: "39:45.8"
import cv2
import pytesseract
from PIL import Image, ImageEnhance
from matplotlib import pyplot as plt
orig_name = "time3.jpg";
image_name = "time3_.jpg";
img = cv2.imread(orig_name, 0)
img = cv2.medianBlur(img, 5)
img_th = cv2.adaptiveThreshold(img, 255,\
cv2.ADAPTIVE_THRESH_MEAN_C,cv2.THRESH_BINARY, 11, 2)
cv2.imshow('image', img_th)
cv2.waitKey(0)
cv2.imwrite(image_name, img_th)
im = Image.open(image_name)
time = pytesseract.image_to_string(im, config = "-psm 7")
print(time)
Is there anything I can do to get more consistent results?
I did three additional things to get it correct for the first Image.
You can set a whitelist for Tesseract. In your case we know that
there will only charachters from this List 01234567890.:. This
improves the accuracy significantly.
I resized the image to make it easier for tesseract.
I switched from psm mode 7 to 11 (Recoginze as much as possible)
Code:
import cv2
import pytesseract
from PIL import Image, ImageEnhance
orig_name = "./time1.jpg";
img = cv2.imread(orig_name)
height, width, channels = img.shape
imgResized = cv2.resize(img, ( width*3, height*3))
cv2.imshow("img",imgResized)
cv2.waitKey()
im = Image.fromarray(imgResized)
time = pytesseract.image_to_string(im, config ='--tessdata-dir "/home/rvq/github/tesseract/tessdata/" -c tessedit_char_whitelist=01234567890.: -psm 11 -oem 0')
print(time)
Note:
You can use Image.fromarray(imgResized) to convert an opencv image to a PIL Image. You don't have to write to disk and read it again.