How can I improve Tesseract results quality? - ios

I'm trying to read the NIRPP number (social security number) from a French vital card using Tesseract's OCR (I'm using TesseractOCRiOS 4.0.0). So here is what I'm doing :
First, I request a picture of the whole card :
Then, using a custom cropper, I ask the user to zoom specifically on the card number:
And then I catch this image (1291x202px) and using Tesseract I try to read the number:
let tesseract = G8Tesseract(language: "eng")
tesseract?.image = pickedImage
tesseract?.recognize()
print("\(tesseract?.recognizedText ?? "")")
But I'm getting pretty bad results... only like 30% of the time Tesseract is able to find the right number, and among these sometimes I need to trim some characters (like alpha characters, dots, dashes...).
So is there a solution for me to improve these results?
Thanks for your help.

To improve your results :
Zoom your image to appropriate level. Right amount of zoom will improve your accuracy by a lot.
Configure tesseract so that only digits are whitelisted . I am
assuming here what you are trying to extract contains only digits.If
you whitelist only digits then it will improve your chances of
recognizing 0 as 0 and not O character.
If your extracted text matches a regex, you should configure
tesseract to use that regex as well.
Pre process your image to remove any background colors and apply
Morphology effects like erode to increase the space between your
characters/digits. If they are too close , tesseract will have
hard time recognizing them correctly. Most of the image processing
library comes prebuilt with those effects.
Use tiff as image format.
Once you have the right preprocessing pipeline and configuration for tesseract , you will usually get a very good and consistent result.

There are couple of things you need to do it....
1.you need to apply black and white or gray scale on image.
you will use default functionality like Graphics framework or third party libray like openCV or GPUImage for applying black&white or grayscale.
2.and then apply text detection using Vision framework.
From vision text detection you can crop texts according to vision text detected coordinates.
3.pass this cropped images(text detected) to TesseractOCRiOS...
I hope it will work for your use-case.
Thanks

I have a similar issue. I discovered that Tesseract recognizes a text only if the given image contain a region of interest.
I solved the problem using Apple' Vision framework. It has VNDetectTextRectanglesRequest that returns CGRect of detected text according to the image. Then you can crop the image to region where text is present and send them to Tesseract for detection.
Ray Smith says:
Since HP had independently-developed page layout analysis technology that was used in products, (and therefore not released for open-source) Tesseract never needed its own page layout analysis. Tesseract therefore assumes that its input is a binary image with optional polygonal text regions defined.

Related

Best practices for tesseract ocr using python

I'm working on a project in which I want to recognize text from a credit card sized document.The document contains details like name,phone number ,address etc. I'm capturing the image and pass the image into tesseract engine using
text = pytesseract.image_to_string(Image.open(filename), lang = 'eng'). Sometimes I'm getting decent results for each field but most of the time result is very bad. How do I resolve this issue ? What are the best practices. How the document readers work with OCR. Is it possible to process region based ocr in the document ?
A single approach can't read every text. You have to apply multiple approach for multiple types of pdf.
If the text is not horizontal, you have to rotate the text. If the text is curved, you have to use transformation (e.g. hog transform).
Moreover, to read text using the package, the texts should be clear and horizontal. Otherwise you need to create rules and transform them.

pytesseract - Read text from images with more accuracy

I am working on pytesseract. I want to read data from Driving License kind of thing. Presently i am converting .jpg image to binary(gray scale) format using opencv but i am not accurate result. How do you solve this? Is there any standard size of image?
Localize your detection by setting the rectangles where Tesseract has to look. You can then restrict according to rectangle which type of data is present at that place example: numerical,alphabets etc.You can also make a dictionary file for tesseract to improve accuracy(This can be used for detecting card holder name by listing common names in a file). If there is disturbance in the background then design a filter to remove it. Good Luck!

Tesseract IOS performance

I am trying to do a character recognition on Ipad. I basically want the user to draw characters and digits in real time and the system recognises them. I tried Tesseract with the iOS wrapper found here: https://github.com/gali8/Tesseract-OCR-iOS But the results are really bad
Picture1:
Picture2:
The output from picture 1: LWJ3
The output from picture 2: Fnilmling Lu summaryofmajnr news and comments in Ihe Hang Kang Emnomic
Journal. the patent puhlkalinn M EJ Insight, nn Ilnnday. Nwv. ca; TOP
Is it suppose to be like this? Maybe the purpose of libraries such as Tressaract is to recognise photographs of text. But should the performance be this bad? Got any tips how to do this?
As per what I worked with Tesseract. It is unable to detect handwriting. Tesseract will work with some standard font, most suitable font is Verdana. And also before passing image to tesseract do some image filtering.
The first image, with hand-written text, cannot be read by Tesseract. Furthermore, I tried another top-level high-quality commercial OCR, and even it cannot provide good result from that image. If you absolutely need to recognize such images, use an ICR capable program. I have and distribute a commercial application that can read those numbers very well with 100% accuracy, but the cost is premium, used in small-to-medium enterprises environments.
The second image reads very well in a commercial OCR application, and I would expect Tesseract to do better than the result you are showing. Perhaps producing higher resolution image will help to improve the result.
Ilya Evdokimov
I suggest you to add filter before processing any image and handling it in tesseract. https://github.com/BradLarson/GPUImage is a really popular filter for image processing. You can use Luminance filter on it. By the way, you should upload some codes in telling us how you handling your image. I mean in second one as the first one is hand writing. Besides GPUIIMage, i think you can use CIFilter which is also suggested by others to convert black white image.
monochromeFilter = [CIFilter filterWithName:#"CIColorMonochrome" keysAndValues: #"inputColor", [CIColor colorWithRed:1.0 green:1.0 blue:1.0 alpha:1.0f], #"inputIntensity", [NSNumber numberWithFloat:1.5f], nil];

Parsing / Scraping information from an image

I am looking for a library that would help scrape the information from the image below.
I need the current value so it would have to recognise the values on the left and then estimate the value of the bottom line.
Any ideas if there is a library out there that could do something like this? Language isn't really important but I guess Python would be preferable.
Thanks
I don't know of any "out of the box" solution for this and I doubt one exists. If all you have is the image, then you'll need to do some image processing. A simple binarization method (like Otsu binarization) would make it easier to process:
The binarization makes it easier because now the pixels are either "on" or "off."
The locations for the lines can be found by searching for some number of pixels that are all on horizontally (5 on in a row while iterating on the x axis?).
Then a possible solution would be to pass the image to an OCR engine to get the numbers (tesseractOCR is an open source OCR engine hosted at Google (C++): tesseractOCR). You'd still have to find out where the numbers are in the image by iterating through it.
Then, you'd have to find where the lines are relative to the keys on the left and do a little math and you can get your answer.
OpenCV is a beefy computer vision library that has things like the binarization. It is also a C++ library.
Hope that helps.

get the alphabets from the image using java..!

I am looking for some code which can get me the alphabets present in the image. As per my understanding OCR works fine with the plain white background. I am looking for something which gives me the characters or alphabets in some random image background except the alphabets. Can anyone help me out in this regard..?
Thank you.
Some image preprocessing is necessary to remove the background noise before feeding it to the OCR engine. For OCR, I use Tesseract. Here are some Java frontend or library that use it:
VietOCR
Tess4J

Resources