get the alphabets from the image using java..! - image-processing

I am looking for some code which can get me the alphabets present in the image. As per my understanding OCR works fine with the plain white background. I am looking for something which gives me the characters or alphabets in some random image background except the alphabets. Can anyone help me out in this regard..?
Thank you.

Some image preprocessing is necessary to remove the background noise before feeding it to the OCR engine. For OCR, I use Tesseract. Here are some Java frontend or library that use it:
VietOCR
Tess4J

Related

OCR: scan specific part of image

I'm quite new in computer vision, currently learning about google cloud vision SDK using Go. And right now I have one problem.
So I have an image scanned using the DetectTexts() method. The result was great! all of the texts are scanned.
However, I don't actually need all of those texts. I only need some of it. Below is the image I use as a sample. What I want to get is the two blocks highlighted in red.
Images
Result
WE-2
Sam WHO
Time
PM 1:57
SYS
mmHg
mmHg
DIA
mmHg
90
62
82
mmHg
PUL
/MIN
MR AVGA
SET
START
STOP
MEM
I do not know what is the best approach to do it. What's in my mind right now is these approaches:
split the images that are highlighted in red, then perform OCR scan on those new images
or, get all of the texts, and then use some algorithm (NLP maybe?) to get the highlighted texts.
Can somebody please help what is the correct and best approach to solves this problem?
You mentioned that you were using Go, which unfortunately I dont have any experience with, but I have approached this problem in other languages like Python and C#. What I would recommend is just create an ROI, or Region of Interest. Basically what that means is you would be cropping the image to only the highlighted region that you want to detect text from. Like I said, I'm not entirely sure if you can do that in Go, so you might have to do some raw pixel manipulation rather than just using a member function. I assumed that the position of the regions that you wanted to detect text from would remain the same. If your open to it, you could just create a simple python script that generates a ROI, and pipes the cropped image to GO.
import cv2
img = cv2.imread('inputImg.png')
output = img[c1:c1+25,r1:r1+25]
#You could do something like this
cv2.imwrite("path/to/output/outputimg.png", output)

How do I decide my image has bright text or dark text? [LabVIEW]

I'm working on a Text Extraction algorithm in which I need some assistance with thresholding an image. My development platform is LabVIEW 2015 and I'm using the "AutoBThreshold2.vi" from Vision Development Module 2015. I decided to go with Otsu's Algorithm for thresholding which is available as "Inter Class Variance" Method. Now, The problem is that I need to specify the "Look for" option to extract the text! Unfortunately, my input images will not always be same.
Kindly refer the attached source code along with sample images. My question is that Is there any way to find whether the image has Dark objects/Bright Objects on Dark Background/Bright Background? Meanwhile I'm also playing with Histogram to find out the BG & FG type!
I'd really appreciate your help...
With the help of NI forum, I'm able to solve this problem.
https://forums.ni.com/t5/LabVIEW/Auto-Thresholding-an-image-for-text-extraction/m-p/3904533#M1108133
Use equalize vi to solve this problem before thresholding, see below image to find it.

How can I improve Tesseract results quality?

I'm trying to read the NIRPP number (social security number) from a French vital card using Tesseract's OCR (I'm using TesseractOCRiOS 4.0.0). So here is what I'm doing :
First, I request a picture of the whole card :
Then, using a custom cropper, I ask the user to zoom specifically on the card number:
And then I catch this image (1291x202px) and using Tesseract I try to read the number:
let tesseract = G8Tesseract(language: "eng")
tesseract?.image = pickedImage
tesseract?.recognize()
print("\(tesseract?.recognizedText ?? "")")
But I'm getting pretty bad results... only like 30% of the time Tesseract is able to find the right number, and among these sometimes I need to trim some characters (like alpha characters, dots, dashes...).
So is there a solution for me to improve these results?
Thanks for your help.
To improve your results :
Zoom your image to appropriate level. Right amount of zoom will improve your accuracy by a lot.
Configure tesseract so that only digits are whitelisted . I am
assuming here what you are trying to extract contains only digits.If
you whitelist only digits then it will improve your chances of
recognizing 0 as 0 and not O character.
If your extracted text matches a regex, you should configure
tesseract to use that regex as well.
Pre process your image to remove any background colors and apply
Morphology effects like erode to increase the space between your
characters/digits. If they are too close , tesseract will have
hard time recognizing them correctly. Most of the image processing
library comes prebuilt with those effects.
Use tiff as image format.
Once you have the right preprocessing pipeline and configuration for tesseract , you will usually get a very good and consistent result.
There are couple of things you need to do it....
1.you need to apply black and white or gray scale on image.
you will use default functionality like Graphics framework or third party libray like openCV or GPUImage for applying black&white or grayscale.
2.and then apply text detection using Vision framework.
From vision text detection you can crop texts according to vision text detected coordinates.
3.pass this cropped images(text detected) to TesseractOCRiOS...
I hope it will work for your use-case.
Thanks
I have a similar issue. I discovered that Tesseract recognizes a text only if the given image contain a region of interest.
I solved the problem using Apple' Vision framework. It has VNDetectTextRectanglesRequest that returns CGRect of detected text according to the image. Then you can crop the image to region where text is present and send them to Tesseract for detection.
Ray Smith says:
Since HP had independently-developed page layout analysis technology that was used in products, (and therefore not released for open-source) Tesseract never needed its own page layout analysis. Tesseract therefore assumes that its input is a binary image with optional polygonal text regions defined.

Tesseract IOS performance

I am trying to do a character recognition on Ipad. I basically want the user to draw characters and digits in real time and the system recognises them. I tried Tesseract with the iOS wrapper found here: https://github.com/gali8/Tesseract-OCR-iOS But the results are really bad
Picture1:
Picture2:
The output from picture 1: LWJ3
The output from picture 2: Fnilmling Lu summaryofmajnr news and comments in Ihe Hang Kang Emnomic
Journal. the patent puhlkalinn M EJ Insight, nn Ilnnday. Nwv. ca; TOP
Is it suppose to be like this? Maybe the purpose of libraries such as Tressaract is to recognise photographs of text. But should the performance be this bad? Got any tips how to do this?
As per what I worked with Tesseract. It is unable to detect handwriting. Tesseract will work with some standard font, most suitable font is Verdana. And also before passing image to tesseract do some image filtering.
The first image, with hand-written text, cannot be read by Tesseract. Furthermore, I tried another top-level high-quality commercial OCR, and even it cannot provide good result from that image. If you absolutely need to recognize such images, use an ICR capable program. I have and distribute a commercial application that can read those numbers very well with 100% accuracy, but the cost is premium, used in small-to-medium enterprises environments.
The second image reads very well in a commercial OCR application, and I would expect Tesseract to do better than the result you are showing. Perhaps producing higher resolution image will help to improve the result.
Ilya Evdokimov
I suggest you to add filter before processing any image and handling it in tesseract. https://github.com/BradLarson/GPUImage is a really popular filter for image processing. You can use Luminance filter on it. By the way, you should upload some codes in telling us how you handling your image. I mean in second one as the first one is hand writing. Besides GPUIIMage, i think you can use CIFilter which is also suggested by others to convert black white image.
monochromeFilter = [CIFilter filterWithName:#"CIColorMonochrome" keysAndValues: #"inputColor", [CIColor colorWithRed:1.0 green:1.0 blue:1.0 alpha:1.0f], #"inputIntensity", [NSNumber numberWithFloat:1.5f], nil];

Parsing / Scraping information from an image

I am looking for a library that would help scrape the information from the image below.
I need the current value so it would have to recognise the values on the left and then estimate the value of the bottom line.
Any ideas if there is a library out there that could do something like this? Language isn't really important but I guess Python would be preferable.
Thanks
I don't know of any "out of the box" solution for this and I doubt one exists. If all you have is the image, then you'll need to do some image processing. A simple binarization method (like Otsu binarization) would make it easier to process:
The binarization makes it easier because now the pixels are either "on" or "off."
The locations for the lines can be found by searching for some number of pixels that are all on horizontally (5 on in a row while iterating on the x axis?).
Then a possible solution would be to pass the image to an OCR engine to get the numbers (tesseractOCR is an open source OCR engine hosted at Google (C++): tesseractOCR). You'd still have to find out where the numbers are in the image by iterating through it.
Then, you'd have to find where the lines are relative to the keys on the left and do a little math and you can get your answer.
OpenCV is a beefy computer vision library that has things like the binarization. It is also a C++ library.
Hope that helps.

Resources