OpenCV + Tesseract for text recognition on video frames in realtime - opencv

I am trying to use tesseract for frames captured by opencv from windows screen. I am not using a camera feed here instead I am trying to capture certain message that appears in a message box of a certain color. I am able to crop out the part of the screen that shows the message box and I want to use tesseract for reading the message. I am able to read the message using tesseract from the screenshot of the same message box cropped image but when I try to do the same in real time screen capture it is giving really bad output.
The screenshot is saved using imwrite() of opencv on the same Mat image which is passed to tesseract.
Can anyone explain why is this happening?
How can I make it work?
Regards

Related

How do we fix our flickering depth image when using an Orbecc Astra Camera and Rviz?

We try to set up the Orbbec Astra Embedded S camera with ROS and our goal is to detect objects by reconstructing a 3D point cloud from the camera images. We are using ROS Noetic and the ROS-Package "astra-camera" (https://github.com/orbbec/ros_astra_camera) as well as Rviz to visualize the images and the 3D point cloud.
Here are the rostopics:
/camera/color/camera_info
/camera/color/image_raw
/camera/depth/camera_info
/camera/depth/image_raw
/camera/depth/points
/camera/ir/camera_info
/camera/ir/image_raw
First Issue:
The color (/camera/color/image_raw) and IR (/camera/ir/image_raw) image stream seems to be working fine, but the big issue is the depth (/camera/depth/image_raw) image stream as it is flickering very fast and does not seem to detect anything.
Second Issue:
When launching the camera by running "roslaunch astra_camera astra_pro.launch" we received three warnings:
Publishing dynamic camera transforms (/tf) at 10 Hz
Camera calibration file /home/astra/.ros/camera_info/rgb_camera.yaml not found.
Camera calibration file /home/astra/.ros/camera_info/ir_camera.yaml not found.
By calibrating the color camera using a checkerboard, we were able to solve the 2. warning, as it generated the rgb_image.yaml file containing the intrinsic parameters. We tried calibrating the ir camera as well, but the ir_camera.yaml file was not generated. We have not yet solved the 1. warning.
Even though we are unsure if this is related to the issue regarding the flickering depth image stream, we believe it is worth mentioning.
We are ROS beginners and would be grateful for any feedback that could help us finding a solution. If you need any other or more information, please let us know.
Thanks,enter image description here
The following gif shows the issue Flickering-Issue

How can I improve Tesseract results quality?

I'm trying to read the NIRPP number (social security number) from a French vital card using Tesseract's OCR (I'm using TesseractOCRiOS 4.0.0). So here is what I'm doing :
First, I request a picture of the whole card :
Then, using a custom cropper, I ask the user to zoom specifically on the card number:
And then I catch this image (1291x202px) and using Tesseract I try to read the number:
let tesseract = G8Tesseract(language: "eng")
tesseract?.image = pickedImage
tesseract?.recognize()
print("\(tesseract?.recognizedText ?? "")")
But I'm getting pretty bad results... only like 30% of the time Tesseract is able to find the right number, and among these sometimes I need to trim some characters (like alpha characters, dots, dashes...).
So is there a solution for me to improve these results?
Thanks for your help.
To improve your results :
Zoom your image to appropriate level. Right amount of zoom will improve your accuracy by a lot.
Configure tesseract so that only digits are whitelisted . I am
assuming here what you are trying to extract contains only digits.If
you whitelist only digits then it will improve your chances of
recognizing 0 as 0 and not O character.
If your extracted text matches a regex, you should configure
tesseract to use that regex as well.
Pre process your image to remove any background colors and apply
Morphology effects like erode to increase the space between your
characters/digits. If they are too close , tesseract will have
hard time recognizing them correctly. Most of the image processing
library comes prebuilt with those effects.
Use tiff as image format.
Once you have the right preprocessing pipeline and configuration for tesseract , you will usually get a very good and consistent result.
There are couple of things you need to do it....
1.you need to apply black and white or gray scale on image.
you will use default functionality like Graphics framework or third party libray like openCV or GPUImage for applying black&white or grayscale.
2.and then apply text detection using Vision framework.
From vision text detection you can crop texts according to vision text detected coordinates.
3.pass this cropped images(text detected) to TesseractOCRiOS...
I hope it will work for your use-case.
Thanks
I have a similar issue. I discovered that Tesseract recognizes a text only if the given image contain a region of interest.
I solved the problem using Apple' Vision framework. It has VNDetectTextRectanglesRequest that returns CGRect of detected text according to the image. Then you can crop the image to region where text is present and send them to Tesseract for detection.
Ray Smith says:
Since HP had independently-developed page layout analysis technology that was used in products, (and therefore not released for open-source) Tesseract never needed its own page layout analysis. Tesseract therefore assumes that its input is a binary image with optional polygonal text regions defined.

Draw overlay on iOS Camera and Save image with Overlay

I'm trying to Implement tesseract Ocr for Car Engine VIN number reader. Partially I have done it with the help of tesseract ocr sdk. Now I'm trying to implement camera something like this
I would like to do any of these three,
crop the image automatically inside rectangle
capture the image only inside rectangle
save image along with rectangle overlay.
I'm sorry, I'm new to iOS development. kindly help me. Thank you.

Extract image within a tube using opencv or others

I'm new to image processing and I'm trying to extract the area within a test tube.
The tube will be placed in front of a camera and then the image is captured. The image obtained will be as shown below. However, there will be slight movements of the tube so its not exactly at the same position each time. The Idea is to recognize the tube position and extract an image corresponding to the inside of the test tube (holding some reaction). Can somebody please guide me how to achieve this using OpenCV or any other image processing library. Thanks a lot!

AV Foundation Capture Image

My basic task is to capture a part of an image and then use it. To give you an overview, I am creating an app based on OCR, I allow to user to take a picture using camera. However, rather than processing the entire image, I only want some part of it to be selected and send for processing (Preferably a rectangle). So, to sum it up, I want an overlay to be provided, and I want the image inside that overlay to be further used rather than the entire clicked image.
Now, by my understanding, I realize that AVFoundation is the tool to capture the image, however in my app I have used UIImagePicker. I am totally confused since I am a newbie and not sure hot to proceed ahead. Appreciate all for the help. Thanks again
There is fine open source library for OCR in iOS :
https://github.com/nolanbrown/Tesseract-iPhone-Demo
This will work best if the image resolution is 150 * 150 . For more in formation of other libraries you can also refer the SO question:OCR Lib

Resources