I am working on a test application to develop a small text detection and recognition app in python using Google Collab. Can you advise any code examples to achieve this? My requirement is that I should be able to detect and recognize text in an image using OpenCV.
Please advise.
you need to make pipeline with following step. if you work only opencv.
opencv for pre-processing - use morphological operations.
For Text detection - use Craft model or finding contours in your image.
For Recognition - Use Tesseract-OCR
According to my personal experience. EasyOCR is very good with good accuracy. easy to use and train your own text also.
Related
I’m working on the object detection application using a camera and Sensor. I wanted to extract the features of a custom image using the YOLO v4 before the detection and save it into the text file for further clustering analysis with the sensor. would it be possible to extract these features before the detection? If so, please suggest the steps.
Thanks in Advance
I have been trying OpenCV iOS sample to achieve facial emotion recognition.
I got OpenCV sample iOS project 'openCViOSFaceTrackingTutorial' from below link.
https://github.com/egeorgiou/openCViOSFaceTrackingTutorial/tree/master/openCViOSFaceTrackingTutorial
This sample project uses 'face detection', it works fine. It uses 'haarcascade_frontalface_alt2.xml' trained model.
I want to use the same project but have haarcascade for other facial emotions like Sad, Surprise. I have been searching for how to train haarcascade for emotions like Sad, Surprise etc. but couldn't find any clue.
Could someone advise me, how to train haarcascade for emotions like Sad, Surprise etc. to use in this sample OpenCV iOS project? Or will there be readymade haarcascade for emotions like Sad, Surprise etc. to use for this iOS sample.
Thanks
you can read tutorial on how to generate haarcascade file, but generating haarcascade for emotions is not easy task.
I would suggest to extract Mouth and eye from face and using haarcascade and process these rectangles for detecting emotions. you can get gaarcascade for mouth and eye from here . Mouth and Eye are complicated feature so will not work if you will try to find it in whole image, so first find the front face and try to detect mouth and eye within face rectangle only.
There are open source library available on github for emotion detaction, though these are not for ios, you can use similar algorithm to implement in ios.
I am very new to opencv and able to install it so far. I want to compare a face with other different faces available in library and to find out the closest match. I have tried different features but couldn't find the closer answer.
Any suggestion to choose a detector.
dimensions of input image and images in library are same.
thanks in advance
i think, you want face recognition (who is it?), not detection (is it a face?).
look here for what opencv has to offer there
My current project involves transcribing texts in pdf into text files, and I first tried putting the image file directly into OCR program (tesseract) and it didnt' do that well.
The original image files are old news papers, basically, and have some background noises, which I am sure tesseract has problem with. So I am trying to use some image preprocessing before feeding it into tesseract. Is there any suggestion for open source image preprocessing engine that fits well to this situation??? And instructions on how to use it would be even more appreciated !
I never heard of an "image preprocessing engine" for that purpose, but you can take a look at OpenCV (Open Source Computer Vision Library) and implement your own "pre-processing engine". OpenCV is a computer vision library that offers many features to perform image processing.
One interesting thing you might want test as a preprocessing step is apply a threshold to the image to remove noises and stuff. Anyway, I've talked about this kind of stuff in this thread.
Like #karlphillip mentioned, I highly doubt there's a readily available preprocessing engine for your purposes as the preprocessing technique vary greatly with the desired result.
Some common approaches to clearing up the text in noisy images include:
1. Adaptive thresholding (Sauvola or Niblack binarization)
2. Applying a median filter of a size slightly larger than the text to get a background image, then subtract out the background from the original image (to remove the larger noise like creases, stains, handwritten notes, etc.).
OpenCV has implementations of these filters/binarization methods. If you have access to published literature there's quite a bit of work on binarization of noisy documents.
Check out ScanTailor. It has pretty impressive pre-processing functionality and it is open source.
I'm attempting to implement an easter egg in a mobile app I'm working on. These easter egg will be triggered when a logo is detected in the camera view. The logo I'm trying to detect is this one: .
I'm not quite sure what the best way to approach this is as I'm pretty new to computer vision. I'm currently finding horizontal edges using the Canny algorithm. I then find line segments using the probabilistic Hough transform. The output of this looks as follows (blue lines represent the line segments detected by the probabilistic Hough transform):
The next step I was going to take would be to look for a group of around 24 lines (fitting within a nearly square rectangle), each line would have to be approximately the same length. I'd use these two signals to indicate the potential presence of the logo. I realise that this is probably a very naive approach and would welcome suggestions as to how to better detect this logo in a more reliable manner?
Thanks
You may want to go with SIFT using Rob Hess' SIFT Library. It's using OpenCV and also pretty fast. I guess that easier than your current way of approaching the logo detection :)
Try also looking for SURF, which claims to be faster & robuster than SIFT. This Feature Detection tutorial will help you.
You may just want to use LogoGrab's technology. It's the best out there and offers all sorts of APIs (both mobile and HTTP). http://www.logograb.com/technologyteam/
I'm not quite sure if you would find such features in the logo to go with a SIFT/SURF approach. As an alternative you can try training a Haar-like feature classifier and use it for detecting the logo, just like opencv does for face detection.
You could also try the Tensorflow's object detection API here:
https://github.com/tensorflow/models/tree/master/research/object_detection
The good thing about this API is that it contains State-of-the-art models in Object Detection & Classification. These models that tensorflow provide are free to train and some of them promise quite astonishing results. I have already trained a model for the company I am working on, that does quite amazing job in LOGO detection from Images & Video Streams. You can check more about my work here:
https://github.com/kochlisGit/LogoLens