Tesseract for License Plate (especially Korean version) - opencv

I'm working on my project for License Plate Recognition using OpenCV & Tesseract.
I use OpenCV to change original image to processed image so that Tesseract can read it well.
For example)
Original Image
Processed Image
But the result shows "38다9502"and it recognized 3 to 5.
These situation happens frequently especially when the number is 3 or 5.
Is there any suggestion or solution for it??

You can try retraining tesseract with some of your own data. It looks like a good candidate for simply fine-tuning the model. You may not even need much data, just give it several examples of the digits it is having trouble with.
Instructions for retraining are here: https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00

1)First it can be done with few image processing techniques which is mentioned in this link(https://cvisiondemy.com/license-plate-detection-with-opencv-and-python/)
2)Next if it doesn't show any improvement you can try image thresholding which you can go through in this link(https://docs.opencv.org/master/d7/d4d/tutorial_py_thresholding.html)
3)If above steps didn't work ,then try to enlarge your image size.

I solved this question with using multiple models supported by Tesseract.
With Hangul model, I only received accurate information of Hangul word, not Numbers.
However, with English model, I can received accurate information of Numbers.
So I used these models in parallel and it resulted 99% accuracy of LPR.

Related

How to recognize or match two images?

I have one image stored in my bundle or in the application.
Now I want to scan images in camera and want to compare that images with my locally stored image. When image is matched I want to play one video and if user move camera from that particular image to somewhere else then I want to stop that video.
For that I have tried Wikitude sdk for iOS but it is not working properly as it is crashing anytime because of memory issues or some other reasons.
Other things came in mind that Core ML and ARKit but Core ML detect the image's properties like name, type, colors etc and I want to match the image. ARKit will not support all devices and ios and also image matching as per requirement is possible or not that I don't have idea.
If anybody have any idea to achieve this requirement they can share. every help will be appreciated. Thanks:)
Easiest way is ARKit's imageDetection. You know the limitation of devices it support. But the result it gives is wide and really easy to implement. Here is an example
Next is CoreML, which is the hardest way. You need to understand machine learning even if in brief. Then the tough part - training with your dataset. Biggest drawback is you have single image. I would discard this method.
Finally mid way solution is to use OpenCV. It might be hard but suit your need. You can find different methods of feature matching to find your image in camera feed. example here. You can use objective-c++ to code in c++ for ios.
Your task is image similarity you can do it simply and with more reliable output results using machine learning. Since your task is using camera scanning. Better option is CoreML.You can refer this link by apple for Image Similarity.You can optimize your results by training with your own datasets. Any more clarifications needed comment.
Another approach is to use a so-called "siamese network". Which really means that you use a model such as Inception-v3 or MobileNet and both images and you compare their outputs.
However, these models usually give a classification output, i.e. "this is a cat". But if you remove that classification layer from the model, it gives an output that is just a bunch of numbers that describe what sort of things are in the image but in a very abstract sense.
If these numbers for two images are very similar -- if the "distance" between them is very small -- then the two images are very similar too.
So you can take an existing Core ML model, remove the classification layer, run it twice (once on each image), which gives you two sets of numbers, and then compute the distance between these numbers. If this distance is lower than some kind of threshold, then the images are similar enough.

How to do segmentation based on some filters(e.g. TRAFFIC SIGNALS) from live streaming data

I am supposed to do traffic symbols recognition from live streaming data. Please tell me how to automate the process of segmentation. I am able to recognize the symbols using Neural Networks from segmented data but stuck in the segmentation part.
I have tried it using YOLO, but I think I am lacking something.
I have also tried it with openCV.
please help
INPUT IMAGE FRAME FROM LIVE STREAM
OUTPUT
I would suggest you follow this link:
https://github.com/AlexeyAB/darknet/tree/47c7af1cea5bbdedf1184963355e6418cb8b1b4f#how-to-train-pascal-voc-data
It's very simple to follow. You basicly need to do 2 steps. Installing and creating the data you want to use (road signs in your case).
So follow the installation guide and then try to find a dataset of road signs, use your own or create your own data set. You will need the annotation files as well (you can generate them yourself easily if you use your own dataset(s) - this is explained in the link as well). You don't need a huge amount of pictures, because darknet will augment the images automaticly (just resizing though). If you use a pretrained version you should get "ok" results pretty fast ~after 500 iterations.

Face landmark extraction in OpenCV 3.0. Can anyone suggest any good open source libraries that will allow me to extract facial landmarks?

I am currently using OpenCV3.0 with the hope i will be able to create a program that does 3 things. First, finds faces within a live video feed. Secondly, extracts the locations of facial landmarks using ASM or AAM. Finally, uses a SVM to classify the facial expression on the persons face in the video.
I have done a fair amount of research into this but can't find anywhere the most suitable open source AAM or ASM library to complete this function. Also if possible I would like to be able to train the AAM or ASM to extract the specific face landmarks i require. For example, all the numbered points in the picture linked below:
www.imgur.com/XnbCZXf
If there are any alternatives to what i have suggested to get the required functionality then feel free to suggest them to me.
Thanks in advance for any answers, all advice is welcome to help me along with this project.
In the comments, I see that you are opting to train your own face landmark detector using the dlib library. You had a few questions regarding what training set dlib used to generate their provided "shape_predictor_68_face_landmarks.dat" model.
Some pointers:
The author (Davis King) stated that he used the annotated images from the iBUG 300-W dataset. This dataset has a total of 11,167 images annotated with the 68-point convention. As a standard trick, he also mirrors each image to effectively double the training set size, ie 11,167*2=22334 images. Here's a link to the dataset: http://ibug.doc.ic.ac.uk/resources/facial-point-annotations/
Note: the iBUG 300-W dataset includes two datasets that are not freely/publicly available: XM2VTS, and FRGCv2. Unfortunately, these images make up a majority of the ibug 300-W (7310 images, or 65.5%).
The original paper only trained on the HELEN, AFW, and LFPW datasets. So, you ought to be able to generate a reasonably-good model on only the publicly-available images (HELEN,LFPW,AFW,IBUG), ie 3857 images.
If you Google "one millisecond face alignment kazemi", the paper (and project page) will be the top hits.
You can read more about the details of the training procedure by reading the comments section of this dlib blog post. In particular, he briefly discusses the parameters he chose for training: http://blog.dlib.net/2014/08/real-time-face-pose-estimation.html
With the size of the training set in mind (thousands of images), I don't think you will get acceptable results with just a handful of images. Fortunately, there are many publicly available face datasets out there, including the dataset linked above :)
Hope that helps!
AAM and ASM are pretty old school and results are a little bit disappointing.
Most of Facial landmarks trackers use cascade of patches or deep-learning. You have DLib that performs pretty well (+BSD licence) with this demo, some other on github or a bunch of API as this one that is free to use.
You can also give a look at my project using C++/OpenCV/DLib with all functionalities you quoted and perfectly operational.
Try Stasm4.0.0. It gives approximately 77 points on face.
I advise you to use FaceTracker library. It is written in C++ using OpenCV 2.x. You won't be disappointed on it.

Image pre-processing in OCR

Our project is all about OCR and base on my research, before performing the character recognition it will go through on pre-processing stage. I know we can use openCV for that but we can't use it base on our rules.
My question is, can someone tells me the step-by-step of pre processing and the best method/algorithm to use.
like what I know,
1.YUVluminace
2.greyscale
3.otsu thresholding
4.Binarization
5.Hough transform
Original Image> YUVluminace> greyscale what's next??
thanks!
In some of my older blog posts, I addressed some parts of your questions:
Binarization on various image qualities from mobile cameras:
http://www.ocr-it.com/guide-to-better-mobile-images-from-cell-phone-camera-for-higher-quality-ocr
Image pre-rpocessing and segmentation for better OCR:
http://www.ocr-it.com/user-scenario-process-digital-camera-pictures-and-ocr-to-extract-specific-numbers
In reality, there is no step-by-step, per my experience. You could use original image for OCR if you wanted to, with means no pre-processing is nessesary. Yes, pre-processing will help, but it depends on the source and type of your images (which you did not specify). For example, a typical office document scanned on a professional scanner with Kofax VRS requires no pre-processing before OCR. Mobile camera image requires a lot of pre-processing. Picture from a parking garage camera will require a lot of pre-processing, but different steps and algorithms from mobile camera picture.
I think decide what is the next major limiting factor in your images, pre-process against it, then look for the next correctable issue.

OpenCV detect numbers

I'm using OpenCV on the iPhone and need to detect numbers in an image. I split the image into smaller images so each image has only one number (1-9). All numbers are printed, NOT handwritten.
What would be the best approach to figure out the numbers with OpenCV?
UPDATE:
I have successfully found the numbers and extracted them. They look like this:
http://img198.imageshack.us/img198/5671/101ht.jpg
http://img824.imageshack.us/img824/539/606yu.jpg
When they are extracted they are in the same size and so on. I have saved a bunch of images and put them in a OCR dir where they are categorized into numbers. Like: ocr/1/100.jpg 101.jpg.... and ocr/2/200.jpg 201.jpg....
Then I was going to use the same approach as in the Basic OCR tutorial:http://blog.damiles.com/?p=93
However, I'm programming for iPhone and can't use C++ code (error on compiling and so on) and I don't have access to highgui.
I tried using cvMatchTemplate() and match a bunch of images but it seems to work pretty bad...
Any other ideas I can try?
You could start by reading about Principal Component Analysis (PCA), Fisher's Linear Discriminant Analysis (LDA), and Support Vector Machines (SVMs). These are classification methods that are extremely useful for OCR, and there are libraries in any language including C++, Python, C# etc.
It turns out that OpenCV already includes excellent implementations on PCAs and SVMs[dead link]. I haven't seen any OpenCV code examples for OCR, but you can use some modified version of face classification to perform character classification. An excellent resource for face recognition code for OpenCV is this website[dead link].
If the numbers are printed, the job is quite simple, you just need to figure out a nice set of features to match. If the numbers are one font, you can get away with this approach:
Extract the number
Find the bounding box
Scale the image down to something like 10x8, try to match the aspect ratio
Do this for a small training set, take the 'average' image for each number
For new images, follow the steps above, but the last is just a absolute image difference with each of the number-templates. Then take the sum of the differences (pixels in the difference image). The one with the minimum is your number.
All above are basic OpenCV operations.
Basically your problem is just to classify a feature vector, which is the set of pixel intensities after some preprocessing steps. You can use any classifier for this task, like eg. neural networks, which should have a C implementation inside OpenCV. You might also try a C libsvm library for Support Vector Machines.
There is a good site related to this problem with a lot of papers and a training database.
Maybe the most simple and convinient way is to use svm as ml algorithm
http://opencv.willowgarage.com/documentation/cpp/support_vector_machines.html
and gray images as feature vectors.
Objective C++?
Try renaming your .m files to .mm and you can then use c++ in your iPhone project.
Convolution Neural Networks are by far the best algorithms for hand written digits. The are implemented in most systems like USPS etc. Here are few papers explaining the algorithms.
http://yann.lecun.com/exdb/lenet/
This is a nice open source ,It is a ORCDemo on iPhone.Hope it is useful to you
Simple Digit Recognition OCR in OpenCV-Python
This might help you out. Converting the code from Python to C++ is not a difficult task, since OpenCV API's are same for the both.
Tesseract is also a nice free OCR engine that is readily available for iPhone and allows you to use your own sets of training images:
http://tinsuke.wordpress.com/2011/11/01/how-to-compile-and-use-tesseract-3-01-on-ios-sdk-5/
HOG + SVM (Try to play with kernels)

Resources