I'm trying to make a credit card ocr engine like Card.io . But it is very tough to preprocess the card into binary image without noise. I use SWT algorithm but it not works well for all the card. There are variety of credit cards with low contrast background and embossed number. It is very hard to design a common algorithm to preprocess card well for OCR.So dose any one have experience on this kind of card process. The image below is the example of card which is hard for me to preprocess
The thing is that credit cards are on purpose so hard to read (therefore to process). The best I could do was, transforming them to greyscale and add some horizontal sobel filter and then maybe play with some more transformations. And still it was difficult to extract the binary image with the numbers only... And it definitely have different outcome from image to image...
Good luck!
I know i am late but found the solution in adrian rosebrock's blog post over here -
https://www.pyimagesearch.com/2017/07/17/credit-card-ocr-with-opencv-and-python/
The technique is to perform template matching with opencv
then filter the credit card image using grayscale, morphology operations and thresholding.
Then we can find contours of our interest that means contours with relevent height:width ratio.
Once we have got contours of our interest (contours with group of digits) we can again find and extract individual digits. Then we can compare these digits with the digits n our template image.
And in this way we can OCR a credit card
Related
I want to perform credit card OCR using opencv-python. sample credit card image How can this be done?
This is what is at the top of my head (a very high-level algorithm):
Detect the outline using blob detection (probably verify that you have two sets of parallel lines)
With the two sets of parallel lines, you know how the characters are aligned. Rotate the entire image in that angle, so text would be horizontal (which makes your work simpler)
Perform adaptive binary thresholding (eg. Otsu).
Usually, if you know approximately where to look for information (the co-ordinates relative to the card border), you can use any OCR algorithm when you segment text (binarize it).
my first hurdle so far is that running tesseract vanilla on images of MTG cards doesn't recognize the card title (honestly that's all I need because I can use that text to pull the rest of the card info from a database) I think the issue might be having to need to train tesseract to recognize the font use in mtg cards but im wondering if it might be an issue with tesseract not looking or not detecting text in a section of the image (specifically the title.)
Edit: including an image of a MTG card for reference.http://gatherer.wizards.com/Handlers/Image.ashx?multiverseid=175263&type=card
Ok so, after asking on reddit programming forums I think I found an answer that I am going to pursue:
The training feature of tesseract is indeed for improving rates for unusual fonts, but that's probably not the reason you have low success.
The environment the text is in is not well controlled - the card background can be a texture in one of five colours plus artifacts and lands. Tesseract greyscales the image before processing, so the contrast between the text and background is not sufficient.
You could put your cards through a preprocessor which mutes coloured areas to white and enhances monotones. That should increase the contrast so tesseract can make out the characters.
If anyone still following thsi believes that above path to be the wrong one to start down, please say so.
TLDR
I believe you're on the right track, doing preprocessing.
But you will need doing both preprocessing AND training Tesseract.
Preprocessing
Basically, you want to get the title text, and only the title text, for Tesseract to read. I suggest you follow the steps below:
Identify the borders of the card.
Cut out the title area for further processing.
Convert the image to black'n'white.
Use contours to identify the exact text area, and crop it out.
Send the image you got to Tesseract.
How to create a basic preprocessing is shown in the YouTube video Automatic MTG card sorting: Part 2 - Automatic perspective correction with OpenCV Also have a look at the third part in that serie.
With this said, there is a number of problems you will encounter. How to handle split cards? Will your algorithm manage white borders? What if the card is rotated or upside-down? just to name a few.
Need for Training
But even if you manage to create a perfect preprocessing algorithm you will still have to train Tesseract. This is due to the special text font used on the cards (which happens to be different fonts depending on the age of the card!).
Consider the card "Kinjalli's Caller".
http://gatherer.wizards.com/Handlers/Image.ashx?multiverseid=435169&type=card
Note how similar the "j" is to the "i". An untrained Tesseract tend to mix them up.
Conclusion
All this considered, my answer to you is that you need to do both preprocessing of the card image AND to train Tesseract.
If you're still interested I would like to suggest that you have a look at this MTG-card reading project on GitHub. That way you don't have to reinvent the wheel.
https://github.com/klanderfri/ReadMagicCard
I have an image for mobile phone credit recharge card and I want to extract the recharge number only (the gray area) as a sequence of number that can be used to recharge the phone directly
This is a sample photo only and cannot be considered as standard, thus the rectangle area may differ in position , in the background and the card also may differ in size .The scratch area may not be fully scratched , the camera's depth and position may differ too . I read a lots and lots of papers on the internet but i can't find any thing that could be interesting and most of papers discuss detection of handwritten numbers .
Any links or algorithms names could be very useful .
You can search the papers on vehicle plate number detection with machine learning methods. Basically you need to extract the number first, you may use sobel filter to extract the vertical edges , then threshold (binary image) and morphologic operations (remove blank spaces between each vertical edge line, and connect all regions that have a high number of edges). Finally retrieve the contour and fill in the connected components with mask.
After you extract the numbers , you can use machine learning method such as neural network and svm to recognize them.
Hope it helps.
Extract the GRAY part from image and then Use Tesseract(OCR) to extract the text written on the gray image.
I think you may not find the algorithm to read from the image on the internet. Nobody will disclose that. I think, if you are a hardcore programmer you can crack that using your own code. I tried from the screenshots where the fonts were clearer and the algorithm was simple. For this, the algorithm should be complex since you are reading from photo source instead of a screenshot.
Follow the following steps:
Load the image.
Select the digits ( By contour finding and applying constraints on area and height of letters to avoid false detections). This will split the image and thus modularise the OCR operation you want to perform.
A simple K - nearest neighbour algorithm for performing the identification and classification.
If the end goal was just to make a bot, you could probably pull the text directly from the app rather than worrying about OCR, but if you want to learn more about machine learning and you haven't done them already the MNIST and CIFAR-10 datasets are fantastic places to start.
If you preprocessed your image so that yellow pixels are black and all others are white you would have a much cleaner source to work with.
If you want to push forward with Tesseract for this and the preprocessing isn't enough then you will probably have to retrain it for this font. You will need to prepare a corpus, process it similarly to how you expect your source data to look, and then use something like qt-box-editor to correct the data. This guide should be able to walk you through the basic steps of retraining.
i have to write a openCV application to extract certain part from an image (shopping bill). Im not sure which filters or functions i should use to accomplish this (i.e removing the background noise such as hands).Can some one give me some hint on which functions and filters will work best to remove such background noise and to extract the shopping bill part from the image.
Thanks
This is a very subjective question, for it is highly dependent on how your image is taken.
Knowing, and controlling if possible, things like the color of the bill, its geometry, the possible colors of the background, range of distance between the bill and the camera, lighting, etc, are of uttermost importance here and will ultimately define the best approach, so I can't give you an objective answer to your question.
I can however try to point you to the right direction, so I suggest you should take a look at the OpenCV tutorials, in special segmentation algorithms, hough transforms and generic object detectors.
general noise reduction: gaussian or median blurring will help you to achieve a low pass filtering operation.
fact 1: shopping bills are black and white
approach: use color detection. Taking the HLS equivalent of your image -using cvtColor- and looking at the lightness channel will help you.
fact 2: shopping bills have plain white background
approach: using binary threshold with an addition of contour finding algorithm -using findContours- could help you to extract the bill region.
fact 3: shopping bills have numbers
approach: You can add OCR to filter out regions not having any numbers inside. Hard to implement, though.
fact 4: shopping bills are quadliteral
approach: shape detection is not hard to implement. I assume there are so many works on that. I once recognized squares succesfully with "hu moment comparison" method. If you have a rectangular shot of bill, there could be open source implementations, search for "detect largest rectangle opencv". If you are having shots from different angles, go through these papers, codes and tutorials:
http://users.cecs.anu.edu.au/~nmb/papers/06-PRD.pdf
http://opencv-code.com/tutorials/automatic-perspective-correction-for-quadrilateral-objects/
https://github.com/drewnoakes/quadrilateral-finder
http://aishack.in/tutorials/an-introduction-to-contours/
http://www.scielo.org.mx/pdf/cys/v15n2/v15n2a5.pdf
basically, you need to find the "extreme points" of your blobs somehow, than decide if it is a quadliteral or not.
What method is suitable to capture (detect) MRZ from a photo of a document? I'm thinking about cascade classifier (e.g. Viola-Jones), but it seems a bit weird to use it for this problem.
If you know that you will look for text in a passport, why not try to find passport model points on it first. Match template of a passport to it by using ASM/AAM (Active shape model, Active Appearance Model) techniques. Once you have passport position information you can cut out the regions that you are interested in. This will take some time to implement though.
Consider this approach as a great starting point:
Black top-hat followed by a horisontal derivative highlights long rows of characters.
Morphological closing operation(s) merge the nearby characters and character rows together into a single large blob.
Optional erosion operation(s) remove the small blobs.
Otsu thresholding followed by contour detection and filtering away the contours which are apparently too small, too round, or located in the wrong place will get you a small number of possible locations for the MRZ
Finally, compute bounding boxes for the locations you found and see whether you can OCR them successfully.
It may not be the most efficient way to solve the problem, but it is surprisingly robust.
A better approach would be the use of projection profile methods. A projection profile method is based on the following idea:
Create an array A with an entry for every row in your b/w input document. Now set A[i] to the number of black pixels in the i-th row of your original image.
(You can also create a vertical projection profile by considering columns in the original image instead of rows.)
Now the array A is the projected row/column histogram of your document and the problem of detecting MRZs can be approached by examining the valleys in the A histogram.
This problem, however, is not completely solved, so there are many variations and improvements. Here's some additional documentation:
Projection profiles in Google Scholar: http://scholar.google.com/scholar?q=projection+profile+method
Tesseract-ocr, a great open source OCR library: https://code.google.com/p/tesseract-ocr/
Viola & Jones' Haar-like features generate many (many (many)) features to try to describe an object and are a bit more robust to scale and the like. Their approach was a unique approach to a difficult problem.
Here, however, you have plenty of constraint on the problem and anything like that seems a bit overkill. Rather than 'optimizing early', I'd say evaluate the standard OCR tools off the shelf and see where they get you. I believe you'll be pleasantly surprised.
PS:
You'll want to preprocess the image to isolate the characters on a white background. This can be done quite easily and will help the OCR algorithms significantly.
You might want to consider using stroke width transform.
You can follow these tips to implement it.