I'm making an automated text recognition script with Python on Ubuntu.
I'm using Gocr and the recognition render is too low.
Exemple:
Output: _O4_4E34E_4_O4_
I suppose that the type in the image is too bold, so I'm asking if there is a way to make it thinner using an python library or a linux command.
You probably will need to apply a morphological operation like "erosion" on your image, e.g by using OpenCV. This will make structures thinner. To the cost of the quality, though.
Look here: https://opencv-python-tutroals.readthedocs.org/en/latest/py_tutorials/py_imgproc/py_morphological_ops/py_morphological_ops.html
Related
I am using Drake to implement a visual servoing scheme. For that I need to process color images and extract features.
I am using the ManipulationStation class, and I have already published the color images of the rgbd_sensor objects to LCM channels (in c++). Now I want to process the images during the simulation.
I know that it would be best to process images internally (without using the ImageWriter), and for that, I cannot use the image_array_t on LCM channels or ImageRgbaU8, I have to convert images to an Eigen or OpenCV type.
I will then use feature extraction functions in these libraries. These features will be used in a control law.
Do you have any examples on how to write a system or code in c++ that could convert my Drake images to OpenCV or Eigen types? What is the best way to do it?
Thanks in advance for the help!
Arnaud
There's no current example converting the drake::Image to Eigen or OpenCV but it should be pretty straightforward.
What you should do is create a system akin to ImageWriter. It takes an image as an input and does whatever processing you need on the image. You connect its input port to the RgbdSensor's output port (again, like ImageWriter.) The only difference, instead of taking the image and converting it to an lcm data type, you convert it to your OpenCV or Eigen type and apply your logic to that.
If you look, ImageWriter declares a periodic publish event. Change that to (probably) a discrete update event (still periodic, I'd imagine) and then change ImageWriter::WriteImage callback into your "Image-to-OpenCV" callback.
Finally, you could also declare an output port that would make your converted image to other systems.
I am running fedora 11.
I seek to identify the characters indicating the date on a satellite image, but its resolution was (deliberately?) deteriorated.
The goal is to automate this procedure. For that I use ocr program tesseract.
It works perfectly on my PC with the scans, but in this case it does not work.
Here's what I did:
Address of the image:
(source: meteo.pf)
I convert in tiff format (used by tesseract, (bpp ok))
I use tesseract: tesseract societeir.tif test, but no output.
When I increase the image scaling, the ocr online work, but tesseract not.
Do you have a suggestion?
One suggestion,
Since the date on the image is most likely going to be in the same position and of the same dimension, you can try to cut it out and save it as another image using an image processing tool. I normally would use gimp, but leptonica, imagemagick are other tools that I can think of. The recognition should be better on the new image
Copy the date region into memory, run enhancements on it then run the OCR against it?
I'm using OpenCV on the iPhone and need to detect numbers in an image. I split the image into smaller images so each image has only one number (1-9). All numbers are printed, NOT handwritten.
What would be the best approach to figure out the numbers with OpenCV?
UPDATE:
I have successfully found the numbers and extracted them. They look like this:
http://img198.imageshack.us/img198/5671/101ht.jpg
http://img824.imageshack.us/img824/539/606yu.jpg
When they are extracted they are in the same size and so on. I have saved a bunch of images and put them in a OCR dir where they are categorized into numbers. Like: ocr/1/100.jpg 101.jpg.... and ocr/2/200.jpg 201.jpg....
Then I was going to use the same approach as in the Basic OCR tutorial:http://blog.damiles.com/?p=93
However, I'm programming for iPhone and can't use C++ code (error on compiling and so on) and I don't have access to highgui.
I tried using cvMatchTemplate() and match a bunch of images but it seems to work pretty bad...
Any other ideas I can try?
You could start by reading about Principal Component Analysis (PCA), Fisher's Linear Discriminant Analysis (LDA), and Support Vector Machines (SVMs). These are classification methods that are extremely useful for OCR, and there are libraries in any language including C++, Python, C# etc.
It turns out that OpenCV already includes excellent implementations on PCAs and SVMs[dead link]. I haven't seen any OpenCV code examples for OCR, but you can use some modified version of face classification to perform character classification. An excellent resource for face recognition code for OpenCV is this website[dead link].
If the numbers are printed, the job is quite simple, you just need to figure out a nice set of features to match. If the numbers are one font, you can get away with this approach:
Extract the number
Find the bounding box
Scale the image down to something like 10x8, try to match the aspect ratio
Do this for a small training set, take the 'average' image for each number
For new images, follow the steps above, but the last is just a absolute image difference with each of the number-templates. Then take the sum of the differences (pixels in the difference image). The one with the minimum is your number.
All above are basic OpenCV operations.
Basically your problem is just to classify a feature vector, which is the set of pixel intensities after some preprocessing steps. You can use any classifier for this task, like eg. neural networks, which should have a C implementation inside OpenCV. You might also try a C libsvm library for Support Vector Machines.
There is a good site related to this problem with a lot of papers and a training database.
Maybe the most simple and convinient way is to use svm as ml algorithm
http://opencv.willowgarage.com/documentation/cpp/support_vector_machines.html
and gray images as feature vectors.
Objective C++?
Try renaming your .m files to .mm and you can then use c++ in your iPhone project.
Convolution Neural Networks are by far the best algorithms for hand written digits. The are implemented in most systems like USPS etc. Here are few papers explaining the algorithms.
http://yann.lecun.com/exdb/lenet/
This is a nice open source ,It is a ORCDemo on iPhone.Hope it is useful to you
Simple Digit Recognition OCR in OpenCV-Python
This might help you out. Converting the code from Python to C++ is not a difficult task, since OpenCV API's are same for the both.
Tesseract is also a nice free OCR engine that is readily available for iPhone and allows you to use your own sets of training images:
http://tinsuke.wordpress.com/2011/11/01/how-to-compile-and-use-tesseract-3-01-on-ios-sdk-5/
HOG + SVM (Try to play with kernels)
I have a set of image files that I can identify. Rather than an OCR, I'd like to search only for matches within the set. What's the ideal platform to quickly find matches?
OpenCV is an advanced computer vision library. It can recognize text blocks, colors, shapes, etc. so it might be of use.
Tesseract can be trained to handle languages, but I can't see a reason why you couldn't train it with shapes. Here's a really confusing training guide.
ImageMagick can also be useful. It's pretty hardcore endless parameter chaining, but you can get it to find images. It's not perfect for this application, but it's been done before. The documentation is insanely huge, but it's about as complete and illustrated as I could wish for (I'm a frequent user, as it's useful for quick image operations via CLI). Here's the image comparison documentation.
I would suggest OpenCV, but it's up to you. Good luck!
I am interested in studying some image processing. I imagine matlab is the best way to go about that but right now I don't have access to matlab. I tried octave but for some reason it can't even load a png, bmp or anything other than 1 specific format. R doesn't seem to be the key here either.
What is the language of choice here? Perl?
Also can anyone point me to any other good tutorials that I may have missed on image processing?
Opencv is an excellent image processing library. Although written in C it comes with some high level tools to display images handle image files, mouse events etc so you can experiment without writing a lot of windows code.
It also works with python, although I haven't used it with the PIL.
If you are interested in how the algorithms work then implementing them yourself using python and numpy for the matrix ops is easy.
I guess it depends on what you want to do. Matlab certainly is a high end choice, but for a lot of things the image modules of general purpose programming languages do the trick.
I did some pixel mangling and image processing with PIL, the python image library. It is perfectly sufficient for processing single RGB images of reasonable size (say, what a consumer digital camera delivers). It can handle alpha channels, has some filters, more or less quick methods of accessing the pixel information - and it is python, a very straightforward and readable language.
The recommended language in my computer vision class was Ch with the OpenCV library. Ch is basically an interpreted version of C, the syntax is quite similar but has a few nice features, like treating arrays as matrices. OpenCV will house pretty much any image processing function you could need.
I think any free programming environments will do basic image processing well. If speed is not an issue, Processing will work fine and you can easily extend your code to Java in the future.
Have a look at Adobe Pixel Bender. It's really fun to play with.