I am using FANN, pyfann in particular, for signature recognition. Before I can use AI, I have to prepocess the image first using the imagelab, a compilation of image processing libraries like image.h,jpegio.h,etc. My problem is I don't know how to incorporate the two so that I can use the their libraries in just one program. I have to extract the signatures' features like no. of pixels and length and width, but I don't know how to input these data to FANN. Any help? I really don't know exactly where to start.
Related
I'm not too familiar with Machine Learning techniques, and i want to know if I can transfer a final trained-model to another machine. More specifically, i'm trying to solve a sound classification problem by training a model on a regular PC, and then implement / transfer its output model to an embedded system where no libraries are allowed (C programming). The system does not support file reading either.
So my question is.
Are there learning methods with output models simple enough that it can be implemented easily on other systems? How would you implement it? (Something like Q-learning? although Q-learning wouldn't be appropriate in my project.)
I would like some pointers, thanks in advance.
Any arbitrary "blob" of data can be converted into a C byte array and compled and linked directly with your code. A code generator is simple enough to write, but there are tools that will do that directly such a Segger Bin2C (and any number of other tools called "bin2c") or the swiss-army knife of embedded data converters SRecord.
Since SRecord can do so many things, getting it to do this one thing is less than obvious:
srec_cat mymodel.nn -binary -o model.c -C-Array model -INClude
will generate a model.c and model.h file defining a data array containing the byte content of mymodel.nn.
I often work with scanned papers. The papers contain tables (similar to Excel tables) which I need to type into the computer manually. To make the task worse the tables can be of different number of columns. Manually entering them into Excel is mundane to say the least.
I thought I can save myself a week of work if I can put a program to OCR it. Would it be possible to detect headers text areas with the OpenCV and OCR the text behind the detected image coordinates.
Can I achieve this with the help of OpenCV or do I need entirely different approach?
Edit: Example table is really just a standard table similar to what you can see in Excel and other spread-sheet applications, see below.
This question seems a little old but i was also working on a similar problem and got my own solution which i am explaining here.
For reading text using any OCR engine there are many challanges in getting good accuracy which includes following main cases:
Presence of noise due to poor image quality / unwanted elements/blobs in the background region. This will require some pre-processing like noise removal which can be easily done using gaussian filter or normal median filter methods. These are also available in opencv.
Wrong orientation of image: Because of wrong orientation OCR engine fails to segment the lines and words in image correctly which gives the worst accuracy.
Presence of lines: While doing word or line segmentation OCR engine sometimes also tries to merge the words and lines together and thus processing wrong content and hence giving wrong results.
There are other issues also but these are the basic ones.
In this case i think the scan image quality is quite good and simple and following steps can be used solve the problem.
Simple image binarization will remove the background content leaving only necessary content as shown here.
Now we have to remove lines which in this case is tabular grid. This can also be identified using connected components and removing the large connected components. So our final image that is needed to be fed to OCR engine will look like this.
For OCR we can use Tesseract Open Source OCR Engine. I got following results from OCR:
Caption title
header! header2 header3
row1cell1 row1cell2 row1cell3
row2cell1 row2cell2 row2cell3
As we can see here that result is quite accurate but there are some issues like
header! which should be header1, this is because OCR engine misunderstood ! with 1. This problem can be solved by further processing the result using Regex based operations.
After post processing the OCR result it can be parsed to read the row and column values.
Also here in this case to classify the sheet title, heading and normal cell values their font information can be used.
With the PDF below, I would like to do the following things.
Localize the four sudoku grids so as to treat each of them separately.
For each grid picture, I would like to obtain a matrix of the pictures corresponding to each cell.
Finally, I would like to "find" the values printed in each cell.
The problem is that I'm a real beginner with OpenCV (I've bought a book about OpenCV with Python but I've not received it yet).
I'm not a beginner in Python, neither in math so every clue is welcomed.
You're in luck:
sudoku solver part 1
part 2
part 3
part 4
Python 3.x isn't supported by OpenCV though.
tesseract has nice python bindings, too (and is more specialized on that 'ocr' job ;)
welcome to opencv, though !
I am trying to build an iOS application. In one of the screens the user can type something in a search bar and I have to take same action for different spellings of the same word.
For eg: User can type "elephant" or "alephant" or "elefant". I have to take same action for all these three words.
Is there any library that identifies these words as similar ones ? I cannot use spellchecker as I need this in languages other than english also ..
I did some research and I found that there are some phonetic algorithms like Text::soundex for achieving this on server side. Wondering if any libraries there for iOS ?
Thanks in advance !!
A better alternative to Soundex would be Double Metaphone or, even better, Metaphone 3. You don't say what language you are using, but both of these algorithms are available in C++, C#, and Java
There's no soundex available in for example NSString, but if that's what you want, it's fairly easy to implement. Here's a—albeit horribly formatted—soundex NSString category from CocoaDev.
You could also use the Levenstein Distance algorithm to catch simple spelling errors. Also easy to implement (read the Wikipedia article for the details), but here's a NSString category for that.
Before you use these algorithms, normalize the input. There's the amazing CFStringTransform class in Core Foundation (see this great article about it on NSHipster—especially the last part about normalization) that automatically can transform different language inputs into normalized forms.
I want to learn that is EmguCV's EigenObjectRecognizer's has Recognize() method.But I could not found any information that is using which algorithm.I used it in my thesis and I need to know which technique is using that method.I know it uses Eigen Vector and Eigen Values but I am not sure how it uses it. Is any one know could point me ?
Thanks.
It use a PCA-based algorithm, which is what eigenface use.