I have downloaded already made project for this link and with given image it works perfect, but when I try change this image by adding my words, it seems recognise not very well as I expected. Especially "i" and "l" symbols etc.
Anybody know how to make this library better or maybe can suggest another one?
the recognised text:
httpV/stackoverflowmom/queslions/ask/
the image:
Alos if for example text is white and background is blue it return nothing.
Related
I'm trying to run one of the gym environments, CarRacing, with the code from (https://gist.github.com/lmclupr/b35c89b2f8f81b443166e88b787b03ab) modified to work with the current versions of Keras etc.
And it works: that is, it starts training and, if I put a print in "action", I can see what actions it is taking at each moment.
However, the display is not as it should be (see https://www.youtube.com/watch?v=pTYOyl8To7g), but I get a big screen that is totally black, and a smaller one where the black bar moves as if it wanted to show something.
Do you know what it could be? Is this a typical problem of gym, or cv...?
PS: I don't know how it is usually done in Stack to share a more or less long code (approx. 250 lines). Tell me if necessary.
See link below
TDLR add env.render("human") to your code
https://github.com/openai/gym/issues/492
I'm working on a college project that involves OCRing a certain digit-code (with a few other characters as seperators - mainly '.','/' etc..) .
that digit code (printed on products for example) is usually in "digital" fonts (e.g. 7-segment-like font, or a pixelated font etc.).
So I am trying to train Tesseract on several digital fonts I've found online, similar to those used with these code.
The thing is, that Tesseract recognizes the tiff files I provide it as blank pages.
Things I've tried:
1. creating a .box file using JTesseract & qt-box (and adjusting the boxes manually) : in this case, the box & tiff are read by Tesseract and I'm getting the output "1 Page", but no characters are recognized and the tr file in blank.
creating a .box file with Tesseract's makebox - in this case no boxes are created at all.
PS - I manage to train it just fine using more traditional fonts (Arial for example)
Any ideas?
Im attaching an image of such an example font.
Thank you!
I managed to work around most of the issues. Posting it in case it could help anyone else:
I did 2 steps to get Tesseract to identify my text:
Image processing on the training images - I've applied some image processing methods (mainly dilate, erode and some blur) to sort of "connect" the pixels in the text that were segmented or separated from one another. Its VERY IMPORTANT to apply the same steps exactly to the images to be fed to the OCR.
I've noticed that simply saving images as TIFF/PNGs via code doesn't save the DPI setting in the header for some reason (and Tesseract identified the as 0 DPI). I assume there's a code-way to do that but I didn't have time, so I just opened the files in Photoshop and saved them from there.
I'm not entirely sure if it was step 1,2 or both that solved my issue, but most characters were eventually identified.
I was supplied a logo for a website in .ai format, which I cropped and saved as SVG1.1, and placed this on multiple places on a friend's Shopify store.
Image in question (view this in Safari for iOS)
Screenshot of the image fault on the site: i.stack.imgur.com/HmsIB.png
Link to actual page here: bambooboss.com/pages/about-us
The "Panda Head" image under the first photo is half blacked out when viewed with Safari, latest version of iOS on my iPhone 5. While it looks quite cool, it's definitely not anything like my friend's original creative vision...
I tried another answer here on SO, where they tried surrounding all LinearGradient with <defs> tags, which I did to the current image - but to no avail...
Anyone have a clue what's going on? Is it compatibility? or did something go wrong while saving to SVG from .ai?
Change the following line to:
.st38{opacity:.08;fill:url(#XMLID_108_);}
i.e. remove the 8.00000e-02
also ... run this through an optimiser like SVGOMG ... if displayed at the small size this is far too complicated and large. Simplify all the gradients, maybe even merge them all into one. Should be able to get the ungzipped version down an order of magnitude from 35kb. Lovely logo though.
I'm trying to write an app that recognize a logo saved in app bundle and readed as UIImage. I have did a search before make this question, the only free solution seems to be OpenCv. I have tried it in a demo i had download from toptal_logo_detector . The demo works and i can find my logo everywhere i place it. Anyway the camera is very slow, too slow to use it in a real app. Maybe there's a way to optimize it, but my question is another.
I have to recognize a vector logo (always the same logo) centered in a white background ,something like this wifi logo:
My only solution is the complex OpenCV? There's a free and simpler way to achive the result: YES here there's your logo/No there isn't ?
I found this tutorial (with project download) that does what you want using OpenCV
I'm given exact size .png renders from Application Design showing exactly what my app should look like on Retina 4", Retina 3.5", etc.
Would like to automate a comparison between these "golden master" renders and screenshots of what the app actually looks like when that screen is shown.
Ideally I would like to have something I can run via continuous integration so I can break the build if a .xib gets messed up.
How can I do this?
Already tried:
Used Command-S in iPhone simulator to grab a screenshot suitable for comparison
Used GitHub's excellent image diff interface to manually compare the images
Pulled them up side-by-side in Preview.app, in actual size (Command-0)
Did some research on ImageMagick's comparison capabilities (examples)
Possible approaches:
Getting a screenshot of the app in code is already implemented
Similarly, I'm pretty sure I can find code to simulate a tap on the screen
Might need some way to exclude a mask or bounding box of areas known to not match exactly
Take a look at ios-snapshot-test-case, which was built for something close to this.
It will take a reference image the first time a test is run and then compare subsequent test outputs to the reference image. You could essentially use this but instead of creating reference images from the tests, you supply your own reference images.
In practice, this will be extremely tricky to do correctly. There are subtle differences in how text, gradients, etc are rendered between iOS and whatever tool your designers are using.
I'd check out KIF for functional testing.
You can create a custom test (small example near the end of the readme just above "Use with other testing frameworks") that takes a screenshot and compares it to your expected screenshot for that view. Just call failWithException:stopTest: if it doesn't match.
As you mentioned, you will want to save a mask with each expected screenshot, and apply the mask before comparing. You will always have parts of the screen that won't match, like the time in the status bar at a minimum.
For the comparison itself, here are a couple links:
Building an image mask
Slow, straightforward way to compare two images
OpenCV: I've seen this recommended, but haven't tried it.
I know this is an older question, but it's worth pointing out that KIF has built a "Perceptual Difference Testing Framework" called Lela:
https://github.com/kif-framework/Lela
If you're already using KIF this is the way to go. I believe it uses somewhat fuzzy image diffing so it may be able to get around the text rendering issues David Grandinetti mentioned. I haven't tried using it against external comps though.
If you're more comfortable with BDD/Cucumber/Gherkin syntax, you should also check out Zucchini, which uses reference images:
http://zucchiniframework.org/
I haven't used it but it's well spoken of.
I suggest you take a look at Visual CI
It's a software built for Continuous integration image compare,
It has UI that allows you to control settings which also include which parts of your image to compare
It's kind of new, but may answer your requirements better.