Tesseract OCR Camera - ios

I'm using Tesseract OCR 3.01 in my iOS application, it shows 90% accuracy for my data when I pick an image from my phone’s library. But if I use the same image from the camera, it is showing jumbled letters. I followed this tutorial, kindly guide me if something can be done to make sure it works from camera as it works for gallery images.

Yup, There are three things to be specific, First of all, OCR works well with black and white images rather than colored, So If you could try to convert your image to B&W, it would increase accuracy.
The second thing is the size and orientation, You need to force the image to be of 640*480 or 320 size, this would increase both the speed of recognition and the accuracy as well, For orientation , there are a lot of ways to manage.
Finally, If some how you can allow the user to specify Exactly where or on which part of the image he wants to perform the OCR, this greatly increases the accuracy and time since the library does not need to check the entire image for text, rather you already specify the part to be searched for.
PS:I have been working on creating an OCR app for the past few weeks.

Almost for sure the problem is "orientation". Apple tends to create images in one bit map form - the image bits are laid out as if the camera was on its side with the volume buttons top and right. Images that you see which appear taller than wider are still laid out as above, but there is an "orientation" in the EXIF object included with the image.
I'm going to guess that tesseract does not look at the EXIF, but expects the image in a "standard" format so that text is in the position it would be for a person reading the text.
You can test my hypothesis by using camera images taken with volume button top right.
If they work, then what you will need to do is process the image yourself, and re-arrange the bits per the orientation setting. This is not all that hard to do but will require you to read up on vImage and/or bit map contexts.

Related

iOS image quality improvement

Icons Are Pretty Right?
I'm working on an UI update in an iOS app, and trying to make things look a bit better with some new icons- but I seem to be incapable of determining how to save an image correctly so that it looks good in the interface!
As you can see from this image, if I include a white background with the image it looks great. If I take those same images and use an alpha background they look terrible! It appears that either the images aren't using the #2x correctly, or something else is going horribly wrong.
These images are either saved with GIMP as a png with alpha, or exported from inkscape, the originals are vector graphics. We get the same results from both avenues. I am using both a base imageName.png and imageName#2x.png for scaling.
Somehow, magically, I changed the a single image to greyscale in gimp, and changed the base size to 25px and it showed up with alpha correctly blended. Stock images from apple are also functioning correctly, so it absolutely seems to be something that I'm doing incorrectly when I'm saving the images.
The Setup in XCode
Basic Questions
Is there a certain bit depth, argb vs rgba format, or some other quirk that I need to know to get these images to show up correctly? Is there any way to verify that the program is loading the correct imageName#2x vs imageName? Is there some document that talks about integrated graphics (the iconography documentation isn't very helpful on technical details)
Actual Images
With Background:
Without Background:
I think you will find success if you just save the image at 4x the size you actually want and specify the size manually.

supplying the right image size when not knowing what the size will be at runtime

I am displaying a grid of images (3rows x 3 columns) in collection view. Each image is a square and its width is determined to be 1/3 of collectionView's width. Collection view is pinned to left and right margin of the mainView.
I do not know what the image height and width will be at runtime, because of different screen sizes of various iPhones. For example each image will be 100x100 display pixels on 5S, but 130x130 on 6+. I was advised to supply images that exactly matches the size on screen. Bigger images often tend to become pixelate and too sharp when downsized. How does one tackle such problem?
The usual solution is to supply three versions, for single-, double-, and triple-resolution screens, and downsize in real time by redrawing with drawInRect into a graphics context when the image is first needed.
I do not know what the image height and width will be at runtime, because of different screen sizes of various iPhones. For example each image will be 100x100 display pixels on 5S, but 130x130 on 6+
Okay, so your first sentence is a lie. The second sentence proves that you do know what the size is to be on the different screen sizes. Clearly, if I tell you the name of a device, you can tell me what you think the image size should be. So, if you don't want to downscale a larger image at runtime because you don't like the resulting quality, simply supply actual images at the correct size and resolution for every device, and use the correct image on the actual device type you find yourself running on.
If your images are photos or raster type images created using a raster drawing tool, then somewhere you will have to scale the original to the sizes you want. You can either do this while running in iOS, or create sets up front using a tool which can give you better scaling results. Unfortunately, the only perfect image will be the original with everything else being a distortion of the truth.
For icons, the only accurate rendering solution is to use vector graphics. Tools like Adobe Illustrator will let you create images which you can scale to different sizes without losing clarity. Unfortunately this still leaves you generating images up front. You can script this generation using most tools and given you said your images were all square, then the total number needed is not huge. At most you need 3 for iPhone (4/5 are same width, 6 and 6+) and 2 for iPad (#1 for mini/ipad1 and #2 for retina).
Although iOS has no direct support I know of for vector image rendering, there are some 3rd party tools. http://www.paintcodeapp.com/ is an example which seems to let you import vector images or draw vector images and then generate image code to run in your app. This kind of tool would give you what you want as the images are now vector drawings drawn at the scale you choose at run time. $99 though.
There is also the SVGKit (https://github.com/SVGKit/SVGKit), but not sure how good/bad this is. It seems to let you simply load and render direct from SVG files. Might be worth trying.
So in summary, I think you either generate the relatively small subset up front using a tool you can control the output from, take the hit in iOS and let it scale the images or use a 3rd party vector to image rendering kit which would give you what you want.

UIImage size managment

I have an image that is used in multiple places within my IOS app at various sizes. Should I be using one large image in conjunction with UIViewContentModeScaleAspectFit to scale it (currently this approach seems to cause distortion on the smaller images despite being scaled to proportionate sizes) or should I have multiple versions of the image at different sizes.
It seems a little extreme to have so many copies of one image e.g. my_image_ipad, my_image_small, my_image_large, my_image_medium.
In general it depends on what it takes to get the look and quality you want. Some images will scale from a large image to a small image with little or no visual loss in quality. Others, as you've found, scale poorly, especially at extreme scales. You're really going to have to just try various solutions and see what works. One way you can almost always cut down on application size is to remove the non-#2x artwork and ship with just the #2x.
It is better to use one image, because it will make your app "lighter", but this may cause your code to become bigger.

how to display image (containing text) with a good quality

I’m new in iOS and trying to make simple app with hierarchy of viewcontrollers. In the last one I wanna display scrollable image (which can also be zoomed at least x1,5), containing some small black and white picture and a piece of text. Initially I planned to make vector image, convert it to .jpg and use UIScrollView for displaying. But I found out that .jpg ( approx. 150 KB) didn’t provide a good quality for displaying text. As I have to use a lot of images I don’t want to increase image size. What is worse I also want it look good on retina display.
Can you recommend a way how to display image, containing text, with enough quality?
I mean that I don’t want the user see the separate pixels of letters in the text. Just like when you read text in your e-mail in iOS. Image size should be as small as possible. Planning physical size of image – approx. 5 cm x 15 cm.
Any help much appreciated
Thanks
To get good edges you would need to use png not jpg, which will make the image sizes much larger. I have a better suggestion, more code but better solution.
The answer is to not put the text into the image, but to draw it over it in real time.
You would:
associate text at some coordinate in the image (say a CGRect) with the image
create a uiimageview subclass that in the drawRect routine, after calling super, draws the text using the NSString categories on UIKit (which let you draw into a context)
To get going on this please create a small one vc project and get the subclass working there, then back port it to your primary project.

Capturing image at higher resolution in iOS

Currently i am working on an image app where i am able to capture an image by using AVFoundation framework. But what i am looking is , to capture an image with certain resolution and DPI (may be 300 DPI or greater).
How to do this ?
There have been numerous posts on here about trying to do OCR on camera generated images. The problem is not that the resolution is too low, but that its too high. I cannot find a link right now, but there was a question a year or so ago where in the end, if the image size was reduced by a factor of four or so, the OCR engine worked better. If you examine the image in Preview, what you want is the number of pixels per character to be say 16x16 or 32x32, not 256x256. Frankly I don't know the exact number but I'm sure you can research this and find posts from actual framework users telling you the best size.
Here is a nice response on how to best scale a large image (with a link to code).

Resources