I'm using the https://github.com/gali8/Tesseract-OCR-iOS/ to make an app that detects text on business cards.
I'm stuck at making the Tesseract detect the text in image.
If I pass the image through code, Tesseract is able to detect it. If I provide the image taken from the camera, tesseract is not able to recognize it.
-(void)startTess:(UIImage *)img{
G8Tesseract *tesseract = [[G8Tesseract alloc] initWithLanguage:#"eng"];
tesseract.delegate = self;
tesseract.engineMode=G8OCREngineModeTesseractCubeCombined;
// Optional: Limit the character set Tesseract should try to recognize from
tesseract.charWhitelist = #"#.,()-,abcdefghijklmnopqrstuvwxyz0123456789";
// Specify the image Tesseract should recognize on
tesseract.image = [img g8_blackAndWhite];
// Optional: Limit the area of the image Tesseract should recognize on to a rectangle
CGRect tessRect = CGRectMake(0, 0, img.size.width, img.size.height);
tesseract.rect = tessRect;
// Optional: Limit recognition time with a few seconds
tesseract.maximumRecognitionTime = 4.0;
// Start the recognition
[tesseract recognize];
// Retrieve the recognized text
NSLog(#"text %#", [tesseract recognizedText]);
// You could retrieve more information about recognized text with that methods:
NSArray *characterBoxes = [tesseract recognizedBlocksByIteratorLevel:G8PageIteratorLevelSymbol];
NSArray *paragraphs = [tesseract recognizedBlocksByIteratorLevel:G8PageIteratorLevelParagraph];
NSArray *characterChoices = tesseract.characterChoices;
UIImage *imageWithBlocks = [tesseract imageWithBlocks:characterBoxes drawText:YES thresholded:NO];
self.imgView.image = imageWithBlocks;
NSString * result = [[characterBoxes valueForKey:#"description"] componentsJoinedByString:#"\n"];
_txtView.text=result;
}
Result when image provided from .xcassets:
Result when image taken directly from the camera:
In both the cases, Tesseract is recognizing the empty space with some random characters. I marked that area in both the images (top-left portion of image).
I made sure that image taken from device camera has the orientation up, as some reported Tesseract doesn't recognize the image taken from camera as it has 180 degree shift.
UIImage *chosenImage = info[UIImagePickerControllerOriginalImage];
// Redraw the image (if necessary) so it has the corrent orientation:
if (chosenImage.imageOrientation != UIImageOrientationUp) {
UIGraphicsBeginImageContextWithOptions(chosenImage.size, NO, chosenImage.scale);
[chosenImage drawInRect:(CGRect){0, 0, chosenImage.size}];
chosenImage = UIGraphicsGetImageFromCurrentImageContext();
UIGraphicsEndImageContext();
}
What is the best way of debugging this and going forward ?
I submitted an issue at git:
https://github.com/gali8/Tesseract-OCR-iOS/issues/358
Edit:
I have changed the iterator level to G8PageIteratorLevelTextline, and now the image taken by device camera gives the following output:
Still it is not accurate. If someone can point out on how to improve this, it would be nice.
On the official github source of tesseract there are various preprocessing methods mentioned and along with those measures I would suggest using .tiff images instead of .jpg or .png because using any other kind of image other than tiff compresses the image and reduces it binarizing quality.
Related
My app lets the user take photos, and in every photo there is a small watermark. The problem is: The watermark appears bigger when the photo has been taken with the front camera. I want the watermark to have the same size no matter which camera has been used.
Any ideas?
My code:
UIImage *backgroundImage = image;
UIImage *watermarkImage = [UIImage imageNamed:#"Watermark.png"];
UIGraphicsBeginImageContext(backgroundImage.size);
[backgroundImage drawInRect:CGRectMake(0, 0, backgroundImage.size.width, backgroundImage.size.height)];
[watermarkImage drawInRect:CGRectMake(backgroundImage.size.width - watermarkImage.size.width, backgroundImage.size.height - watermarkImage.size.height, watermarkImage.size.width, watermarkImage.size.height)];
UIImage *result = UIGraphicsGetImageFromCurrentImageContext();
UIGraphicsEndImageContext();
self.imageView.image = result;
The watermark is the same size. The image is not, since the two cameras have different resolutions. You need to resize the watermark in proportion to the image size. I believe you can use scaleImage:toSize: for this.
Figured out a weird solution. I turned on edit mode: [picker setAllowsEditing:YES]; and now the watermark is the same size no matter what camera you use.
My iOS app utilizes a loop to cycle through images in a folder.
My application is supposed to loop through a total of 2031 images (sized 1200x900) inside a folder. The images were taken at 8fps and each image will be displayed as the loop continues to simulate a video clip. After the 696th picture, the images will cease to be displayed in the UIImageView although the app will continue looping.
I tested to see if the disconnect was because of the picture not existing
I started the loop at picture 200, but after picture 896 the UIImageView stop displaying the pictures.
The Code:
imgName = [NSString stringWithFormat:#"subject_basline_mat k (%d).png",jojo];
jojo++;
imageToCrop.image = [UIImage imageNamed:imgName]; //imageToCrop is the name of the UIImageView image and it is set to the image file here
imageToCrop.image = [self imageWithImage:imageToCrop.image convertToSize:self.imageToCrop.frame.size]; //Here the image is converted to fit the bounds of the simulator which is 320x240
The code loops due to a timer that loops it about once every 0.8 seconds.
I ran my code with instruments to see if there was a memory problem occurring,and instruments is very heavy on my computer. As such, my application ran quite slowly. However, when I arrived at the 696th picture, the pictures kept displaying themselves. It was almost as if my application running too quickly caused the picture to not be displayed... which I don't really understand.
The only memory heavy part of the image switching seems to be the size conversion step which is called by the line imageToCrop.image = [self imageWithImage:imageToCrop.image convertToSize:self.imageToCrop.frame.size];
imageToCrop.image = [self imageWithImage:imageToCrop.image convertToSize:self.imageToCrop.frame.size];
The method "imageWithImage" is here:
- (UIImage *)imageWithImage:(UIImage *)image convertToSize:(CGSize)size {
#autoreleasepool {
UIGraphicsBeginImageContext(size);
[image drawInRect:CGRectMake(0, 0, size.width, size.height)];
UIImage *destImage = UIGraphicsGetImageFromCurrentImageContext();
UIGraphicsEndImageContext();
return destImage;
}
And the line [image drawInRect:CGRectMake(0, 0, size.width, size.height)]; uses around up the most memory out of all the image management in the app.
Any Ideas as to why my app will only display a certain amount of images?
Try loading the full-size images from the app bundle by URL. For example:
#autoreleasepool {
NSString *imgName = [NSString stringWithFormat:#"subject_basline_mat k (%d)",jojo];
NSURL *imageURL = [[NSBundle mainBundle] URLForResource:imgName withExtension:#"png"];
UIImage *image = [UIImage imageWithContentsOfFile:[imageURL path]];
imageToCrop.image = [self imageWithImage:image convertToSize:self.imageToCrop.frame.size];
}
Almost for sure your problem is [UIImage imageNamed:imgName]. There are hundreds of posts here on the pitfalls of using it. The issue is that it caches the images - its real purpose is for some small number of images in your bundle.
If you have oodles of images, get the path to the image, then get the image through a URL or file pointer. That way its not cached. Note that when you do this, you lose the automatic "get-retina-image-automatically", and so you will need to grab the appropriately sized image depending on whether the device is retina or not.
I have to merge multiple images in to single (all of high resolution), It acquires lots of memory. I saved original images to local directory and set resized images to imageviews, placed on different locations on main image. Now at the time of saving final merged image, I then read the original images from local directory. here the memory increases, that cause error (crash due to memory) for higher number of images.
here is code: retrieving original image from local directory
UIImage *originalImage = [UIImage imageWithContentsOfFile:[self getOriginalImagePath:imageview.tag]];
Is there any other way to get images from local directory without loading it into memory.
Thanks in advance
There is no way to load an image without it going into memory. With some image formats you could, in theory, implement your own reader that scales the image down while reading the file, so that the original size never ends up in memory, but that would require a lot of work for little gain.
Overall you would be better off just saving the different sizes of images as separate files and loading only the correct size (you seem to be scaling them based on the screen size, so there are not that many different versions required).
If you do keep to resizing them on the fly, try to ensure that you get rid of the original versions as soon as possible, i.e., don't keep any image reference no longer required, and perhaps wrap the whole thing in #autoreleasepool (assuming ARC is being used):
#autoreleasepool {
UIImage *originalImage = [UIImage imageWithContentsOfFile:[self getOriginalImagePath:imageview.tag]];
UIImage *pThumbsImage = [self scaleImageToSize:CGSizeMake(AppScreenBound.size.width, AppScreenBound.size.height) imageWithImage:pOrignalImage];
originalImage = nil;
imageView.image = pThumbImage;
pThumbImage = nil;
// … ?
}
Similarly treat any other image handling that creates intermediate versions, i.e., get rid of references no longer required as soon as possible (such as by assigning nil or having them fall out of scope), and put #autoreleasepool { … } around subsections that may generate temporary objects.
Found a solution, posting it as an answer to my own question, might help other people. reference from Image I/O Programming Guide
An alternative to "imageWithContentsOfFile:", one can use an Image Source
here is a code how I use it.
UIImage *originalWMImage = [self createCGImageFromFile:your-image-path];
the method createCGImageFromFile: get an image content without loading it to memory
-(UIImage*) createCGImageFromFile :(NSString*)path
{
// Get the URL for the pathname passed to the function.
NSURL *url = [NSURL fileURLWithPath:path];
CGImageRef myImage = NULL;
CGImageSourceRef myImageSource;
CFDictionaryRef myOptions = NULL;
CFStringRef myKeys[2];
CFTypeRef myValues[2];
// Set up options if you want them. The options here are for
// caching the image in a decoded form and for using floating-point
// values if the image format supports them.
myKeys[0] = kCGImageSourceShouldCache;
myValues[0] = (CFTypeRef)kCFBooleanTrue;
myKeys[1] = kCGImageSourceShouldAllowFloat;
myValues[1] = (CFTypeRef)kCFBooleanTrue;
// Create the dictionary
myOptions = CFDictionaryCreate(NULL, (const void **) myKeys,
(const void **) myValues, 2,
&kCFTypeDictionaryKeyCallBacks,
& kCFTypeDictionaryValueCallBacks);
// Create an image source from the URL.
myImageSource = CGImageSourceCreateWithURL((CFURLRef)url, myOptions);
CFRelease(myOptions);
// Make sure the image source exists before continuing
if (myImageSource == NULL){
fprintf(stderr, "Image source is NULL.");
return NULL;
}
// Create an image from the first item in the image source.
myImage = CGImageSourceCreateImageAtIndex(myImageSource,
0,
NULL);
CFRelease(myImageSource);
// Make sure the image exists before continuing
if (myImage == NULL){
fprintf(stderr, "Image not created from image source.");
return NULL;
}
return [UIImage imageWithCGImage:myImage];
}
Here is code: resized image and simply assigned to imageview. Then i perform scaling and rotation on imageview.
UIImage *pThumbsImage = [self scaleImageToSize:CGSizeMake(AppScreenBound.size.width, AppScreenBound.size.height) imageWithImage:pOrignalImage];
[imageView setImage:pThumbImage];
here when saving:this code is within for loop: (upto number of images to merge on main image)
// get size of the second image
CGFloat backgroundWidth = canvasSize.width;
CGFloat backgroundHeight = canvasSize.height;
//Image View: to be merged
UIImageView* imageView = [[UIImageView alloc] initWithImage:stampImage];
[imageView setFrame:CGRectMake(0, 0, stampFrameSize.size.width , stampFrameSize.size.height)];
// Rotate Image View
CGAffineTransform currentTransform = imageView.transform;
CGAffineTransform newTransform = CGAffineTransformRotate(currentTransform, radian);
[imageView setTransform:newTransform];
// Scale Image View
CGRect imageFrame = [imageView frame];
// Create Final Stamp View
UIView *finalStamp = nil;
finalStamp = [[UIView alloc] initWithFrame:CGRectMake(0, 0, imageFrame.size.width, imageFrame.size.height)];
// Set Center of Stamp Image
[imageView setCenter:CGPointMake(imageFrame.size.width /2, imageFrame.size.height /2)];
[finalImageView addSubview:imageView];
// Create Image From image View;
UIGraphicsBeginImageContext(finalStamp.frame.size);
[finalStamp.layer renderInContext:UIGraphicsGetCurrentContext()];
UIImage *viewImage = UIGraphicsGetImageFromCurrentImageContext();
UIGraphicsEndImageContext();
UIImage *pfinalMainImage = nil;
// Create Final Image With Stamp
UIGraphicsBeginImageContext(CGSizeMake(backgroundWidth, backgroundHeight));
[canvasImage drawInRect:CGRectMake(0, 0, backgroundWidth, backgroundHeight)];
[viewImage drawInRect:CGRectMake(stampFrameSize.origin.x , stampFrameSize.origin.y , stampImageFrame.size.width , stampImageFrame.size.height) blendMode:kCGBlendModeNormal alpha:fAlphaValue];
pfinalImage = UIGraphicsGetImageFromCurrentImageContext();
UIGraphicsEndImageContext();
}
everything is okay here. the problem occurs while saving it or generating merged image.
This is an old question, but I had to face something like that recently... so there is my answer.
I had to merge a lot of images into one, and had the same problem. The memory increased until the app crashes. The functions that I created, returned UIImage and that was the problem. The ARC was not releasing at time, so I had to change to return CGImageRef and release them at properly time.
I had a C++ binarization routine that I used for later OCR operation.
However I found that it produced unnecessary slanting of text.
Searching for alternatives I found GPUImage of great value and it solved the slanting issue.
I am using GPUImage code like this to binarize my input images before applying OCR.
However the threshold value does not cover the range of images I get.
See two samples from my input images:
I can't handle both with same threshold value.
Low value seems to be fine with later, and higher value is fine with first one.
The second image seems to be of special complexity because I never get all the chars to be binarized right, irrespective of what value I set for threshold. On the other hand, my C++ binarization routine seems to do it right, but I don't have much insights to experiment into it like simplistic threshold value in GPUImage.
How should I handle that?
UPDATE:
I tried with GPUImageAverageLuminanceThresholdFilter with default multiplier = 1. It works fine with first image but the second image continues to be problem.
Some more diverse inputs for binarization:
UPDATE II:
After going through this answer by Brad, tried GPUImageAdaptiveThresholdFilter (also incorporating GPUImagePicture because earlier I was only applying it on UIImage).
With this, I got second image binarized perfect. However first one seems to have lot of noise after binarization when I set blur size is 3.0. OCR results in extra characters added. With lower value of blur size, second image loses precision.
Here it is:
+(UIImage *)binarize : (UIImage *) sourceImage
{
UIImage * grayScaledImg = [self toGrayscale:sourceImage];
GPUImagePicture *imageSource = [[GPUImagePicture alloc] initWithImage:grayScaledImg];
GPUImageAdaptiveThresholdFilter *stillImageFilter = [[GPUImageAdaptiveThresholdFilter alloc] init];
stillImageFilter.blurSize = 3.0;
[imageSource addTarget:stillImageFilter];
[imageSource processImage];
UIImage *imageWithAppliedThreshold = [stillImageFilter imageFromCurrentlyProcessedOutput];
// UIImage *destImage = [thresholdFilter imageByFilteringImage:grayScaledImg];
return imageWithAppliedThreshold;
}
For a pre processing step you need adaptive thresholding here.
I got these results using opencv grayscale and adaptive thresholding methods. Maybe with an addition of low pass noise filtering (gaussian or median) it should work like a charm.
I used provisia (its a ui to help you process images fast) to get the block size I need: 43 for the image you supplied here. The block size may change if you take photo from closer or further. If you want a generic algorithm, you need to develop one that should search for the best size (search until numbers are detected)
EDIT: I just saw the last image. It is untreatably small. Even if you apply the best pre-processing algorithm, you are not going to detect those numbers. Sampling up would not be solution since noises will come around.
I finally ended up exploring on my own, and here is my result with GPUImage filter:
+ (UIImage *) doBinarize:(UIImage *)sourceImage
{
//first off, try to grayscale the image using iOS core Image routine
UIImage * grayScaledImg = [self grayImage:sourceImage];
GPUImagePicture *imageSource = [[GPUImagePicture alloc] initWithImage:grayScaledImg];
GPUImageAdaptiveThresholdFilter *stillImageFilter = [[GPUImageAdaptiveThresholdFilter alloc] init];
stillImageFilter.blurSize = 8.0;
[imageSource addTarget:stillImageFilter];
[imageSource processImage];
UIImage *retImage = [stillImageFilter imageFromCurrentlyProcessedOutput];
return retImage;
}
+ (UIImage *) grayImage :(UIImage *)inputImage
{
// Create a graphic context.
UIGraphicsBeginImageContextWithOptions(inputImage.size, NO, 1.0);
CGRect imageRect = CGRectMake(0, 0, inputImage.size.width, inputImage.size.height);
// Draw the image with the luminosity blend mode.
// On top of a white background, this will give a black and white image.
[inputImage drawInRect:imageRect blendMode:kCGBlendModeLuminosity alpha:1.0];
// Get the resulting image.
UIImage *outputImage = UIGraphicsGetImageFromCurrentImageContext();
UIGraphicsEndImageContext();
return outputImage;
}
I achieve almost 90% using this - I am sure there must be better options but I tried with blurSize as far as I could and 8.0 is the value that works with most of my input images.
For anyone else, good luck with your trying!
SWIFT3
SOLUTION 1
extension UIImage {
func doBinarize() -> UIImage? {
let grayScaledImg = self.grayImage()
let imageSource = GPUImagePicture(image: grayScaledImg)
let stillImageFilter = GPUImageAdaptiveThresholdFilter()
stillImageFilter.blurRadiusInPixels = 8.0
imageSource!.addTarget(stillImageFilter)
stillImageFilter.useNextFrameForImageCapture()
imageSource!.processImage()
guard let retImage: UIImage = stillImageFilter.imageFromCurrentFramebuffer(with: UIImageOrientation.up) else {
print("unable to obtain UIImage from filter")
return nil
}
return retImage
}
func grayImage() -> UIImage? {
UIGraphicsBeginImageContextWithOptions(self.size, false, 1.0)
let imageRect = CGRect(x: 0, y: 0, width: self.size.width, height: self.size.height)
self.draw(in: imageRect, blendMode: .luminosity, alpha: 1.0)
let outputImage = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
return outputImage
}
}
The result would be
SOLUTION 2
use GPUImageLuminanceThresholdFilter to achieve 100% black and white effect whithout grey color
let stillImageFilter = GPUImageLuminanceThresholdFilter()
stillImageFilter.threshold = 0.9
For example I need to detect flash light and this works for me
I'm trying to make an application and i have to calculate the brightness of the camera like this application : http://itunes.apple.com/us/app/megaman-luxmeter/id455660266?mt=8
I found this document : http://b2cloud.com.au/tutorial/obtaining-luminosity-from-an-ios-camera
But i don't know how to adapt it to the camera directly and not an image. Here is my code :
Image = [[UIImagePickerController alloc] init];
Image.delegate = self;
Image.sourceType = UIImagePickerControllerCameraCaptureModeVideo;
Image.showsCameraControls = NO;
[Image setWantsFullScreenLayout:YES];
Image.view.bounds = CGRectMake (0, 0, 320, 480);
[self.view addSubview:Image.view];
NSArray* dayArray = [NSArray arrayWithObjects:Image,nil];
for(NSString* day in dayArray)
{
for(int i=1;i<=2;i++)
{
UIImage* image = [UIImage imageNamed:[NSString stringWithFormat:#"%#%d.png",day,i]];
unsigned char* pixels = [image rgbaPixels];
double totalLuminance = 0.0;
for(int p=0;p<image.size.width*image.size.height*4;p+=4)
{
totalLuminance += pixels[p]*0.299 + pixels[p+1]*0.587 + pixels[p+2]*0.114;
}
totalLuminance /= (image.size.width*image.size.height);
totalLuminance /= 255.0;
NSLog(#"%# (%d) = %f",day,i,totalLuminance);
}
}
Here are the issues :
"Instance method '-rgbaPixels' not found (return type defaults to 'id')"
&
"Incompatible pointer types initializing 'unsigned char *' with an expression of type 'id'"
Thanks a lot ! =)
Rather than doing expensive CPU-bound processing of each pixel in an input video frame, let me suggest an alternative approach. My open source GPUImage framework has a luminosity extractor built into it, which uses GPU-based processing to give live luminosity readings from the video camera.
It's relatively easy to set this up. You simply need to allocate a GPUImageVideoCamera instance to represent the camera, allocate a GPUImageLuminosity filter, and add the latter as a target for the former. If you want to display the camera feed to the screen, create a GPUImageView instance and add that as another target for your GPUImageVideoCamera.
Your luminosity extractor will use a callback block to return luminosity values as they are calculated. This block is set up using code like the following:
[(GPUImageLuminosity *)filter setLuminosityProcessingFinishedBlock:^(CGFloat luminosity, CMTime frameTime) {
// Do something with the luminosity
}];
I describe the inner workings of this luminosity extraction in this answer, if you're curious. This extractor runs in ~6 ms for a 640x480 frame of video on an iPhone 4.
One thing you'll quickly find is that the average luminosity from the iPhone camera is almost always around 50% when automatic exposure is enabled. This means that you'll need to supplement your luminosity measurements with exposure values from the camera metadata to obtain any sort of meaningful brightness measurement.
Why do you place the camera image into an NSArray *dayArray? Five lines later you remove it from that array but treat the object as an NSString. An NSString does not have rgbaPixels. The example you copy-pasted has an array of filenames corresponding to pictures taken at different times of the day. It then opens those image files and performs the analysis of luminosity.
In your case, there is no file to read. Both outer for loops, i.e. on day and i will have to go away. You already got access to the Image provided through the UIImagePickerController. Right after adding the subview, you could in principle access pixels as in unsigned char *pixels = [Image rgbaPixels]; where Image is the image you got from UIImagePickerController.
However, this may not be what you want to do. I imagine that your goal is rather to show the UIImagePickerController in capture mode and then to measure luminosity continuously. To this end, you could turn Image into a member variable, and then access its pixels repeatedly from a timer callback.
You can import below class from GIT to resolve this issue.
https://github.com/maxmuermann/pxl
Add UIImage+Pixels.h & .m files into project. Now try to run.