I've been challenged with the task of putting an oddly sized image (with fixed proportion, though) on a GL_QUAD (well, a GL_TRIANGLE_STRIP resem--you got the point) and that seemed fairily easy to me at first, except for the part where I need to do this in iOS (4.2+). The solution is awkwardly easy anyway: just take the image, make a texture out of it, map it to the correct vertices and you're good to go.
As you may very well know, OpenGL ES textures are required to have width and height to be powers of 2, like 2, 4, 8, ..., 256, 512... (not sure this holds for regular OpenGL but I think it does... anyway, doesn't matter).
Since I have to download these images from the Intertubes (actually, the YouTube) I can't really do anything beforehand, so I have these 480x360 images (if I remember it correctly) and I have to splat them on my triangle strips. Fortunately we have texture mapping which allows us to select portions of the texture to be mapped where we want, so the obvious solution would be to (optionally up/downsize) and pad with some matte color the source image, and live with it.
Enter iOS. I get the data from the Intertubes, I happily build the corresponding UIImage, then I make another UIImage (yes, I know, bear with me, I'll optimize it later) just scaled down to the nearest power-of-2 in width, preserving aspect, so let's say 256x192, then I make a bitmap context , paint it black (or, for what matters, any other colour, but I think you can see why I chose black in this case), draw the UIImage (a CGImage) on it, and return the UIImage built using the aforementioned bitmap context.
I am now the happy owner of a 256x256 image ready to be mapped on my GL_TRIANGLE_STRIP. Except that it does not work. I tried with a prepared 512x512 image and it worked flawlessly. The code I'm pasting here does not include the retrieval of the image from YouTube, I just saved it locally to rule out networking problems. Also, I'm not including the GL code as it's clearly working.
- (void)viewDidLoad {
images = [[NSMutableArray alloc] init];
//NSURL *url = [NSURL URLWithString:#"http://i.ytimg.com/vi/d2wVgzXWE9Y/0.jpg"];
NSString *path = [[NSBundle mainBundle] pathForResource:#"opengl_texture" ofType:#"jpg"];
NSData *texData = [NSData dataWithContentsOfFile:path];
UIImage *rawImage = [[UIImage alloc] initWithData:texData];
float newWidth = (float)(1 << (int)floor(log2f(rawImage.size.width)));
// Scale means the scale of the current image relative to the resulting image.
float scale = rawImage.size.width / newWidth;
UIImage *midImage = [UIImage imageWithCGImage:[rawImage CGImage] scale:scale orientation:UIImageOrientationUp];
NSLog(#"%f %f %f", midImage.size.width, midImage.size.height, scale);
[rawImage release];
UIImage *image = [self padImage:midImage withColor:[UIColor redColor]];
NSLog(#"%f %f", image.size.width, image.size.height);
[images addObject:image];
textures = malloc(sizeof(GLuint));
glGenTextures(1, textures);
glBindTexture(GL_TEXTURE_2D, textures[0]);
GLuint width = CGImageGetWidth(image.CGImage);
GLuint height = CGImageGetWidth(image.CGImage);
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
void *imageData = malloc(width * height * 4);
CGContextRef context = CGBitmapContextCreate(imageData, width, height, 8, 4*width, colorSpace, kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big);
CGColorSpaceRelease( colorSpace );
CGContextClearRect( context, CGRectMake( 0, 0, width, height ) );
CGContextTranslateCTM( context, 0, height - height );
CGContextDrawImage( context, CGRectMake( 0, 0, width, height ), image.CGImage );
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, imageData);
[midImage release];
[image release];
[texData release];
- (UIImage *)padImage:(UIImage *)image withColor:(UIColor *)color {
CGFloat size = round(image.size.width);
NSLog(#"%f", size);
CGContextRef bContext = [self createBitmapContextOfSize:CGSizeMake(size, size)];
CGContextSetFillColorWithColor(bContext, [color CGColor]);
CGContextFillRect(bContext, CGRectMake(0, 0, size, size));
CGContextDrawImage(bContext, CGRectMake(0, 0, size, size), [image CGImage]);
UIImage *result = [UIImage imageWithCGImage:CGBitmapContextCreateImage(bContext)];
return result;
- (CGContextRef) createBitmapContextOfSize:(CGSize) size {
CGContextRef context = NULL;
CGColorSpaceRef colorSpace;
void * bitmapData;
int bitmapByteCount;
int bitmapBytesPerRow;
bitmapBytesPerRow = (size.width * 4);
bitmapByteCount = (bitmapBytesPerRow * size.height);
colorSpace = CGColorSpaceCreateDeviceRGB();
bitmapData = malloc( bitmapByteCount );
if (bitmapData == NULL) {
fprintf (stderr, "Memory not allocated!");
return NULL;
context = CGBitmapContextCreate (bitmapData,
8, // bits per component
CGContextSetAllowsAntialiasing (context,NO);
if (context== NULL) {
free (bitmapData);
fprintf (stderr, "Context not created!");
return NULL;
CGColorSpaceRelease( colorSpace );
return context;
Please don't bother mentioning obvious memory management issues unless you think they are the core of the problem. As for the "error message" or whatever: no, there's no such thing, the whole app just crashes.
Ok, now you can collectively smack my face with a large trout.
The problem was actually memory management, specifically I was releasing objects that were created with implicit methods (namely midImage and texData). Implicit creation does not increase the retain count, while explicit (alloc+init and friends) does. How may times did I already crash against this? Lots. Were them enough? Obviously not.
Second question: where can I find a large post-it, like 1x1m at least?
My use case is that a user takes a photo of themself on their phone, and uploads it to an image hosting service as a JPEG. Other uses can then download that image, and that image is then mapped to a metal texture for use in a game.
My issue is that if i download that image and simply display it in a UIImageView, it looks correct, but when I take the downloaded image and transform it into a metal texture it gets mirrored and rotated 90 degrees clockwise. I understand the image getting mirrored is due to metal having a different coordinate system but I don't understand the rotation issues. When I print the details for the image that has been passed into my function it has all the same orientation details as the UIImageView that is displaying correctly, so I have no idea where the issue is. Attached is my function that gives me my MTLTexture.
- (id<MTLTexture>) createTextureFromImage:(UIImage*) image device:(id<MTLDevice>) device
image =[UIImage imageWithCGImage:[image CGImage]
scale:[image scale]
orientation: UIImageOrientationLeft];
NSLog(#"orientation and size and stuff %ld %f %f", (long)image.imageOrientation, image.size.width, image.size.height);
CGImageRef imageRef = image.CGImage;
size_t width = self.view.frame.size.width;
size_t height = self.view.frame.size.height;
size_t bitsPerComponent = CGImageGetBitsPerComponent(imageRef);
size_t bitsPerPixel = CGImageGetBitsPerPixel(imageRef);
CGColorSpaceRef colorSpace = CGImageGetColorSpace(imageRef);
CGImageAlphaInfo alphaInfo = CGImageGetAlphaInfo(imageRef);
// NSLog(#"%# %u", colorSpace, alphaInfo);
CGBitmapInfo bitmapInfo = kCGBitmapByteOrderDefault | alphaInfo;
// NSLog(#"bitmap info %u", bitmapInfo);
CGContextRef context = CGBitmapContextCreate( NULL, width, height, bitsPerComponent, (bitsPerPixel / 8) * width, colorSpace, bitmapInfo);
if( !context )
NSLog(#"Failed to load image, probably an unsupported texture type");
return nil;
CGContextDrawImage( context, CGRectMake( 0, 0, width, height ), image.CGImage);
MTLPixelFormat format = MTLPixelFormatRGBA8Unorm;
MTLTextureDescriptor *texDesc = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat:format
id<MTLTexture> texture = [device newTextureWithDescriptor:texDesc];
[texture replaceRegion:MTLRegionMake2D(0, 0, width, height)
bytesPerRow:4 * width];
return texture;
In Metal coordinates are reversed. However, you now have a much simpler way to load textures with MTKTextureLoader:
import MetalKit
let textureLoader = MTKTextureLoader(device: device)
let texture: MTLTexture = textureLoader.newTextureWithContentsOfURL(filePath, options: nil)
This will create a new texture for you with the appropriate coordinates using the image located at filePath. If you don't want to use a NSURL you also have the newTextureWithData and newTextureWithCGImage options.
I want to take snapshot of the content in CCGLView in my viewController and display the resultant image in the same viewController.
Right now I'm using the following method to do so :
-(UIImage *) drawableToCGImage{
GLint backingWidth2, backingHeight2;
//Bind the color renderbuffer used to render the OpenGL ES view
// If your application only creates a single color renderbuffer which is already bound at this point,
// this call is redundant, but it is needed if you're dealing with multiple renderbuffers.
// Note, replace "_colorRenderbuffer" with the actual name of the renderbuffer object defined in your class.
glBindRenderbufferOES(GL_RENDERBUFFER_OES, self.glView.colorRenderBuffer);
// Get the size of the backing CAEAGLLayer
glGetRenderbufferParameterivOES(GL_RENDERBUFFER_OES, GL_RENDERBUFFER_WIDTH_OES, &backingWidth2);
glGetRenderbufferParameterivOES(GL_RENDERBUFFER_OES, GL_RENDERBUFFER_HEIGHT_OES, &backingHeight2);
NSInteger x = self.glView.frame.origin.x, y = self.glView.frame.origin.y, width2 = backingWidth2, height2 = backingHeight2;
NSInteger dataLength = width2 * height2 * 4;
GLubyte *data = (GLubyte*)malloc(dataLength * sizeof(GLubyte));
// Read pixel data from the framebuffer
glPixelStorei(GL_PACK_ALIGNMENT, 4);
glReadPixels(x, y, width2, height2, GL_RGBA, GL_UNSIGNED_BYTE, data);
// Create a CGImage with the pixel data
// If your OpenGL ES content is opaque, use kCGImageAlphaNoneSkipLast to ignore the alpha channel
// otherwise, use kCGImageAlphaPremultipliedLast
CGDataProviderRef ref = CGDataProviderCreateWithData(NULL, data, dataLength, NULL);
CGColorSpaceRef colorspace = CGColorSpaceCreateDeviceRGB();
CGImageRef iref = CGImageCreate(width2, height2, 8, 32, width2 * 4, colorspace, kCGBitmapByteOrder32Big | kCGImageAlphaPremultipliedLast,
ref, NULL, true, kCGRenderingIntentDefault);
// OpenGL ES measures data in PIXELS
// Create a graphics context with the target size measured in POINTS
NSInteger widthInPoints, heightInPoints;
// On iOS 4 and later, use UIGraphicsBeginImageContextWithOptions to take the scale into consideration
// Set the scale parameter to your OpenGL ES view's contentScaleFactor
// so that you get a high-resolution snapshot when its value is greater than 1.0
CGFloat scale = self.glView.contentScaleFactor;
widthInPoints = width2 / scale;
heightInPoints = height2 / scale;
UIGraphicsBeginImageContextWithOptions(CGSizeMake(widthInPoints, heightInPoints), NO, scale);
CGContextRef cgcontext = UIGraphicsGetCurrentContext();
// UIKit coordinate system is upside down to GL/Quartz coordinate system
// Flip the CGImage by rendering it to the flipped bitmap context
// The size of the destination area is measured in POINTS
CGContextSetBlendMode(cgcontext, kCGBlendModeCopy);
CGContextDrawImage(cgcontext, CGRectMake(0.0, 0.0, widthInPoints, heightInPoints), iref);
// Retrieve the UIImage from the current context
UIImage *image = UIGraphicsGetImageFromCurrentImageContext();
// Clean up
return image;
But it works only in simulator and in device when I test it, I don't get the content of the CCGLView. Why this method doesn't give the snapshot in device? Or is there any other way to get it done?
I don't know why the previous method didn't work, but I got to know another way of doing it, and its less expensive too :). I'm using the following method :
- (UIImage *)snapshot:(UIView *)view{
UIGraphicsBeginImageContextWithOptions(view.bounds.size, YES, 0);
[view drawViewHierarchyInRect:view.bounds afterScreenUpdates:YES];
UIImage *image = UIGraphicsGetImageFromCurrentImageContext();
return image;
for more info go to following link: https://developer.apple.com/library/ios/qa/qa1817/_index.html
I want to make a screenshot of OpenGLES and UIKit at a time and after a big research I found a way exactly like this:
- (UIImage *)makeScreenshot {
GLint backingWidth, backingHeight;
// Bind the color renderbuffer used to render the OpenGL ES view
// If your application only creates a single color renderbuffer which is already bound at this point,
// this call is redundant, but it is needed if you're dealing with multiple renderbuffers.
// Note, replace "_colorRenderbuffer" with the actual name of the renderbuffer object defined in your class.
// glBindRenderbufferOES(GL_RENDERBUFFER_OES, _colorRenderbuffer);
// Get the size of the backing CAEAGLLayer
glGetRenderbufferParameterivOES(GL_RENDERBUFFER_OES, GL_RENDERBUFFER_WIDTH_OES, &backingWidth);
glGetRenderbufferParameterivOES(GL_RENDERBUFFER_OES, GL_RENDERBUFFER_HEIGHT_OES, &backingHeight);
// NSInteger x = 0, y = 0, width = backingWidth, height = backingHeight;
NSInteger x = _visibleFrame.origin.x, y = _visibleFrame.origin.y, width = _visibleFrame.size.width, height = _visibleFrame.size.height;
NSInteger dataLength = width * height * 4;
GLubyte *data = (GLubyte*)malloc(dataLength * sizeof(GLubyte));
// Read pixel data from the framebuffer
glPixelStorei(GL_PACK_ALIGNMENT, 4);
glReadPixels(x, y, width, height, GL_RGBA, GL_UNSIGNED_BYTE, data);
// Create a CGImage with the pixel data
// If your OpenGL ES content is opaque, use kCGImageAlphaNoneSkipLast to ignore the alpha channel
// otherwise, use kCGImageAlphaPremultipliedLast
CGDataProviderRef ref = CGDataProviderCreateWithData(NULL, data, dataLength, NULL);
CGColorSpaceRef colorspace = CGColorSpaceCreateDeviceRGB();
// CGImageRef iref = CGImageCreate(width, height, 8, 32, width * 4, colorspace, kCGBitmapByteOrder32Big | kCGImageAlphaPremultipliedLast, ref, NULL, true, kCGRenderingIntentDefault);
CGImageRef iref = CGImageCreate(width, height, 8, 32, width * 4, colorspace, kCGBitmapByteOrder32Big | kCGImageAlphaNoneSkipLast, ref, NULL, true, kCGRenderingIntentDefault);
// OpenGL ES measures data in PIXELS
// Create a graphics context with the target size measured in POINTS
NSInteger widthInPoints, heightInPoints;
if (NULL != UIGraphicsBeginImageContextWithOptions) {
// On iOS 4 and later, use UIGraphicsBeginImageContextWithOptions to take the scale into consideration
// Set the scale parameter to your OpenGL ES view's contentScaleFactor
// so that you get a high-resolution snapshot when its value is greater than 1.0
CGFloat scale = _baseView.contentScaleFactor;
widthInPoints = width / scale;
heightInPoints = height / scale;
UIGraphicsBeginImageContextWithOptions(CGSizeMake(widthInPoints, heightInPoints), NO, scale);
else {
// On iOS prior to 4, fall back to use UIGraphicsBeginImageContext
widthInPoints = width;
heightInPoints = height;
UIGraphicsBeginImageContext(CGSizeMake(widthInPoints, heightInPoints));
CGContextRef cgcontext = UIGraphicsGetCurrentContext();
// UIKit coordinate system is upside down to GL/Quartz coordinate system
// Flip the CGImage by rendering it to the flipped bitmap context
// The size of the destination area is measured in POINTS
CGContextSetBlendMode(cgcontext, kCGBlendModeCopy);
CGContextDrawImage(cgcontext, CGRectMake(0.0, 0.0, widthInPoints, heightInPoints), iref);
// Retrieve the UIImage from the current context
UIImage *image = UIGraphicsGetImageFromCurrentImageContext();
// Clean up
// return image;
UIImageView *GLImage = [[UIImageView alloc] initWithImage:image];
//order of getting the context depends on what should be rendered first.
// this draws the UIKit on top of the gl image
[GLImage.layer renderInContext:UIGraphicsGetCurrentContext()];
CGContextTranslateCTM(UIGraphicsGetCurrentContext(), -_visibleFrame.origin.x, -_visibleFrame.origin.y);
[_baseView.layer renderInContext:UIGraphicsGetCurrentContext()];
UIImage *finalImage = UIGraphicsGetImageFromCurrentImageContext();
// Do something with resulting image
return finalImage;
but the interesting part would be the merging section. Where I am having two
blocks. First generating the OpenGLES image and then merging with the UIKit image. Is there a better way to do that with single UIGraphicsBeginImageContext(); ... UIGraphicsEndImageContext(); block rather creating UIImageView and then perform the render?
something like:
CGContextRef cgcontext = UIGraphicsGetCurrentContext();
// UIKit coordinate system is upside down to GL/Quartz coordinate system
// Flip the CGImage by rendering it to the flipped bitmap context
// The size of the destination area is measured in POINTS
CGContextSetBlendMode(cgcontext, kCGBlendModeCopy);
CGContextDrawImage(cgcontext, CGRectMake(0.0, 0.0, widthInPoints, heightInPoints), iref);
// the merging part starts
CGContextTranslateCTM(UIGraphicsGetCurrentContext(), -_visibleFrame.origin.x, -_visibleFrame.origin.y);
[_baseView.layer renderInContext:UIGraphicsGetCurrentContext()];
// the merging part ends
// Retrieve the UIImage from the current context
UIImage *image = UIGraphicsGetImageFromCurrentImageContext();
but unfortunately its not merging. Can anyone correct the mistake here and/or find the best way to do that?
With iOS 7 Apple introduced UISnapshotting and they claim it's really fast, much faster than renderInContext:.
UIView *snapshot = [view snapshotViewAfterScreenUpdates:YES];
This method captures the current visual contents of the screen from
the render server and uses them to build a new snapshot view. You can
use the returned snapshot view as a visual stand-in for the screen’s
contents in your app. (...) this method is faster than trying to
render the contents of the screen into a bitmap image yourself.
Moreover, have a look into links below. They should give you some insights and point to the right direction.
Implementing Engaging UI on iOS from WWDC 2013, slides 32-41
How to render view into image faster?
Here's the question in brief:
For some layer compositing, I have to render an OpenGL texture in a CGContext. What's the fastest way to do that?
Thoughts so far:
Obviously, calling renderInContext won't capture OpenGL content, and glReadPixels is too slow.
For some 'context', I'm calling this method in a delegate class of a layer:
- (void) drawLayer:(CALayer *)layer inContext:(CGContextRef)ctx
I've considered using a CVOpenGLESTextureCache, but that requires an additional rendering, and it seems like some complicated conversion would be necessary post-rendering.
Here's my (terrible) implemention right now:
glBindRenderbuffer(GL_RENDERBUFFER, displayRenderbuffer);
NSInteger x = 0, y = 0, width = backingWidth, height = backingHeight;
NSInteger dataLength = width * height * 4;
GLubyte *data = (GLubyte *) malloc(dataLength * sizeof(GLubyte));
glPixelStorei(GL_PACK_ALIGNMENT, 4);
glReadPixels(x, y, width, height, GL_RGBA, GL_UNSIGNED_BYTE, data);
CGDataProviderRef ref = CGDataProviderCreateWithData(NULL, data, dataLength, NULL);
CGColorSpaceRef colorspace = CGColorSpaceCreateDeviceRGB();
CGImageRef iref = CGImageCreate(width, height, 8, 32, width * 4, colorspace, kCGBitmapByteOrder32Big | kCGImageAlphaPremultipliedLast,
ref, NULL, true, kCGRenderingIntentDefault);
CGFloat scale = self.contentScaleFactor;
NSInteger widthInPoints, heightInPoints;
widthInPoints = width / scale;
heightInPoints = height / scale;
CGContextSetBlendMode(context, kCGBlendModeCopy);
CGContextDrawImage(context, CGRectMake(0.0, 0.0, widthInPoints, heightInPoints), iref);
// Clean up
For anyone curious, the method shown above is not the fastest way.
When a UIView is asked for its contents, it will ask its layer (usually a CALayer) to draw them for it. The exception: OpenGL-based views, which use a CAEAGLLayer (a subclass of CALayer), use the same method but returns nothing. No drawing happens.
So, if you call:
[someUIView.layer drawInContext:someContext];
it will work, while
[someOpenGLView.layer drawInContext:someContext];
This also becomes an issue if you're asking a superview of any OpenGL-based view for its content: it will recursively ask each of its subviews for theirs, and any subview that uses a CAEAGLLayer will hand back nothing (you'll see a black rectangle).
I set out above to find an implementation of a delegate method of CALayer, drawLayer:inContext:, which I could use in any OpenGL-based views so that the view object itself would provide its contents (rather than the layer). The delegate method is called automatically: Apple expects it to work this way.
Where performance isn't an issue, you can implement a variation of a simple snapshot method in your view. The method would look like this:
- (void) drawLayer:(CALayer *)layer inContext:(CGContextRef)ctx {
GLint backingWidth, backingHeight;
glBindRenderbufferOES(GL_RENDERBUFFER_OES, _colorRenderbuffer);
glGetRenderbufferParameterivOES(GL_RENDERBUFFER_OES, GL_RENDERBUFFER_WIDTH_OES, &backingWidth);
glGetRenderbufferParameterivOES(GL_RENDERBUFFER_OES, GL_RENDERBUFFER_HEIGHT_OES, &backingHeight);
NSInteger x = 0, y = 0, width = backingWidth, height = backingHeight;
NSInteger dataLength = width * height * 4;
GLubyte *data = (GLubyte*)malloc(dataLength * sizeof(GLubyte));
// Read pixel data from the framebuffer
glPixelStorei(GL_PACK_ALIGNMENT, 4);
glReadPixels(x, y, width, height, GL_RGBA, GL_UNSIGNED_BYTE, data);
CGDataProviderRef ref = CGDataProviderCreateWithData(NULL, data, dataLength, NULL);
CGColorSpaceRef colorspace = CGColorSpaceCreateDeviceRGB();
CGImageRef iref = CGImageCreate(width, height, 8, 32, width * 4, colorspace, kCGBitmapByteOrder32Big | kCGImageAlphaPremultipliedLast,
ref, NULL, true, kCGRenderingIntentDefault);
CGContextDrawImage(ctx, self.bounds, iref);
BUT! This is not a performance effective.
glReadPixels, as noted just about everywhere, is not a fast call. Starting in iOS 5, Apple exposed CVOpenGLESTextureCacheRef - basically, a shared buffer that can be used both as a CVPixelBufferRef and as an OpenGL texture. Originally, it was designed to be used as a way of getting an OpenGL texture from a video frame: now it's more often used in reverse, to get a video frame from a texture.
So a much better implementation of the above idea is to use the CVPixelBufferRef you get from CVOpenGLESTextureCacheCreateTextureFromImage, get direct access to those pixels, draw them into a CGImage which you cache and which is drawn into your context in the delegate method above.
The code is here. On each rendering pass, you draw your texture into the texturecache, which is linked to the CVPixelBuffer Ref:
- (void) renderToCGImage {
// Setup the drawing
[ochrContext useProcessingContext];
glBindFramebuffer(GL_FRAMEBUFFER, layerRenderingFramebuffer);
glViewport(0, 0, (int) self.frame.size.width, (int) self.frame.size.height);
[ochrContext setActiveShaderProgram:layerRenderingShaderProgram];
// Do the actual drawing
glBindTexture(GL_TEXTURE_2D, self.inputTexture);
glUniform1i(layerRenderingInputTextureUniform, 4);
glVertexAttribPointer(layerRenderingShaderPositionAttribute, 2, GL_FLOAT, 0, 0, kRenderTargetVertices);
glVertexAttribPointer(layerRenderingShaderTextureCoordinateAttribute, 2, GL_FLOAT, 0, 0, kRenderTextureVertices);
// Draw and finish up
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
// Try running this code asynchronously to improve performance
dispatch_async(PixelBufferReadingQueue, ^{
// Lock the base address (can't get the address without locking it).
CVPixelBufferLockBaseAddress(renderTarget, 0);
// Get a pointer to the pixels
uint32_t * pixels = (uint32_t*) CVPixelBufferGetBaseAddress(renderTarget);
// Wrap the pixel data in a data-provider object.
CGDataProviderRef pixelWrapper = CGDataProviderCreateWithData(NULL, pixels, CVPixelBufferGetDataSize(renderTarget), NULL);
// Get a color-space ref... can't this be done only once?
CGColorSpaceRef colorSpaceRef = CGColorSpaceCreateDeviceRGB();
// Release the exusting CGImage
// Get a CGImage from the data (the CGImage is used in the drawLayer: delegate method above)
currentCGImage = CGImageCreate(self.frame.size.width,
4 * self.frame.size.width,
kCGBitmapByteOrder32Big | kCGImageAlphaPremultipliedLast,
// Clean up
CVPixelBufferUnlockBaseAddress(renderTarget, 0);
And then implement the delegate method very simply:
- (void) drawLayer:(CALayer *)layer inContext:(CGContextRef)ctx {
CGContextDrawImage(ctx, self.bounds, currentCGImage);
I would like to implement an OCR application that would recognize text from Photos.
I succeeded in Compiling and Integration the Tesseract Engine in iOS, I succeeded in getting reasonable detection when photographing clear documents (or a photoshot of this text from the screen) but for other text such as signposts, shop signs, colour background, the detection failed.
The Question is What kind of image processing preparations are necessary to get better recognition. For example, I expect that we need to transform the images into grayscale /B&W as well as fixing contrast etc.
How can this be done in iOS, Is there a package for this?
I'm currently working on the same thing.
I found that a PNG saved in photoshop worked fine, but an image which was originally sourced from the camera then imported into the app never worked.
Don't ask me to explain it - but applying this function made these images work. Maybe it'll work for you too.
// this does the trick to have tesseract accept the UIImage.
UIImage * gs_convert_image (UIImage * src_img) {
CGColorSpaceRef d_colorSpace = CGColorSpaceCreateDeviceRGB();
* Note we specify 4 bytes per pixel here even though we ignore the
* alpha value; you can't specify 3 bytes per-pixel.
size_t d_bytesPerRow = src_img.size.width * 4;
unsigned char * imgData = (unsigned char*)malloc(src_img.size.height*d_bytesPerRow);
CGContextRef context = CGBitmapContextCreate(imgData, src_img.size.width,
8, d_bytesPerRow,
// These next two lines 'flip' the drawing so it doesn't appear upside-down.
CGContextTranslateCTM(context, 0.0, src_img.size.height);
CGContextScaleCTM(context, 1.0, -1.0);
// Use UIImage's drawInRect: instead of the CGContextDrawImage function, otherwise you'll have issues when the source image is in portrait orientation.
[src_img drawInRect:CGRectMake(0.0, 0.0, src_img.size.width, src_img.size.height)];
* At this point, we have the raw ARGB pixel data in the imgData buffer, so
* we can perform whatever image processing here.
// After we've processed the raw data, turn it back into a UIImage instance.
CGImageRef new_img = CGBitmapContextCreateImage(context);
UIImage * convertedImage = [[UIImage alloc] initWithCGImage:
return convertedImage;
I've also gone a lot of experimentation preparing the image for tesseract. Resizing, converting to grayscale, then adjusting brightness and contrast seems to work best.
I've also tried this GPUImage library. https://github.com/BradLarson/GPUImage
And the GPUImageAverageLuminanceThresholdFilter seems to give me a great adjusted image, but tesseract doesn't seem to work well with it.
I've also put in opencv into my project and plan to try out it's image routines. Possibly even some box detection to find the text area (i'm hoping this will speed up tesseract).
I have used the code above but added two other function calls as well to convert the image so that it will work with the Tesseract.
Firstly I used an image resize script to convert to 640 x 640 which seems to be more manageable for the Tesseract.
-(UIImage *)resizeImage:(UIImage *)image {
CGImageRef imageRef = [image CGImage];
CGImageAlphaInfo alphaInfo = CGImageGetAlphaInfo(imageRef);
CGColorSpaceRef colorSpaceInfo = CGColorSpaceCreateDeviceRGB();
if (alphaInfo == kCGImageAlphaNone)
alphaInfo = kCGImageAlphaNoneSkipLast;
int width, height;
width = 640;//[image size].width;
height = 640;//[image size].height;
CGContextRef bitmap;
if (image.imageOrientation == UIImageOrientationUp | image.imageOrientation == UIImageOrientationDown) {
bitmap = CGBitmapContextCreate(NULL, width, height, CGImageGetBitsPerComponent(imageRef), CGImageGetBytesPerRow(imageRef), colorSpaceInfo, alphaInfo);
} else {
bitmap = CGBitmapContextCreate(NULL, height, width, CGImageGetBitsPerComponent(imageRef), CGImageGetBytesPerRow(imageRef), colorSpaceInfo, alphaInfo);
if (image.imageOrientation == UIImageOrientationLeft) {
NSLog(#"image orientation left");
CGContextRotateCTM (bitmap, radians(90));
CGContextTranslateCTM (bitmap, 0, -height);
} else if (image.imageOrientation == UIImageOrientationRight) {
NSLog(#"image orientation right");
CGContextRotateCTM (bitmap, radians(-90));
CGContextTranslateCTM (bitmap, -width, 0);
} else if (image.imageOrientation == UIImageOrientationUp) {
NSLog(#"image orientation up");
} else if (image.imageOrientation == UIImageOrientationDown) {
NSLog(#"image orientation down");
CGContextTranslateCTM (bitmap, width,height);
CGContextRotateCTM (bitmap, radians(-180.));
CGContextDrawImage(bitmap, CGRectMake(0, 0, width, height), imageRef);
CGImageRef ref = CGBitmapContextCreateImage(bitmap);
UIImage *result = [UIImage imageWithCGImage:ref];
return result;
So that the radians work ensure you declare it above the #implementation
static inline double radians (double degrees) {return degrees * M_PI/180;}
Then I convert to grayscale.
I found this article Convert image to grayscale on converting to grayscale.
I have used the code from here successfully and can now read different colour text and different colour backgrounds
I have modified the code slightly to work as a function within a class rather than as its own class which the other person did
- (UIImage *) toGrayscale:(UIImage*)img
const int RED = 1;
const int GREEN = 2;
const int BLUE = 3;
// Create image rectangle with current image width/height
CGRect imageRect = CGRectMake(0, 0, img.size.width * img.scale, img.size.height * img.scale);
int width = imageRect.size.width;
int height = imageRect.size.height;
// the pixels will be painted to this array
uint32_t *pixels = (uint32_t *) malloc(width * height * sizeof(uint32_t));
// clear the pixels so any transparency is preserved
memset(pixels, 0, width * height * sizeof(uint32_t));
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
// create a context with RGBA pixels
CGContextRef context = CGBitmapContextCreate(pixels, width, height, 8, width * sizeof(uint32_t), colorSpace,
kCGBitmapByteOrder32Little | kCGImageAlphaPremultipliedLast);
// paint the bitmap to our context which will fill in the pixels array
CGContextDrawImage(context, CGRectMake(0, 0, width, height), [img CGImage]);
for(int y = 0; y < height; y++) {
for(int x = 0; x < width; x++) {
uint8_t *rgbaPixel = (uint8_t *) &pixels[y * width + x];
// convert to grayscale using recommended method: http://en.wikipedia.org/wiki/Grayscale#Converting_color_to_grayscale
uint32_t gray = 0.3 * rgbaPixel[RED] + 0.59 * rgbaPixel[GREEN] + 0.11 * rgbaPixel[BLUE];
// set the pixels to gray
rgbaPixel[RED] = gray;
rgbaPixel[GREEN] = gray;
rgbaPixel[BLUE] = gray;
// create a new CGImageRef from our context with the modified pixels
CGImageRef image = CGBitmapContextCreateImage(context);
// we're done with the context, color space, and pixels
// make a new UIImage to return
UIImage *resultUIImage = [UIImage imageWithCGImage:image
// we're done with image now too
return resultUIImage;