GPUImage performance - ios

I'm using GPUImage to apply filters and chain filters on the images. I'm using UISlider to change the value of the filters and applying the filters continuously on the image as slider's values changes. So that user can see what's the output as he changes the value.
This is causing very slow processing and sometimes UI hangs or event app crashes on receiving low memory warning.
How can I achieve fast filter implementation using GPUImage. I have seem some Apps which are applying filters on the go and their UI doesn't even hang for second.
Thanks,
Here's the sample code which I'm using as slider's value changes.
- (IBAction) foregroundSliderValueChanged:(id)sender{
float value = ([(UISlider *)sender maximumValue] - [(UISlider *)sender value]) + [(UISlider *)sender minimumValue];
[(GPUImageVignetteFilter *)self.filter setVignetteEnd:value];
GPUImagePicture *filteredImage = [[GPUImagePicture alloc]initWithImage:_image];
[filteredImage addTarget:self.filter];
[filteredImage processImage];
self.imageView.image = [self.filter imageFromCurrentlyProcessedOutputWithOrientation:_image.imageOrientation];
}

You haven't specified how you set up your filter chain, what filters you use, or how you're doing your updates, so it's hard to provide all but the most generic advice. Still, here goes:
If processing an image for display to the screen, never use a UIImageView. Converting to and from a UIImage is an extremely slow process, and one that should never be used for live updates of anything. Instead, go GPUImagePicture -> filters -> GPUImageView. This keeps the image on the GPU and is far more efficient, processing- and memory-wise.
Only process as many pixels as you actually will be displaying. Use -forceProcessingAtSize: or -forceProcessingAtSizeRespectingAspectRatio: on the first filter in your chain to reduce its resolution to the output resolution of your GPUImageView. This will cause your filters to operate on image frames that are usually many times smaller than your full-resolution source image. There's no reason to process pixels you'll never see. You can then pass in a 0 size to these same methods when you need to finally capture the full-resolution image to disk.
Find more efficient ways of setting up your filter chain. If you have a common set of simple operations that you apply over and over to your images, think about creating a custom shader that combines these operations, as appropriate. Expensive operations also sometimes have a cheaper substitute, like how I use a downsampling-then-upsampling pass for GPUImageiOSBlur to use a much smaller blur radius than I would with a stock GPUImageGaussianBlur.

Related

GPU Memory crash with GPUImage

I've been struggling with this issue for the past 3 days and i can't figure out why/how to solve my memory warnings/crashes due to image filtering.
It's that simple : The user can choose between different filters, and then save the picture.
I had different problems which i managed to overcome, and here is my final code (where i removed everything unimportant).
- (void)viewDidLoad{
// Setting up the filter chain, it's better to do it now to avoid lags
// if we had done it in the "filter" methods, about 3 sec lag.
stillImageSource = [[GPUImagePicture alloc]initWithImage:_photo];
GPUImageBrightnessFilter *soft;
GPUImageContrastFilter *medium;
soft = [[GPUImageBrightnessFilter alloc]init];
medium = [[GPUImageContrastFilter alloc]init];
soft.brightness = -0.5;
medium.contrast = 1.6;
}
- (IBAction)mediumFilter:(id)sender {
[stillImageSource addTarget:medium];
[medium useNextFrameForImageCapture];
[stillImageSource processImage];
//[medium endProcessing]; // Not sure what this does but i've seen it around.
[self.imgPicture setImage:[medium imageFromCurrentFramebuffer]];
[stillImageSource removeAllTargets];
}
- (IBAction)denseFilter:(id)sender {
[stillImageSource addTarget:soft];
[soft addTarget:medium];
[medium useNextFrameForImageCapture];
[stillImageSource processImage];
//[soft endProcessing];
//[medium endProcessing]; // Not sure what this does but i've seen it around.
[self.imgPicture setImage:[medium imageFromCurrentFramebuffer]];
[stillImageSource removeAllTargets];
}
- (IBAction)softFilter:(id)sender {
[stillImageSource addTarget:soft];
[soft useNextFrameForImageCapture];
[stillImageSource processImage];
//[soft endProcessing];
[self.imgPicture setImage:[soft imageFromCurrentFramebuffer]];
[stillImageSource removeAllTargets];
}
The user can freely clic on the different filters. If he uses one every 3 seconds, it only gives memory warnings (using an empty iPhone 4S with no running app). If he uses 3 in a row the app crashes with a memory warning.
The memory shown by Xcode is at about 50M, which is okay considering we're dealing with an image.
What i've tried :
- I manage to get it down to 8M if I alloc/init the image and the filter in the filter method. That means the user has a good 3 second lag and still has memory warnings. At the end of the filter method i'd set everything to nil, and this is interseting : it doesn't go back down to 8M, sometimes it does, but most of the time it just stacks up to 40 to 75/80.
Using the default filters : It works fine but it takes about 10 to 15 seconds to filter an image. My user will be dead by the time his image is filtered, that can't be done.
Putting everything to nil. Still, at the next view, the Xcode memory show is of about 70, and that's just regular memory. Since i've managed to get memory warnings and crashes with only 8, i'm pretty sure the GPU memory is also in the red.
Reducing the size and compressing the image that i'm working with. Didn't change much. It reduces the memory usage (shown by Xcode) to about 45, but i still get the very same memory warnings and crashes after 3 or 4 filters.
Note : if I go very slowly, i really can do as many filters as i want and I only get memory warnings, but no crash.
I'm open to suggestions and questions, i'm quite out of ideas. It's really just applying the most basic filter to a really classic picture. I can show other bits of code if necessary.
Here's what I replied to the email you just sent me:
Well, one large problem here is that you’re going to and from UIImages. Every time you do that, you’re allocating a very large image and you’re not giving some of my framebuffer optimizations a chance to run. This will cause large memory spikes, particularly if you are setting these resulting images to a UIImageView and thus holding on to them.
Rather than creating a new GPUImagePicture every time you want to filter it, create the input GPUImagePicture once and chain filters off of it using addTarget:. Replace your UIImageView with a GPUImageView and target your filtered images at that. When you need to update your filter options, either swap out the filters or set the new options on your filters, then call -processImage on the original source image. This will cause all image processing to fully reside on the GPU (being a lot faster) and will be far more memory efficient.
I’d also recommend using -forceProcessingAtSize: or -forceProcessingAtSizeRespectingAspectRatio: on your first filter in your chain, and set that to the target pixel size of your view. There’s no sense in processing images at a higher resolution than you’ll display. This will also dramatically reduce filtering time and memory usage. When you need to capture your final image to disk, you can reset the image size to 0,0 to remove these constraints and get the full resolution image out.

Effectively scaling multiple images in iOS

I have 15 images being displayed on a single view. I need to scale the images based on the user's voice (the louder they speak the larger the images need to scale). At the moment I am using averagePowerForChannel on the AVAudioRecorder and frequently sampling the audio to scaling all the images appropriately. The code I'm using to do the scaling looks something like this:
- (void)scaleImages:(float)scalingFactor {
for (UIView *imageHolder in self.imageView.subviews) {
UIView *image = [imageHolder.subviews objectAtIndex:0];
image.transform = CGAffineTransformMakeScale(scalingFactor, scalingFactor);
image.hidden = scalingFactor <= 0.0f;
}
}
This works fine when I have a single image, but when I do this for all 15 images it becomes incredibly laggy and unresponsive. I have tried several different options - sampling less frequently, normalizing the sampling output, etc but nothing seems to make a difference.
How would I optimize this?
You might want to try the GPUImage framework . It uses the GPU to accelerate Core Image transforms .
https://github.com/BradLarson/GPUImage

GPUImage: Way to output tiled images?

I use GPUImage (https://github.com/BradLarson/GPUImage) in my iOS project and really like it.
Now I use it to process image with a filter (only color changes, no scaling/transform), and use the layer from the GPUImageView output to do something else, so my chain looks like:
GPUImagePicture -> (Color Filter) -> GPUImageView.
Now I want to change the output to be tiled images, where the rendered result will be used as a pattern. I had considered few ways to do it:
Just use Quartz2D to generate tiled images to GPUImagePicture, then process it (so the result will also be tiled). But since GPUImagePicture will redraw using Quartz2D again, it could be less efficient. Am I right?
Modify or subclass GPUImageView to generate tiled result using OpenGL. It could be hard and I cannot figure out a good way to implement it.
Which will be better and is there any other way to do it?

CGContextDrawImage is EXTREMELY slow after large UIImage drawn into it

It seems that CGContextDrawImage(CGContextRef, CGRect, CGImageRef) performs MUCH WORSE when drawing a CGImage that was created by CoreGraphics (i.e. with CGBitmapContextCreateImage) than it does when drawing the CGImage which backs a UIImage. See this testing method:
-(void)showStrangePerformanceOfCGContextDrawImage
{
///Setup : Load an image and start a context:
UIImage *theImage = [UIImage imageNamed:#"reallyBigImage.png"];
UIGraphicsBeginImageContext(theImage.size);
CGContextRef ctxt = UIGraphicsGetCurrentContext();
CGRect imgRec = CGRectMake(0, 0, theImage.size.width, theImage.size.height);
///Why is this SO MUCH faster...
NSDate * startingTimeForUIImageDrawing = [NSDate date];
CGContextDrawImage(ctxt, imgRec, theImage.CGImage); //Draw existing image into context Using the UIImage backing
NSLog(#"Time was %f", [[NSDate date] timeIntervalSinceDate:startingTimeForUIImageDrawing]);
/// Create a new image from the context to use this time in CGContextDrawImage:
CGImageRef theImageConverted = CGBitmapContextCreateImage(ctxt);
///This is WAY slower but why?? Using a pure CGImageRef (ass opposed to one behind a UIImage) seems like it should be faster but AT LEAST it should be the same speed!?
NSDate * startingTimeForNakedGImageDrawing = [NSDate date];
CGContextDrawImage(ctxt, imgRec, theImageConverted);
NSLog(#"Time was %f", [[NSDate date] timeIntervalSinceDate:startingTimeForNakedGImageDrawing]);
}
So I guess the question is, #1 what may be causing this and #2 is there a way around it, i.e. other ways to create a CGImageRef which may be faster? I realize I could convert everything to UIImages first but that is such an ugly solution. I already have the CGContextRef sitting there.
UPDATE : This seems to not necessarily be true when drawing small images? That may be a clue- that this problem is amplified when large images (i.e. fullsize camera pics) are used. 640x480 seems to be pretty similar in terms of execution time with either method
UPDATE 2 : Ok, so I've discovered something new.. Its actually NOT the backing of the CGImage that is changing the performance. I can flip-flop the order of the 2 steps and make the UIImage method behave slowly, whereas the "naked" CGImage will be super fast. It seems whichever you perform second will suffer from terrible performance. This seems to be the case UNLESS I free memory by calling CGImageRelease on the image I created with CGBitmapContextCreateImage. Then the UIImage backed method will be fast subsequently. The inverse it not true. What gives? "Crowded" memory shouldn't affect performance like this, should it?
UPDATE 3 : Spoke too soon. The previous update holds true for images at size 2048x2048 but stepping up to 1936x2592 (camera size) the naked CGImage method is still way slower, regardless of order of operations or memory situation. Maybe there are some CG internal limits that make a 16MB image efficient whereas the 21MB image can't be handled efficiently. Its literally 20 times slower to draw the camera size than a 2048x2048. Somehow UIImage provides its CGImage data much faster than a pure CGImage object does. o.O
UPDATE 4 : I thought this might have to do with some memory caching thing, but the results are the same whether the UIImage is loaded with the non-caching [UIImage imageWithContentsOfFile] as if [UIImage imageNamed] is used.
UPDATE 5 (Day 2) : After creating mroe questions than were answered yesterday I have something solid today. What I can say for sure is the following:
The CGImages behind a UIImage don't use alpha. (kCGImageAlphaNoneSkipLast). I thought that maybe they were faster to be drawn because my context WAS using alpha. So I changed the context to use kCGImageAlphaNoneSkipLast. This makes the drawing MUCH faster, UNLESS:
Drawing into a CGContextRef with a UIImage FIRST, makes ALL subsequent image drawing slow
I proved this by 1)first creating a non-alpha context (1936x2592). 2) Filled it with randomly colored 2x2 squares. 3) Full frame drawing a CGImage into that context was FAST (.17 seconds) 4) Repeated experiment but filled context with a drawn CGImage backing a UIImage. Subsequent full frame image drawing was 6+ seconds. SLOWWWWW.
Somehow drawing into a context with a (Large) UIImage drastically slows all subsequent drawing into that context.
Well after a TON of experimentation I think I have found the fastest way to handle situations like this. The drawing operation above which was taking 6+ seconds now .1 seconds. YES. Here's what I discovered:
Homogenize your contexts & images with a pixel format! The root of the question I asked boiled down to the fact that the CGImages inside a UIImage were using THE SAME PIXEL FORMAT as my context. Therefore fast. The CGImages were a different format and therefore slow. Inspect your images with CGImageGetAlphaInfo to see which pixel format they use. I'm using kCGImageAlphaNoneSkipLast EVERYWHERE now as I don't need to work with alpha. If you don't use the same pixel format everywhere, when drawing an image into a context Quartz will be forced to perform expensive pixel-conversions for EACH pixel. = SLOW
USE CGLayers! These make offscreen-drawing performance much better. How this works is basically as follows. 1) create a CGLayer from the context using CGLayerCreateWithContext. 2) do any drawing/setting of drawing properties on THIS LAYER's CONTEXT which is gotten with CGLayerGetContext. READ any pixels or information from the ORIGINAL context. 3) When done, "stamp" this CGLayer back onto the original context using CGContextDrawLayerAtPoint.This is FAST as long as you keep in mind:
1) Release any CGImages created from a context (i.e. those created with CGBitmapContextCreateImage) BEFORE "stamping" your layer back into the CGContextRef using CGContextDrawLayerAtPoint. This creates a 3-4x speed increase when drawing that layer. 2) Keep your pixel format the same everywhere!! 3) Clean up CG objects AS SOON as you can. Things hanging around in memory seem to create strange situations of slowdown, probably because there are callbacks or checks associated with these strong references. Just a guess, but I can say that CLEANING UP MEMORY ASAP helps performance immensely.
I had a similar problem. My application has to redraw a picture almost as large as the screen size. The problem came down to drawing as fast as possible two images of the same resolution, neither rotated nor flipped, but scaled and positioned in different places of the screen each time. After all, I was able to get ~15-20 FPS on iPad 1 and ~20-25 FPS on iPad4. So... hope this helps someone:
Exactly as typewriter said, you have to use the same pixel format. Using one with AlphaNone gives a speed boost. But even more important, argb32_image call in my case did numerous calls converting pixels from ARGB to BGRA. So the best bitmapInfo value for me was (at the time; there is a probability that Apple can change something here in the future):
const CGBitmabInfo g_bitmapInfo = kCGBitmapByteOrder32Little | kCGImageAlphaNoneSkipLast;
CGContextDrawImage may work faster if rectangle argument was made integral (by CGRectIntegral). Seems to have more effect when image is scaled by factor close to 1.
Using layers actually slowed down things for me. Probably something was changed since 2011 in some internal calls.
Setting interpolation quality for the context lower than default (by CGContextSetInterpolationQuality) is important. I would recommend using (IS_RETINA_DISPLAY ? kCGInterpolationNone : kCGInterpolationLow). Macros IS_RETINA_DISPLAY is taken from here.
Make sure you get CGColorSpaceRef from CGColorSpaceCreateDeviceRGB() or the like when creating context. Some performance issues were reported for getting fixed color space instead of requesting that of the device.
Inheriting view class from UIImageView and simply setting self.image to the image created from context proved useful to me. However, read about using UIImageView first if you want to do this, for it requires some changes in code logic (because drawRect: isn't called anymore).
And if you can avoid scaling your image at the time of actual drawing, try to do so. Drawing non-scaled image is significantly faster - unfortunately, for me that was not an option.

How to implement fast image filters on iOS platform

I am working on iOS application where user can apply a certain set of photo filters. Each filter is basically set of Photoshop actions with a specific parameters. This actions are:
Levels adjustment
Brightness / Contrast
Hue / Saturation
Single and multiple overlay
I've repeated all this actions in my code using arithmetic expressions looping through the all pixels in image. But when I run my app on iPhone 4, each filter takes about 3-4 sec to apply which is quite a few time for the user to wait. The image size is 640 x 640 px which is #2x of my view size because it's displayed on Retina display. I've found that my main problem is levels modifications which are calling the pow() C function each time I need to adjust the gamma. I am using floats not doubles of course because ARMv6 and ARMv7 are slow with doubles. Tried to enable and disable Thumb and got the same result.
Example of the simplest filter in my app which is runs pretty fast though (2 secs). The other filters includes more expressions and pow() calls thus making them slow.
https://gist.github.com/1156760
I've seen some solutions which are using Accelerate Framework vDSP matrix transformations for fast image modifications. I've also seen OpenGL ES solutions. I am not sure that they are capable of my needs. But probably it's just a matter of translating my set of changes into some good convolution matrix?
Any advice would be helpful.
Thanks,
Andrey.
For the filter in your example code, you could use a lookup table to make it much faster. I assume your input image is 8 bits per color and you are converting it to float before passing it to this function. For each color, this only gives 256 possible values and therefore only 256 possible output values. You could precompute these and store them in an array. This would avoid the pow() calculation and the bounds checking since you could factor them into the precomputation.
It would look something like this:
unsigned char table[256];
for(int i=0; i<256; i++) {
float tmp = pow((float)i/255.0f, 1.3f) * 255.0;
table[i] = tmp > 255 ? 255 : (unsigned char)tmp;
}
for(int i=0; i<length; ++i)
m_OriginalPixelBuf[i] = table[m_OriginalPixelBuf[i]];
In this case, you only have to perform pow() 256 times instead of 3*640*640 times. You would also avoid the branching caused by the bounds checking in your main image loop which can be costly. You would not have to convert to float either.
Even a faster way may be to precompute the table outside the program and just put the 256 coefficients in the code.
None of the operations you have listed there should require a convolution or even a matrix multiply. They are all pixel-wise operations, meaning that each output pixel only depends on the single corresponding input pixel. You would need to consider convolution for operations like blurring or sharpening where multiple input pixels affect a single output pixel.
If you're looking for the absolute fastest way to do this, you're going to want to use the GPU to handle the processing. It's built to do massively parallel operations, like color adjustments on single pixels.
As I've mentioned in other answers, I measured a 14X - 28X improvement in performance when running an image processing operation using OpenGL ES instead of on the CPU. You can use the Accelerate framework to do faster on-CPU image manipulation (I believe Apple claims around a ~4-5X boost is possible here), but it won't be as fast as OpenGL ES. It can be easier to implement, however, which is why I've sometimes used Accelerate for this over OpenGL ES.
iOS 5.0 also brings over Core Image from the desktop, which gives you a nice wrapper around these kind of on-GPU image adjustments. However, there are some limitations to the iOS Core Image implementation that you don't have when working with OpenGL ES 2.0 shaders directly.
I present an example of an OpenGL ES 2.0 shader image filter in my article here. The hardest part about doing this kind of processing is getting the OpenGL ES scaffolding set up. Using my sample application there, you should be able to extract that setup code and apply your own filters using it. To make this easier, I've created an open source framework called GPUImage that handles all of the OpenGL ES interaction for you. It has almost every filter you list above, and most run in under 2.5 ms for a 640x480 frame of video on an iPhone 4, so they're far faster than anything processed on the CPU.
As I said in a comment, you should post this question on the official Apple Developer Forums as well.
That aside, one real quick check: are you calling pow( ) or powf( )? Even if your data is float, calling pow( ) will get you the double-precision math library function, which is significantly slower than the single-precision variant powf( ) (and you'll have to pay for the extra conversions between float and double as well).
And a second check: have you profiled your filters in Instruments? Do you actually know where the execution time is being spent, or are you guessing?
I actually wanted to do all this myself but I found Silverberg's Image Filters. You could apply various instagram type image filters on your images. This so much better than other image filters out there - GLImageProcessing or Cimg.
Also check Instagram Image Filters on iPhone.
Hope this helps...
From iOS 5 upwards, you can use the Core Image filters to adjust a good range of image parameters.
To adjust contrast for example, this code works like a charm:
- (void)setImageContrast:(float)contrast forImageView:(UIImageView *)imageView {
if (contrast > MIN_CONTRAST && contrast < MAX_CONTRAST) {
CIImage *inputImage = [[CIImage alloc] initWithImage:imageView.image];
CIFilter *exposureAdjustmentFilter = [CIFilter filterWithName:#"CIColorControls"];
[exposureAdjustmentFilter setDefaults];
[exposureAdjustmentFilter setValue:inputImage forKey:#"inputImage"];
[exposureAdjustmentFilter setValue:[NSNumber numberWithFloat:contrast] forKey:#"inputContrast"]; //default = 1.00
// [exposureAdjustmentFilter setValue:[NSNumber numberWithFloat:1.0f] forKey:#"inputSaturation"]; //default = 1.00
// [exposureAdjustmentFilter setValue:[NSNumber numberWithFloat:0.0f] forKey:#"inputBrightness"];
CIImage *outputImage = [exposureAdjustmentFilter valueForKey:#"outputImage"];
CIContext *context = [CIContext contextWithOptions:nil];
imageView.image = [UIImage imageWithCGImage:[context createCGImage:outputImage fromRect:outputImage.extent]];
}
}
N.B. Default value for contrast is 1.0 (maximum value suggested is 4.0).
Also, contrast is calculated here on the imageView's image, so calling this method repeatedly will cumulate the contrast. Meaning, if you call this method with contrast value 2.0 first and then again with contrast value 3.0, you will get the original image with contrast value increased by 6.0 (2.0 * 3.0) - not 5.0.
Check the Apple documentation for more filters and parameters.
To list all available filters and parameters in code, just run this loop:
NSArray* filters = [CIFilter filterNamesInCategories:nil];
for (NSString* filterName in filters)
{
NSLog(#"Filter: %#", filterName);
NSLog(#"Parameters: %#", [[CIFilter filterWithName:filterName] attributes]);
}
This is an old thread, but I got to it from another link on SO, so people still read it.
With iOS 5, Apple added support for Core Image, and a decent number of Core image filters. I'm pretty sure all the ones the OP mentioned are available
Core Image uses OpenGL shaders under the covers, so it's really fast. It's much easier to use than OpenGL however. If you aren't already working in OpenGL, and just want to apply filters to CGImage or UIIMage objects, Core Image filters are the way to go.

Resources