Key differences between Core Image and GPUImage - ios

What are the major differences between the Core Image and GPUImage frameworks (besides GPUImage being open source)? At a glance their interfaces seem pretty similar... Applying a series of filters to an input to create an output. I see a few small differences, such as the easy to use LookupFilter that GPUImage has. I am trying to figure out why someone would choose one over the other for a photo filtering application.

As the author of GPUImage, you may want to take what I say with a grain of salt. I should first say that I have a tremendous amount of respect for the Core Image team and how they continue to update the framework. I was a heavy Core Image user before I wrote GPUImage, and I patterned many of its design elements based on how Core Image worked on the Mac.
Both frameworks are constantly evolving, so a comparison made today might not be true in a few months. I can point to current capabilities and benchmarks, but there's no guarantee that won't flip when either of us update things.
My philosophy with GPUImage was to create a lightweight wrapper around OpenGL (ES) quads rendered with shaders, and to do so with as simple an interface as possible. As I stated earlier, I pulled in aspects of Core Image that I really liked, but I also changed portions of their interface that had tripped me up in the past. I also extended things a bit, in that Core Image only deals with image processing, while I hook in movie playback, camera input, video recording, and image capture.
When I originally was kicking around the idea for this, Core Image had not yet come to iOS. By the time I released it, Core Image had just been added to iOS. However, the number of filters supported on iOS at that time was fairly limited (no blurs, for example), and Core Image on iOS did not allow you to create custom kernels as it did on the Mac.
GPUImage provided the means to do custom GPU-accelerated operations on images and video on iOS, where Core Image did not. Most people who started using it did so for that reason, because they had some effect that they could not do with stock Core Image filters.
Initially, GPUImage also had significant performance advantages for many common operations. However, the Core Image team has made significant improvements in processing speed with each iOS version and things are very close right now. For some operations, GPUImage is faster, and for others, Core Image is faster. They look to employ some pretty clever optimizations for things like blurs, which I've started to replicate in things like my GPUImageiOSBlurFilter. They also combine multi-stage operations intelligently, where I treat filter steps as discrete and separate items. In some cases on iOS, this gives me an advantage, and I've tried to reduce the memory consequences of this recently, but they handle many types of filter chains better than I do.
iOS 8 introduces the custom kernel support in Core Image on iOS that it has always had on the Mac. This makes it possible to write your own custom filters and other operations in Core Image on iOS, so that will no longer be as much of an advantage for GPUImage. Of course, anyone wanting to target an older iOS version will still be limited by what Core Image can do there, where GPUImage can target back to iOS 4.0.
Core Image also has some neat capabilities in terms of being able to do filtering while an iOS application is in the background (CPU-based at first, but iOS 8 adds GPU-side support for this now), where GPUImage's reliance on OpenGL ES prevents it from running when an application is in the background. There might be ways around this limitation in iOS 8, but I haven't worked through all the documentation yet.
My interests with GPUImage are in the field of machine vision. The image filters are a fun distraction, but I want to use this framework to explore what's possible with GPU-accelerated image analysis. I'm working on arbitrary object recognition and tracking operations, and that's the direction I'll continually evolve the framework toward. However, you have the code to the framework, so you don't have to rely on me.

This is an old thread, but I think it's worth noting that GPUImage also has some features that are not present in Core Image: notably hough transform and several edge detection filters.
Core Image seems all about applying filters and effects — and it's good to see GPUImage explores more on image/video analysis, kind of becoming more like openCV, but in a more efficient way.

Related

ARKit image detection - many images

I need to make an app that detects images and their position, and displays AR content on them. These images will change during the lifetime of the app, and there can be many of them. I'm wondering how to design this kind of app. ARKit can provide this functionality - detect image and it's orientation, and display AR content on it. But the problem is that ARKit can detect only a limited number of images at a time. If I have for example 300 images, then there can be problem. Maybe I could prepare some ML dataset to pre-detect image, and then assign it as an ARKit trackable on the fly? Is this the right approach? What else could I do to make such an app with dynamic and large set of images to detect?
Regarding a ML approach, you can use just about any state-of-the-art object detection network to pull the approximate coordinates of your desired target and extract that section of the frame, passing positives to ARKit or similar. The downside is that training will probably be resource-intensive. It could work, but I can't speak to its efficiency relative to other approaches.
In looking to extend this explanation, I see the ARKit 2.0 handles (what seems to be) what you're trying to do; is this insufficient?
To answer your question in the comments, CoreML seems to offer models for object recognition but not localization, so I suspect it'd be necessary to use their converter after training a model such as these. The input to this network would be frames from camera, and output would be detected classes with probabilities of detection, and approximate coordinates; if your targets are present, and roughly where they are.
Again, though, if you're looking for 2D images rather than 3D+ objects, and especially if it's an ARKit app anyway, it really looks like ARKit's built-in tracking will be much more effective at substantially lower development cost.
At WWDC '19 ARKit 3 was touted to support up to 100 images for image detection. Image tracking supports a lower number of images, which I believe is still under 10. You have to recognize images yourself if you want more than that, currently.
As an idea, you can identify rectangles in the camera feed and then apply a CIPerspectiveCorrection filter to extract a fully 2D image based on the detected rectangle. See Tracking and Altering Images sample code which does something similar.
You then compare the rectangle's image data against your set of 300 source images. ARKit stopped at 100 likely due to performance concerns, but it's possible you can surmount those numbers with a performance metric that's acceptable to your own criteria.

iOS: why does overriding drawRect resort to software rendering?

Im not a huge fan of iOS graphics APIs and their documentation and have been trying for a while now to form a high level view and structure of the rendering process but only have bits and pieces of information. Essentially, I am trying to understand (again, in high level);
1) The role of Coregraphics and CoreAnimation APIs in the rendering pipeline all the way from CGContext to the front frame buffer.
2) And along the way(this is has been the most confusing and least elaborate in the documentation), which tasks are performed by the CPU and GPU.
With Swift and Metal out, I'm hoping the APIs would be revisited.
Have you started with the WWDC videos? They cover many of the details extensively. For example, this year's Advanced Graphics & Animations for iOS Apps is a good starting point. The Core Image talks are generally useful as well (I haven't watched this year's yet). I highly recommend going back to previous years. They've had excellent discussions about the CPU/GPU pipeline in previous years. The WWDC 2012 Core Image Techniques was very helpful. And of course learning to use Instruments effectively is just as important as understanding the implementations.
Apple does not typically provide low-level implementation details in the main documentation. The implementation details are not interface promises, and Apple changes them from time to time to improve performance for the majority of applications. This can sometimes degrade performance on corner cases, which is one reason you should avoid being clever with performance tricks.
But the WWDC videos have exactly what you're describing, and will walk you through the rendering pipeline and how to optimize it. The recommendations they make tend to be very stable from release to release and device to device.
1) The role of Coregraphics and CoreAnimation APIs in the rendering pipeline all the way from CGContext to the front frame buffer.
Core Graphics is a drawing library that implements the same primitives as PDF or PostScript. So you feed it bitmaps and various types of path and it produces pixels.
Core Animation is a compositor. It produces the screen display by compositing buffers (known as layers) from video memory. While compositing it may apply a transform, moving, rotating, adding perspective or doing something else to each layer. It also has a timed animation subsystem that can make timed adjustments to any part of that transform without further programmatic intervention.
UIKit wires things up so that you use CoreGraphics to draw the contents of your view to a layer whenever the contents themselves change. That primarily involves the CPU. For things like animations and transitions you then usually end up applying or scheduling compositing adjustments. So that primarily involves the GPU.
2) And along the way(this is has been the most confusing and least elaborate in the documentation), which tasks are performed by the CPU and GPU.
Individual layer drawing: CPU
Transforming and compositing layers to build up the display: GPU
iOS: why does overriding drawRect resort to software rendering?
It doesn't 'resort' to anything. The exact same pipeline is applied whether you wrote the relevant drawRect: or Apple did.
With Swift and Metal out, I'm hoping the APIs would be revisited.
Swift and Metal are completely orthogonal to this issue. The APIs are very well formed and highly respected. Your issues with them are — as you freely recognise — lack of understanding. There is no need to revisit them and Apple has given no indication that it will be doing so.

Is Quartz2D or OpenGL more appropriate in my situation

I'm planning on developing a 2D game.It's a traffic control game with many different entities -lanes with variety of complexities, pedestrians,bike riders,cars with different privileges and off course traffic lights, etc. Although it's going to be 2D I want it to be as smooth as possible. The objects will mostly not be as realistic - a pedestrian, for example, will more like a cartoon personage than a real man- but the flow of the game should be natural. I'm having a little difficulty in making a decision as to whether to use Quartz or OpenGL. I read lots of threads in SO but I still need some more guidance. Thank you a lot.
For the performance view, OpenGL will be the best. Cocos2d a link is a very good framework, you can put images on canvas with very good performance.
I haven't use GLKit (from iOS5), but you can put OpenGL view in the UIKit, that will be good if you still would like to draw using core graphics, you can layer the UIKit and OpenGL.
I personally recommends Kobold2d: http://www.kobold2d.com/display/KKSITE/Home because it comes with many sample projects, you can start changing from.

confusion regarding quartz2d, core graphics, core animation, core images

i am working on a project which requires some image processing, i also asked question regarding it and i got very good solution here is the link create whole new image in iOS by selecting different properties
but now i want to learn this in more detail and i am confused from where should i start learning quartz 2d or core animation or core graphics or core image
apple documents say regarding quartz 2d that
The Quartz 2D API is part of the Core Graphics framework, so you may
see Quartz referred to as Core Graphics or, simply, CG.
and apple docs says about core graphics that
The Core Graphics framework is a C-based API that is based on the
Quartz advanced drawing engine.
this is confusing how they both relate to each other...
now core animation contains all concepts of coordinates, bounds, frames etc which is also required in drawing images
and core image is introduced in ios 5
from where should i start learning or i which sequence i start learning all these.
Quartz and Core Graphics are effectively synonymous. I tend to avoid using "Quartz" because the term is very prone to confusion (indeed, the framework that includes Core Animation is "QuartzCore," confusing matters further).
I would say:
Learn Core Graphics (CoreGraphics.framework) if you need high performance vector drawing (lines, rectangles, circles, text, etc.), perhaps intermingled with bitmap/raster graphics with simple modifications (e.g. scaling, rotation, borders, etc.). Core Graphics is not particularly well suited for more advanced bitmap operations (e.g. color correction). It can do a lot in the way of bitmap/raster operations, but it's not always obvious or straightforward. In short, Core Graphics is best for "Illustrator/Freehand/OmniGraffle" type uses.
Learn Core Animation (inside QuartzCore.framework) if, well, you need to animate content. Basic animations (such as moving a view around the screen) can be accomplished entirely without Core Animation, using basic UIView functionality, but if you want to do fancier animation, Core Animation is your friend. Somewhat unintuitively, Core Animation is also home to the CALayer family of classes, which in addition to being animatable allow you to do some more interesting things, like quick (albeit poorly performing) view shadows and 3D transforms (giving you what might be thought of as "poor man's OpenGL"). But it's mainly used for animating content (or content properties, such as color and opacity).
Learn Core Image (inside QuartzCore.framework) if you need high performance, pixel-accurate image processing. This could be everything from color correction to lens flares to blurs and anything in between. Apple publishes a filter reference that enumerates the various pre-built Core Image filters that are available. You can also write your own, though this isn't necessarily for the faint of heart. In short, if you need to implement something like "[pick your favorite photo editor] filters" then Core Image is your go-to.
Does that clarify matters?
Core Animation is a technology that relies a lot more on OpenGL, which means its GPU-bound.
Core Graphics on the other hand uses the CPU for rendering. It's a lot more precise (pixel-wise) than Core Animation, but will use your CPU.

Photo booth in iOS. Using OpenCV or OpenGL ES?

I want to make an application filtering videos like Apple's photo booth app
How can I make that??
Using OpenCV, OpenGL ES or anything else?
OpenCV and OpenGL have very different purposes:
OpenCV is a cross-platform computer vision library. It allows you to easily work with image and video files and presents several tools and methods to handle them and execute filters and several other image processing techniques and some more cool stuff in images.
OpenGL is a cross-platform API to produce 2D/3D computer graphics. It is used to draw complex three-dimensional scenes from simple primitives.
If you want to perform cool effects on images OpenCV is the way to go since it provides tools/effects that can be easily used together to achieve the desired effect you are looking for. And this approach doesn't stop you from processing the image with OpenCV and then render the result in a OpenGL window (if you have to). Remember, they have different purposes and every now and then somebody uses them together.
The point is that the effects you want to perform in the image should be done with OpenCV or any other image processing library.
Actually karlphillip, all though what you have said is correct, OpenGL can also be used to perform hardware accelerated image processing.
Apple even has an OpenGL sample project called GLImageProcessing that has hw accelerated brightness, contrast, saturation, hue and sharpness.

Resources