I am working on an image processing app for the iOS, and one of the various stages of my application is a vector based image posterization/color detection.
Now, I've written the code that can, per-pixel, determine the posterized color, but going through each and every pixel in an image, I imagine, would be quite difficult for the processor if the iOS. As such, I was wondering if it is possible to use the graphics processor instead;
I'd like to create a sort of "pixel shader" which uses OpenGL-ES, or some other rendering technology to process and posterize the image quickly. I have no idea where to start (I've written simple shaders for Unity3D, but never done the underlying programming for them).
Can anyone point me in the correct direction?
I'm going to come at this sideways and suggest you try out Brad Larson's GPUImage framework, which describes itself as "a BSD-licensed iOS library that lets you apply GPU-accelerated filters and other effects to images, live camera video, and movies". I haven't used it and assume you'll need to do some GL reading to add your own filtering but it'll handle so much of the boilerplate stuff and provides so many prepackaged filters that it's definitely worth looking into. It doesn't sound like you're otherwise particularly interested in OpenGL so there's no real reason to look into it.
I will add the sole consideration that under iOS 4 I found it often faster to do work on the CPU (using GCD to distribute it amongst cores) than on the GPU where I needed to be able to read the results back at the end for any sort of serial access. That's because OpenGL is generally designed so that you upload an image and then it converts it into whatever format it wants and if you want to read it back then it converts it back to the one format you expect to receive it in and copies it to where you want it. So what you save on the GPU you pay for because the GL driver has to shunt and rearrange memory. As of iOS 5 Apple have introduced a special mechanism that effectively gives you direct CPU access to OpenGL's texture store so that's probably not a concern any more.
Related
I need to render video from multiple IP cameras into several controls within the client application.
On top of the video, I should be able to add some OSD such as timestamp and camera name.
What I'm trying to do has nothing to do with 3D since we're talking about digital video with some text on it.
Which API is more suitable for this purpose? Direct3D or Direct2D?
Performance should also be a consideration here.
It used to be that Direct2D was a poor choice for Windows Phone (if you care about that system) because it wasn't supported, but Win Phone 8.1 has it now, so less of an issue.
My experience with D2D was that it offered fast, high quality 2D rendering, and I would say it is a good choice.
You might want to take a look at this article on Code Project. That looks appropriate for your purposes.
If you are certain you only need MS system support, then you're all set.
Another way to go would be a cross platform system like nanovg, which offers nice 2D rendering and would work on a Mac. Of course, you'd need to figure out how to do the video part on non windows systems.
Regarding D3D, you could certainly do it that way, but my guess would be it would make some things trickier to do. Don't forget you can combine the two as well...
I'm newly invovled developing image processing app on iOS, I have lots of experience on OpenCV, however everything is new for me on the iOS even OSX.
So I found there are mainly the core image library and the GPUImage library around for the normal image processing work. I'm insterested in knowing which one should I choose as a new on the iOS platform? I have seen some tests done on iOS 8 on iPhone 6, it appears the core image is faster than the GPUImage on the GPUImage's benchmark now.
I'm actually looking for a whole solution on image processing development,
What language ? Swift, Objective-C or Clang and C++ ?
What library ? GPUImage or Core Image or OpenCV or GEGL ?
Is there an example app ?
My goal is to develop some advance colour correction functions, I wish to make it as fast as possible, so in future I can make the image processing become video processing without much problem.
Thanks
I'm the author of GPUImage, so you might weigh my words appropriately. I provide a lengthy description of my design thoughts on this framework vs. Core Image in my answer here, but I can restate that.
Basically, I designed GPUImage to be a convenient wrapper around OpenGL / OpenGL ES image processing. It was built at a time when Core Image didn't exist on iOS, and even when Core Image launched there it lacked custom kernels and had some performance shortcomings.
In the meantime, the Core Image team has done impressive work on performance, leading to Core Image slightly outperforming GPUImage in several areas now. I still beat them in others, but it's way closer than it used to be.
I think the decision comes down to what you value for your application. The entire source code for GPUImage is available to you, so you can customize or fix any part of it that you want. You can look behind the curtain and see how any operation runs. The flexibility in pipeline design lets me experiment with complex operations that can't currently be done in Core Image.
Core Image comes standard with iOS and OS X. It is widely used (plenty of code available), performant, easy to set up, and (as of the latest iOS versions) is extensible via custom kernels. It can do CPU-side processing in addition to GPU-acelerated processing, which lets you do things like process images in a background process (although you should be able to do limited OpenGL ES work in the background in iOS 8). I used Core Image all the time before I wrote GPUImage.
For sample applications, download the GPUImage source code and look in the examples/ directory. You'll find examples of every aspect of the framework for both Mac and iOS, as well as both Objective-C and Swift. I particularly recommend building and running the FilterShowcase example on your iOS device, as it demonstrates every filter from the framework on live video. It's a fun thing to try.
In regards to language choice, if performance is what you're after for video / image processing, language makes little difference. Your performance bottlenecks will not be due to language, but will be in shader performance on the GPU and the speed at which images and video can be uploaded to / downloaded from the GPU.
GPUImage is written in Objective-C, but it can still process video frames at 60 FPS on even the oldest iOS devices it supports. Profiling the code finds very few places where message sending overhead or memory allocation (the slowest areas in this language compared with C or C++) is even noticeable. If these operations were done on the CPU, this would be a slightly different story, but this is all GPU-driven.
Use whatever language is most appropriate and easiest for your development needs. Core Image and GPUImage are both compatible with Swift, Objective-C++, or Objective-C. OpenCV might require a shim to be used from Swift, but if you're talking performance OpenCV might not be a great choice. It will be much slower than either Core Image or GPUImage.
Personally, for ease of use it can be hard to argue with Swift, since I can write an entire video filtering application using GPUImage in only 23 lines of non-whitespace code.
I have just open-sourced VideoShader, which allows you to describe video-processing pipeline in JSON-based scripting language.
https://github.com/snakajima/videoshader
For example, "cartoon filter" can be described in 12 lines.
{
"title":"Cartoon I",
"pipeline":[
{ "filter":"boxblur", "ui":{ "primary":["radius"] }, "attr":{"radius":2.0} },
{ "control":"fork" },
{ "filter":"boxblur", "attr":{"radius":2.0} },
{ "filter":"toone", "ui":{ "hidden":["weight"] } },
{ "control":"swap" },
{ "filter":"sobel" },
{ "filter":"canny_edge", "attr":{ "threshold":0.19, "thin":0.50 } },
{ "filter":"anti_alias" },
{ "blender":"alpha" }
]
}
It compiles this script into GLSL (OpenGL's shading language for GPU) at runtime, and all the pixel operations will be done in GPU.
Well if you are doing some ADVANCE image processing stuff then i suggest to go with OpenGL ES(i assume i don't need to cover the benefit of OpenGL over UIKit or Core Graphics) and you can start with below tutorials.
http://www.raywenderlich.com/70208/opengl-es-pixel-shaders-tutorial
https://developer.apple.com/library/ios/samplecode/GLImageProcessing/Introduction/Intro.html
I am new to iOS programming and am interested in working with images. Basically, I want to be able to obtain the (0,255) and RGB tuples of every pixel in a given image. What would be the best way of doing this? Would I need to use Open GL, or something similar?
Thanks
If you want to work with images, get a copy of Apple's 'Quartz 2D Programming Guide'. If you want even more detailed how-to, get a copy of the "Programming with Quartz" book on Amazon (its says Mac in the title as it predates iOS).
Essentially you are going to take images, draw them into bit map contexts, then determine the rgba layout by querying the image.
If you want to use system resources to assist you in making certain types of changes to images, there is a OSX framework recently moved to iOS called the Accelerate Framework. and it has a lot of functions in it for image manipulation (vImage).
For reading and writing images to the file system look at Apple's 'Image I/O Guide'. For advanced filtering there is Core Image, which allows you to apply filters to images.
EDIT: If you have any interest in really fast accellerated code that uses the GPU to perform some sophisticated filtering, you can checkout Brad Larson's GPU Image project on github.
I need to move large amounts of pixels on the screen on an iOS device. What is the most efficient way of doing this?
So far I'm using glTexSubImage2D(), but I wonder if this can be done any faster. I noticed that OpenGL ES 2.0 does not support pixel buffers, but there seems to be a pixel buffer used by Core Video. Can I use that? Or maybe there's an Apple extension for OpenGL that could help me achieve something similar (I think saw a very vague mention about a client storage extension in one of the WWDC 2012 videos, but I can't find any documentation about it)? Any other way that I can speed this up?
My main concern is that glTexSubImage2D() copies all the pixels that I send. Ideally, I'd like to skip this step of copying the data, since I already have it prepared...
The client storage extension you're probably thinking of is CVOpenGLESTextureCacheCreateTextureFromImage; a full tutorial is here. That's definitely going to be the fastest way to get data to the GPU.
Frustratingly the only mention I can find of it in Apple's documentation is the iOS 4.3 to 5.0 API Differences document — do a quick search for CVOpenGLESTextureCache.h.
Our software produces a lot of data that is georeferenced and recorded over time. We are considering ways to improve the visualisation, and showing the (processed) data in a 3D view, given it's georeferenced, seems a good idea.
I am looking for SO's recommendations for what 3D libraries are best to use as a base when building these kind of visualisations in a Delphi- / C++Builder-based Windows application. I'll add a bounty when I can.
The data
Is recorded over time (hours to days) and is GPS-tagged. So, we have a lot of data following a path over time.
Is spatial: it represents real 3D elements of the earth, such as the land, or 3D elements of objects around the earth.
Is high volume: we could have a point cloud, say, of hundreds of thousands to millions of points. Processed data may display as surfaces created from these point clouds.
From that, you can see that an interactive, spatially-based 3D visualisation seems a good approach. I'm envisaging something where you can easily and quickly navigate around in space, and data will load or be generated on the fly depending on what you're looking at. I would prefer we don't try to write our own 3D library from scratch - for something like this, there have to be good existing libraries we can work from.
So, I'm hoping for a library which supports:
good navigation (is the library based on Euler rotations only, for example? Can you 'pick' objects to rotate around or move with easily?);
modern GPUs (shader-only rendering is ok; being able to hook into the pipeline to write shaders that map values to colours and change dynamically would be great - think data values given a colour through a colour lookup table);
dynamic data / objects (data can be added as it's recorded; and if the data volume is too high, we should be able to page things in and out or recreate them, and only show a sensible subset so that whatever the user's viewport is looking at is there onscreen, but other data can be loaded/regenerated, preferably asynchronously, or at least quickly as the user navigates. Obviously data creation is dependent on us, but a library that has hooks for this kind of thing would be great.)
and technologically, works with Delphi / C++Builder and the VCL.
Libraries
There are two main libraries I've considered so far - I'm looking for knowledgeable opinions about these, or for other libraries I haven't considered.
1. FireMonkey
This is Embarcadero's new UI library, which is only available in XE2 and above. Our app is based on the VCL and we'd want to host this in a VCL window; that seems to be officially unsupported but unofficially works fine, or is available through third-parties.
The mix of UI framework and 3D framework with shaders etc sounds great. But I don't know how complex the library is, what support it has for data that's not a simple object like a cube or sphere, and how well-designed it is. That last link has major criticisms of the 3D side of the library - severe enough I am not sure it's worthwhile in its current state at the time of writing for a non-trivial 3D app.
Is it worth trying to write a new visualisation window in our VCL app using FireMonkey?
2. GLScene
GLScene is a well-known 3D OpenGL framework for Delphi. I have never used it myself so have no experience about how it works or is designed. However, I believe it integrates well into VCL windows and supports shaders and modern GPUs. I do not know how its scene graph or navigation work or how well dynamic data can be implemented.
Its feature list specifically mentions some things I'm interested in, such as easy rotation/movement, procedural objects (implying dynamic data is easy to implement), and helper functions for picking. It seems shaders are Cg only (not GLSL or another non-vendor-specific language.) It also supports "polymorphic image support for texturing (allows many formats as well as procedural textures), easily extendable" - that may just mean many image formats, or it may indicate something where the texture can be dynamically changed, such as for dynamic colour mapping.
Where to from here?
These are the only two major 3D libraries I know of for Delphi or C++Builder. Have I missed any? Are there pros and cons I'm not aware of? Do you have any experience using either of these for this kind of purpose, and what pitfalls should we be wary of or features should we know about and use?
We currently use Embarcadero RAD Studio 2010 and most of our software is written in C++. We have small amounts of Delphi and may consider upgrading IDEs, but we are most likely to wait until the 64-bit C++ compiler is released. For that reason, a library that works in RS2010 might be best.
Thanks for your input :) I'm after high-quality answers, so I'll add a bounty when I can!
I have used GLScene in my 3D geomapping software and although it's not used to an extent you're looking for I can vouch that it seems the most appropriate for what you're trying to do.
GLScene supports terrain rendering and adding customizable objects to the scene. Objects can be interacted with and you can create complex 3D models of objects using the various building blocks of GLScene.
Unfortunately I cannot state how it will work with millions of points, but I do know that it is quite optimized and performs great on minimal hardware - that being said - the target PC I found required a dedicated graphics card capable of using OpenGL 2.1 extensions or higher (I found small issues with integrated graphics cards).
The other library I looked at was DXscene - which appears quite similar to GLScene albeit using DirectX instead of OpenGL. From memory this was a commercial product where GLScene was licensed under GPL. (EDIT - the page seems to be down at the moment : http://www.ksdev.com/index.html)
GLScene is still in active development and provides a fairly comprehensive library of functions, base objects and texturing etc. Things like rotation, translation, pitch, roll, turn, ray casting - to name a few - are all provided for you. Visibility culling is provided for each base object as well as viewing cameras, lighting and meshes. Base objects include cubes, spheres, pipes, tetrahedrons, cones, terrain, grids, 3d text, arrows to name a few.
Objects can be picked with the mouse and moved along 1,2 or 3 axes. Helper functions are included to automatically calculate the top-most object the mouse is under. Complex 3D shapes can be built by attaching base objects to other base objects in a hierarchical manner. So, for example, a car could be built using a rectangle as the base object and attaching four cylinders to it for the wheels - then you can manipulate the 'car' as a whole - since the four cylinders are attached to the base rectangle.
The only downside I could bring to your attention is the sometimes limited help/support available to you. Yes, there is a reference manual and a number of demo applications to show you how to do things such as select objects and move them around, however the reference manual is not complete and there is potential to get 'stuck' on how to accomplish a certain task. Forum support is somewhat limited/sparse. If you have a sound knowledge of 3D basics and concepts I'm sure you could nut it out.
As for Firemonkey - I have had no experience with this so I can't comment. I believe this is more targeted at mobile applications with lower hardware requirements so you may have issues with larger data sets.
Here are some other links that you may consider - I have no experience with them:
http://www.truevision3d.com/
http://www.3impact.com/
Game Development in Delphi
The last one is targeted at game development - but may provide useful information.
Have you tried glData? http://gldata.sourceforge.net/
It is old (~2004, Delphi 7), and I have not personally used the library, but some of the output is amazing.
you can use the GLScene or OpenGL they are good 3D rendering and its very easy to use.
Since you are already using georeferenced data, maybe you should consider embedding GoogleEarth in your Delphi application like this? Then you can add data to it as points, paths, or objects.