Relation between pixel formats of ofGrabber and ofTexture - ios

I'm currently creating an iOS app and I'm having trouble understanding the relationship between taking pixels from an ofGrabber and drawing them using an ofTexture.
My current code:
In setup():
//Set iOS orientation
ofSetOrientation(OF_ORIENTATION_90_LEFT);
//Inits the camera to specified dimensions and sets up texture to display on screen
grabber.initGrabber(640, 480, OF_PIXELS_BGRA); //Options: 1280x720, 640x480
//Allocate opengl texture
tex.allocate(grabber.width, grabber.height, GL_RGB);
//Create pix array large enough to store an rgb value for each pixel
//pix is a global that I use to do pixel manipulation before drawing
pix = new unsigned char[grabber.width * grabber.height * 3];
In update()
//Loads the new pixels into the opengl texture
tex.loadData(pix, grabber.width, grabber.height, GL_RGB);
In draw():
CGRect screenBounds = [[UIScreen mainScreen] bounds];
CGSizeMake screenSize = CGSizeMake(screenBounds.size.width, screenBounds.size.height);
//Draws the texture we generated onto the screen
//On 1st generation iPad mini: width = 1024, height = 768
tex.draw(0, 0, screenSize.height, screenSize.width); //Reversed for 90 degree rotation
What I'm wondering:
1) Why does the ofGrabber and the ofTexture use seemingly different pixel formats? (These formats are the same used in the VideoGrabberExample)
2) What exactly is the texture drawing with the resolution? I'm loading the pix array into the texture. The pix array represents a 640x480 image, while the ofTexture is drawing a 1024x768 (768x1024 when rotated) image to the screen. How is it doing this? Does it just scale everything up since the aspect ratio is basically the same?
3) Is there a list anywhere that describes the OpenGL and OpenFrameworks pixel formats? I've searched for this but haven't found much. For example, why is it OF_PIXELS_BGRA instead of OF_PIXELS_RGBA? For that matter, why does my code even work if I'm capturing BGRA formatted data (which I assume included a gamma value) yet I am only drawing RGB (and you can see that my pix array is sized for RGB data).
I might also mention that in main() I have:
ofSetupOpenGL(screenSize.height, screenSize.width, OF_FULLSCREEN);
However, changing the width/height values above seem to have no effect whatsoever on my app.

ofGrabber is CPU based, so it uses OF_PIXELS_BGRA by the programmers choice. It is common for some cameras to have BGRA pixel order, so this just avoids the grabber to perform a costly memcpy when grabbing from the source. ofTexture maps GPU memory, so it maps to what you'll see on screen (RGB). Note that GL_RGB is an OpenGL definition.
ofTexture does scale to whatever you tell it to. This is done in GPU so it's quite cheap. It does not need to have the same aspect ratio.
This is quite up to the programmer or your requirements. Some cameras provide BGRA streams, other cameras or files provide RGB directly, or even YUV-I420. Color formats are very heterogeneous. OpenFrameworks will handle conversions in most cases, look into ofPixels to see where and how it's used. In a nutshell:
OF_PIXELS_XXX : Used by ofPixels, basically a RAM mapped bitmap
OF_IMAGE_XXX : Used by ofImage, which wrapps ofPixels and makes it simpler to use
GL_XXX : Used by OpenGL and ofTexture, low level GPU Memory mapped

Related

CIFilter convolution skews CIImage dimensions into infinity

Applying a convolution kernel to an input image should produce an output image with the exact same dimensions. Yet, when using a CIFilter.convolution3x3 with a non-zero bias on a CIImage, inspecting the output reveals that the width, height, and origin coordinate have been skewed into infinity, specifically CGFloat.greatestFiniteMagnitude. I've tried the 5x5 and 7x7 versions of this filter and I've tried setting different weights and biases and the conclusion is the same - if the bias is anything other than zero the output image's size and origin coordinate appear to be ruined.
The documentation for this filter is here.
Here is some code...
// create the filter
let convolutionFilter = CIFilter.convolution3X3()
convolutionFilter.bias = 1 // any non zero bias will do
// I'll skip setting convolutionFilter.weights because the filter's default weights (an identity matrix) should be fine
// make your CIImage input
let input = CIImage(...) // I'm making mine from data I got from the camera
// lets print the size and position so we can compare it with the output
print(input.extent.width, input.extent.height, input.extent.origin) // -> 960.0 540.0 (0.0, 0.0)
// pass the input through the filter
convolutionFilter.inputImage = input
guard let output = convolutionFilter.outputImage else {
print("the filter failed for some reason")
}
// the output image now contains the instructions necessary to perform the convolution,
// but no processing has actually occurred; even so, the extent property will have
// been updated if a change in size or position was described
// examine the output's size (it's just another CIImage - virtual, not real)
print(output.extent.width, output.extent.height, output.extent.origin) // -> 1.7976931348623157e+308 1.7976931348623157e+308 (-8.988465674311579e+307, -8.988465674311579e+307)
Notice that 1.7976931348623157e+308 is CGFloat.greatestFiniteMagnitude.
This shouldn't be happening. The only other information I can provide is that I'm running this code on iOS 13.5 and the CIImages I am filtering are being instantiated from CVPixelBuffers grabbed from CMSampleBuffers that are automatically delivered to my code by the device's camera feed. The width and height are 960x540 before going through the filter.
Although it does not appear to be documented anywhere this does seem to be the normal behavior as #matt suggested, although I have no idea why the bias is the deciding factor. In general I suspect it has something to do with the fact that CIFilter's convolutions must operate outside the initial bounds of the image when processing the edge pixels; the kernel overlaps the edge and the undefined area outside it, which is treated as an infinite space of virtual RGBA(0,0,0,0) pixels.
After the extent is changed to infinity, the original image pixels themselves are still at their original origin point and width/height, so you will have no trouble rendering them into a target pixel buffer with the same origin and width/height; The CIContext which you use for this rendering will just ignore those "virtual" pixels that are outside the bounds of the target pixel buffer.
Keep in mind that your convolution may have unintended effects at the edges of your image due to the interaction with the virtual RGBA(0,0,0,0) pixels adjacent to them, making you think the rendering has gone wrong or misaligned things. Often if you use your CIImage's clampedToExtent() method before applying the convolution such problems can be avoided.

Generate Image from Pixel Array (fast)

I would like to generate a grid picture or bitmap or anything similar with raw pixel data in swift. Since the pixel location, image size etc. are not determined before the user opens the app or presses a refresh button I need a fast way to generate 2732x2048 or more individual pixels and display them on the screen.
First I did use UIGraphicsBeginImageContextWithOptions and drew each pixel with a 1x1 CGRect but this obviously did not scale well.
Afterwards I have used this approach: Pixel Array to UIImage in Swift
But this is still kind of slow with the bigger screens.
Could something like this be done with MetalKit? I would assume that a lower api does render something like this way faster?
Or is there any better way to process something like this in-between MetalKit and CoreGraphics?
Some info regarding the structure of my data:
There is a struct with the pixel color data red, green, blue, alpha for each individual pixel stored as an Array and two image size variables: imageHeight and imageWidth.
The most performant way to do that is to use Metal Compute Function.
Apple has a good documentation to illustrate GPU programming.
Performing Calculations on a GPU
Processing a Texture in a Compute Function

sampler value of texture from DisparityFloat16 pixel format on iOS OpenGLES

I want to use depthDataMap as a texture from iPhoneX true depth camera on my OpenGLES project. Have downloaded some Swift samples, it seems that depthMap can be created and sampled as a float texture on Metal. But on OpenGLES, the only way to create a depth texture from depth buffer is,
glTexImage2D(GL_TEXTURE_2D, 0, GL_DEPTH_COMPONENT, depthWidth, depthHeight, 0, GL_DEPTH_COMPONENT, GL_UNSIGNED_SHORT, CVPixelBufferGetBaseAddress(depthBuffer));
The sample value is different from the value exported as CIImage from DisparityFloat16 pixel type. The value is much lower, and not a linear scale compared to the CIImage.
This is sampled value in OpenGLES
This is via code: CIImage *image = [CIImage imageWithCVImageBuffer:depthData.depthDataMap];
Does anyone have the same issue?
Well it looks like you're specifying the pixel data type as GL_UNSIGNED_SHORT, try changing it to GL_HALF_FLOAT (if using DisparityFloat16) or GL_FLOAT (if using DisparityFloat32).
Also, if you want to display the depth buffer as a texture, you should be converting the depth data to values that mean something in a grayscale image. If you normalize your depth buffer values to be integers between 0 and 255, your picture will look a whole lot better.
For more information, Apple has examples of this exact thing. They use Metal, but the principal would work with OpenGL too. Here's a really nice tutorial with some sample code that does this as well.

How to play Video using OpenGL in iOS?

I am trying to play a video using OpenGL ES 2.0 in iOS. I am not able to get a sample code or starting point of how to achieve this. Can anybody help me with this?
What you are looking for is getting a raw buffer for the video in real time. I believe you need to look into AVFoundation and somehow extract the CVPixelBufferRef. If I remember correctly you have a few ways; one is on demand at specific time, another for processing where you will get a fast iteration of the frames in a block, and the one you probably need is to receive the frames in real time. So with this you can extract a raw RGB buffer which needs to be pushed to the texture and then drawn to the render buffer.
I suggest you create a texture once (per video) and try making it as small as possible but ensure that the video frame will fit. You might need the POT (power of two) textures so to get the texture dimension from video width you need something like:
GLInt textureWidth = 1.0f;
while(textureWidth<videoWidth) textureWidth <<= 1; // Multiplies by 2
So the texture size is expected to be larger then the video. To push the data to the texture you then need to use texture subimage glTexSubImage2D. Which expects a pointer to your raw data and rectangle parameters where to save the data which are then (0, 0, sampleWidth, sampleHeight). Also then the texture coordinates must computed so they are not in range [0, 1] but rather for x: [0, sampleWidth/textureWidth].
So then you just need to put it all together:
Have a system to keep generating the video raw sample buffers
Generate a texture to fit video size
On new sample update the texture using glTexSubImage2D (watch out for threads)
After the data is loaded into the texture draw the texture as full screen rectangle (watch out for threads)
You might need to watch out for video orientation, transformation. So if possible do test your system with a few videos that have been recorded on the device in different orientations. I think there is now a support to receive the buffers already correctly oriented. But by default the sample at least used to be "wrong"; the portrait recorded video still had the samples in landscape but a transformation matrix or orientation was given with the asset.

GPU Texture Splatting

Just as a quick example, I'm trying to do the following:
+
+
=
With the third image as an alpha map, how could this be implemented in a DX9-compatible pixel shader to "blend" between the first two images, creating an effect similar to the fourth image?
Furthermore, how could this newly created texture be given back to the CPU, where it could be placed back inside the original array of textures?
The rough way is to blend the colors of the textures with the alphamap and return it from the pixelshader:
float alpha = tex2D(AlphaSampler,TexCoord).r;
float3 texture1 = tex2D(Texture1Sampler,TexCoord).rgb;
float3 texture2 = tex2D(Texture2Sampler,TexCoord).rgb;
float3 color = lerp(texture1,texture2,alpha);
return float4(color.rgb,1);
Therefore you need a texture as rendertarget (doc) with the size of the inputtextures and a fullscreen quad as geometry for rendering, a xyzrhw quad would be the easiest. This texture you can use further for rendering. If you want to read the texels or something else, where you must lock the result you could work with StretchRect (doc) or UpdateSurface (doc) to copy the data into a normal texture.
If the performance isn't important (e.g. you preprocess the textures), you could easier compute this on the cpu (but it's slower). Lock the 4 textures, iterate over the pixels and merge them directly.

Resources