I really need to get an RGB 8 bytes per channel buffer from the GPU.
I need it to pass to a trained convolutional neural network, and it only accepts data in that format.
I can't convert it on the CPU as I'm heavily CPU bound and it's quite slow.
I currently have FBO with a renderbuffer attached, which is defined with:
glRenderbufferStorage(GL_RENDERBUFFER, GL_RGB8_OES, bufferWidth, bufferHeight);
There are no errors when I bind, define and render to the buffer.
But when I use
glReadPixels(0, 0, bufferWidth, bufferHeight, GL_RGB, GL_UNSIGNED_BYTE, rgbBufferRawName);
it gives an invalid enum error (0x0500). It works just fine when I pass GL_RED_EXT or GL_RGBA and produces correct buffers (I've checked it by uploading those buffers to a texture and rendering them, and they looked correct).
I tried setting glPixelStorei(GL_PACK_ALIGNMENT, 1); but that made no difference.
I'm on iOS10 and iPhone 6. I was doing ES2.0, but now tried switching to ES3.0 in hopes that it will help me solve the problem. It did not.
I would really appreciate help in getting RGB8 buffer in any way,
Thanks.
According the OpenGL 3.0 specification, GL_RGB is not a valid value for format.
https://www.khronos.org/opengles/sdk/docs/man3/html/glReadPixels.xhtml
You may want to either convert it to RGB after retrieving the GL_RGBA formatted buffer, or adjusting your algorithm to compensate for RGBA.
Related
I want to use depthDataMap as a texture from iPhoneX true depth camera on my OpenGLES project. Have downloaded some Swift samples, it seems that depthMap can be created and sampled as a float texture on Metal. But on OpenGLES, the only way to create a depth texture from depth buffer is,
glTexImage2D(GL_TEXTURE_2D, 0, GL_DEPTH_COMPONENT, depthWidth, depthHeight, 0, GL_DEPTH_COMPONENT, GL_UNSIGNED_SHORT, CVPixelBufferGetBaseAddress(depthBuffer));
The sample value is different from the value exported as CIImage from DisparityFloat16 pixel type. The value is much lower, and not a linear scale compared to the CIImage.
This is sampled value in OpenGLES
This is via code: CIImage *image = [CIImage imageWithCVImageBuffer:depthData.depthDataMap];
Does anyone have the same issue?
Well it looks like you're specifying the pixel data type as GL_UNSIGNED_SHORT, try changing it to GL_HALF_FLOAT (if using DisparityFloat16) or GL_FLOAT (if using DisparityFloat32).
Also, if you want to display the depth buffer as a texture, you should be converting the depth data to values that mean something in a grayscale image. If you normalize your depth buffer values to be integers between 0 and 255, your picture will look a whole lot better.
For more information, Apple has examples of this exact thing. They use Metal, but the principal would work with OpenGL too. Here's a really nice tutorial with some sample code that does this as well.
ARKit runs at 60 frames/sec, which equates to 16.6ms per frame.
My current code to convert the CVPixelBufferRef (kCVPixelFormatType_420YpCbCr8BiPlanarFullRange format) to a cv::Mat (YCrCb) runs in 30ms, which causes ARKit to stall and everything to lag.
Does anyone have any ideas on how to to a quicker conversion or do I need to drop the frame rate?
There is a suggestion by Apple to use Metal, but I'm not sure how to do that.
Also I could just take the grayscale plane, which is the first channel, which runs in <1ms, but ideally I need the colour information as well.
In order to process an image in a pixel buffer using Metal, you need to do following.
Call CVMetalTextureCacheCreateTextureFromImage to create CVMetalTexture object on top of the pixel buffer.
Call CVMetalTextureGetTexture to create a MTLTexture object, which Metal code (GPU) can read and write.
Write some Metal code to convert the color format.
I have an open source project (https://github.com/snakajima/vs-metal), which processes pixel buffers (from camera, not ARKit) using Metal. Feel free to copy any code from this project.
I tried to convert Ycbcr to RGB, do image processing in RGB mat and convert it back to Ycbcr, it worked very slowly. I suggest only do that with a static image. For realtime processing, we should process directly in cv::Mat. ARFrame.capturedImage is Ycbcr buffer. So, the solution is
Sperate the buffer to 2 cv::Mat (yPlane and cbcrPlane). Keep in mind, we do not clone memory, we create 2 cv::Mat with base addresses is yPlane address and cbcrPlane address.
Do image process on yPlane and cbcrPlane, size(cbcrPlane) = size(yPlane) / 2.
You can check out my code here: https://gist.github.com/ttruongatl/bb6c69659c48bac67826be7368560216
Can somebody tell me if it is possible to use full precision floating point 2DTextures on the iPad2? (full precision = single precision)
By printing out the implemented OpenGL extensions on the iPad2 using
glGetString(GL_EXTENSIONS)
I figured out that both OES_texture_half_float and OES_texture_float are supported.
However, using GL_HALF_FLOAT_OES as the textures type works fine,
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, w, h, 0, GL_RGBA, GL_HALF_FLOAT_OES, NULL);
whereas using GL_FLOAT results in an incomplete framebuffer object.
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, w, h, 0, GL_RGBA, GL_FLOAT, NULL);
Am I making something wrong here or are full precision floating point textures just not supported?
Thank u in advance.
The OES_texture_float extension provides for 32-bit floating point textures to be used as inputs, but that doesn't mean that you can render into them. The EXT_color_buffer_half_float adds the capability for iOS devices (I believe A5 GPUs and higher) to render into 16-bit half float textures, but not 32-bit full float ones.
I don't believe that any of the current iOS devices allow for rendering into full 32-bit float textures, just to use them as inputs when rendering a scene.
I want to record images, rendered with OpenGL, into a movie-file with the help of AVAssetWriter. The problem arises, that the only way to access pixels from an OpenGL framebuffer is by using glReadPixels, which only supports the RGBA-pixel format on iOS. But AVAssetWriter doesn't support this format. Here I can either use ARGB or BGRA. As the alpha-values can be ignored, I came to the conclusion, that the fastest way to convert RGBA to ARGB would be to give glReadPixels the buffer shifted by one byte:
UInt8 *buffer = malloc(width*height*4+1);
glReadPixels(0, 0, width, height, GL_RGBA, GL_UNSIGNED_BYTE, buffer+1);
The problem is, that the glReadPixels call leads to a EXC_BAD_ACCESS crash. If I don't shift the buffer by one byte, it works perfectly (but obviously with wrong colors in the video-file). What's the problem here?
I came to the conclusion, that the fastest way to convert RGBA to ARGB would be to give glReadPixels the buffer shifted by one byte
This will however shift your alpha values by 1 pixel as well. Here's another suggestion:
Render the picture to a texture (using a FBO with that texture as color attachment). Next render that texture to another framebuffer, with a swizzling fragment shader:
#version ...
uniform sampler2D image;
uniform vec2 image_dim;
void main()
{
// we want to address texel centers by absolute fragment coordinates, this
// requires a bit of work (OpenGL-ES SL doesn't provide texelFetch function).
gl_FragColor.rgba =
texture2D(image, vec2( (2*gl_FragCoord.x + 1)/(2*image_dim.y),
(2*gl_FragCoord.y + 1)/(2*image_dim.y) )
).argb; // this swizzles RGBA into ARGB order if read into a RGBA buffer
}
What happens if you put an extra 128 bytes of slack on the end of your buffer? It might be that OpenGL is trying to fill 4/8/16/etc bytes at a time for performance, and has a bug when the buffer is non-aligned or something. It wouldn't be the first time a performance optimization in OpenGL had issues on an edge case :)
Try calling
glPixelStorei(GL_PACK_ALIGNMENT,1)
before glReadPixels.
From the docs:
GL_PACK_ALIGNMENT
Specifies the alignment requirements for the start of each pixel row in memory.
The allowable values are
1 (byte-alignment),
2 (rows aligned to even-numbered bytes),
4 (word-alignment), and
8 (rows start on double-word boundaries).
The default value is 4 (see glGet). This often gets mentioned as a troublemaker in various "OpenGL pitfalls" type lists, although this is generally more to do with its row padding effects than buffer alignment.
As an alternative approach, what happens if you malloc 4 extra bytes, do the glReadPixels as 4-byte aligned starting at buffer+4, and then pass your AVAssetWriter buffer+3 (although I've no idea whether AVAssetWriter is more tolerant of alignment issues) ?
You will need to shift bytes by doing a memcpy or other copy operation. Modifying the pointers will leave them unaligned, which may or may not be within the capabilities of any underlying hardware (DMA bus widths, tile granularity, etc.)
Using buffer+1 will mean the data is not written at the start of your malloc'd memory, but rather one byte in, so it will be writing over the end of your malloc'd memory, causing the crash.
If iOS's glReadPixels will only accept GL_RGBA then you'll have to go through and re-arrange them yourself I think.
UPDATE, sorry I missed the +1 in your malloc, StilesCrisis is probably right about the cause of the crash.
I'm writing media player framework for Apple TV, using OpenGL ES and ffmpeg.
Conversion to RGBA is required for rendering on OpenGL ES, soft convert using swscale is unbearably slow, so using information on the internet I came up with two ideas: using neon (like here) or using fragment shaders and GL_LUMINANCE and GL_LUMINANCE_ALPHA.
As I know almost nothing about OpenGL, the second option still doesn't work :)
Can you give me any pointers how to proceed?
Thank you in advance.
It is most definitely worthwhile learning OpenGL ES2.0 shaders:
You can load-balance between the GPU and CPU (e.g. video decoding of subsequent frames while GPU renders the current frame).
Video frames need to go to the GPU in any case: using YCbCr saves you 25% bus bandwidth if your video has 4:2:0 sampled chrominance.
You get 4:2:0 chrominance up-sampling for free, with the GPU hardware interpolator. (Your shader should be configured to use the same vertex coordinates for both Y and C{b,r} textures, in effect stretching the chrominance texture out over the same area.)
On iOS5 pushing YCbCr textures to the GPU is fast (no data-copy or swizzling) with the texture cache (see the CVOpenGLESTextureCache* API functions). You will save 1-2 data-copies compared to NEON.
I am using these techniques to great effect in my super-fast iPhone camera app, SnappyCam.
You are on the right track for implementation: use a GL_LUMINANCE texture for Y and GL_LUMINANCE_ALPHA if your CbCr is interleaved. Otherwise use three GL_LUMINANCE textures if all of your YCbCr components are noninterleaved.
Creating two textures for 4:2:0 bi-planar YCbCr (where CbCr is interleaved) is straightforward:
glBindTexture(GL_TEXTURE_2D, texture_y);
glTexImage2D(
GL_TEXTURE_2D,
0,
GL_LUMINANCE, // Texture format (8bit)
width,
height,
0, // No border
GL_LUMINANCE, // Source format (8bit)
GL_UNSIGNED_BYTE, // Source data format
NULL
);
glBindTexture(GL_TEXTURE_2D, texture_cbcr);
glTexImage2D(
GL_TEXTURE_2D,
0,
GL_LUMINANCE_ALPHA, // Texture format (16-bit)
width / 2,
height / 2,
0, // No border
GL_LUMINANCE_ALPHA, // Source format (16-bits)
GL_UNSIGNED_BYTE, // Source data format
NULL
);
where you would then use glTexSubImage2D() or the iOS5 texture cache to update these textures.
I'd also recommend using a 2D varying that spans the texture coordinate space (x: [0,1], y: [0,1]) so that you avoid any dependent texture reads in your fragment shader. The end result is super-fast and doesn't load the GPU at all in my experience.
Converting YUV to RGB using NEON is very slow. Use a shader to offload onto the GPU.