Get RGB "CVPixelBuffer" from ARKit - ios

I'm trying to get a CVPixelBuffer in RGB color space from the Apple's ARKit. In func session(_ session: ARSession, didUpdate frame: ARFrame) method of ARSessionDelegate I get an instance of ARFrame. On page Displaying an AR Experience with Metal I found that this pixel buffer is in YCbCr (YUV) color space.
I need to convert this to RGB color space (I actually need CVPixelBuffer and not UIImage). I've found something about color conversion on iOS but I was not able to get this working in Swift 3.

There's several ways to do this, depending on what you're after. The best way to do this in realtime (to say, render the buffer to a view) is to use a custom shader to convert the YCbCr CVPixelBuffer to RGB.
Using Metal:
If you make a new project, select "Augmented Reality App," and select "Metal" for the content technology, the project generated will contain the code and shaders necessary to make this conversion.
Using OpenGL:
The GLCameraRipple example from Apple uses an AVCaptureSession to capture the camera, and shows how to map the resulting CVPixelBuffer to GL textures, which are then converted to RGB in shaders (again, provided in the example).
Non Realtime:
The answer to this stackoverflow question addresses converting the buffer to a UIImage, and offers a pretty simple way to do it.

I have also stuck on this question for several days. All of the code snippet I could find on the Internet is written in Objective-C rather than Swift, regarding converting CVPixelBuffer to UIImage.
Finally, the following code snippet works perfect for me, to convert a YUV image to either JPG or PNG file format, and then you can write it to the local file in your application.
func pixelBufferToUIImage(pixelBuffer: CVPixelBuffer) -> UIImage {
let ciImage = CIImage(cvPixelBuffer: pixelBuffer)
let context = CIContext(options: nil)
let cgImage = context.createCGImage(ciImage, from: ciImage.extent)
let uiImage = UIImage(cgImage: cgImage!)
return uiImage
}

The docs explicitly says that you need to access the luma and chroma planes:
ARKit captures pixel buffers in a planar YCbCr format (also known as YUV) format. To render these images on a device display, you'll need to access the luma and chroma planes of the pixel buffer and convert pixel values to an RGB format.
So there's no way to directly get the RGB planes and you'll have to handle this in your shaders, either in Metal or openGL as described by #joshue

You may want the Accelerate framework's image conversion functions. Perhaps a combination of vImageConvert_420Yp8_Cb8_Cr8ToARGB8888 and vImageConvert_ARGB8888toRGB888 (If you don't want the alpha channel). In my experience these work in real time.

Struggled a long while with this as well and I've ended up writing the following code, which works for me:
// Helper macro to ensure pixel values are bounded between 0 and 255
#define clamp(a) (a > 255 ? 255 : (a < 0 ? 0 : a));
- (void)processImageBuffer:(CVImageBufferRef)imageBuffer
{
OSType type = CVPixelBufferGetPixelFormatType(imageBuffer);
if (type == kCVPixelFormatType_420YpCbCr8BiPlanarFullRange)
{
CVPixelBufferLockBaseAddress(imageBuffer, 0);
// We know the return format of the base address based on the YpCbCr8BiPlanarFullRange format (as per doc)
StandardBuffer baseAddress = (StandardBuffer)CVPixelBufferGetBaseAddress(imageBuffer);
// Get the number of bytes per row for the pixel buffer, width and height
size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
size_t width = CVPixelBufferGetWidth(imageBuffer);
size_t height = CVPixelBufferGetHeight(imageBuffer);
// Get buffer info and planar pixel data
CVPlanarPixelBufferInfo_YCbCrBiPlanar *bufferInfo = (CVPlanarPixelBufferInfo_YCbCrBiPlanar *)baseAddress;
uint8_t* cbrBuff = (uint8_t *)CVPixelBufferGetBaseAddressOfPlane(imageBuffer, 1);
// This just moved the pointer past the offset
baseAddress = (uint8_t *)CVPixelBufferGetBaseAddressOfPlane(imageBuffer, 0);
int bytesPerPixel = 4;
uint8_t *rgbData = rgbFromYCrCbBiPlanarFullRangeBuffer(baseAddress,
cbrBuff,
bufferInfo,
width,
height,
bytesPerRow);
[self doStuffOnRGBBuffer:rgbData width:width height:height bitsPerComponent:8 bytesPerPixel:bytesPerPixel bytesPerRow:bytesPerRow];
free(rgbData);
CVPixelBufferUnlockBaseAddress(imageBuffer, 0);
}
else
{
NSLog(#"Unsupported image buffer type");
}
}
uint8_t * rgbFromYCrCbBiPlanarFullRangeBuffer(uint8_t *inBaseAddress,
uint8_t *cbCrBuffer,
CVPlanarPixelBufferInfo_YCbCrBiPlanar * inBufferInfo,
size_t inputBufferWidth,
size_t inputBufferHeight,
size_t inputBufferBytesPerRow)
{
int bytesPerPixel = 4;
NSUInteger yPitch = EndianU32_BtoN(inBufferInfo->componentInfoY.rowBytes);
uint8_t *rgbBuffer = (uint8_t *)malloc(inputBufferWidth * inputBufferHeight * bytesPerPixel);
NSUInteger cbCrPitch = EndianU32_BtoN(inBufferInfo->componentInfoCbCr.rowBytes);
uint8_t *yBuffer = (uint8_t *)inBaseAddress;
for(int y = 0; y < inputBufferHeight; y++)
{
uint8_t *rgbBufferLine = &rgbBuffer[y * inputBufferWidth * bytesPerPixel];
uint8_t *yBufferLine = &yBuffer[y * yPitch];
uint8_t *cbCrBufferLine = &cbCrBuffer[(y >> 1) * cbCrPitch];
for(int x = 0; x < inputBufferWidth; x++)
{
int16_t y = yBufferLine[x];
int16_t cb = cbCrBufferLine[x & ~1] - 128;
int16_t cr = cbCrBufferLine[x | 1] - 128;
uint8_t *rgbOutput = &rgbBufferLine[x*bytesPerPixel];
int16_t r = (int16_t)roundf( y + cr * 1.4 );
int16_t g = (int16_t)roundf( y + cb * -0.343 + cr * -0.711 );
int16_t b = (int16_t)roundf( y + cb * 1.765);
// ABGR image representation
rgbOutput[0] = 0Xff;
rgbOutput[1] = clamp(b);
rgbOutput[2] = clamp(g);
rgbOutput[3] = clamp(r);
}
}
return rgbBuffer;
}

Related

Metal Texture is not filterable

I am trying to mipmap a texture contained in an MTLTexture object. This texture was loaded from an OpenCV Mat. I can run correctly run kernels on this texture so I know my import process is correct.
Unfortunately, the generate mipmaps function gives this rather opaque error. I get a similar error even if I change temp to be BGRA.
-[MTLDebugBlitCommandEncoder generateMipmapsForTexture:]:1074:
failed assertion `tex(MTLPixelFormatR8Uint) is not filterable.'
// create an MTL Texture
{
MTLTextureDescriptor * textureDescriptor = [MTLTextureDescriptor
texture2DDescriptorWithPixelFormat:MTLPixelFormatR8Uint
width:cols
height:rows
mipmapped:NO];
textureDescriptor.usage = MTLTextureUsageShaderRead;
_mImgTex = [_mDevice newTextureWithDescriptor:textureDescriptor];
}
{
MTLTextureDescriptor * textureDescriptor = [MTLTextureDescriptor
texture2DDescriptorWithPixelFormat:MTLPixelFormatR8Uint
width:cols
height:rows
mipmapped:YES];
textureDescriptor.mipmapLevelCount = 5;
textureDescriptor.usage = MTLTextureUsageShaderRead | MTLTextureUsageShaderWrite;
_mPyrTex = [_mDevice newTextureWithDescriptor:textureDescriptor];
}
// copy data to GPU
cv::Mat temp;
cv::cvtColor(image, temp, cv::COLOR_BGRA2GRAY);
MTLRegion region = MTLRegionMake2D(0, 0, cols, rows);
const int bytesPerPixel = 1 * 1; // 1 uint * 1 channels
const int bytesPerRow = bytesPerPixel * cols;
[_mImgTex replaceRegion:region mipmapLevel:0 withBytes:temp.data bytesPerRow:bytesPerRow];
// try to mipmap
id<MTLBlitCommandEncoder> blitEncoder = [commandBuffer blitCommandEncoder];
MTLOrigin origin = MTLOriginMake(0, 0, 0);
MTLSize size = MTLSizeMake(cols, rows, 1);
[blitEncoder copyFromTexture:_mImgTex sourceSlice:0 sourceLevel:0 sourceOrigin:origin sourceSize:size toTexture:_mPyrTex destinationSlice:0 destinationLevel:0 destinationOrigin:origin];
[blitEncoder generateMipmapsForTexture:_mPyrTex];
[blitEncoder endEncoding];
The documentation for generateMipmapsForTextures says:
Mipmap generation works only for textures with color-renderable and color-filterable pixel formats.
If you look at the "Pixel Format Capabilities" table here, you can see that R8Uint does not support Filter nor is it colour renderable (Color).
Perhaps R8Unorm (MTLPixelFormatR8Unorm) will work well for your needs. Otherwise you might need to write your own mip generation code with compute (although I'm not sure if there's a use case for mipmaps with non filterable textures).

Blit pvrtc texture metal

I'm trying to blit buffer to PVRTC texture. The reason why I'm doing it, because want to keep texture with private storage.
Here is quote from documentation.
If the texture's pixel format is a compressed format, then sourceSize
must be a multiple of the pixel format's block size or be clamped to
the edge of the texture if the block extends outside the bounds of a
texture. For a compressed format, sourceBytesPerRow is the number of
bytes from the start of one row of blocks to the start of the next row
of blocks.
Something wrong in my code, because texture looks broken after.
MTLBlitOption options = MTLBlitOptionNone;
if (_pixelFormat == MTLPixelFormatPVRTC_RGB_4BPP || _pixelFormat == MTLPixelFormatPVRTC_RGBA_4BPP) {
uint32_t blockWidth = 4;
uint32_t blockHeight = 4;
uint32_t bitsPerPixel = 4;
uint32_t blockSize = blockWidth * blockHeight;
uint32_t widthInBlocks = width / blockWidth;
uint32_t heightInBlocks = height / blockHeight;
options = MTLBlitOptionRowLinearPVRTC;
levelBytesPerRow = widthInBlocks * ((blockSize * bitsPerPixel) / 8);
}
id <MTLBuffer> buffer = [device newBufferWithBytes:[data bytes] length:[data length] options:0];
[blitEncoder copyFromBuffer:buffer
sourceOffset:0
sourceBytesPerRow:levelBytesPerRow
sourceBytesPerImage:[buffer length]
sourceSize:MTLSizeMake(width, height, 1)
toTexture:self.textureMetal
destinationSlice:0
destinationLevel:i
destinationOrigin:MTLOriginMake(0, 0, 0)
options:options];

xcode CVpixelBuffer shows negative values

I am using xcode and is currently trying to extract pixel values from the pixel buffer using the following code. However, when i print out the pixel values, it consists of negative values. Anyone has encountered such problem before?
part of the code is as below
- (void)captureOutput:(AVCaptureOutput*)captureOutput didOutputSampleBuffer:
(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection*)connection
{
CVImageBufferRef Buffer = CMSampleBufferGetImageBuffer(sampleBuffer);
CVPixelBufferLockBaseAddress(Buffer, 0);
uint8_t* BaseAddress = (uint8_t*)CVPixelBufferGetBaseAddressOfPlane(Buffer, 0);
size_t Width = CVPixelBufferGetWidth(Buffer);
size_t Height = CVPixelBufferGetHeight(Buffer);
if (BaseAddress)
{
IplImage* Temporary = cvCreateImage(cvSize(Width, Height), IPL_DEPTH_8U, 4);
Temporary->imageData = (char*)BaseAddress;
for (int i = 0; i < Temporary->width * Temporary->height; ++i) {
NSLog(#"Pixel value: %d",Temporary->imageData[i]);
//where i try to print the pixels
}
}
The issue is that imageData of IplImage is a signed char. Thus, anything greater than 127 will appear as a negative number.
You can simply assign it to an unsigned char, and then print that, and you'll see values in the range between 0 and 255, like you probably anticipated:
for (int i = 0; i < Temporary->width * Temporary->height; ++i) {
unsigned char c = Temporary->imageData[i];
NSLog(#"Pixel value: %u", c);
}
Or you can print that in hex:
NSLog(#"Pixel value: %02x", c);

How can I obtain all the image pixels from a UIImage object [duplicate]

This question already has answers here:
Get underlying NSData from UIImage
(7 answers)
Closed 8 years ago.
My task is to obtain all the image pixels from a UIImage object, and then store them in a variable. It is not difficult for me to do that for colour image:
CGColorSpaceRef colorSpace = CGImageGetColorSpace(image.CGImage);
size_t ele = CGColorSpaceGetNumberOfComponents(colorSpace);
CGFloat cols = image.size.width;
CGFloat rows = image.size.height;
// Create memory for the input image
unsigned char *img_mem;
img_mem = (unsigned char*) malloc(rows*cols*4);
unsigned char *my_img;
my_img = (unsigned char *)malloc(rows*cols*3);
CGContextRef contextRef = CGBitmapContextCreate(img_mem, cols, // Width of bitmap
rows, // Height of bitmap
8, // Bits per component
cols*4, // Bytes per row
colorSpace, // Colorspace
kCGImageAlphaNoneSkipLast|kCGBitmapByteOrderDefault); // Bitmap info flags
CGContextDrawImage(contextRef, CGRectMake(0, 0, cols, rows), image.CGImage);
CGContextRelease(contextRef);
CGColorSpaceRelease(colorSpace);
unsigned int pos_new;
unsigned int pos_old;
for(int i=0; i<rows; i++)
{
pos_new = i*cols*3;
pos_old = i*cols*4;
for(int j=0; j<cols; j++)
{
my_img[j*3+pos_new] = img_mem[pos_old+j*4];
my_img[j*3+pos_new+1] = img_mem[pos_old+j*4+1];
my_img[j*3+pos_new+2] = img_mem[pos_old+j*4+2];
}
}
free(img_mem);
//All the pixels are installed in my_img
free(my_img);
My problem is the above codes are useful for colour image, but for grayscale image I do not how to do it. Any ideas?
The trouble is you've got hard-coded numbers in your code that make assumptions about your input and output image formats. Doing it this way completely depends on the exact format of your greyscale source image, and equally on what format you want the resultant image to be in.
If you are sure the images will always be, say, 8-bit single-channel greyscale, then you could get away with simply removing all occurrences of *4 and *3 in your code, and reducing the final inner loop to only handle a single channel:-
for(int j=0; j<cols; j++)
{
my_img[j+pos_new] = img_mem[pos_old+j];
}
But if the output image is going to be 24-bit (as your code seems to imply) then you'll have to leave in all the occurrences of *3 and your inner loop would read:-
for(int j=0; j<cols; j++)
{
my_img[j*3+pos_new] = img_mem[pos_old+j];
my_img[j*3+pos_new+1] = img_mem[pos_old+j];
my_img[j*3+pos_new+2] = img_mem[pos_old+j];
}
This would create greyscale values in 24 bits.
To make it truly flexible you should look at the components of your colorSpace and dynamically code your pixel processing loops based on that, or at least throw some kind of exception or error if the image format is not what your code expects.
Please refer to the category (UIImage+Pixels) on the link : http://b2cloud.com.au/tutorial/obtaining-pixel-data-from-a-uiimage

OpenNI RGB image to OpenCV BGR IplImage conversion?

The image which one can get from OpenNI Image Meta Data is arranged as an RGB image. I would like to convert it to OpenCV IplImage which by default assumes the data to be stored as BGR. I use the following code:
XnUInt8 * pImage = new XnUInt8 [640*480*3];
memcpy(pImage,imageMD.Data(),640*480*3*sizeof(XnUInt8));
XnUInt8 temp;
for(size_t row=0; row<480; row++){
for(size_t col=0;col<3*640; col+=3){
size_t index = row*3*640+col;
temp = pImage[index];
pImage[index] = pImage[index+2];
pImage[index+2] = temp;
}
}
img->imageData = (char*) pImage;
What is the best way (fastest) in C/C++ to perform this conversion such that RGB image becomes BGR (in IplImage format)?
Is it not easy to use the color conversion function of OpenCV?
imgColor->imageData = (char*) pImage;
cvCvtColor( imgColor, imgColor, CV_BGR2RGB);
There are some interesting references out there.
For instance, the QImage to IplImage convertion shown here, that also converts RGB to BGR:
static IplImage* qImage2IplImage(const QImage& qImage)
{
int width = qImage.width();
int height = qImage.height();
// Creates a iplImage with 3 channels
IplImage *img = cvCreateImage(cvSize(width, height), IPL_DEPTH_8U, 3);
char * imgBuffer = img->imageData;
//Remove alpha channel
int jump = (qImage.hasAlphaChannel()) ? 4 : 3;
for (int y=0;y<img->height;y++)
{
QByteArray a((const char*)qImage.scanLine(y), qImage.bytesPerLine());
for (int i=0; i<a.size(); i+=jump)
{
//Swap from RGB to BGR
imgBuffer[2] = a[i];
imgBuffer[1] = a[i+1];
imgBuffer[0] = a[i+2];
imgBuffer+=3;
}
}
return img;
}
There are several posts here besides this one that show how to iterate on IplImage data.
There might be more than that (if the encoding is not openni_wrapper::Image::RGB). A good example can be found in the openni_image.cpp file where they use in line 170 the function fillRGB.

Resources