how to convert pixelBuffer from BGRA to YUV - ios

i want to convert pixelBuffer from BGRA to YUV(420V).
Using the convert function, most of the videos in my mobile phone photo albums are running normally ,
Execpt the one video from my colleagues, after converted the pixels are insanity,
the video from my colleagues is quite normal,
ID : 1
Format : AVC
Format/Info : Advanced Video Codec
Format profile : Main#L3.1
Format settings : CABAC / 1 Ref Frames
Format settings, CABAC : Yes
Format settings, Reference frames : 1 frame
Format settings, GOP : M=1, N=15
Codec ID : avc1
Codec ID/Info : Advanced Video Coding
Duration : 6 s 623 ms
Source duration : 6 s 997 ms
Bit rate : 4 662 kb/s
Width : 884 pixels
Clean aperture width : 884 pixels
Height : 492 pixels
Clean aperture height : 492 pixels
Display aspect ratio : 16:9
Original display aspect ratio : 16:9
Frame rate mode : Variable
Frame rate : 57.742 FPS
Minimum frame rate : 20.000 FPS
Maximum frame rate : 100.000 FPS
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Bits/(Pixel*Frame) : 0.186
Stream size : 3.67 MiB (94%)
Source stream size : 3.79 MiB (97%)
Title : Core Media Video
Encoded date : UTC 2021-10-29 09:54:03
Tagged date : UTC 2021-10-29 09:54:03
Color range : Limited
Color primaries : Display P3
Transfer characteristics : BT.709
Matrix coefficients : BT.709
Codec configuration box : avcC
this is my function, i do not know what is wrong.
CFDictionaryRef CreateCFDictionary(CFTypeRef* keys, CFTypeRef* values, size_t size) {
return CFDictionaryCreate(kCFAllocatorDefault,
static void bt709_rgb2yuv8bit_TV(uint8_t R, uint8_t G, uint8_t B, uint8_t &Y, uint8_t &U, uint8_t &V)
Y = 0.183 * R + 0.614 * G + 0.062 * B + 16;
U = -0.101 * R - 0.339 * G + 0.439 * B + 128;
V = 0.439 * R - 0.399 * G - 0.040 * B + 128;
CVPixelBufferRef RGB2YCbCr8Bit(CVPixelBufferRef pixelBuffer)
CVPixelBufferLockBaseAddress(pixelBuffer, 0);
uint8_t *baseAddress = (uint8_t *)CVPixelBufferGetBaseAddress(pixelBuffer);
int w = (int) CVPixelBufferGetWidth(pixelBuffer);
int h = (int) CVPixelBufferGetHeight(pixelBuffer);
// int stride = (int) CVPixelBufferGetBytesPerRow(pixelBuffer) / 4;
OSType pixelFormat = kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange;
CVPixelBufferRef pixelBufferCopy = NULL;
const size_t attributes_size = 1;
CFTypeRef keys[attributes_size] = {
CFDictionaryRef io_surface_value = CreateCFDictionary(nullptr, nullptr, 0);
CFTypeRef values[attributes_size] = {io_surface_value};
CFDictionaryRef attributes = CreateCFDictionary(keys, values, attributes_size);
CVReturn status = CVPixelBufferCreate(kCFAllocatorDefault,
if (status != kCVReturnSuccess) {
std::cout << "YUVBufferCopyWithPixelBuffer :: failed" << std::endl;
return nullptr;
if (attributes) {
attributes = nullptr;
CVPixelBufferLockBaseAddress(pixelBufferCopy, 0);
size_t y_stride = CVPixelBufferGetBytesPerRowOfPlane(pixelBufferCopy, 0);
size_t uv_stride = CVPixelBufferGetBytesPerRowOfPlane(pixelBufferCopy, 1);
int plane_h1 = (int) CVPixelBufferGetHeightOfPlane(pixelBufferCopy, 0);
int plane_h2 = (int) CVPixelBufferGetHeightOfPlane(pixelBufferCopy, 1);
uint8_t *y = (uint8_t *) CVPixelBufferGetBaseAddressOfPlane(pixelBufferCopy, 0);
memset(y, 0x80, plane_h1 * y_stride);
uint8_t *uv = (uint8_t *) CVPixelBufferGetBaseAddressOfPlane(pixelBufferCopy, 1);
memset(uv, 0x80, plane_h2 * uv_stride);
int y_bufferSize = w * h;
int uv_bufferSize = w * h / 4;
uint8_t *y_planeData = (uint8_t *) malloc(y_bufferSize * sizeof(uint8_t));
uint8_t *u_planeData = (uint8_t *) malloc(uv_bufferSize * sizeof(uint8_t));
uint8_t *v_planeData = (uint8_t *) malloc(uv_bufferSize * sizeof(uint8_t));
int u_offset = 0;
int v_offset = 0;
uint8_t R, G, B;
uint8_t Y, U, V;
for (int i = 0; i < h; i ++) {
for (int j = 0; j < w; j ++) {
int offset = i * w + j;
B = baseAddress[offset * 4];
G = baseAddress[offset * 4 + 1];
R = baseAddress[offset * 4 + 2];
bt709_rgb2yuv8bit_TV(R, G, B, Y, U, V);
y_planeData[offset] = Y;
//隔行扫描 偶数行的偶数列取U 奇数行的偶数列取V
if (j % 2 == 0) {
(i % 2 == 0) ? u_planeData[u_offset++] = U : v_planeData[v_offset++] = V;
for (int i = 0; i < plane_h1; i ++) {
memcpy(y + i * y_stride, y_planeData + i * w, w);
if (i < plane_h2) {
for (int j = 0 ; j < w ; j+=2) {
//NV12 和 NV21 格式都属于 YUV420SP 类型。它也是先存储了 Y 分量,但接下来并不是再存储所有的 U 或者 V 分量,而是把 UV 分量交替连续存储。
//NV12 是 IOS 中有的模式,它的存储顺序是先存 Y 分量,再 UV 进行交替存储。
memcpy(uv + i * y_stride + j, u_planeData + i * w/2 + j/2, 1);
memcpy(uv + i * y_stride + j + 1, v_planeData + i * w/2 + j/2, 1);
CVPixelBufferUnlockBaseAddress(pixelBuffer, 0);
CVPixelBufferUnlockBaseAddress(pixelBufferCopy, 0);
return pixelBufferCopy;
pixelBuffer BGRA is normal
pixelBuffer YUV insanity

In the video metadata there is a line Color space: YUV It looks like that this video isn't BGRA
When you calculate source pixel you must use stride (length of image row in bytes) instead of width because distance between rows in image may be bigger than width * pixel_size_in_bytes. I recommend to check this case on images with odd width.
int offset = i * stride + j;
You already has it commented at the beginning of function:
int stride = (int) CVPixelBufferGetBytesPerRow(pixelBuffer) / 4;
It is better to use builtin functions for converting images. Here is an example from one of my projects:
vImage_CGImageFormat out_cg_format = CreateVImage_CGImageFormat( target_pixel_format );
CGColorSpaceRef color_space = CGColorSpaceCreateDeviceRGB();
vImageCVImageFormatRef in_cv_format = vImageCVImageFormat_Create(
0 );
vImage_Error err = kvImageNoError;
vImageConverterRef converter = vImageConverter_CreateForCVToCGImageFormat(in_cv_format, &out_cg_format, NULL, kvImagePrintDiagnosticsToConsole, &err);
vImage_Buffer src_planes[4] = {{0}};
vImage_Buffer dst_planes[4] = {{0}};
unsigned long source_plane_count = vImageConverter_GetNumberOfSourceBuffers(converter);
for( unsigned int i = 0; i < source_plane_count; i++ )
src_planes[i] = (vImage_Buffer){planes_in[i], pic_size.height, pic_size.width, strides_in[i]};
unsigned long target_plane_count = vImageConverter_GetNumberOfDestinationBuffers(converter);
for( unsigned int i = 0; i < target_plane_count; i++ )
dst_planes[i] = (vImage_Buffer){planes_out[i], pic_size.height, pic_size.width, strides_out[i]};
err = vImageConvert_AnyToAny(converter, src_planes, dst_planes, NULL, kvImagePrintDiagnosticsToConsole);


How do I blur a YUV videoframe with Agora SDK

I'm using the following method from the Advanced Video Example on Github to capture the raw video data:
- (AgoraVideoRawData *)mediaDataPlugin:(AgoraMediaDataPlugin *)mediaDataPlugin didCapturedVideoRawData:(AgoraVideoRawData *)videoRawData
I have already been able to convert the Y U V buffers to a CVPixelBuffer > CIImage and apply the blur, but i'm having trouble translating the CIImage data back into YUV buffers.
I already succeeded into setting random values to the yuv-buffers which results in a grey video frame being sent to the other user.
memset(videoRawData.yBuffer, 128, videoRawData.yStride * videoRawData.height);
memset(videoRawData.uBuffer, 128, videoRawData.uStride * videoRawData.height / 2);
memset(videoRawData.vBuffer, 128, videoRawData.vStride * videoRawData.height / 2);
Could someone point me in the right direction on how to translate CIImage data back into YUV buffers? Or if there is a more efficient way to blur a YUV videodata stream, i'm willing to try that.
I have found a solutation that works for me. I will try to post a complete answer so others might find a solution that works for them. See comments in code for more explanation.
Set these helpers somewhere in your file. This will be used later to calculate the RGB values of each color pixel:
#define Mask8(x) ( (x) & 0xFF )
#define R(x) ( Mask8(x) )
#define G(x) ( Mask8(x >> 8 ) )
#define B(x) ( Mask8(x >> 16) )
All code posted here is inside the - (AgoraVideoRawData *)mediaDataPlugin:(AgoraMediaDataPlugin *)mediaDataPlugin didCapturedVideoRawData:(AgoraVideoRawData *)videoRawData method for simplicity sake of answerring this question.
- (AgoraVideoRawData *)mediaDataPlugin:(AgoraMediaDataPlugin *)mediaDataPlugin didCapturedVideoRawData:(AgoraVideoRawData *)videoRawData
// create pixelbuffer from raw video data
NSDictionary *pixelAttributes = #{(NSString *)kCVPixelBufferIOSurfacePropertiesKey:#{}};
CVPixelBufferRef pixelBuffer = NULL;
CVReturn result = CVPixelBufferCreate(kCFAllocatorDefault,
kCVPixelFormatType_420YpCbCr8BiPlanarFullRange, // NV12
(__bridge CFDictionaryRef)(pixelAttributes),
if (result != kCVReturnSuccess) {
NSLog(#"Unable to create cvpixelbuffer %d", result);
CVPixelBufferLockBaseAddress(pixelBuffer, 0);
unsigned char *yDestPlane = (unsigned char *)CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0);
for (int i = 0, k = 0; i < videoRawData.height; i ++) {
for (int j = 0; j < videoRawData.width; j ++) {
yDestPlane[k++] = videoRawData.yBuffer[j + i * videoRawData.yStride];
unsigned char *uvDestPlane = (unsigned char *)CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 1);
for (int i = 0, k = 0; i < videoRawData.height / 2; i ++) {
for (int j = 0; j < videoRawData.width / 2; j ++) {
uvDestPlane[k++] = videoRawData.uBuffer[j + i * videoRawData.uStride];
uvDestPlane[k++] = videoRawData.vBuffer[j + i * videoRawData.vStride];
CVPixelBufferUnlockBaseAddress(pixelBuffer, 0);
// create CIImage from pixel buffer
CIImage *coreImage = [CIImage imageWithCVPixelBuffer:pixelBuffer];
// apply pixel filter to image
CIFilter *pixelFilter = [CIFilter filterWithName:#"CIPixellate"];
[pixelFilter setDefaults];
[pixelFilter setValue:coreImage forKey:kCIInputImageKey];
[pixelFilter setValue:#40 forKey:#"inputScale"];
CIVector *vector = [[CIVector alloc] initWithX:160 Y:160]; // x & y should be multiple of 'inputScale' parameter
[pixelFilter setValue:vector forKey:#"inputCenter"];
CIImage *outputBlurredImage = [pixelFilter outputImage];
CIContext *blurImageContext = [CIContext contextWithOptions:nil];
CGImageRef inputCGImage = [blurImageContext createCGImage:outputBlurredImage fromRect:[coreImage extent]];
// write blurred image data to YUV buffers
NSUInteger blurredWidth = CGImageGetWidth(inputCGImage);
NSUInteger blurredHeight = CGImageGetHeight(inputCGImage);
NSUInteger bytesPerPixel = 4;
NSUInteger bytesPerRow = bytesPerPixel * blurredWidth;
NSUInteger bitsPerComponent = 8;
UInt32 * pixels = (UInt32 *) calloc(blurredHeight * blurredWidth, sizeof(UInt32));
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
CGContextRef context = CGBitmapContextCreate(pixels, blurredWidth, blurredHeight, bitsPerComponent, bytesPerRow, colorSpace, kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big);
CGContextDrawImage(context, CGRectMake(0, 0, blurredWidth, blurredHeight), inputCGImage);
int frameSize = videoRawData.width * videoRawData.height;
int yIndex = 0; // Y start index
int uIndex = frameSize; // U statt index
int vIndex = frameSize * 5 / 4; // V start index: w*h*5/4
// allocate buffers to store YUV data
UInt32 *currentPixel = pixels;
char *yBuffer = malloc( sizeof(char) * ( frameSize + 1 ) );
char *uBuffer = malloc( sizeof(char) * ( uIndex + frameSize + 1 ) );
char *vBuffer = malloc( sizeof(char) * ( vIndex + frameSize + 1 ) );
// loop through each RGB pixel and translate to YUV
for (int j = 0; j < blurredHeight; j++) {
for (int i = 0; i < blurredWidth; i++) {
UInt32 color = *currentPixel;
UInt32 R = R(color);
UInt32 G = G(color);
UInt32 B = B(color);
UInt32 Y = ((66 * R + 129 * G + 25 * B + 128) >> 8) + 16;
UInt32 U = ((-38 * R - 74 * G + 112 * B + 128) >> 8) + 128;
UInt32 V = ((112 * R - 94 * G - 18 * B + 128) >> 8) + 128;
yBuffer[yIndex++] = Y;
if (j % 2 == 0 && i % 2 == 0) {
uBuffer[uIndex++] = U;
vBuffer[vIndex++] = V;
// copy new YUV values to given videoRawData object buffers
memcpy((void*)videoRawData.yBuffer, yBuffer, strlen(yBuffer));
memcpy((void*)videoRawData.uBuffer, uBuffer, strlen(uBuffer));
memcpy((void*)videoRawData.vBuffer, vBuffer, strlen(vBuffer));
// cleanup
return videoRawData;

How do I convert ByteArray from ImageMetaData() to Bitmap?

I have this code:
Frame frame = mSession.update();
Camera camera = frame.getCamera();
System.out.println("Byte Array "+frame.getImageMetadata().getByteArray(0));
Bitmap bmp = BitmapFactory.decodeByteArray(bytes,0,bytes.length);
When I print Bitmap, I get a null object. I'm trying to get the image from the camera, that's the reason I'm trying to convert byteArray to Bitmap. If there's an alternative way, it would also be helpful.
Thank You.
The ImageMetaData describes the background image, but does not actually contain the image itself.
If you want to capture the background image as a Bitmap, you should look at the computervision sample which uses a FrameBufferObject to copy the image to a byte array.
I've tried something similar. It works. But I don't recommend anyone to try this way. It takes time because of nested loops.
CameraImageBuffer inputImage;
final Bitmap bmp = Bitmap.createBitmap(inputImage.width, inputImage.height, Bitmap.Config.ARGB_8888);
int width = inputImage.width;
int height = inputImage.height;
int frameSize = width*height;
// Write Bytebuffer to byte[]
byte[] imageBuffer= new byte[inputImage.buffer.remaining()];
int[] rgba = new int[frameSize];
for (int i = 0; i < height; i++){
for (int j = 0; j < width; j++) {
int r =imageBuffer[(i * width + j)*4 + 0];
int g =imageBuffer[(i * width + j)*4 + 1];
int b =imageBuffer[(i * width + j)*4 + 2];
rgba[i * width + j] = 0xff000000 + (b << 16) + (g << 8) + r;
bmp.setPixels(rgba, 0, width , 0, 0, width, height);
Bytebuffer is converted to rgba buffer, and is written to Bitmap. CameraImageBuffer is the class provided in computervision sample app.
You may not able to get bitmap using image metadata. Use below approach.Use onDrawFrame override method of surface view render.
#Override public void onDrawFrame(GL10 gl) {
int w = 1080;
int h = 1080;
int b[] = new int[w * (0 + h)];
int bt[] = new int[w * h];
IntBuffer ib = IntBuffer.wrap(b);
GLES20.glReadPixels(0, 0, w, h, GLES20.GL_RGBA, GLES20.GL_UNSIGNED_BYTE, ib);
for (int i = 0, k = 0; i < h; i++, k++) {
for (int j = 0; j < w; j++) {
int pix = b[i * w + j];
int pb = (pix >> 16) & 0xff;
int pr = (pix << 16) & 0x00ff0000;
int pix1 = (pix & 0xff00ff00) | pr | pb;
bt[(h - k - 1) * w + j] = pix1;
Bitmap mBitmap = Bitmap.createBitmap(bt, w, h, Bitmap.Config.ARGB_8888);
runOnUiThread(new Runnable() {
#Override public void run() {

How to swap bit U with bit V in YUV format

I want to swap the U and V bit in YUV format, from NV12
YYYYYYYY UVUV // each letter presents a bit
to NV21
I leave the Y planar alone, and handle the U and V planar by the function below
uchar swap(uchar in) {
uchar out = ((in >> 1) & 0x55) | ((in << 1) & 0xaa);
return out;
But I cannot get the desired result, the colour of the output image still not correct.
How can I swap U and V planar correctly?
Found the problem. UV should be manipulated in byte format, not bit.
byte[] yuv = // ...
final int length = yuv.length;
for (int i1 = 0; i1 < length; i1 += 2) {
if (i1 >= width * height) {
byte tmp = yuv[i1];
yuv[i1] = yuv[i1+1];
yuv[i1+1] = tmp;
try this method (-_-)
IFrameCallback iFrameCallback = new IFrameCallback() {
public void onFrame(ByteBuffer frame) {
//get nv12 data
byte[] b = new byte[frame.remaining()];
//nv12 data to nv21
NV12ToNV21(b, 1280, 720);
//send NV21 data
BVPU.InputVideoData(nv21, nv21.length,
System.currentTimeMillis() * 1000, 1280, 720);
byte[] nv21;
private void NV12ToNV21(byte[] data, int width, int height) {
nv21 = new byte[data.length];
int framesize = width * height;
int i = 0, j = 0;
System.arraycopy(data, 0, nv21, 0, framesize);
for (i = 0; i < framesize; i++) {
nv21[i] = data[i];
for (j = 0; j < framesize / 2; j += 2) {
nv21[framesize + j - 1] = data[j + framesize];
for (j = 0; j < framesize / 2; j += 2) {
nv21[framesize + j] = data[j + framesize - 1];

Loading Texture2D data in DirectX 11 Compute Shader

I am trying to read some data from a texture2d in DirectX11 compute shader, however, the 'Load' function of a texture2D object keeps returning 0 even though the texture object is filled with the same float number.
This is a 160 * 120 texture2d with DXGI_FORMAT_R32G32B32A32_FLOAT. The following code is how I created this resource:
HRESULT TestResources(ID3D11Device* pd3dDevice, ID3D11DeviceContext* pImmediateContext) {
float *test = new float[4 * 80 * 60 * 4]; // 80 * 60, 4 channels, 1 big texture contains 4 80 * 60 subimage
for (int i = 0; i < 4 * 80 * 60 * 4; i++) test[i] = 0.7f;
D3D11_TEXTURE2D_DESC RTtextureDesc;
ZeroMemory(&RTtextureDesc, sizeof(D3D11_TEXTURE2D_DESC));
RTtextureDesc.Width = 160;
RTtextureDesc.Height = 120;
RTtextureDesc.MipLevels = 1;
RTtextureDesc.ArraySize = 1;
RTtextureDesc.Format = DXGI_FORMAT_R32G32B32A32_FLOAT;
RTtextureDesc.SampleDesc.Count = 1;
RTtextureDesc.SampleDesc.Quality = 0;
RTtextureDesc.Usage = D3D11_USAGE_DYNAMIC;
RTtextureDesc.BindFlags = D3D11_BIND_SHADER_RESOURCE;
RTtextureDesc.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
RTtextureDesc.MiscFlags = 0;
InitData.pSysMem = test;
InitData.SysMemPitch = sizeof(float) * 4;
V_RETURN(pd3dDevice->CreateTexture2D(&RTtextureDesc, &InitData, &m_pInputTex2Ds));
//V_RETURN(pd3dDevice->CreateTexture2D(&RTtextureDesc, NULL, &m_pInputTex2Ds));
ZeroMemory(&SRViewDesc, sizeof(SRViewDesc));
SRViewDesc.Format = RTtextureDesc.Format;
SRViewDesc.ViewDimension = D3D11_SRV_DIMENSION_TEXTURE2D;
SRViewDesc.Texture2D.MostDetailedMip = 0;
SRViewDesc.Texture2D.MipLevels = 1;
V_RETURN(pd3dDevice->CreateShaderResourceView(m_pInputTex2Ds, &SRViewDesc, &m_pInputTexSRV));
delete[] test;
return hr;
And then I try to run dispatch with X = Y = 2 and Z = 1 like the following:
void ComputeShaderReduction::ExecuteComputeShader(ID3D11DeviceContext* pd3dImmediateContext, UINT uInputNum, ID3D11UnorderedAccessView** ppUAVInputs, UINT X, UINT Y, UINT Z) {
pd3dImmediateContext->CSSetShader(m_pComputeShader, nullptr, 0);
pd3dImmediateContext->CSSetShaderResources(0, 1, &m_pInputTexSRV); // test code
pd3dImmediateContext->CSSetUnorderedAccessViews(0, uInputNum, ppUAVInputs, nullptr);
//pd3dImmediateContext->CSSetUnorderedAccessViews(0, 1, &m_pGPUOutUAVs, nullptr);
pd3dImmediateContext->UpdateSubresource(m_pConstBuf, 0, nullptr, &m_ConstBuf, 0, 0);
pd3dImmediateContext->CSSetConstantBuffers(0, 1, &m_pConstBuf);
pd3dImmediateContext->Dispatch(X, Y, Z);
pd3dImmediateContext->CSSetShader(nullptr, nullptr, 0);
ID3D11UnorderedAccessView* ppUAViewnullptr[1] = { nullptr };
pd3dImmediateContext->CSSetUnorderedAccessViews(0, 1, ppUAViewnullptr, nullptr);
ID3D11ShaderResourceView* ppSRVnullptr[1] = { nullptr };
pd3dImmediateContext->CSSetShaderResources(0, 1, ppSRVnullptr);
ID3D11Buffer* ppCBnullptr[1] = { nullptr };
pd3dImmediateContext->CSSetConstantBuffers(0, 1, ppCBnullptr);
And I wrote a very simple CS shader to try to get the data in the texture2d and out it. So, the compute shader looks like this:
#define subimg_dim_x 80
#define subimg_dim_y 60
Texture2D<float4> BufferIn : register(t0);
StructuredBuffer<float> Test: register(t1);
RWStructuredBuffer<float> BufferOut : register(u0);
groupshared float sdata[subimg_dim_x];
[numthreads(subimg_dim_x, 1, 1)]
void CSMain(uint3 DTid : SV_DispatchThreadID,
uint3 threadIdx : SV_GroupThreadID,
uint3 groupIdx : SV_GroupID) {
sdata[threadIdx.x] = 0.0;
if (threadIdx.x == 0) {
float4 num = BufferIn.Load(uint3(groupIdx.x, groupIdx.y, 1));
//BufferOut[groupIdx.y * 2 + groupIdx.x] = 2.0; //This one gives me 2.0 as output in the console
BufferOut[groupIdx.y * 2 + groupIdx.x] = num.x; //This one keeps giving me 0.0 and in the texture, r = g = b = a = 0.7 or x = y = z = w = 0.7, so it suppose to print 0.7 in the console.
I think the way I print the CS shader result on CPU end is correct.
void ComputeShaderReduction::CopyToCPUBuffer(ID3D11Device* pdevice, ID3D11DeviceContext* pd3dImmediateContext, ID3D11Buffer* pGPUOutBufs) {
ZeroMemory(&desc, sizeof(desc));
desc.CPUAccessFlags = D3D11_CPU_ACCESS_READ;
desc.Usage = D3D11_USAGE_STAGING;
desc.BindFlags = 0;
desc.MiscFlags = 0;
if (!m_pCPUOutBufs && SUCCEEDED(pdevice->CreateBuffer(&desc, nullptr, &m_pCPUOutBufs))) {
pd3dImmediateContext->CopyResource(m_pCPUOutBufs, pGPUOutBufs);
else pd3dImmediateContext->CopyResource(m_pCPUOutBufs, pGPUOutBufs);
float *p;
pd3dImmediateContext->Map(m_pCPUOutBufs, 0, D3D11_MAP_READ, 0, &MappedResource);
p = (float*)MappedResource.pData;
for (int i = 0; i < 4; i++) printf("%d %f\n", i, p[i]);
pd3dImmediateContext->Unmap(m_pCPUOutBufs, 0);
The buffer that bind to UAV has only 4 elements. So, if all the float numbers in my texture2d are 0.7, I should have 4 0.7s get printed in CopyToCPUBuffer function instead of 0.0s.
Is anyone know what could be wrong in my code or can someone provide me an entire example or a tutorial that shows how to read DirectX 11 texture2d's data in compute shader correctly?
Thanks in advance.
The following is wrong for a start. The Pitch of your input data is the number of bytes per row of the texture, not per pixel.
InitData.SysMemPitch = sizeof(float) * 4;
float4 num = BufferIn.Load(uint3(groupIdx.x, groupIdx.y, 1));
You're trying to load data from the 2nd mip of the texture, it only has 1 mip level.

Converting cv::Mat to MTLTexture

An intermediate step of my current project requires conversion of opencv's cv::Mat to MTLTexture, the texture container of Metal. I need to store the Floats in the Mat as Floats in the texture; my project cannot quite afford the loss of precision.
This is my attempt at such a conversion.
- (id<MTLTexture>)texForMat:(cv::Mat)image context:(MBEContext *)context
id<MTLTexture> texture;
int width = image.cols;
int height = image.rows;
Float32 *rawData = (Float32 *)calloc(height * width * 4,sizeof(float));
int bytesPerPixel = 4;
int bytesPerRow = bytesPerPixel * width;
float r, g, b,a;
for(int i = 0; i < height; i++)
Float32* imageData = (Float32*)( + image.step * i);
for(int j = 0; j < width; j++)
r = (Float32)(imageData[4 * j]);
g = (Float32)(imageData[4 * j + 1]);
b = (Float32)(imageData[4 * j + 2]);
a = (Float32)(imageData[4 * j + 3]);
rawData[image.step * (i) + (4 * j)] = r;
rawData[image.step * (i) + (4 * j + 1)] = g;
rawData[image.step * (i) + (4 * j + 2)] = b;
rawData[image.step * (i) + (4 * j + 3)] = a;
MTLTextureDescriptor *textureDescriptor = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat:MTLPixelFormatRGBA16Float
texture = [context.device newTextureWithDescriptor:textureDescriptor];
MTLRegion region = MTLRegionMake2D(0, 0, width, height);
[texture replaceRegion:region mipmapLevel:0 withBytes:rawData bytesPerRow:bytesPerRow];
return texture;
But it doesn't seem to be working. It reads zeroes every time from the Mat, and throws up EXC_BAD_ACCESS. I need the MTLTexture in MTLPixelFormatRGBA16Float to keep the precision.
Thanks for considering this issue.
One problem here is you’re loading up rawData with Float32s but your texture is RGBA16Float, so the data will be corrupted (16Float is half the size of Float32). This shouldn’t cause your crash, but it’s an issue you’ll have to deal with.
Also as “chappjc” noted you’re using ‘image.step’ when writing your data out, but that buffer should be contiguous and not ever have a step that’s not just (width * bytesPerPixel).
