iOS Raw PixelFormat rggb has 2 channel? - image-processing

I'm trying to get Raw pure data from avfoundation.
And I figured out that my device support pixelformat CV14BayerRggb.
After documentation, I got raw image file which is good.
However I want byte array from AVCapturephoto from photoOutput delegate callback.
There is one thing I can't not understand!!
When I look deep into pixelformatdescription of rggb,
<CVPixelBuffer 0x283b51ad0 width=4032 height=3024 bytesPerRow=8064 pixelFormat=rgg4 iosurface=0x280e59970 poolName=RENDERED_RAW attributes={
PixelFormatDescription = {
BitsPerBlock = 16;
BitsPerComponent = 8;
ContainsAlpha = 0;
ContainsGrayscale = 0;
ContainsRGB = 1;
ContainsYCbCr = 0;
FillExtendedPixelsCallback = {length = 24, bytes = 0x0000000000000000948536ba010000000000000000000000};
PixelFormat = 1919379252;
};
Since after bayer filter, I know one pixel has one color data.
So using interpolation of neighboring pixels color data, fill up the other color value which is demosaicing.
But why in this moment, BitsPerBlock(BitsPerPixel) is 16?
Which is channel 2.
Is there anything I misunderstand about bayer filter?

Related

Get RGB value from each pixel of camera view

For my android app I have code that looks like this:
Bitmap currentBitmap = textureView.getBitmap();
int pixelCount = textureView.getWidth() * textureView.getHeight();
int redSum, greenSum, blueSum = 0;
int[] pixels = new int[pixelCount];
// get pixels as RGB-Integer to pixels[] array
currentBitmap.getPixels(pixels, 0, textureView.getWidth(), 0, 0, textureView.getWidth(), textureView.getHeight());
// extract the red component from all pixels and add it to measurement
for (int pixelIndex = 0; pixelIndex < pixelCount; pixelIndex++) {
redSum += Color.red(pixels[pixelIndex]);
greenSum += Color.green(pixels[pixelIndex]);
blueSum += Color.blue(pixels[pixelIndex]);
}
It takes every pixel from a live camera image and gets the RGB value from it. Is there a similar solution for a swift iOS version?
I am having trouble with the different image formats in swift and how to get image data from them. My camera image is in the form of CIImage.

ID3DDevice::CreateTexture2D fails with E_INVALIDARG in NV12 format for certain texture heights

I have the following texture description:
D3D11_TEXTURE2D_DESC texDesc = {};
texDesc.Width = 1920;
texDesc.Height = 953;
texDesc.MipLevels = 1;
texDesc.ArraySize = 1;
texDesc.Format = DXGI_FORMAT_NV12;
texDesc.SampleDesc.Count = 1;
texDesc.SampleDesc.Quality = 0;
texDesc.CPUAccessFlags = 0;
texDesc.Usage = D3D11_USAGE_DEFAULT;
texDesc.BindFlags = (D3D11_BIND_RENDER_TARGET | D3D11_BIND_SHADER_RESOURCE);
texDesc.MiscFlags = D3D11_RESOURCE_MISC_SHARED;
And I want to create the texture using the description with ID3D11Device::CreateTexture2D:
HRESULT hr = _pDevice->CreateTexture2D(&texDesc, 0, _ppTexOutput);
With the description given, hr is always E_INVALIDARG.
But it all works if texDesc.Height is set to, for example, 954. Also for every value the texture is created successfully if texDesc.Format is set to DXGI_FORMAT_B8G8R8A8_UNORM.
Is it something about DXGI_FORMAT_NV12 format which doesn't support certain texture heights/widths? Should I just use heights that divide by 2? Or is there more complicated rule behind this?
Yes, that format requires that both width and height are even. See here for reference. It explicitly says that for format DXGI_FORMAT_NV12:
Width and height must be even.
If you had debug layer enabled as Simon Mourier said in the comments you would already know this. I strongly advise you to enable it since it makes debugging in DirectX a lot easier.

Find average color of an area inside UIImageView [duplicate]

I am writing this method to calculate the average R,G,B values of an image. The following method takes a UIImage as an input and returns an array containing the R,G,B values of the input image. I have one question though: How/Where do I properly release the CGImageRef?
-(NSArray *)getAverageRGBValuesFromImage:(UIImage *)image
{
CGImageRef rawImageRef = [image CGImage];
//This function returns the raw pixel values
const UInt8 *rawPixelData = CFDataGetBytePtr(CGDataProviderCopyData(CGImageGetDataProvider(rawImageRef)));
NSUInteger imageHeight = CGImageGetHeight(rawImageRef);
NSUInteger imageWidth = CGImageGetWidth(rawImageRef);
//Here I sort the R,G,B, values and get the average over the whole image
int i = 0;
unsigned int red = 0;
unsigned int green = 0;
unsigned int blue = 0;
for (int column = 0; column< imageWidth; column++)
{
int r_temp = 0;
int g_temp = 0;
int b_temp = 0;
for (int row = 0; row < imageHeight; row++) {
i = (row * imageWidth + column)*4;
r_temp += (unsigned int)rawPixelData[i];
g_temp += (unsigned int)rawPixelData[i+1];
b_temp += (unsigned int)rawPixelData[i+2];
}
red += r_temp;
green += g_temp;
blue += b_temp;
}
NSNumber *averageRed = [NSNumber numberWithFloat:(1.0*red)/(imageHeight*imageWidth)];
NSNumber *averageGreen = [NSNumber numberWithFloat:(1.0*green)/(imageHeight*imageWidth)];
NSNumber *averageBlue = [NSNumber numberWithFloat:(1.0*blue)/(imageHeight*imageWidth)];
//Then I store the result in an array
NSArray *result = [NSArray arrayWithObjects:averageRed,averageGreen,averageBlue, nil];
return result;
}
I tried two things:
Option 1:
I leave it as it is, but then after a few cycles (5+) the program crashes and I get the "low memory warning error"
Option 2:
I add one line
CGImageRelease(rawImageRef)
before the method returns. Now it crashes after the second cycle, I get the EXC_BAD_ACCESS error for the UIImage that I pass to the method. When I try to analyze (instead of RUN) in Xcode I get the following warning at this line
"Incorrect decrement of the reference count of an object that is not owned at this point by the caller"
Where and how should I release the CGImageRef?
Thanks!
Your memory issue results from the copied data, as others have stated. But here's another idea: Use Core Graphics's optimized pixel interpolation to calculate the average.
Create a 1x1 bitmap context.
Set the interpolation quality to medium (see later).
Draw your image scaled down to exactly this one pixel.
Read the RGB value from the context's buffer.
(Release the context, of course.)
This might result in better performance because Core Graphics is highly optimized and might even use the GPU for the downscaling.
Testing showed that medium quality seems to interpolate pixels by taking the average of color values. That's what we want here.
Worth a try, at least.
Edit: OK, this idea seemed too interesting not to try. So here's an example project showing the difference. Below measurements were taken with the contained 512x512 test image, but you can change the image if you want.
It takes about 12.2 ms to calculate the average by iterating over all pixels in the image data. The draw-to-one-pixel approach takes 3 ms, so it's roughly 4 times faster. It seems to produce the same results when using kCGInterpolationQualityMedium.
I assume that the huge performance gain is a result from Quartz noticing that it does not have to decompress the JPEG fully but that it can use the lower frequency parts of the DCT only. That's an interesting optimization strategy when composing JPEG compressed pixels with a scale below 0.5. But I'm only guessing here.
Interestingly, when using your method, 70% of the time is spent in CGDataProviderCopyData and only 30% in the pixel data traversal. This hints to a lot of time spent in JPEG decompression.
Note: Here's a late follow up on the example image above.
You don't own the CGImageRef rawImageRef because you obtain it using [image CGImage]. So you don't need to release it.
However, you own rawPixelData because you obtained it using CGDataProviderCopyData and must release it.
CGDataProviderCopyData
Return Value:
A new data object containing a copy of the provider’s data. You are responsible for releasing this object.
I believe your issue is in this statement:
const UInt8 *rawPixelData = CFDataGetBytePtr(CGDataProviderCopyData(CGImageGetDataProvider(rawImageRef)));
You should be releasing the return value of CGDataProviderCopyData.
Your mergedColor works great on an image loaded from a file, but not for an image capture by the camera. Because CGBitmapContextGetData() on the context created from a captured sample buffer doesn't return it bitmap. I changed your code to as following. It works on any image and it is as fast as your code.
- (UIColor *)mergedColor
{
CGImageRef rawImageRef = [self CGImage];
// scale image to an one pixel image
uint8_t bitmapData[4];
int bitmapByteCount;
int bitmapBytesPerRow;
int width = 1;
int height = 1;
bitmapBytesPerRow = (width * 4);
bitmapByteCount = (bitmapBytesPerRow * height);
memset(bitmapData, 0, bitmapByteCount);
CGColorSpaceRef colorspace = CGColorSpaceCreateDeviceRGB();
CGContextRef context = CGBitmapContextCreate (bitmapData,width,height,8,bitmapBytesPerRow,
colorspace,kCGBitmapByteOrder32Little|kCGImageAlphaPremultipliedFirst);
CGColorSpaceRelease(colorspace);
CGContextSetBlendMode(context, kCGBlendModeCopy);
CGContextSetInterpolationQuality(context, kCGInterpolationMedium);
CGContextDrawImage(context, CGRectMake(0, 0, width, height), rawImageRef);
CGContextRelease(context);
return [UIColor colorWithRed:bitmapData[2] / 255.0f
green:bitmapData[1] / 255.0f
blue:bitmapData[0] / 255.0f
alpha:1];
}
CFDataRef abgrData = CGDataProviderCopyData(CGImageGetDataProvider(rawImageRef));
const UInt8 *rawPixelData = CFDataGetBytePtr(abgrData);
...
CFRelease(abgrData);

OpenCV and Windows 10 transparent image

Suppose we have the following color:
const Scalar TRANSPARENT2 = Scalar(255, 0, 255,0);
which is magenta but fully transparent: alpha = 0 (to be fully opaque is 255).
Now I made the following test based on:
http://blogs.msdn.com/b/lucian/archive/2015/12/04/opencv-first-version-up-on-nuget.aspx
WriteableBitmap^ Grabcut::TestTransparent()
{
Mat res(400,400, CV_8UC4);
res.setTo(TRANSPARENT2);
WriteableBitmap^ wbmp = ref new WriteableBitmap(res.cols, res.rows);
IBuffer^ buffer = wbmp->PixelBuffer;
unsigned char* dstPixels;
ComPtr<IBufferByteAccess> pBufferByteAccess;
ComPtr<IInspectable> pBuffer((IInspectable*)buffer);
pBuffer.As(&pBufferByteAccess);
pBufferByteAccess->Buffer(&dstPixels);
memcpy(dstPixels, res.data, res.step.buf[1] * res.cols * res.rows);
return wbmp;
}
The issue I have is that the image created is not fully transparent, it has a bit of alpha:
I understand there is a fila in the memcpy data, but I am not really sure about how to solve this. any idea to get it to alpha 0?
more details
To see I saving the image could then read and test if it works, I saw that the imwrite contains an snippet about transparency like in the image, but well imwrite is not implemented yet. But the transparency method is not working neither.
Any light with this snippet?
Thanks.
Finally I did the conversion in the C# code, first avoid calling CreateAlphaMat.
Then what I did is use a BitmapEncoder to convert data:
WriteableBitmap wb = new WriteableBitmap(bitmap.PixelWidth, bitmap.PixelHeight);
using (IRandomAccessStream stream = new InMemoryRandomAccessStream())
{
BitmapEncoder encoder = await BitmapEncoder.CreateAsync(BitmapEncoder.PngEncoderId, stream);
Stream pixelStream = bitmap.PixelBuffer.AsStream();
byte[] pixels = new byte[pixelStream.Length];
await pixelStream.ReadAsync(pixels, 0, pixels.Length);
encoder.SetPixelData(BitmapPixelFormat.Bgra8, BitmapAlphaMode.Premultiplied,
(uint)bitmap.PixelWidth, (uint)bitmap.PixelHeight, 96.0, 96.0, pixels);
await encoder.FlushAsync();
wb.SetSource(stream);
}
this.MainImage.Source = wb;
where bitmap is the WriteableBitmap from the OpenCV result. And now the image is fully transparent.
NOTE: Do not use MemoryStream and then .AsRandomAccessStream because it won't FlushAsync

How to convert an FFMPEG AVFrame in YUVJ420P to AVFoundation cVPixelBufferRef?

I have an FFMPEG AVFrame in YUVJ420P and I want to convert it to a CVPixelBufferRef with CVPixelBufferCreateWithBytes. The reason I want to do this is to use AVFoundation to show/encode the frames.
I selected kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange and tried converting it since the AVFrame has the data in three planes
Y480 Cb240 Cr240. And according to what I've researched this matches the selected kCVPixelFormatType. By being biplanar I need to convert it into a buffer that contains Y480 and CbCr480 Interleaved.
I tried to create a buffer with 2 planes:
frame->data[0] on the first plane,
frame->data[1] and frame->data[2] interleaved on the second plane.
However, I'm getting return error -6661 (invalid a) from CVPixelBufferCreateWithBytes:
"Invalid function parameter. For example, out of range or the wrong type."
I don't have expertise on image processing at all, so any pointers to documentation that can get me started in the right approach to this problem are appreciated. My C skills aren't top of the line either so maybe I'm making a basic mistake here.
uint8_t **buffer = malloc(2*sizeof(int *));
buffer[0] = frame->data[0];
buffer[1] = malloc(frame->linesize[0]*sizeof(int));
for(int i = 0; i<frame->linesize[0]; i++){
if(i%2){
buffer[1][i]=frame->data[1][i/2];
}else{
buffer[1][i]=frame->data[2][i/2];
}
}
int ret = CVPixelBufferCreateWithBytes(NULL, frame->width, frame->height, kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange, buffer, frame->linesize[0], NULL, 0, NULL, cvPixelBufferSample)
The frame is the AVFrame with the rawData from FFMPEG Decoding.
My C skills aren't top of the line either so maybe im making a basic mistake here.
You're making several:
You should be using CVPixelBufferCreateWithPlanarBytes(). I do not know if CVPixelBufferCreateWithBytes() can be used to create a planar video frame; if so, it will require a pointer to a "plane descriptor block" (I can't seem to find the struct in the docs).
frame->linesize[0] is the bytes per row, not the size of the whole image. The docs are unclear, but the usage is fairly unambiguous.
frame->linesize[0] refers to the Y plane; you care about the UV planes.
Where is sizeof(int) from?
You're passing in cvPixelBufferSample; you might mean &cvPixelBufferSample.
You're not passing in a release callback. The documentation does not say that you can pass NULL.
Try something like this:
size_t srcPlaneSize = frame->linesize[1]*frame->height;
size_t dstPlaneSize = srcPlaneSize *2;
uint8_t *dstPlane = malloc(dstPlaneSize);
void *planeBaseAddress[2] = { frame->data[0], dstPlane };
// This loop is very naive and assumes that the line sizes are the same.
// It also copies padding bytes.
assert(frame->linesize[1] == frame->linesize[2]);
for(size_t i = 0; i<srcPlaneSize; i++){
// These might be the wrong way round.
dstPlane[2*i ]=frame->data[2][i];
dstPlane[2*i+1]=frame->data[1][i];
}
// This assumes the width and height are even (it's 420 after all).
assert(!frame->width%2 && !frame->height%2);
size_t planeWidth[2] = {frame->width, frame->width/2};
size_t planeHeight[2] = {frame->height, frame->height/2};
// I'm not sure where you'd get this.
size_t planeBytesPerRow[2] = {frame->linesize[0], frame->linesize[1]*2};
int ret = CVPixelBufferCreateWithPlanarBytes(
NULL,
frame->width,
frame->height,
kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange,
NULL,
0,
2,
planeBaseAddress,
planeWidth,
planeHeight,
planeBytesPerRow,
YOUR_RELEASE_CALLBACK,
YOUR_RELEASE_CALLBACK_CONTEXT,
NULL,
&cvPixelBufferSample);
Memory management is left as an exercise to the reader, but for test code you might get away with passing in NULL instead of a release callback.

Resources