Encoding raw YUV420P to h264 with AVCodec on iOS - ios

I am trying to encode a single YUV420P image gathered from a CMSampleBuffer to an AVPacket so that I can send h264 video over the network with RTMP.
The posted code example seems to work as avcodec_encode_video2 returns 0 (Success) however got_output is also 0 (AVPacket is empty).
Does anyone have any experience with encoding video on iOS devices that might know what I am doing wrong?
- (void) captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection {
// sampleBuffer now contains an individual frame of raw video frames
CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
CVPixelBufferLockBaseAddress(pixelBuffer, 0);
// access the data
int width = CVPixelBufferGetWidth(pixelBuffer);
int height = CVPixelBufferGetHeight(pixelBuffer);
int bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0);
unsigned char *rawPixelBase = (unsigned char *)CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0);
// Convert the raw pixel base to h.264 format
AVCodec *codec = 0;
AVCodecContext *context = 0;
AVFrame *frame = 0;
AVPacket packet;
//avcodec_init();
avcodec_register_all();
codec = avcodec_find_encoder(AV_CODEC_ID_H264);
if (codec == 0) {
NSLog(#"Codec not found!!");
return;
}
context = avcodec_alloc_context3(codec);
if (!context) {
NSLog(#"Context no bueno.");
return;
}
// Bit rate
context->bit_rate = 400000; // HARD CODE
context->bit_rate_tolerance = 10;
// Resolution
context->width = width;
context->height = height;
// Frames Per Second
context->time_base = (AVRational) {1,25};
context->gop_size = 1;
//context->max_b_frames = 1;
context->pix_fmt = PIX_FMT_YUV420P;
// Open the codec
if (avcodec_open2(context, codec, 0) < 0) {
NSLog(#"Unable to open codec");
return;
}
// Create the frame
frame = avcodec_alloc_frame();
if (!frame) {
NSLog(#"Unable to alloc frame");
return;
}
frame->format = context->pix_fmt;
frame->width = context->width;
frame->height = context->height;
avpicture_fill((AVPicture *) frame, rawPixelBase, context->pix_fmt, frame->width, frame->height);
int got_output = 0;
av_init_packet(&packet);
avcodec_encode_video2(context, &packet, frame, &got_output)
// Unlock the pixel data
CVPixelBufferUnlockBaseAddress(pixelBuffer, 0);
// Send the data over the network
[self uploadData:[NSData dataWithBytes:packet.data length:packet.size] toRTMP:self.rtmp_OutVideoStream];
}
Note: It is known that this code has memory leaks because I am not freeing the memory that is dynamically allocated.
UPDATE
I updated my code to use #pogorskiy method. I only try to upload the frame if got output returns 1 and clear the buffer once I am done encoding video frames.

Related

Why is Yp Cb Cr image buffer all shuffled in iOS 13?

I have a computer vision app that takes grayscale images from sensor and processes them. The image acquisition for iOS is written in Obj-C and the image processing is performed in C++ using OpenCV. As I only need the luminance data, I acquire the image in YUV (or Yp Cb Cr) 420 bi-planar full range format and just assign the buffer's data to an OpenCV Mat object (see aquisition code below). This worked great so far, until the brand new iOS 13 came out... For some reason, on iOS 13 the image I obtain is misaligned, resulting in diagonal stripes. By looking at the image I obtain, I suspect this is the consequence of a change in ordering of the buffer's Y Cb an Cr components or a change in the buffer's stride. Does anyone know if iOS 13 introduces this kind of changes and how I could update my code to avoid this, preferably in a backward-compatible manner?
Here is my image acquisition code:
//capture config
- (void)initialize {
AVCaptureDevice *frontCameraDevice;
NSArray *devices = [AVCaptureDevice devicesWithMediaType:AVMediaTypeVideo];
for (AVCaptureDevice *device in devices) {
if (device.position == AVCaptureDevicePositionFront) {
frontCameraDevice = device;
}
}
if (frontCameraDevice == nil) {
NSLog(#"Front camera device not found");
return;
}
_session = [[AVCaptureSession alloc] init];
_session.sessionPreset = AVCaptureSessionPreset640x480;
NSError *error = nil;
AVCaptureDeviceInput *input = [AVCaptureDeviceInput deviceInputWithDevice:frontCameraDevice error: &error];
if (error != nil) {
NSLog(#"Error getting front camera device input: %#", error);
}
if ([_session canAddInput:input]) {
[_session addInput:input];
} else {
NSLog(#"Could not add front camera device input to session");
}
AVCaptureVideoDataOutput *videoOutput = [[AVCaptureVideoDataOutput alloc] init];
// This is the default, but making it explicit
videoOutput.alwaysDiscardsLateVideoFrames = YES;
if ([videoOutput.availableVideoCVPixelFormatTypes containsObject:
[NSNumber numberWithInt:kCVPixelFormatType_420YpCbCr8BiPlanarFullRange]]) {
OSType format = kCVPixelFormatType_420YpCbCr8BiPlanarFullRange;
videoOutput.videoSettings = [NSDictionary dictionaryWithObject:[NSNumber numberWithUnsignedInt:format]
forKey:(id)kCVPixelBufferPixelFormatTypeKey];
} else {
NSLog(#"YUV format not available");
}
[videoOutput setSampleBufferDelegate:self queue:dispatch_queue_create("extrapage.camera.capture.sample.buffer.delegate", DISPATCH_QUEUE_SERIAL)];
if ([_session canAddOutput:videoOutput]) {
[_session addOutput:videoOutput];
} else {
NSLog(#"Could not add video output to session");
}
AVCaptureConnection *captureConnection = [videoOutput connectionWithMediaType:AVMediaTypeVideo];
captureConnection.videoOrientation = AVCaptureVideoOrientationPortrait;
}
//acquisition code
- (void)captureOutput:(AVCaptureOutput *)output didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection {
if (_listener != nil) {
CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
OSType format = CVPixelBufferGetPixelFormatType(pixelBuffer);
NSAssert(format == kCVPixelFormatType_420YpCbCr8BiPlanarFullRange, #"Only YUV is supported");
// The first plane / channel (at index 0) is the grayscale plane
// See more infomation about the YUV format
// http://en.wikipedia.org/wiki/YUV
CVPixelBufferLockBaseAddress(pixelBuffer, 0);
void *baseaddress = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0);
CGFloat width = CVPixelBufferGetWidth(pixelBuffer);
CGFloat height = CVPixelBufferGetHeight(pixelBuffer);
cv::Mat frame(height, width, CV_8UC1, baseaddress, 0);
[_listener onNewFrame:frame];
CVPixelBufferUnlockBaseAddress(pixelBuffer, 0);
}
}
I found the solution to this problem. It was a row stride issue: appearently, in iOS 13, the row stride of the Yp Cb Cr 4:2:0 8 bit bi-planar buffer was changed. Maybe for it to always be a power of 2. Therefore in some cases, the row stride is no longer the same as the width. It was the case for me. The fix is easy, just get the row stride from the buffer's info and pass it to the OpenCV Mat's constructor as shown below.
void *baseaddress = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0);
size_t width = CVPixelBufferGetWidthOfPlane(pixelBuffer, 0);
size_t height = CVPixelBufferGetHeightOfPlane(pixelBuffer, 0);
size_t bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0);
cv::Mat frame(height, width, CV_8UC1, baseaddress, bytesPerRow);
Note that I also changed how I get the width and height by using the dimensions of the plane instead of the ones of the buffer. For the Y plane, it should always be the same. I am not sure that this makes a difference.
Also be careful: after the Xcode update to support the iOS 13 SDK, I had to uninstall my app from the test device because otherwise, Xcode kept running the old version instead of the newly compiled one.
this is not an answer, but we have a similar problem. I tried with different resolution used in photo capture, only one resolution (2592 x 1936) does not work, other resolutions does work. I think change the resolution to 1440x1920 for example may be a workaround to your problem.

Saving high quality images, doing live processing - what's the best approach?

I'm still learning about AVFoundation, so I'm unsure how best I should approach the problem of needing to capture a high quality still image, but provide a low-quality preview video stream.
I've got an app that needs to take high quality images (AVCaptureSessionPresetPhoto), but process the preview video stream using OpenCV - for which a much lower resolution is acceptable. Simply using the base OpenCV Video Camera class is no good, as setting the defaultAVCaptureSessionPreset to AVCaptureSessionPresetPhoto results in the full resolution frame being passed to processImage - which is very slow indeed.
How can I have a high-quality connection to the device that I can use for capturing the still image, and a low-quality connection that can be processed and displayed? A description of how I need to set up sessions/connections would be very helpful. Is there an open-source example of such an app?
I did something similar - I grabbed the pixels in the delegate method, made a CGImageRef of them, then dispatched that to the normal priority queue, where it was modified. Since AVFoundation must be using a CADisplayLink for the callback method it has highest priority. In my particular case I was not grabbing all pixels so it worked on an iPhone 4 at 30fps. Depending on what devices you want to run you have number of pixels, fps, etc trade offs.
Another idea is to grab a power of 2 subset of pixels - for instance every 4th in each row and every 4th row. Again I did something similar in my app at 20-30fps. You can then further operate on this smaller image in dispatched blocks.
If this seems daunting offer a bounty for working code.
CODE:
// Image is oriented with bottle neck to the left and the bottle bottom on the right
- (void)captureOutput:(AVCaptureVideoDataOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
{
#if 1
AVCaptureDevice *camera = [(AVCaptureDeviceInput *)[captureSession.inputs lastObject] device];
if(camera.adjustingWhiteBalance || camera.adjustingExposure) NSLog(#"GOTCHA: %d %d", camera.adjustingWhiteBalance, camera.adjustingExposure);
printf("foo\n");
#endif
if(saveState != saveOne && saveState != saveAll) return;
#autoreleasepool {
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
//NSLog(#"PE: value=%lld timeScale=%d flags=%x", prStamp.value, prStamp.timescale, prStamp.flags);
/*Lock the image buffer*/
CVPixelBufferLockBaseAddress(imageBuffer,0);
NSRange captureRange;
if(saveState == saveOne) {
#if 0 // B G R A MODE !
NSLog(#"PIXEL_TYPE: 0x%lx", CVPixelBufferGetPixelFormatType(imageBuffer));
uint8_t *newPtr = (uint8_t *)CVPixelBufferGetBaseAddress(imageBuffer);
NSLog(#"ONE VAL %x %x %x %x", newPtr[0], newPtr[1], newPtr[2], newPtr[3]);
}
exit(0);
#endif
[edgeFinder setupImageBuffer:imageBuffer];
BOOL success = [edgeFinder delineate:1];
if(!success) {
dispatch_async(dispatch_get_main_queue(), ^{ edgeFinder = nil; [delegate error]; });
saveState = saveNone;
} else
bottleRange = edgeFinder.sides;
xRange.location = edgeFinder.shoulder;
xRange.length = edgeFinder.bottom - xRange.location;
NSLog(#"bottleRange 1: %# neck=%d bottom=%d", NSStringFromRange(bottleRange), edgeFinder.shoulder, edgeFinder.bottom );
//searchRows = [edgeFinder expandRange:bottleRange];
rowsPerSwath = lrintf((bottleRange.length*NUM_DEGREES_TO_GRAB)*(float)M_PI/360.0f);
NSLog(#"rowsPerSwath = %d", rowsPerSwath);
saveState = saveIdling;
captureRange = NSMakeRange(0, [WLIPBase numRows]);
dispatch_async(dispatch_get_main_queue(), ^
{
[delegate focusDone];
edgeFinder = nil;
captureOutput.alwaysDiscardsLateVideoFrames = YES;
});
} else {
NSInteger rows = rowsPerSwath;
NSInteger newOffset = bottleRange.length - rows;
if(newOffset & 1) {
--newOffset;
++rows;
}
captureRange = NSMakeRange(bottleRange.location + newOffset/2, rows);
}
//NSLog(#"captureRange=%u %u", captureRange.location, captureRange.length);
/*Get information about the image*/
uint8_t *baseAddress = (uint8_t *)CVPixelBufferGetBaseAddress(imageBuffer);
size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
size_t width = CVPixelBufferGetWidth(imageBuffer);
// Note Apple sample code cheats big time - the phone is big endian so this reverses the "apparent" order of bytes
CGContextRef newContext = CGBitmapContextCreate(NULL, width, captureRange.length, 8, bytesPerRow, colorSpace, kCGImageAlphaNoneSkipFirst | kCGBitmapByteOrder32Little); // Video in ARGB format
assert(newContext);
uint8_t *newPtr = (uint8_t *)CGBitmapContextGetData(newContext);
size_t offset = captureRange.location * bytesPerRow;
memcpy(newPtr, baseAddress + offset, captureRange.length * bytesPerRow);
CVPixelBufferUnlockBaseAddress(imageBuffer, 0);
OSAtomicIncrement32(&totalImages);
int32_t curDepth = OSAtomicIncrement32(&queueDepth);
if(curDepth > maxDepth) maxDepth = curDepth;
#define kImageContext #"kImageContext"
#define kState #"kState"
#define kPresTime #"kPresTime"
CMTime prStamp = CMSampleBufferGetPresentationTimeStamp(sampleBuffer); // when it was taken?
//CMTime deStamp = CMSampleBufferGetDecodeTimeStamp(sampleBuffer); // now?
NSDictionary *dict = [NSDictionary dictionaryWithObjectsAndKeys:
[NSValue valueWithBytes:&saveState objCType:#encode(saveImages)], kState,
[NSValue valueWithNonretainedObject:(__bridge id)newContext], kImageContext,
[NSValue valueWithBytes:&prStamp objCType:#encode(CMTime)], kPresTime,
nil ];
dispatch_async(imageQueue, ^
{
// could be on any thread now
OSAtomicDecrement32(&queueDepth);
if(!isCancelled) {
saveImages state; [(NSValue *)[dict objectForKey:kState] getValue:&state];
CGContextRef context; [(NSValue *)[dict objectForKey:kImageContext] getValue:&context];
CMTime stamp; [(NSValue *)[dict objectForKey:kPresTime] getValue:&stamp];
CGImageRef newImageRef = CGBitmapContextCreateImage(context);
CGContextRelease(context);
UIImageOrientation orient = state == saveOne ? UIImageOrientationLeft : UIImageOrientationUp;
UIImage *image = [UIImage imageWithCGImage:newImageRef scale:1.0 orientation:orient]; // imageWithCGImage: UIImageOrientationUp UIImageOrientationLeft
CGImageRelease(newImageRef);
NSData *data = UIImagePNGRepresentation(image);
// NSLog(#"STATE:[%d]: value=%lld timeScale=%d flags=%x", state, stamp.value, stamp.timescale, stamp.flags);
{
NSString *name = [NSString stringWithFormat:#"%d.png", num];
NSString *path = [[wlAppDelegate snippetsDirectory] stringByAppendingPathComponent:name];
BOOL ret = [data writeToFile:path atomically:NO];
//NSLog(#"WROTE %d err=%d w/time %f path:%#", num, ret, (double)stamp.value/(double)stamp.timescale, path);
if(!ret) {
++errors;
} else {
dispatch_async(dispatch_get_main_queue(), ^
{
if(num) [delegate progress:(CGFloat)num/(CGFloat)(MORE_THAN_ONE_REV * SNAPS_PER_SEC) file:path];
} );
}
++num;
}
} else NSLog(#"CANCELLED");
} );
}
}
In AVCaptureSessionPresetPhoto it use small video preview(about 1000x700 for iPhone6) and high resolution photo(about 3000x2000).
So I use modified 'CvPhotoCamera' class to process small preview and take photo of full-size picture. I post this code here: https://stackoverflow.com/a/31478505/1994445

The iPhone 3GS and kCVPixelFormatType_420YpCbCr8BiPlanarFullRange format?

I'm developing an iOS application with latest SDK and testing it on an iPhone 3GS.
I'm doing this on init method:
CFDictionaryRef formatDictionary = CVPixelFormatDescriptionCreateWithPixelFormatType(kCFAllocatorDefault, kCVPixelFormatType_420YpCbCr8BiPlanarFullRange);
CFNumberRef val = (CFNumberRef) CFDictionaryGetValue(formatDictionary, kCVPixelFormatBitsPerBlock);
if (val != nil)
{
CFNumberGetValue(val,kCFNumberSInt8Type, &_bytesPerPixel);
_bytesPerPixel /= 8;
}
else
_bytesPerPixel = 4;
But val is always nil.
And here:
- (void)captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection
{
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
//Lock the image buffer//
CVPixelBufferLockBaseAddress(imageBuffer,0);
//Get information about the image//
uint8_t *baseAddress = (uint8_t *)CVPixelBufferGetBaseAddress(imageBuffer);
size_t width = CVPixelBufferGetWidth(imageBuffer);
size_t height = CVPixelBufferGetHeight(imageBuffer);
//size_t stride = CVPixelBufferGetBytesPerRow(imageBuffer);
//put buffer in open cv, no memory copied
cv::Mat image = cv::Mat(height, width, CV_8UC4, baseAddress);
// copy the image
//cv::Mat copied_image = image.clone();
[_previewBufferLock lock];
memcpy(baseAddress, _lastFrame, _previewSurfaceHeight * _previewSurfaceWidth * _bytesPerPixel);
[_previewBufferLock unlock];
//We unlock the image buffer//
CVPixelBufferUnlockBaseAddress(imageBuffer,0);
}
I have add a breakpoint on memcpy line and I these are my vars values:
But I'm getting an EXEC_BAD_ACCESS here: memcpy(baseAddress, _lastFrame, _previewSurfaceHeight * _previewSurfaceWidth * _bytesPerPixel);
Does the iPhone 3GS support kCVPixelFormatType_420YpCbCr8BiPlanarFullRange?
No, the iPhone 3GS does not have support for kCVPixelFormatType_420YpCbCr8BiPlanarFullRange as an output buffer type from the camera, only kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange. The iPhone 4S and iPhone 5 have support for the full-range YUV output, but older devices do not.
You can test this availability using the following code:
videoOutput = [[AVCaptureVideoDataOutput alloc] init];
BOOL supportsFullYUVRange = NO;
NSArray *supportedPixelFormats = videoOutput.availableVideoCVPixelFormatTypes;
for (NSNumber *currentPixelFormat in supportedPixelFormats)
{
if ([currentPixelFormat intValue] == kCVPixelFormatType_420YpCbCr8BiPlanarFullRange)
{
supportsFullYUVRange = YES;
}
}

why is audio coming up garbled when using AVAssetReader with audio queue

based on my research.. people keep on saying that it's based on mismatched/wrong formatting.. but i'm using lPCM formatting for both input and output.. how can you go wrong with that? the result i'm getting is just noise.. (like white noise)
I've decided to just paste my entire code.. perhaps that would help:
#import "AppDelegate.h"
#import "ViewController.h"
#implementation AppDelegate
#synthesize window = _window;
#synthesize viewController = _viewController;
- (BOOL)application:(UIApplication *)application didFinishLaunchingWithOptions:(NSDictionary *)launchOptions
{
self.window = [[UIWindow alloc] initWithFrame:[[UIScreen mainScreen] bounds]];
// Override point for customization after application launch.
self.viewController = [[ViewController alloc] initWithNibName:#"ViewController" bundle:nil];
self.window.rootViewController = self.viewController;
[self.window makeKeyAndVisible];
// Insert code here to initialize your application
player = [[Player alloc] init];
[self setupReader];
[self setupQueue];
// initialize reader in a new thread
internalThread =[[NSThread alloc]
initWithTarget:self
selector:#selector(readPackets)
object:nil];
[internalThread start];
// start the queue. this function returns immedatly and begins
// invoking the callback, as needed, asynchronously.
//CheckError(AudioQueueStart(queue, NULL), "AudioQueueStart failed");
// and wait
printf("Playing...\n");
do
{
CFRunLoopRunInMode(kCFRunLoopDefaultMode, 0.25, false);
} while (!player.isDone /*|| gIsRunning*/);
// isDone represents the state of the Audio File enqueuing. This does not mean the
// Audio Queue is actually done playing yet. Since we have 3 half-second buffers in-flight
// run for continue to run for a short additional time so they can be processed
CFRunLoopRunInMode(kCFRunLoopDefaultMode, 2, false);
// end playback
player.isDone = true;
CheckError(AudioQueueStop(queue, TRUE), "AudioQueueStop failed");
cleanup:
AudioQueueDispose(queue, TRUE);
AudioFileClose(player.playbackFile);
return YES;
}
- (void) setupReader
{
NSURL *assetURL = [NSURL URLWithString:#"ipod-library://item/item.m4a?id=1053020204400037178"]; // from ilham's ipod
AVURLAsset *songAsset = [AVURLAsset URLAssetWithURL:assetURL options:nil];
// from AVAssetReader Class Reference:
// AVAssetReader is not intended for use with real-time sources,
// and its performance is not guaranteed for real-time operations.
NSError * error = nil;
AVAssetReader* reader = [[AVAssetReader alloc] initWithAsset:songAsset error:&error];
AVAssetTrack* track = [songAsset.tracks objectAtIndex:0];
readerOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:track
outputSettings:nil];
// AVAssetReaderOutput* readerOutput = [[AVAssetReaderAudioMixOutput alloc] initWithAudioTracks:songAsset.tracks audioSettings:nil];
[reader addOutput:readerOutput];
[reader startReading];
}
- (void) setupQueue
{
// get the audio data format from the file
// we know that it is PCM.. since it's converted
AudioStreamBasicDescription dataFormat;
dataFormat.mSampleRate = 44100.0;
dataFormat.mFormatID = kAudioFormatLinearPCM;
dataFormat.mFormatFlags = kAudioFormatFlagIsBigEndian | kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
dataFormat.mBytesPerPacket = 4;
dataFormat.mFramesPerPacket = 1;
dataFormat.mBytesPerFrame = 4;
dataFormat.mChannelsPerFrame = 2;
dataFormat.mBitsPerChannel = 16;
// create a output (playback) queue
CheckError(AudioQueueNewOutput(&dataFormat, // ASBD
MyAQOutputCallback, // Callback
(__bridge void *)self, // user data
NULL, // run loop
NULL, // run loop mode
0, // flags (always 0)
&queue), // output: reference to AudioQueue object
"AudioQueueNewOutput failed");
// adjust buffer size to represent about a half second (0.5) of audio based on this format
CalculateBytesForTime(dataFormat, 0.5, &bufferByteSize, &player->numPacketsToRead);
// check if we are dealing with a VBR file. ASBDs for VBR files always have
// mBytesPerPacket and mFramesPerPacket as 0 since they can fluctuate at any time.
// If we are dealing with a VBR file, we allocate memory to hold the packet descriptions
bool isFormatVBR = (dataFormat.mBytesPerPacket == 0 || dataFormat.mFramesPerPacket == 0);
if (isFormatVBR)
player.packetDescs = (AudioStreamPacketDescription*)malloc(sizeof(AudioStreamPacketDescription) * player.numPacketsToRead);
else
player.packetDescs = NULL; // we don't provide packet descriptions for constant bit rate formats (like linear PCM)
// get magic cookie from file and set on queue
MyCopyEncoderCookieToQueue(player.playbackFile, queue);
// allocate the buffers and prime the queue with some data before starting
player.isDone = false;
player.packetPosition = 0;
int i;
for (i = 0; i < kNumberPlaybackBuffers; ++i)
{
CheckError(AudioQueueAllocateBuffer(queue, bufferByteSize, &audioQueueBuffers[i]), "AudioQueueAllocateBuffer failed");
// EOF (the entire file's contents fit in the buffers)
if (player.isDone)
break;
}
}
-(void)readPackets
{
// initialize a mutex and condition so that we can block on buffers in use.
pthread_mutex_init(&queueBuffersMutex, NULL);
pthread_cond_init(&queueBufferReadyCondition, NULL);
state = AS_BUFFERING;
while ((sample = [readerOutput copyNextSampleBuffer])) {
AudioBufferList audioBufferList;
CMBlockBufferRef CMBuffer = CMSampleBufferGetDataBuffer( sample );
CheckError(CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(
sample,
NULL,
&audioBufferList,
sizeof(audioBufferList),
NULL,
NULL,
kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment,
&CMBuffer
),
"could not read samples");
AudioBuffer audioBuffer = audioBufferList.mBuffers[0];
UInt32 inNumberBytes = audioBuffer.mDataByteSize;
size_t incomingDataOffset = 0;
while (inNumberBytes) {
size_t bufSpaceRemaining;
bufSpaceRemaining = bufferByteSize - bytesFilled;
#synchronized(self)
{
bufSpaceRemaining = bufferByteSize - bytesFilled;
size_t copySize;
if (bufSpaceRemaining < inNumberBytes)
{
copySize = bufSpaceRemaining;
}
else
{
copySize = inNumberBytes;
}
// copy data to the audio queue buffer
AudioQueueBufferRef fillBuf = audioQueueBuffers[fillBufferIndex];
memcpy((char*)fillBuf->mAudioData + bytesFilled, (const char*)(audioBuffer.mData + incomingDataOffset), copySize);
// keep track of bytes filled
bytesFilled +=copySize;
incomingDataOffset +=copySize;
inNumberBytes -=copySize;
}
// if the space remaining in the buffer is not enough for this packet, then enqueue the buffer.
if (bufSpaceRemaining < inNumberBytes + bytesFilled)
{
[self enqueueBuffer];
}
}
}
}
-(void)enqueueBuffer
{
#synchronized(self)
{
inuse[fillBufferIndex] = true; // set in use flag
buffersUsed++;
// enqueue buffer
AudioQueueBufferRef fillBuf = audioQueueBuffers[fillBufferIndex];
NSLog(#"we are now enqueing buffer %d",fillBufferIndex);
fillBuf->mAudioDataByteSize = bytesFilled;
err = AudioQueueEnqueueBuffer(queue, fillBuf, 0, NULL);
if (err)
{
NSLog(#"could not enqueue queue with buffer");
return;
}
if (state == AS_BUFFERING)
{
//
// Fill all the buffers before starting. This ensures that the
// AudioFileStream stays a small amount ahead of the AudioQueue to
// avoid an audio glitch playing streaming files on iPhone SDKs < 3.0
//
if (buffersUsed == kNumberPlaybackBuffers - 1)
{
err = AudioQueueStart(queue, NULL);
if (err)
{
NSLog(#"couldn't start queue");
return;
}
state = AS_PLAYING;
}
}
// go to next buffer
if (++fillBufferIndex >= kNumberPlaybackBuffers) fillBufferIndex = 0;
bytesFilled = 0; // reset bytes filled
}
// wait until next buffer is not in use
pthread_mutex_lock(&queueBuffersMutex);
while (inuse[fillBufferIndex])
{
pthread_cond_wait(&queueBufferReadyCondition, &queueBuffersMutex);
}
pthread_mutex_unlock(&queueBuffersMutex);
}
#pragma mark - utility functions -
// generic error handler - if err is nonzero, prints error message and exits program.
static void CheckError(OSStatus error, const char *operation)
{
if (error == noErr) return;
char str[20];
// see if it appears to be a 4-char-code
*(UInt32 *)(str + 1) = CFSwapInt32HostToBig(error);
if (isprint(str[1]) && isprint(str[2]) && isprint(str[3]) && isprint(str[4])) {
str[0] = str[5] = '\'';
str[6] = '\0';
} else
// no, format it as an integer
sprintf(str, "%d", (int)error);
fprintf(stderr, "Error: %s (%s)\n", operation, str);
exit(1);
}
// we only use time here as a guideline
// we're really trying to get somewhere between 16K and 64K buffers, but not allocate too much if we don't need it/*
void CalculateBytesForTime(AudioStreamBasicDescription inDesc, Float64 inSeconds, UInt32 *outBufferSize, UInt32 *outNumPackets)
{
// we need to calculate how many packets we read at a time, and how big a buffer we need.
// we base this on the size of the packets in the file and an approximate duration for each buffer.
//
// first check to see what the max size of a packet is, if it is bigger than our default
// allocation size, that needs to become larger
// we don't have access to file packet size, so we just default it to maxBufferSize
UInt32 maxPacketSize = 0x10000;
static const int maxBufferSize = 0x10000; // limit size to 64K
static const int minBufferSize = 0x4000; // limit size to 16K
if (inDesc.mFramesPerPacket) {
Float64 numPacketsForTime = inDesc.mSampleRate / inDesc.mFramesPerPacket * inSeconds;
*outBufferSize = numPacketsForTime * maxPacketSize;
} else {
// if frames per packet is zero, then the codec has no predictable packet == time
// so we can't tailor this (we don't know how many Packets represent a time period
// we'll just return a default buffer size
*outBufferSize = maxBufferSize > maxPacketSize ? maxBufferSize : maxPacketSize;
}
// we're going to limit our size to our default
if (*outBufferSize > maxBufferSize && *outBufferSize > maxPacketSize)
*outBufferSize = maxBufferSize;
else {
// also make sure we're not too small - we don't want to go the disk for too small chunks
if (*outBufferSize < minBufferSize)
*outBufferSize = minBufferSize;
}
*outNumPackets = *outBufferSize / maxPacketSize;
}
// many encoded formats require a 'magic cookie'. if the file has a cookie we get it
// and configure the queue with it
static void MyCopyEncoderCookieToQueue(AudioFileID theFile, AudioQueueRef queue ) {
UInt32 propertySize;
OSStatus result = AudioFileGetPropertyInfo (theFile, kAudioFilePropertyMagicCookieData, &propertySize, NULL);
if (result == noErr && propertySize > 0)
{
Byte* magicCookie = (UInt8*)malloc(sizeof(UInt8) * propertySize);
CheckError(AudioFileGetProperty (theFile, kAudioFilePropertyMagicCookieData, &propertySize, magicCookie), "get cookie from file failed");
CheckError(AudioQueueSetProperty(queue, kAudioQueueProperty_MagicCookie, magicCookie, propertySize), "set cookie on queue failed");
free(magicCookie);
}
}
#pragma mark - audio queue -
static void MyAQOutputCallback(void *inUserData, AudioQueueRef inAQ, AudioQueueBufferRef inCompleteAQBuffer)
{
AppDelegate *appDelegate = (__bridge AppDelegate *) inUserData;
[appDelegate myCallback:inUserData
inAudioQueue:inAQ
audioQueueBufferRef:inCompleteAQBuffer];
}
- (void)myCallback:(void *)userData
inAudioQueue:(AudioQueueRef)inAQ
audioQueueBufferRef:(AudioQueueBufferRef)inCompleteAQBuffer
{
unsigned int bufIndex = -1;
for (unsigned int i = 0; i < kNumberPlaybackBuffers; ++i)
{
if (inCompleteAQBuffer == audioQueueBuffers[i])
{
bufIndex = i;
break;
}
}
if (bufIndex == -1)
{
NSLog(#"something went wrong at queue callback");
return;
}
// signal waiting thread that the buffer is free.
pthread_mutex_lock(&queueBuffersMutex);
NSLog(#"signalling that buffer %d is free",bufIndex);
inuse[bufIndex] = false;
buffersUsed--;
pthread_cond_signal(&queueBufferReadyCondition);
pthread_mutex_unlock(&queueBuffersMutex);
}
#end
Update:
btomw's answer below solved the problem magnificently. But I want to get to the bottom of this (most novice developers like myself and even btomw when he first started usually shoot in the dark with parameters, formatting etc - see here for an example -)..
the reason why I provided nul as a parameter for
AVURLAsset *songAsset = [AVURLAsset URLAssetWithURL:assetURL options:audioReadSettings];
was because according to the documentation and trial and error, I realized that any formatting I put other than lPCM would be rejected outright. In other words, when you use AVAseetReader or conversion even the result is always lPCM.. so I thought the default format was lPCM anyways and so I left it as null.. but I guess I was wrong.
The weird part in this (please correct me anyone, if I'm wrong) is that as I mentioned.. supposed the original file was .mp3, and my intention was to play it back (or send the packets over a network etc) as mp3.. and so I provided an mp3 ABSD.. the asset reader will crash! so is that if i wanted to send it in it's original form, i just supply null? the obvious problem with this is that there would be no way for me to figure out what ABSD it has once I receive it on the other side.. or could I?
Update 2:You can download the code from github.
So here's what I think is happening and also how I think you can fix it.
You're pulling a predefined item out of the ipod (music) library on an iOS device. you are then using an asset reader to collect it's buffers, and queue those buffers, where possible, in an AudioQueue.
The problem you are having, I think, is that you are setting the audio queue buffer's input format to Linear Pulse Code Modulation (LPCM - hope I got that right, I might be off on the acronym). The output settings you are passing to the asset reader output are nil, which means that you'll get an output that is most likely NOT LPCM, but is instead aiff or aac or mp3 or whatever the format is of the song as it exists in iOS's media library. You can, however, remedy this situation by passing in different output settings.
Try changing
readerOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:track outputSettings:nil];
to:
[NSDictionary dictionaryWithObjectsAndKeys:
[NSNumber numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey,
[NSNumber numberWithFloat:44100.0], AVSampleRateKey,
[NSNumber numberWithInt:2], AVNumberOfChannelsKey,
[NSData dataWithBytes:&channelLayout length:sizeof(AudioChannelLayout)],
AVChannelLayoutKey,
[NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
[NSNumber numberWithBool:NO], AVLinearPCMIsNonInterleaved,
[NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
[NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey,
nil];
output = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:track audioSettings:outputSettings];
It's my understanding (per the documentation at Apple1) that passing nil as the output settings param gives you samples of the same file type as the original audio track. Even if you have a file that is LPCM, some other settings might be off, which might cause your problems. At the very least, this will normalize all the reader output, which should make things a bit easier to trouble shoot.
Hope that helps!
Edit:
the reason why I provided nul as a parameter for AVURLAsset *songAsset
= [AVURLAsset URLAssetWithURL:assetURL options:audioReadSettings];
was because according to the documentation and trial and error, I...
AVAssetReaders do 2 things; read back an audio file as it exists on disk (i.e.: mp3, aac, aiff), or convert the audio into lpcm.
If you pass nil as the output settings, it will read the file back as it exists, and in this you are correct. I apologize for not mentioning that an asset reader will only allow nil or LPCM. I actually ran into that problem myself (it's in the docs somewhere, but requires a bit of diving), but didn't elect to mention it here as it wasn't on my mind at the time. Sooooo... sorry about that?
If you want to know the AudioStreamBasicDescription (ASBD) of the track you are reading before you read it, you can get it by doing this:
AVURLAsset* uasset = [[AVURLAsset URLAssetWithURL:<#assetURL#> options:nil]retain];
AVAssetTrack*track = [uasset.tracks objectAtIndex:0];
CMFormatDescriptionRef formDesc = (CMFormatDescriptionRef)[[track formatDescriptions] objectAtIndex:0];
const AudioStreamBasicDescription* asbdPointer = CMAudioFormatDescriptionGetStreamBasicDescription(formDesc);
//because this is a pointer and not a struct we need to move the data into a struct so we can use it
AudioStreamBasicDescription asbd = {0};
memcpy(&asbd, asbdPointer, sizeof(asbd));
//asbd now contains a basic description for the track
You can then convert asbd to binary data in whatever format you see fit and transfer it over the network. You should then be able to start sending audio buffer data over the network and successfully play it back with your AudioQueue.
I actually had a system like this working not that long ago, but since I could't keep the connection alive when the iOS client device went to the background, I wasn't able to use it for my purpose. Still, if all that work lets me help someone else who can actually use the info, seems like a win to me.

Mute Audio AVAssetWriterInput while recording

I am recording video and audio using an AVAssetWriter to append CMSampleBuffer from AVCaptureVideoDataOutput and AVCaptureAudioDataOutput respectively. What I want to do is at the user discretion mute the audio during the recording.
I assuming the best way is to some how create an empty CMSampleBuffer like
CMSampleBufferRef sb;
CMSampleBufferCreate(kCFAllocatorDefault, NULL, YES, NULL, NULL, NULL, 0, 1, &sti, 0, NULL, &sb);
[_audioInputWriter appendSampleBuffer:sb];
CFRelease(sb);
but that doesn't work, so I am assuming that I need to create a silent audio buffer. How do I do this and is there a better way?
I have done this before by calling a function that processes the data in the SampleBuffer and zeros all of it. Might need to modify this if your audio format is not using an SInt16 sample size.
You can also use this same technique to process the audio in other ways.
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
{
if(isMute){
[self muteAudioInBuffer:sampleBuffer];
}
}
- (void) muteAudioInBuffer:(CMSampleBufferRef)sampleBuffer
{
CMItemCount numSamples = CMSampleBufferGetNumSamples(sampleBuffer);
NSUInteger channelIndex = 0;
CMBlockBufferRef audioBlockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
size_t audioBlockBufferOffset = (channelIndex * numSamples * sizeof(SInt16));
size_t lengthAtOffset = 0;
size_t totalLength = 0;
SInt16 *samples = NULL;
CMBlockBufferGetDataPointer(audioBlockBuffer, audioBlockBufferOffset, &lengthAtOffset, &totalLength, (char **)(&samples));
for (NSInteger i=0; i<numSamples; i++) {
samples[i] = (SInt16)0;
}
}

Resources