Adding audio buffer [from file] to 'live' audio buffer [recording to file] - ios

What I'm trying to do:
Record up to a specified duration of audio/video, where the resulting output file will have a pre-defined background music from external audio-file added - without further encoding/exporting after recording.
As if you were recording video using the iPhones Camera-app, and all the recorded videos in 'Camera Roll' have background-songs. No exporting or loading after ending recording, and not in a separate AudioTrack.
How I'm trying to achieve this:
By using AVCaptureSession, in the delegate-method where the (CMSampleBufferRef)sample buffers are passed through, I'm pushing them to an AVAssetWriter to write to file. As I don't want multiple audio tracks in my output file, I can't pass the background-music through a separate AVAssetWriterInput, which means I have to add the background-music to each sample buffer from the recording while it's recording to avoid having to merge/export after recording.
The background-music is a specific, pre-defined audio file (format/codec: m4a aac), and will need no time-editing, just adding beneath the entire recording, from start to end. The recording will never be longer than the background-music-file.
Before starting the writing to file, I've also made ready an AVAssetReader, reading the specified audio-file.
Some pseudo-code(threading excluded):
-(void)startRecording
{
/*
Initialize writer and reader here: [...]
*/
backgroundAudioTrackOutput = [AVAssetReaderTrackOutput
assetReaderTrackOutputWithTrack:
backgroundAudioTrack
outputSettings:nil];
if([backgroundAudioReader canAddOutput:backgroundAudioTrackOutput])
[backgroundAudioReader addOutput:backgroundAudioTrackOutput];
else
NSLog(#"This doesn't happen");
[backgroundAudioReader startReading];
/* Some more code */
recording = YES;
}
- (void)captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection
{
if(!recording)
return;
if(videoConnection)
[self writeVideoSampleBuffer:sampleBuffer];
else if(audioConnection)
[self writeAudioSampleBuffer:sampleBuffer];
}
The AVCaptureSession is already streaming the camera-video and microphone-audio, and is just waiting for the BOOL recording to be set to YES. This isn't exactly how I'm doing this, but a short, somehow equivalent representation. When the delegate-method receives a CMSampleBufferRef of type Audio, I call my own method writeAudioSamplebuffer:sampleBuffer. If this was to be done normally, without a background-track as I'm trying to do, I'd simply put something like this: [assetWriterAudioInput appendSampleBuffer:sampleBuffer]; instead of calling my method. In my case though, I need to overlap two buffers before writing it:
-(void)writeAudioSamplebuffer:(CMSampleBufferRef)recordedSampleBuffer
{
CMSampleBufferRef backgroundSampleBuffer =
[backgroundAudioTrackOutput copyNextSampleBuffer];
/* DO MAGIC HERE */
CMSampleBufferRef resultSampleBuffer =
[self overlapBuffer:recordedSampleBuffer
withBackgroundBuffer:backgroundSampleBuffer];
/* END MAGIC HERE */
[assetWriterAudioInput appendSampleBuffer:resultSampleBuffer];
}
The problem:
I have to add incremental sample buffers from a local file to the live buffers coming in. The method I have created named overlapBuffer:withBackgroundBuffer: isn't doing much right now. I know how to extract AudioBufferList, AudioBuffer and mData etc. from a CMSampleBufferRef, but I'm not sure how to actually add them together - however - I haven't been able to test different ways to do that, because the real problem happens before that. Before the Magic should happen, I am in possession of two CMSampleBufferRefs, one received from microphone, one read from file, and this is the problem:
The sample buffer received from the background-music-file is different than the one I receive from the recording-session. It seems like the call to [self.backgroundAudioTrackOutput copyNextSampleBuffer]; receives a large number of samples. I realize that this might be obvious to some people, but I've never before been at this level of media-technology. I see now that it was wishful thinking to call copyNextSampleBuffer each time I receive a sampleBuffer from the session, but I don't know when/where to put it.
As far as I can tell, the recording-session gives one audio-sample in each sample-buffer, while the file-reader gives multiple samples in each sample-buffer. Can I somehow create a counter to count each received recorded sample/buffers, and then use the first file-sampleBuffer to extract each sample, until the current file-sampleBuffer has no more samples 'to give', and then call [..copyNext..], and do the same to that buffer?
As I'm in full control of both the recording and the file's codecs, formats etc, I am hoping that such a solution wouldn't ruin the 'alignment'/synchronization of the audio. Given that both samples have the same sampleRate, could this still be a problem?
Note
I'm not even sure if this is possible, but I see no immediate reason why it shouldn't.
Also worth mentioning that when I try to use a Video-file instead of an Audio-file, and try to continually pull video-sampleBuffers, they align up perfectly.

I am not familiarized with AVCaptureOutput, since all my sound/music sessions were built using AudioToolbox instead of AVFoundation. However, I guess you should be able to set the size of the recording capturing buffer. If not, and you are still get just one sample, I would recommend you to store each individual data obtained from the capture output in an auxiliar buffer. When the auxiliar buffer reaches the same size as the file-reading buffer, then call [self overlapBuffer:auxiliarSampleBuffer withBackgroundBuffer:backgroundSampleBuffer];
I hope this would help you. If not, I can provide example about how to do this using CoreAudio. Using CoreAudio I have been able to obtain 1024 LCPM samples buffer from both microphone capturing and file reading. So the overlapping is immediate.

Related

ARSession and Recording Video

I’m manually writing a video recorder. Unfortunately it’s necessary if you want to record video and use ARKit at the same time. I’ve got most of it figured out, but now I need to optimize it a bit because my phone gets pretty hot running ARKit, Vision and this recorder all at once.
To make the recorder, you need to use an AVAssetWriter with an AVAssetWriterInput (and AVAssetWriterInputPixelBufferAdaptor). The input has a isReadyForMoreMediaData property you need to check before you can write another frame. I’m recording in real-time (or as close to as possible).
Right now, when ARKit.ARSession gives me a new session I immediately pass it to the AVAssetWriterInput. What I want to do is add it to a queue, and have loop check to see if there’s samples available to write. For the life of me I can’t figure out how to do that efficiently.
I want to just run a while loop like this, but it seems like it would be a bad idea:
func startSession() {
// …
while isRunning {
guard !pixelBuffers.isEmpty && writerInput.isReadyForMoreMediaData else {
continue
}
// process sample
}
}
Can I run this a separate thread from the ARSession.delegateQueue? I don't want to run into issues with CVPixelBuffers from the camera being retained for too long.

AudioUnitRender got error kAudioUnitErr_CannotDoInCurrentContext (-10863)

I want to play the recorded audio directly to speaker when headset is plugged in an iOS device.
What I did is calling AudioUnitRender in AURenderCallback func so that the audio data is writed to AudioBuffer structure.
It works well if the "IO buffer duration" is not set or set to 0.020seconds. If the "IO buffer duration" is set to a small value (0.005 etc.) by calling setPreferredIOBufferDuration, AudioUnitRender() will return an error:
kAudioUnitErr_CannotDoInCurrentContext (-10863).
Any one can help to figure out why and how to resolve it please? Thanks
Just wanted to add that changing the output scope sample rate to match the input scope sample rate of the input to the OSx kAudioUnitSubType_HALOutput Audio Unit that I was using fixed this error for me
The buffer is full so wait until a subsequent render pass or use a larger buffer.
This same error code is used by AudioToolbox, AudioUnit and AUGraph but only documented for AUGraph.
To avoid spinning or waiting in the render thread (a bad idea!), many
of the calls to AUGraph can return:
kAUGraphErr_CannotDoInCurrentContext. This result is only generated
when you call an AUGraph API from its render callback. It means that
the lock that it required was held at that time, by another thread. If
you see this result code, you can generally attempt the action again -
typically the NEXT render cycle (so in the mean time the lock can be
cleared), or you can delegate that call to another thread in your app.
You should not spin or put-to-sleep the render thread.
https://developer.apple.com/reference/audiotoolbox/kaugrapherr_cannotdoincurrentcontext

`[AVCaptureSession canAddOutput:output]` returns NO intermittently. Can I find out why?

I am using canAddOutput: to determine if I can add a AVCaptureMovieFileOutput to a AVCaptureSession and I'm finding that canAddOutput: is sometimes returning NO, and mostly returning YES. Is there a way to find out why a NO was returned? Or a way to eliminate the situation that is causing the NO to be returned? Or anything else I can do that will prevent the user from just seeing an intermittent failure?
Some further notes: This happens approximately once in 30 calls. As my app is not launched, it has only been tested on one device: an iPhone 5 running 7.1.2
Here is quote from documentation (discussion of canAddOutput:)
You cannot add an output that reads from a track of an asset other than the asset used to initialize the receiver.
Explanation that will help you (Please check if your code is matching to this guide, if you're doing all right, it should not trigger error, because basically canAddOuput: checks the compatibility).
AVCaptureSession
Used for the connection between the organizations Device Input and output, similar to the connection of the DShow the filter. If you can connect the input and output, after the start, the data will be read from input to the output.
Several main points:
a) AVCaptureDevice, the definition of equipment, both camera Device.
b) AVCaptureInput
c) AVCaptureOutput
Input and output are not one-to-one, such as the video output while video + audio Input.
Before and after switching the camera:
AVCaptureSession * session = <# A capture session #>;
[session beginConfiguration];
[session removeInput: frontFacingCameraDeviceInput];
[session addInput: backFacingCameraDeviceInput];
[session commitConfiguration];
Add the capture INPUT:
To add a capture device to a capture session, you use an instance of AVCaptureDeviceInput (a concrete
subclass of the abstract AVCaptureInput class). The capture device input manages the device's ports.
NSError * error = nil;
AVCaptureDeviceInput * input =
[AVCaptureDeviceInput deviceInputWithDevice: device error: & error];
if (input) {
// Handle the error appropriately.
}
Add output, output classification:
To get output from a capture session, you add one or more outputs. An output is an instance of a concrete
subclass of AVCaptureOutput;
you use:
AVCaptureMovieFileOutput to output to a movie file
AVCaptureVideoDataOutput if you want to process frames from the video being captured
AVCaptureAudioDataOutput if you want to process the audio data being captured
AVCaptureStillImageOutput if you want to capture still images with accompanying metadata
You add outputs to a capture session using addOutput:.
You check whether a capture output is compatible
with an existing session using canAddOutput:.
You can add and remove outputs as you want while the
session is running.
AVCaptureSession * captureSession = <# Get a capture session #>;
AVCaptureMovieFileOutput * movieInput = <# Create and configure a movie output #>;
if ([captureSession canAddOutput: movieInput]) {
[captureSession addOutput: movieInput];
}
else {
// Handle the failure.
}
Save a video file, add the video file output:
You save movie data to a file using an AVCaptureMovieFileOutput object. (AVCaptureMovieFileOutput
is a concrete subclass of AVCaptureFileOutput, which defines much of the basic behavior.) You can configure
various aspects of the movie file output, such as the maximum duration of the recording, or the maximum file
size. You can also prohibit recording if there is less than a given amount of disk space left.
AVCaptureMovieFileOutput * aMovieFileOutput = [[AVCaptureMovieFileOutput alloc]
init];
CMTime maxDuration = <# Create a CMTime to represent the maximum duration #>;
aMovieFileOutput.maxRecordedDuration = maxDuration;
aMovieFileOutput.minFreeDiskSpaceLimit = <# An appropriate minimum given the quality
of the movie format and the duration #>;
Processing preview video frame data, each frame view finder data can be used for subsequent high-level processing, such as face detection, and so on.
An AVCaptureVideoDataOutput object uses delegation to vend video frames.
You set the delegate using
setSampleBufferDelegate: queue:.
In addition to the delegate, you specify a serial queue on which they
delegate methods are invoked. You must use a serial queue to ensure that frames are delivered to the delegate
in the proper order.
You should not pass the queue returned by dispatch_get_current_queue since there
is no guarantee as to which thread the current queue is running on. You can use the queue to modify the
priority given to delivering and processing the video frames.
Data processing for the frame, there must be restrictions on the size (image size) and the processing time limit, if the processing time is too long, the underlying sensor will not send data to the layouter and the callback.
You should set the session output to the lowest practical resolution for your application.
Setting the output
to a higher resolution than necessary wastes processing cycles and needlessly consumes power.
You must ensure that your implementation of
captureOutput: didOutputSampleBuffer: fromConnection: is able to process a sample buffer within
the amount of time allotted to a frame. If it takes too long, and you hold onto the video frames, AVFoundation
will stop delivering frames, not only to your delegate but also other outputs such as a preview layer.
Deal with the capture process:
AVCaptureStillImageOutput * stillImageOutput = [[AVCaptureStillImageOutput alloc]
init];
NSDictionary * outputSettings = [[NSDictionary alloc] initWithObjectsAndKeys: AVVideoCodecJPEG,
AVVideoCodecKey, nil];
[StillImageOutput setOutputSettings: outputSettings];
Able to support different format also supports directly generate jpg stream.
If you want to capture a JPEG image, you should typically not specify your own compression format. Instead,
you should let the still image output do the compression for you, since its compression is hardware-accelerated.
If you need a data representation of the image, you can use jpegStillImageNSDataRepresentation: to
get an NSData object without re-compressing the data, even if you modify the image's metadata.
Camera preview display:
You can provide the user with a preview of what's being recorded using an AVCaptureVideoPreviewLayer
object. AVCaptureVideoPreviewLayer is a subclass of CALayer (see Core Animation Programming Guide. You don't need any outputs to show the preview.
AVCaptureSession * captureSession = <# Get a capture session #>;
CALayer * viewLayer = <# Get a layer from the view in which you want to present the
The preview #>;
AVCaptureVideoPreviewLayer * captureVideoPreviewLayer = [[AVCaptureVideoPreviewLayer
alloc] initWithSession: captureSession];
[viewLayer addSublayer: captureVideoPreviewLayer];
In general, the preview layer behaves like any other CALayer object in the render tree (see Core Animation
Programming Guide). You can scale the image and perform transformations, rotations and so on just as you
would any layer. One difference is that you may need to set the layer's orientation property to specify how
it should rotate images coming from the camera. In addition, on iPhone 4 the preview layer supports mirroring
(This is the default when previewing the front-facing camera).
Referring from this answer, there might be a possibility that this delegate method may be running in the background, which causes the previous AVCaptureSession not disconnected properly sometimes resulting in canAddOutput: to be NO sometimes.
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputMetadataObjects:(NSArray *)metadataObjects fromConnection:(AVCaptureConnection *)connection
The solution might be to use stopRunning in the above delegate(Of course after doing necessary actions and condition checks, you need to finish off your previous sessions properly right?).
Adding on to that, It would be better if you provide some code of what you are trying to do.
It's can be one from this 2 cases
1) Session is running
2) You already added output
You can't add 2 output or 2 input, and also you can't create 2 different sessions
It may be a combination of:
Calling this method when the camera is busy.
Not properly removing your previously connected AVCaptureSession.
You should try to only add it once (where I guess canAddOutput: will always be YES) and just pause/resume your session as needed:
// Stop session if possible
if (_captureSession.running && !_captureInProgress)
{
[_captureSession stopRunning];
NBULogVerbose(#"Capture session: {\n%#} stopped running", _captureSession);
}
You can take a look here.
I think this will help you
canAddOutput:
Returns a Boolean value that indicates whether a given output can be added to the session.
- (BOOL)canAddOutput:(AVCaptureOutput *)output
Parameters
output
An output that you want to add to the session.
Return Value
YES if output can be added to the session, otherwise NO.
Availability
Available in OS X v10.7 and later.
Here is the link for apple doc Click here

iOS: Playing PCM buffers from a stream

I'm receiving a series of UDP packets from a socket containing encoded PCM buffers. After decoding them, I'm left with an int16 * audio buffer, which I'd like to immediately play back.
The intended logic goes something like this:
init(){
initTrack(track, output, channels, sample_rate, ...);
}
onReceiveBufferFromSocket(NSData data){
//Decode the buffer
int16 * buf = handle_data(data);
//Play data
write_to_track(track, buf, length_of_buf, etc);
}
I'm not sure about everything that has to do with playing back the buffers though. On Android, I'm able to achieve this by creating an AudioTrack object, setting it up by specifying a sample rate, a format, channels, etc... and then just calling the "write" method with the buffer (like I wish I could in my pseudo-code above) but on iOS I'm coming up short.
I tried using the Audio File Stream Services, but I'm guessing I'm doing something wrong since no sound ever comes out and I feel like those functions by themselves don't actually do any playback. I also attempted to understand the Audio Queue Services (which I think might be close to what I want), however I was unable to find any simple code samples for its usage.
Any help would be greatly appreciated, specially in the form of example code.
You need to use some type of buffer to hold your incoming UDP data. This is an easy and good circular buffer that I have used.
Then to play back data from the buffer, you can use Audio Unit framework. Here is a good example project.
Note: The first link also shows you how to playback using Audio Unit.
You could use audioQueue services as well, make sure your doing some kind of packet re-ordering, if your using ffmpeg to decode the streams there is an option for this.
otherwise audio queues are easy to set up.
https://github.com/mooncatventures-group/iFrameExtractor/blob/master/Classes/AudioController.m
You could also use AudioUnits, a bit more complicated though.

Removing Silence from Audio Queue session recorded audio in ios

I'm using Audio Queue to record audio from the iphone's mic and stop recording when silence detected (no audio input for 10seconds) but I want to discard the silence from audio file.
In AudioInputCallback function I am using following code to detect silence :
AudioQueueLevelMeterState meters[1];
UInt32 dlen = sizeof(meters);
OSStatus Status AudioQueueGetProperty(inAQ,kAudioQueueProperty_CurrentLevelMeterDB,meters,&dlen);
if(meters[0].mPeakPower < _threshold)
{ // NSLog(#"Silence detected");}
But how to remove these packets? Or Is there any better option?
Instead of removing the packets from the AudioQueue, you can delay the write up by writing it to a buffer first. The buffer can be easily defined by having it inside the inUserData.
When you finish recording, if the last 10 seconds is not silent, you write it back to whatever file you are going to write. Otherwise just free the buffer.
after the file is recorded and closed, simply open and truncate the sample data you are not interested in (note: you can use AudioFile/ExtAudioFile APIs to properly update any dependent chunk/header sizes).

Resources