why is audio coming up garbled when using AVAssetReader with audio queue

why is audio coming up garbled when using AVAssetReader with audio queue - ios

based on my research.. people keep on saying that it's based on mismatched/wrong formatting.. but i'm using lPCM formatting for both input and output.. how can you go wrong with that? the result i'm getting is just noise.. (like white noise)
I've decided to just paste my entire code.. perhaps that would help:
#import "AppDelegate.h"
#import "ViewController.h"
#implementation AppDelegate
#synthesize window = _window;
#synthesize viewController = _viewController;
- (BOOL)application:(UIApplication *)application didFinishLaunchingWithOptions:(NSDictionary *)launchOptions
{
self.window = [[UIWindow alloc] initWithFrame:[[UIScreen mainScreen] bounds]];
// Override point for customization after application launch.
self.viewController = [[ViewController alloc] initWithNibName:#"ViewController" bundle:nil];
self.window.rootViewController = self.viewController;
[self.window makeKeyAndVisible];
// Insert code here to initialize your application
player = [[Player alloc] init];
[self setupReader];
[self setupQueue];
// initialize reader in a new thread
internalThread =[[NSThread alloc]
initWithTarget:self
selector:#selector(readPackets)
object:nil];
[internalThread start];
// start the queue. this function returns immedatly and begins
// invoking the callback, as needed, asynchronously.
//CheckError(AudioQueueStart(queue, NULL), "AudioQueueStart failed");
// and wait
printf("Playing...\n");
do
{
CFRunLoopRunInMode(kCFRunLoopDefaultMode, 0.25, false);
} while (!player.isDone /*|| gIsRunning*/);
// isDone represents the state of the Audio File enqueuing. This does not mean the
// Audio Queue is actually done playing yet. Since we have 3 half-second buffers in-flight
// run for continue to run for a short additional time so they can be processed
CFRunLoopRunInMode(kCFRunLoopDefaultMode, 2, false);
// end playback
player.isDone = true;
CheckError(AudioQueueStop(queue, TRUE), "AudioQueueStop failed");
cleanup:
AudioQueueDispose(queue, TRUE);
AudioFileClose(player.playbackFile);
return YES;
}
- (void) setupReader
{
NSURL *assetURL = [NSURL URLWithString:#"ipod-library://item/item.m4a?id=1053020204400037178"]; // from ilham's ipod
AVURLAsset *songAsset = [AVURLAsset URLAssetWithURL:assetURL options:nil];
// from AVAssetReader Class Reference:
// AVAssetReader is not intended for use with real-time sources,
// and its performance is not guaranteed for real-time operations.
NSError * error = nil;
AVAssetReader* reader = [[AVAssetReader alloc] initWithAsset:songAsset error:&error];
AVAssetTrack* track = [songAsset.tracks objectAtIndex:0];
readerOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:track
outputSettings:nil];
// AVAssetReaderOutput* readerOutput = [[AVAssetReaderAudioMixOutput alloc] initWithAudioTracks:songAsset.tracks audioSettings:nil];
[reader addOutput:readerOutput];
[reader startReading];
}
- (void) setupQueue
{
// get the audio data format from the file
// we know that it is PCM.. since it's converted
AudioStreamBasicDescription dataFormat;
dataFormat.mSampleRate = 44100.0;
dataFormat.mFormatID = kAudioFormatLinearPCM;
dataFormat.mFormatFlags = kAudioFormatFlagIsBigEndian | kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
dataFormat.mBytesPerPacket = 4;
dataFormat.mFramesPerPacket = 1;
dataFormat.mBytesPerFrame = 4;
dataFormat.mChannelsPerFrame = 2;
dataFormat.mBitsPerChannel = 16;
// create a output (playback) queue
CheckError(AudioQueueNewOutput(&dataFormat, // ASBD
MyAQOutputCallback, // Callback
(__bridge void *)self, // user data
NULL, // run loop
NULL, // run loop mode
0, // flags (always 0)
&queue), // output: reference to AudioQueue object
"AudioQueueNewOutput failed");
// adjust buffer size to represent about a half second (0.5) of audio based on this format
CalculateBytesForTime(dataFormat, 0.5, &bufferByteSize, &player->numPacketsToRead);
// check if we are dealing with a VBR file. ASBDs for VBR files always have
// mBytesPerPacket and mFramesPerPacket as 0 since they can fluctuate at any time.
// If we are dealing with a VBR file, we allocate memory to hold the packet descriptions
bool isFormatVBR = (dataFormat.mBytesPerPacket == 0 || dataFormat.mFramesPerPacket == 0);
if (isFormatVBR)
player.packetDescs = (AudioStreamPacketDescription*)malloc(sizeof(AudioStreamPacketDescription) * player.numPacketsToRead);
else
player.packetDescs = NULL; // we don't provide packet descriptions for constant bit rate formats (like linear PCM)
// get magic cookie from file and set on queue
MyCopyEncoderCookieToQueue(player.playbackFile, queue);
// allocate the buffers and prime the queue with some data before starting
player.isDone = false;
player.packetPosition = 0;
int i;
for (i = 0; i < kNumberPlaybackBuffers; ++i)
{
CheckError(AudioQueueAllocateBuffer(queue, bufferByteSize, &audioQueueBuffers[i]), "AudioQueueAllocateBuffer failed");
// EOF (the entire file's contents fit in the buffers)
if (player.isDone)
break;
}
}
-(void)readPackets
{
// initialize a mutex and condition so that we can block on buffers in use.
pthread_mutex_init(&queueBuffersMutex, NULL);
pthread_cond_init(&queueBufferReadyCondition, NULL);
state = AS_BUFFERING;
while ((sample = [readerOutput copyNextSampleBuffer])) {
AudioBufferList audioBufferList;
CMBlockBufferRef CMBuffer = CMSampleBufferGetDataBuffer( sample );
CheckError(CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(
sample,
NULL,
&audioBufferList,
sizeof(audioBufferList),
NULL,
NULL,
kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment,
&CMBuffer
),
"could not read samples");
AudioBuffer audioBuffer = audioBufferList.mBuffers[0];
UInt32 inNumberBytes = audioBuffer.mDataByteSize;
size_t incomingDataOffset = 0;
while (inNumberBytes) {
size_t bufSpaceRemaining;
bufSpaceRemaining = bufferByteSize - bytesFilled;
#synchronized(self)
{
bufSpaceRemaining = bufferByteSize - bytesFilled;
size_t copySize;
if (bufSpaceRemaining < inNumberBytes)
{
copySize = bufSpaceRemaining;
}
else
{
copySize = inNumberBytes;
}
// copy data to the audio queue buffer
AudioQueueBufferRef fillBuf = audioQueueBuffers[fillBufferIndex];
memcpy((char*)fillBuf->mAudioData + bytesFilled, (const char*)(audioBuffer.mData + incomingDataOffset), copySize);
// keep track of bytes filled
bytesFilled +=copySize;
incomingDataOffset +=copySize;
inNumberBytes -=copySize;
}
// if the space remaining in the buffer is not enough for this packet, then enqueue the buffer.
if (bufSpaceRemaining < inNumberBytes + bytesFilled)
{
[self enqueueBuffer];
}
}
}
}
-(void)enqueueBuffer
{
#synchronized(self)
{
inuse[fillBufferIndex] = true; // set in use flag
buffersUsed++;
// enqueue buffer
AudioQueueBufferRef fillBuf = audioQueueBuffers[fillBufferIndex];
NSLog(#"we are now enqueing buffer %d",fillBufferIndex);
fillBuf->mAudioDataByteSize = bytesFilled;
err = AudioQueueEnqueueBuffer(queue, fillBuf, 0, NULL);
if (err)
{
NSLog(#"could not enqueue queue with buffer");
return;
}
if (state == AS_BUFFERING)
{
//
// Fill all the buffers before starting. This ensures that the
// AudioFileStream stays a small amount ahead of the AudioQueue to
// avoid an audio glitch playing streaming files on iPhone SDKs < 3.0
//
if (buffersUsed == kNumberPlaybackBuffers - 1)
{
err = AudioQueueStart(queue, NULL);
if (err)
{
NSLog(#"couldn't start queue");
return;
}
state = AS_PLAYING;
}
}
// go to next buffer
if (++fillBufferIndex >= kNumberPlaybackBuffers) fillBufferIndex = 0;
bytesFilled = 0; // reset bytes filled
}
// wait until next buffer is not in use
pthread_mutex_lock(&queueBuffersMutex);
while (inuse[fillBufferIndex])
{
pthread_cond_wait(&queueBufferReadyCondition, &queueBuffersMutex);
}
pthread_mutex_unlock(&queueBuffersMutex);
}
#pragma mark - utility functions -
// generic error handler - if err is nonzero, prints error message and exits program.
static void CheckError(OSStatus error, const char *operation)
{
if (error == noErr) return;
char str[20];
// see if it appears to be a 4-char-code
*(UInt32 *)(str + 1) = CFSwapInt32HostToBig(error);
if (isprint(str[1]) && isprint(str[2]) && isprint(str[3]) && isprint(str[4])) {
str[0] = str[5] = '\'';
str[6] = '\0';
} else
// no, format it as an integer
sprintf(str, "%d", (int)error);
fprintf(stderr, "Error: %s (%s)\n", operation, str);
exit(1);
}
// we only use time here as a guideline
// we're really trying to get somewhere between 16K and 64K buffers, but not allocate too much if we don't need it/*
void CalculateBytesForTime(AudioStreamBasicDescription inDesc, Float64 inSeconds, UInt32 *outBufferSize, UInt32 *outNumPackets)
{
// we need to calculate how many packets we read at a time, and how big a buffer we need.
// we base this on the size of the packets in the file and an approximate duration for each buffer.
//
// first check to see what the max size of a packet is, if it is bigger than our default
// allocation size, that needs to become larger
// we don't have access to file packet size, so we just default it to maxBufferSize
UInt32 maxPacketSize = 0x10000;
static const int maxBufferSize = 0x10000; // limit size to 64K
static const int minBufferSize = 0x4000; // limit size to 16K
if (inDesc.mFramesPerPacket) {
Float64 numPacketsForTime = inDesc.mSampleRate / inDesc.mFramesPerPacket * inSeconds;
*outBufferSize = numPacketsForTime * maxPacketSize;
} else {
// if frames per packet is zero, then the codec has no predictable packet == time
// so we can't tailor this (we don't know how many Packets represent a time period
// we'll just return a default buffer size
*outBufferSize = maxBufferSize > maxPacketSize ? maxBufferSize : maxPacketSize;
}
// we're going to limit our size to our default
if (*outBufferSize > maxBufferSize && *outBufferSize > maxPacketSize)
*outBufferSize = maxBufferSize;
else {
// also make sure we're not too small - we don't want to go the disk for too small chunks
if (*outBufferSize < minBufferSize)
*outBufferSize = minBufferSize;
}
*outNumPackets = *outBufferSize / maxPacketSize;
}
// many encoded formats require a 'magic cookie'. if the file has a cookie we get it
// and configure the queue with it
static void MyCopyEncoderCookieToQueue(AudioFileID theFile, AudioQueueRef queue ) {
UInt32 propertySize;
OSStatus result = AudioFileGetPropertyInfo (theFile, kAudioFilePropertyMagicCookieData, &propertySize, NULL);
if (result == noErr && propertySize > 0)
{
Byte* magicCookie = (UInt8*)malloc(sizeof(UInt8) * propertySize);
CheckError(AudioFileGetProperty (theFile, kAudioFilePropertyMagicCookieData, &propertySize, magicCookie), "get cookie from file failed");
CheckError(AudioQueueSetProperty(queue, kAudioQueueProperty_MagicCookie, magicCookie, propertySize), "set cookie on queue failed");
free(magicCookie);
}
}
#pragma mark - audio queue -
static void MyAQOutputCallback(void *inUserData, AudioQueueRef inAQ, AudioQueueBufferRef inCompleteAQBuffer)
{
AppDelegate *appDelegate = (__bridge AppDelegate *) inUserData;
[appDelegate myCallback:inUserData
inAudioQueue:inAQ
audioQueueBufferRef:inCompleteAQBuffer];
}
- (void)myCallback:(void *)userData
inAudioQueue:(AudioQueueRef)inAQ
audioQueueBufferRef:(AudioQueueBufferRef)inCompleteAQBuffer
{
unsigned int bufIndex = -1;
for (unsigned int i = 0; i < kNumberPlaybackBuffers; ++i)
{
if (inCompleteAQBuffer == audioQueueBuffers[i])
{
bufIndex = i;
break;
}
}
if (bufIndex == -1)
{
NSLog(#"something went wrong at queue callback");
return;
}
// signal waiting thread that the buffer is free.
pthread_mutex_lock(&queueBuffersMutex);
NSLog(#"signalling that buffer %d is free",bufIndex);
inuse[bufIndex] = false;
buffersUsed--;
pthread_cond_signal(&queueBufferReadyCondition);
pthread_mutex_unlock(&queueBuffersMutex);
}
#end
Update:
btomw's answer below solved the problem magnificently. But I want to get to the bottom of this (most novice developers like myself and even btomw when he first started usually shoot in the dark with parameters, formatting etc - see here for an example -)..
the reason why I provided nul as a parameter for
AVURLAsset *songAsset = [AVURLAsset URLAssetWithURL:assetURL options:audioReadSettings];
was because according to the documentation and trial and error, I realized that any formatting I put other than lPCM would be rejected outright. In other words, when you use AVAseetReader or conversion even the result is always lPCM.. so I thought the default format was lPCM anyways and so I left it as null.. but I guess I was wrong.
The weird part in this (please correct me anyone, if I'm wrong) is that as I mentioned.. supposed the original file was .mp3, and my intention was to play it back (or send the packets over a network etc) as mp3.. and so I provided an mp3 ABSD.. the asset reader will crash! so is that if i wanted to send it in it's original form, i just supply null? the obvious problem with this is that there would be no way for me to figure out what ABSD it has once I receive it on the other side.. or could I?
Update 2:You can download the code from github.

So here's what I think is happening and also how I think you can fix it.
You're pulling a predefined item out of the ipod (music) library on an iOS device. you are then using an asset reader to collect it's buffers, and queue those buffers, where possible, in an AudioQueue.
The problem you are having, I think, is that you are setting the audio queue buffer's input format to Linear Pulse Code Modulation (LPCM - hope I got that right, I might be off on the acronym). The output settings you are passing to the asset reader output are nil, which means that you'll get an output that is most likely NOT LPCM, but is instead aiff or aac or mp3 or whatever the format is of the song as it exists in iOS's media library. You can, however, remedy this situation by passing in different output settings.
Try changing
readerOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:track outputSettings:nil];
to:
[NSDictionary dictionaryWithObjectsAndKeys:
[NSNumber numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey,
[NSNumber numberWithFloat:44100.0], AVSampleRateKey,
[NSNumber numberWithInt:2], AVNumberOfChannelsKey,
[NSData dataWithBytes:&channelLayout length:sizeof(AudioChannelLayout)],
AVChannelLayoutKey,
[NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
[NSNumber numberWithBool:NO], AVLinearPCMIsNonInterleaved,
[NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
[NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey,
nil];
output = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:track audioSettings:outputSettings];
It's my understanding (per the documentation at Apple1) that passing nil as the output settings param gives you samples of the same file type as the original audio track. Even if you have a file that is LPCM, some other settings might be off, which might cause your problems. At the very least, this will normalize all the reader output, which should make things a bit easier to trouble shoot.
Hope that helps!
Edit:
the reason why I provided nul as a parameter for AVURLAsset *songAsset
= [AVURLAsset URLAssetWithURL:assetURL options:audioReadSettings];
was because according to the documentation and trial and error, I...
AVAssetReaders do 2 things; read back an audio file as it exists on disk (i.e.: mp3, aac, aiff), or convert the audio into lpcm.
If you pass nil as the output settings, it will read the file back as it exists, and in this you are correct. I apologize for not mentioning that an asset reader will only allow nil or LPCM. I actually ran into that problem myself (it's in the docs somewhere, but requires a bit of diving), but didn't elect to mention it here as it wasn't on my mind at the time. Sooooo... sorry about that?
If you want to know the AudioStreamBasicDescription (ASBD) of the track you are reading before you read it, you can get it by doing this:
AVURLAsset* uasset = [[AVURLAsset URLAssetWithURL:<#assetURL#> options:nil]retain];
AVAssetTrack*track = [uasset.tracks objectAtIndex:0];
CMFormatDescriptionRef formDesc = (CMFormatDescriptionRef)[[track formatDescriptions] objectAtIndex:0];
const AudioStreamBasicDescription* asbdPointer = CMAudioFormatDescriptionGetStreamBasicDescription(formDesc);
//because this is a pointer and not a struct we need to move the data into a struct so we can use it
AudioStreamBasicDescription asbd = {0};
memcpy(&asbd, asbdPointer, sizeof(asbd));
//asbd now contains a basic description for the track
You can then convert asbd to binary data in whatever format you see fit and transfer it over the network. You should then be able to start sending audio buffer data over the network and successfully play it back with your AudioQueue.
I actually had a system like this working not that long ago, but since I could't keep the connection alive when the iOS client device went to the background, I wasn't able to use it for my purpose. Still, if all that work lets me help someone else who can actually use the info, seems like a win to me.

Related

Recording audio and passing the data to a UIWebView (JavascriptCore) on iOS 8/9

We have an app that is mostly a UIWebView for a heavily javascript based web app. The requirement we have come up against is being able to play audio to the user and then record the user, play back that recording for confirmation and then send the audio to a server. This works in Chrome, Android and other platforms because that ability is built into the browser. No native code required.
Sadly, the iOS (iOS 8/9) web view lacks the ability to record audio.
The first workaround we tried was recording the audio with an AudioQueue and passing the data (LinearPCM 16bit) to a JS AudioNode so the web app could process the iOS audio exactly the same way as other platforms. This got to a point where we could pass the audio to JS, but the app would eventually crash with a bad memory access error or the javascript side just could not keep up with the data being sent.
The next idea was to save the audio recording to a file and send partial audio data to JS for visual feedback, a basic audio visualizer displayed during recording only.
The audio records and plays back fine to a WAVE file as Linear PCM signed 16bit. The JS visualizer is where we are stuck. It is expecting Linear PCM unsigned 8bit so I added a conversion step that may be wrong. I've tried several different ways, mostly found online, and have not found one that works which makes me think something else is wrong or missing before we even get to the conversion step.
Since I don't know what or where exactly the problem is I'll dump the code below for the audio recording and playback classes. Any suggestions would be welcome to resolve, or bypass somehow, this issue.
One idea I had was to record in a different format (CAF) using different format flags. Looking at the values that are produced, non of the signed 16bit ints come even close to the max value. I rarely see anything above +/-1000. Is that because of the kLinearPCMFormatFlagIsPacked flag in the AudioStreamPacketDescription? Removing that flag cuases the audio file to not be created because of an invalid format. Maybe switching to CAF would work but we need to convert to WAVE before sending the audio back to our server.
Or maybe my conversion from signed 16bit to unsigned 8bit is wrong? I have also tried bitshifting and casting. The only difference is, with this conversion all the audio values get compressed to between 125 and 130. Bit shifting and casting change that to 0-5 and 250-255. That doesn't really solve any problems on the JS side.
The next step would be, instead of passing the data to JS run it through a FFT function and produce values to be used directly by JS for the audio visualizer. I'd rather figure out if I have done something obviously wrong before going that direction.
AQRecorder.h - EDIT: updated audio format to LinearPCM 32bit Float.
#ifndef AQRecorder_h
#define AQRecorder_h
#import <AudioToolbox/AudioToolbox.h>
#define NUM_BUFFERS 3
#define AUDIO_DATA_TYPE_FORMAT float
#define JS_AUDIO_DATA_SIZE 32
#interface AQRecorder : NSObject {
AudioStreamBasicDescription mDataFormat;
AudioQueueRef mQueue;
AudioQueueBufferRef mBuffers[ NUM_BUFFERS ];
AudioFileID mAudioFile;
UInt32 bufferByteSize;
SInt64 mCurrentPacket;
bool mIsRunning;
}
- (void)setupAudioFormat;
- (void)startRecording;
- (void)stopRecording;
- (void)processSamplesForJS:(UInt32)audioDataBytesCapacity audioData:(void *)audioData;
- (Boolean)isRunning;
#end
#endif
AQRecorder.m - EDIT: updated audio format to LinearPCM 32bit Float. Added FFT step in processSamplesForJS instead of sending audio data directly.
#import <AVFoundation/AVFoundation.h>
#import "AQRecorder.h"
#import "JSMonitor.h"
#implementation AQRecorder
void AudioQueueCallback(void * inUserData,
AudioQueueRef inAQ,
AudioQueueBufferRef inBuffer,
const AudioTimeStamp * inStartTime,
UInt32 inNumberPacketDescriptions,
const AudioStreamPacketDescription* inPacketDescs)
{
AQRecorder *aqr = (__bridge AQRecorder *)inUserData;
if ( [aqr isRunning] )
{
if ( inNumberPacketDescriptions > 0 )
{
AudioFileWritePackets(aqr->mAudioFile, FALSE, inBuffer->mAudioDataByteSize, inPacketDescs, aqr->mCurrentPacket, &inNumberPacketDescriptions, inBuffer->mAudioData);
aqr->mCurrentPacket += inNumberPacketDescriptions;
[aqr processSamplesForJS:inBuffer->mAudioDataBytesCapacity audioData:inBuffer->mAudioData];
}
AudioQueueEnqueueBuffer(inAQ, inBuffer, 0, NULL);
}
}
- (void)debugDataFormat
{
NSLog(#"format=%i, sampleRate=%f, channels=%i, flags=%i, BPC=%i, BPF=%i", mDataFormat.mFormatID, mDataFormat.mSampleRate, (unsigned int)mDataFormat.mChannelsPerFrame, mDataFormat.mFormatFlags, mDataFormat.mBitsPerChannel, mDataFormat.mBytesPerFrame);
}
- (void)setupAudioFormat
{
memset(&mDataFormat, 0, sizeof(mDataFormat));
mDataFormat.mSampleRate = 44100.;
mDataFormat.mChannelsPerFrame = 1;
mDataFormat.mFormatID = kAudioFormatLinearPCM;
mDataFormat.mFormatFlags = kLinearPCMFormatFlagIsFloat | kLinearPCMFormatFlagIsPacked;
int sampleSize = sizeof(AUDIO_DATA_TYPE_FORMAT);
mDataFormat.mBitsPerChannel = 32;
mDataFormat.mBytesPerPacket = mDataFormat.mBytesPerFrame = (mDataFormat.mBitsPerChannel / 8) * mDataFormat.mChannelsPerFrame;
mDataFormat.mFramesPerPacket = 1;
mDataFormat.mReserved = 0;
[self debugDataFormat];
}
- (void)startRecording/
{
[self setupAudioFormat];
mCurrentPacket = 0;
NSString *recordFile = [NSTemporaryDirectory() stringByAppendingPathComponent: #"AudioFile.wav"];
CFURLRef url = CFURLCreateWithString(kCFAllocatorDefault, (CFStringRef)recordFile, NULL);;
OSStatus *stat =
AudioFileCreateWithURL(url, kAudioFileWAVEType, &mDataFormat, kAudioFileFlags_EraseFile, &mAudioFile);
NSError *error = [NSError errorWithDomain:NSOSStatusErrorDomain code:stat userInfo:nil];
NSLog(#"AudioFileCreateWithURL OSStatus :: %#", error);
CFRelease(url);
bufferByteSize = 896 * mDataFormat.mBytesPerFrame;
AudioQueueNewInput(&mDataFormat, AudioQueueCallback, (__bridge void *)(self), NULL, NULL, 0, &mQueue);
for ( int i = 0; i < NUM_BUFFERS; i++ )
{
AudioQueueAllocateBuffer(mQueue, bufferByteSize, &mBuffers[i]);
AudioQueueEnqueueBuffer(mQueue, mBuffers[i], 0, NULL);
}
mIsRunning = true;
AudioQueueStart(mQueue, NULL);
}
- (void)stopRecording
{
mIsRunning = false;
AudioQueueStop(mQueue, false);
AudioQueueDispose(mQueue, false);
AudioFileClose(mAudioFile);
}
- (void)processSamplesForJS:(UInt32)audioDataBytesCapacity audioData:(void *)audioData
{
int sampleCount = audioDataBytesCapacity / sizeof(AUDIO_DATA_TYPE_FORMAT);
AUDIO_DATA_TYPE_FORMAT *samples = (AUDIO_DATA_TYPE_FORMAT*)audioData;
NSMutableArray *audioDataBuffer = [[NSMutableArray alloc] initWithCapacity:JS_AUDIO_DATA_SIZE];
// FFT stuff taken mostly from Apples aurioTouch example
const Float32 kAdjust0DB = 1.5849e-13;
int bufferFrames = sampleCount;
int bufferlog2 = round(log2(bufferFrames));
float fftNormFactor = (1.0/(2*bufferFrames));
FFTSetup fftSetup = vDSP_create_fftsetup(bufferlog2, kFFTRadix2);
Float32 *outReal = (Float32*) malloc((bufferFrames / 2)*sizeof(Float32));
Float32 *outImaginary = (Float32*) malloc((bufferFrames / 2)*sizeof(Float32));
COMPLEX_SPLIT mDspSplitComplex = { .realp = outReal, .imagp = outImaginary };
Float32 *outFFTData = (Float32*) malloc((bufferFrames / 2)*sizeof(Float32));
//Generate a split complex vector from the real data
vDSP_ctoz((COMPLEX *)samples, 2, &mDspSplitComplex, 1, bufferFrames / 2);
//Take the fft and scale appropriately
vDSP_fft_zrip(fftSetup, &mDspSplitComplex, 1, bufferlog2, kFFTDirection_Forward);
vDSP_vsmul(mDspSplitComplex.realp, 1, &fftNormFactor, mDspSplitComplex.realp, 1, bufferFrames / 2);
vDSP_vsmul(mDspSplitComplex.imagp, 1, &fftNormFactor, mDspSplitComplex.imagp, 1, bufferFrames / 2);
//Zero out the nyquist value
mDspSplitComplex.imagp[0] = 0.0;
//Convert the fft data to dB
vDSP_zvmags(&mDspSplitComplex, 1, outFFTData, 1, bufferFrames / 2);
//In order to avoid taking log10 of zero, an adjusting factor is added in to make the minimum value equal -128dB
vDSP_vsadd(outFFTData, 1, &kAdjust0DB, outFFTData, 1, bufferFrames / 2);
Float32 one = 1;
vDSP_vdbcon(outFFTData, 1, &one, outFFTData, 1, bufferFrames / 2, 0);
// Average out FFT dB values
int grpSize = (bufferFrames / 2) / 32;
int c = 1;
Float32 avg = 0;
int d = 1;
for ( int i = 1; i < bufferFrames / 2; i++ )
{
if ( outFFTData[ i ] != outFFTData[ i ] || outFFTData[ i ] == INFINITY )
{ // NAN / INFINITE check
c++;
}
else
{
avg += outFFTData[ i ];
d++;
//NSLog(#"db = %f, avg = %f", outFFTData[ i ], avg);
if ( ++c >= grpSize )
{
uint8_t u = (uint8_t)((avg / d) + 128); //dB values seem to range from -128 to 0.
NSLog(#"%i = %i (%f)", i, u, avg);
[audioDataBuffer addObject:[NSNumber numberWithUnsignedInt:u]];
avg = 0;
c = 0;
d = 1;
}
}
}
[[JSMonitor shared] passAudioDataToJavascriptBridge:audioDataBuffer];
}
- (Boolean)isRunning
{
return mIsRunning;
}
#end
Audio playback and recording contrller classes
Audio.h
#ifndef Audio_h
#define Audio_h
#import <AVFoundation/AVFoundation.h>
#import "AQRecorder.h"
#interface Audio : NSObject <AVAudioPlayerDelegate> {
AQRecorder* recorder;
AVAudioPlayer* player;
bool mIsSetup;
bool mIsRecording;
bool mIsPlaying;
}
- (void)setupAudio;
- (void)startRecording;
- (void)stopRecording;
- (void)startPlaying;
- (void)stopPlaying;
- (Boolean)isRecording;
- (Boolean)isPlaying;
- (NSString *) getAudioDataBase64String;
#end
#endif
Audio.m
#import "Audio.h"
#import <AudioToolbox/AudioToolbox.h>
#import "JSMonitor.h"
#implementation Audio
- (void)setupAudio
{
NSLog(#"Audio->setupAudio");
AVAudioSession *session = [AVAudioSession sharedInstance];
NSError * error;
[session setCategory:AVAudioSessionCategoryPlayAndRecord error:&error];
[session setActive:YES error:nil];
recorder = [[AQRecorder alloc] init];
mIsSetup = YES;
}
- (void)startRecording
{
NSLog(#"Audio->startRecording");
if ( !mIsSetup )
{
[self setupAudio];
}
if ( mIsRecording ) {
return;
}
if ( [recorder isRunning] == NO )
{
[recorder startRecording];
}
mIsRecording = [recorder isRunning];
}
- (void)stopRecording
{
NSLog(#"Audio->stopRecording");
[recorder stopRecording];
mIsRecording = [recorder isRunning];
[[JSMonitor shared] sendAudioInputStoppedEvent];
}
- (void)startPlaying
{
if ( mIsPlaying )
{
return;
}
mIsPlaying = YES;
NSLog(#"Audio->startPlaying");
NSError* error = nil;
NSString *recordFile = [NSTemporaryDirectory() stringByAppendingPathComponent: #"AudioFile.wav"];
player = [[AVAudioPlayer alloc] initWithContentsOfURL:[NSURL fileURLWithPath:recordFile] error:&error];
if ( error )
{
NSLog(#"AVAudioPlayer failed :: %#", error);
}
player.delegate = self;
[player play];
}
- (void)stopPlaying
{
NSLog(#"Audio->stopPlaying");
[player stop];
mIsPlaying = NO;
[[JSMonitor shared] sendAudioPlaybackCompleteEvent];
}
- (NSString *) getAudioDataBase64String
{
NSString *recordFile = [NSTemporaryDirectory() stringByAppendingPathComponent: #"AudioFile.wav"];
NSError* error = nil;
NSData *fileData = [NSData dataWithContentsOfFile:recordFile options: 0 error: &error];
if ( fileData == nil )
{
NSLog(#"Failed to read file, error %#", error);
return #"DATAENCODINGFAILED";
}
else
{
return [fileData base64EncodedStringWithOptions:0];
}
}
- (Boolean)isRecording { return mIsRecording; }
- (Boolean)isPlaying { return mIsPlaying; }
- (void)audioPlayerDidFinishPlaying:(AVAudioPlayer *)player successfully:(BOOL)flag
{
NSLog(#"Audio->audioPlayerDidFinishPlaying: %i", flag);
mIsPlaying = NO;
[[JSMonitor shared] sendAudioPlaybackCompleteEvent];
}
- (void)audioPlayerDecodeErrorDidOccur:(AVAudioPlayer *)player error:(NSError *)error
{
NSLog(#"Audio->audioPlayerDecodeErrorDidOccur: %#", error.localizedFailureReason);
mIsPlaying = NO;
[[JSMonitor shared] sendAudioPlaybackCompleteEvent];
}
#end
The JSMonitor class is a bridge between the UIWebView javascriptcore and the native code. I'm not including it because it doesn't do anything for audio other than pass data / calls between these classes and JSCore.
EDIT
The format of the audio has changed to LinearPCM Float 32bit. Instead of sending the audio data it is sent through an FFT function and the dB values are averaged and sent instead.

Core Audio is a pain to work with. Fortunately, AVFoundation provides AVAudioRecorder to record video and also gives you access to the average and peak audio power that you can send to back to your JavaScript to update your UI visualizer. From the docs:
An instance of the AVAudioRecorder class, called an audio recorder,
provides audio recording capability in your application. Using an
audio recorder you can:
Record until the user stops the recording
Record for a specified duration
Pause and resume a recording
Obtain input audio-level data that you can use to provide level
metering
This Stack Overflow question has an example of how to use AVAudioRecorder.

Programmatically Record a call and save file in iPhone [duplicate]

Is it theoretically possible to record a phone call on iPhone?
I'm accepting answers which:
may or may not require the phone to be jailbroken
may or may not pass apple's guidelines due to use of private API's (I don't care; it is not for the App Store)
may or may not use private SDKs
I don't want answers just bluntly saying "Apple does not allow that".
I know there would be no official way of doing it, and certainly not for an App Store application, and I know there are call recording apps which place outgoing calls through their own servers.

Here you go. Complete working example. Tweak should be loaded in mediaserverd daemon. It will record every phone call in /var/mobile/Media/DCIM/result.m4a. Audio file has two channels. Left is microphone, right is speaker. On iPhone 4S call is recorded only when the speaker is turned on. On iPhone 5, 5C and 5S call is recorded either way. There might be small hiccups when switching to/from speaker but recording will continue.
#import <AudioToolbox/AudioToolbox.h>
#import <libkern/OSAtomic.h>
//CoreTelephony.framework
extern "C" CFStringRef const kCTCallStatusChangeNotification;
extern "C" CFStringRef const kCTCallStatus;
extern "C" id CTTelephonyCenterGetDefault();
extern "C" void CTTelephonyCenterAddObserver(id ct, void* observer, CFNotificationCallback callBack, CFStringRef name, void *object, CFNotificationSuspensionBehavior sb);
extern "C" int CTGetCurrentCallCount();
enum
{
kCTCallStatusActive = 1,
kCTCallStatusHeld = 2,
kCTCallStatusOutgoing = 3,
kCTCallStatusIncoming = 4,
kCTCallStatusHanged = 5
};
NSString* kMicFilePath = #"/var/mobile/Media/DCIM/mic.caf";
NSString* kSpeakerFilePath = #"/var/mobile/Media/DCIM/speaker.caf";
NSString* kResultFilePath = #"/var/mobile/Media/DCIM/result.m4a";
OSSpinLock phoneCallIsActiveLock = 0;
OSSpinLock speakerLock = 0;
OSSpinLock micLock = 0;
ExtAudioFileRef micFile = NULL;
ExtAudioFileRef speakerFile = NULL;
BOOL phoneCallIsActive = NO;
void Convert()
{
//File URLs
CFURLRef micUrl = CFURLCreateWithFileSystemPath(NULL, (CFStringRef)kMicFilePath, kCFURLPOSIXPathStyle, false);
CFURLRef speakerUrl = CFURLCreateWithFileSystemPath(NULL, (CFStringRef)kSpeakerFilePath, kCFURLPOSIXPathStyle, false);
CFURLRef mixUrl = CFURLCreateWithFileSystemPath(NULL, (CFStringRef)kResultFilePath, kCFURLPOSIXPathStyle, false);
ExtAudioFileRef micFile = NULL;
ExtAudioFileRef speakerFile = NULL;
ExtAudioFileRef mixFile = NULL;
//Opening input files (speaker and mic)
ExtAudioFileOpenURL(micUrl, &micFile);
ExtAudioFileOpenURL(speakerUrl, &speakerFile);
//Reading input file audio format (mono LPCM)
AudioStreamBasicDescription inputFormat, outputFormat;
UInt32 descSize = sizeof(inputFormat);
ExtAudioFileGetProperty(micFile, kExtAudioFileProperty_FileDataFormat, &descSize, &inputFormat);
int sampleSize = inputFormat.mBytesPerFrame;
//Filling input stream format for output file (stereo LPCM)
FillOutASBDForLPCM(inputFormat, inputFormat.mSampleRate, 2, inputFormat.mBitsPerChannel, inputFormat.mBitsPerChannel, true, false, false);
//Filling output file audio format (AAC)
memset(&outputFormat, 0, sizeof(outputFormat));
outputFormat.mFormatID = kAudioFormatMPEG4AAC;
outputFormat.mSampleRate = 8000;
outputFormat.mFormatFlags = kMPEG4Object_AAC_Main;
outputFormat.mChannelsPerFrame = 2;
//Opening output file
ExtAudioFileCreateWithURL(mixUrl, kAudioFileM4AType, &outputFormat, NULL, kAudioFileFlags_EraseFile, &mixFile);
ExtAudioFileSetProperty(mixFile, kExtAudioFileProperty_ClientDataFormat, sizeof(inputFormat), &inputFormat);
//Freeing URLs
CFRelease(micUrl);
CFRelease(speakerUrl);
CFRelease(mixUrl);
//Setting up audio buffers
int bufferSizeInSamples = 64 * 1024;
AudioBufferList micBuffer;
micBuffer.mNumberBuffers = 1;
micBuffer.mBuffers[0].mNumberChannels = 1;
micBuffer.mBuffers[0].mDataByteSize = sampleSize * bufferSizeInSamples;
micBuffer.mBuffers[0].mData = malloc(micBuffer.mBuffers[0].mDataByteSize);
AudioBufferList speakerBuffer;
speakerBuffer.mNumberBuffers = 1;
speakerBuffer.mBuffers[0].mNumberChannels = 1;
speakerBuffer.mBuffers[0].mDataByteSize = sampleSize * bufferSizeInSamples;
speakerBuffer.mBuffers[0].mData = malloc(speakerBuffer.mBuffers[0].mDataByteSize);
AudioBufferList mixBuffer;
mixBuffer.mNumberBuffers = 1;
mixBuffer.mBuffers[0].mNumberChannels = 2;
mixBuffer.mBuffers[0].mDataByteSize = sampleSize * bufferSizeInSamples * 2;
mixBuffer.mBuffers[0].mData = malloc(mixBuffer.mBuffers[0].mDataByteSize);
//Converting
while (true)
{
//Reading data from input files
UInt32 framesToRead = bufferSizeInSamples;
ExtAudioFileRead(micFile, &framesToRead, &micBuffer);
ExtAudioFileRead(speakerFile, &framesToRead, &speakerBuffer);
if (framesToRead == 0)
{
break;
}
//Building interleaved stereo buffer - left channel is mic, right - speaker
for (int i = 0; i < framesToRead; i++)
{
memcpy((char*)mixBuffer.mBuffers[0].mData + i * sampleSize * 2, (char*)micBuffer.mBuffers[0].mData + i * sampleSize, sampleSize);
memcpy((char*)mixBuffer.mBuffers[0].mData + i * sampleSize * 2 + sampleSize, (char*)speakerBuffer.mBuffers[0].mData + i * sampleSize, sampleSize);
}
//Writing to output file - LPCM will be converted to AAC
ExtAudioFileWrite(mixFile, framesToRead, &mixBuffer);
}
//Closing files
ExtAudioFileDispose(micFile);
ExtAudioFileDispose(speakerFile);
ExtAudioFileDispose(mixFile);
//Freeing audio buffers
free(micBuffer.mBuffers[0].mData);
free(speakerBuffer.mBuffers[0].mData);
free(mixBuffer.mBuffers[0].mData);
}
void Cleanup()
{
[[NSFileManager defaultManager] removeItemAtPath:kMicFilePath error:NULL];
[[NSFileManager defaultManager] removeItemAtPath:kSpeakerFilePath error:NULL];
}
void CoreTelephonyNotificationCallback(CFNotificationCenterRef center, void *observer, CFStringRef name, const void *object, CFDictionaryRef userInfo)
{
NSDictionary* data = (NSDictionary*)userInfo;
if ([(NSString*)name isEqualToString:(NSString*)kCTCallStatusChangeNotification])
{
int currentCallStatus = [data[(NSString*)kCTCallStatus] integerValue];
if (currentCallStatus == kCTCallStatusActive)
{
OSSpinLockLock(&phoneCallIsActiveLock);
phoneCallIsActive = YES;
OSSpinLockUnlock(&phoneCallIsActiveLock);
}
else if (currentCallStatus == kCTCallStatusHanged)
{
if (CTGetCurrentCallCount() > 0)
{
return;
}
OSSpinLockLock(&phoneCallIsActiveLock);
phoneCallIsActive = NO;
OSSpinLockUnlock(&phoneCallIsActiveLock);
//Closing mic file
OSSpinLockLock(&micLock);
if (micFile != NULL)
{
ExtAudioFileDispose(micFile);
}
micFile = NULL;
OSSpinLockUnlock(&micLock);
//Closing speaker file
OSSpinLockLock(&speakerLock);
if (speakerFile != NULL)
{
ExtAudioFileDispose(speakerFile);
}
speakerFile = NULL;
OSSpinLockUnlock(&speakerLock);
Convert();
Cleanup();
}
}
}
OSStatus(*AudioUnitProcess_orig)(AudioUnit unit, AudioUnitRenderActionFlags *ioActionFlags, const AudioTimeStamp *inTimeStamp, UInt32 inNumberFrames, AudioBufferList *ioData);
OSStatus AudioUnitProcess_hook(AudioUnit unit, AudioUnitRenderActionFlags *ioActionFlags, const AudioTimeStamp *inTimeStamp, UInt32 inNumberFrames, AudioBufferList *ioData)
{
OSSpinLockLock(&phoneCallIsActiveLock);
if (phoneCallIsActive == NO)
{
OSSpinLockUnlock(&phoneCallIsActiveLock);
return AudioUnitProcess_orig(unit, ioActionFlags, inTimeStamp, inNumberFrames, ioData);
}
OSSpinLockUnlock(&phoneCallIsActiveLock);
ExtAudioFileRef* currentFile = NULL;
OSSpinLock* currentLock = NULL;
AudioComponentDescription unitDescription = {0};
AudioComponentGetDescription(AudioComponentInstanceGetComponent(unit), &unitDescription);
//'agcc', 'mbdp' - iPhone 4S, iPhone 5
//'agc2', 'vrq2' - iPhone 5C, iPhone 5S
if (unitDescription.componentSubType == 'agcc' || unitDescription.componentSubType == 'agc2')
{
currentFile = &micFile;
currentLock = &micLock;
}
else if (unitDescription.componentSubType == 'mbdp' || unitDescription.componentSubType == 'vrq2')
{
currentFile = &speakerFile;
currentLock = &speakerLock;
}
if (currentFile != NULL)
{
OSSpinLockLock(currentLock);
//Opening file
if (*currentFile == NULL)
{
//Obtaining input audio format
AudioStreamBasicDescription desc;
UInt32 descSize = sizeof(desc);
AudioUnitGetProperty(unit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, 0, &desc, &descSize);
//Opening audio file
CFURLRef url = CFURLCreateWithFileSystemPath(NULL, (CFStringRef)((currentFile == &micFile) ? kMicFilePath : kSpeakerFilePath), kCFURLPOSIXPathStyle, false);
ExtAudioFileRef audioFile = NULL;
OSStatus result = ExtAudioFileCreateWithURL(url, kAudioFileCAFType, &desc, NULL, kAudioFileFlags_EraseFile, &audioFile);
if (result != 0)
{
*currentFile = NULL;
}
else
{
*currentFile = audioFile;
//Writing audio format
ExtAudioFileSetProperty(*currentFile, kExtAudioFileProperty_ClientDataFormat, sizeof(desc), &desc);
}
CFRelease(url);
}
else
{
//Writing audio buffer
ExtAudioFileWrite(*currentFile, inNumberFrames, ioData);
}
OSSpinLockUnlock(currentLock);
}
return AudioUnitProcess_orig(unit, ioActionFlags, inTimeStamp, inNumberFrames, ioData);
}
__attribute__((constructor))
static void initialize()
{
CTTelephonyCenterAddObserver(CTTelephonyCenterGetDefault(), NULL, CoreTelephonyNotificationCallback, NULL, NULL, CFNotificationSuspensionBehaviorHold);
MSHookFunction(AudioUnitProcess, AudioUnitProcess_hook, &AudioUnitProcess_orig);
}
A few words about what's going on. AudioUnitProcess function is used for processing audio streams in order to apply some effects, mix, convert etc. We are hooking AudioUnitProcess in order to access phone call's audio streams. While phone call is active these streams are being processed in various ways.
We are listening for CoreTelephony notifications in order to get phone call status changes. When we receive audio samples we need to determine where they come from - microphone or speaker. This is done using componentSubType field in AudioComponentDescription structure. Now, you might think, why don't we store AudioUnit objects so that we don't need to check componentSubType every time. I did that but it will break everything when you switch speaker on/off on iPhone 5 because AudioUnit objects will change, they are recreated. So, now we open audio files (one for microphone and one for speaker) and write samples in them, simple as that. When phone call ends we will receive appropriate CoreTelephony notification and close the files. We have two separate files with audio from microphone and speaker that we need to merge. This is what void Convert() is for. It's pretty simple if you know the API. I don't think I need to explain it, comments are enough.
About locks. There are many threads in mediaserverd. Audio processing and CoreTelephony notifications are on different threads so we need some kind synchronization. I chose spin locks because they are fast and because the chance of lock contention is small in our case. On iPhone 4S and even iPhone 5 all the work in AudioUnitProcess should be done as fast as possible otherwise you will hear hiccups from device speaker which obviously not good.

Yes. Audio Recorder by a developer named Limneos does that (and quite well). You can find it on Cydia. It can record any type of call on iPhone 5 and up without using any servers etc'. The call will be placed on the device in an Audio file. It also supports iPhone 4S but for speaker only.
This tweak is known to be the first tweak ever that managed to record both streams of audio without using any 3rd party severs, VOIP or something similar.
The developer placed beeps on the other side of the call to alert the person you are recording but those were removed too by hackers across the net. To answer your question, Yes, it's very much possible, and not just theoretically.
Further reading
https://stackoverflow.com/a/19413363/202451
http://forums.macrumors.com/showthread.php?t=1566350
https://github.com/nst/iOS-Runtime-Headers

The only solution I can think of is to use the Core Telephony framework, and more specifically the callEventHandler property, to intercept when a call is coming in, and then to use an AVAudioRecorder to record the voice of the person with the phone (and maybe a little of the person on the other line's voice). This is obviously not perfect, and would only work if your application is in the foreground at the time of the call, but it may be the best you can get. See more about finding out if there is an incoming phone call here: Can we fire an event when ever there is Incoming and Outgoing call in iphone?.
EDIT:
.h:
#import <AVFoundation/AVFoundation.h>
#import<CoreTelephony/CTCallCenter.h>
#import<CoreTelephony/CTCall.h>
#property (strong, nonatomic) AVAudioRecorder *audioRecorder;
ViewDidLoad:
NSArray *dirPaths;
NSString *docsDir;
dirPaths = NSSearchPathForDirectoriesInDomains(
NSDocumentDirectory, NSUserDomainMask, YES);
docsDir = dirPaths[0];
NSString *soundFilePath = [docsDir
stringByAppendingPathComponent:#"sound.caf"];
NSURL *soundFileURL = [NSURL fileURLWithPath:soundFilePath];
NSDictionary *recordSettings = [NSDictionary
dictionaryWithObjectsAndKeys:
[NSNumber numberWithInt:AVAudioQualityMin],
AVEncoderAudioQualityKey,
[NSNumber numberWithInt:16],
AVEncoderBitRateKey,
[NSNumber numberWithInt: 2],
AVNumberOfChannelsKey,
[NSNumber numberWithFloat:44100.0],
AVSampleRateKey,
nil];
NSError *error = nil;
_audioRecorder = [[AVAudioRecorder alloc]
initWithURL:soundFileURL
settings:recordSettings
error:&error];
if (error)
{
NSLog(#"error: %#", [error localizedDescription]);
} else {
[_audioRecorder prepareToRecord];
}
CTCallCenter *callCenter = [[CTCallCenter alloc] init];
[callCenter setCallEventHandler:^(CTCall *call) {
if ([[call callState] isEqual:CTCallStateConnected]) {
[_audioRecorder record];
} else if ([[call callState] isEqual:CTCallStateDisconnected]) {
[_audioRecorder stop];
}
}];
AppDelegate.m:
- (void)applicationDidEnterBackground:(UIApplication *)application//Makes sure that the recording keeps happening even when app is in the background, though only can go for 10 minutes.
{
__block UIBackgroundTaskIdentifier task = 0;
task=[application beginBackgroundTaskWithExpirationHandler:^{
NSLog(#"Expiration handler called %f",[application backgroundTimeRemaining]);
[application endBackgroundTask:task];
task=UIBackgroundTaskInvalid;
}];
This is the first time using many of these features, so not sure if this is exactly right, but I think you get the idea. Untested, as I do not have access to the right tools at the moment. Compiled using these sources:
Recording voice in background using AVAudioRecorder
http://prassan-warrior.blogspot.com/2012/11/recording-audio-on-iphone-with.html
Can we fire an event when ever there is Incoming and Outgoing call in iphone?

Apple does not allow it and does not provide any API for it.
However, on a jailbroken device I'm sure it's possible. As a matter of fact, I think it's already done. I remember seeing an app when my phone was jailbroken that changed your voice and recorded the call - I remember it was a US company offering it only in the states. Unfortunately I don't remember the name...

I guess some hardware could solve this. Connected to the minijack-port; having earbuds and a microphone passing through a small recorder. This recorder can be very simple. While not in conversation the recorder could feed the phone with data/the recording (through the jack-cable). And with a simple start button (just like the volum controls on the earbuds) could be sufficient for timing the recording.
Some setups
http://www.danmccomb.com/posts/483/how-to-record-iphone-conversations-using-zoom-h4n/
http://forums.macrumors.com/showthread.php?t=346430

Encoding raw YUV420P to h264 with AVCodec on iOS

I am trying to encode a single YUV420P image gathered from a CMSampleBuffer to an AVPacket so that I can send h264 video over the network with RTMP.
The posted code example seems to work as avcodec_encode_video2 returns 0 (Success) however got_output is also 0 (AVPacket is empty).
Does anyone have any experience with encoding video on iOS devices that might know what I am doing wrong?
- (void) captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection {
// sampleBuffer now contains an individual frame of raw video frames
CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
CVPixelBufferLockBaseAddress(pixelBuffer, 0);
// access the data
int width = CVPixelBufferGetWidth(pixelBuffer);
int height = CVPixelBufferGetHeight(pixelBuffer);
int bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0);
unsigned char *rawPixelBase = (unsigned char *)CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0);
// Convert the raw pixel base to h.264 format
AVCodec *codec = 0;
AVCodecContext *context = 0;
AVFrame *frame = 0;
AVPacket packet;
//avcodec_init();
avcodec_register_all();
codec = avcodec_find_encoder(AV_CODEC_ID_H264);
if (codec == 0) {
NSLog(#"Codec not found!!");
return;
}
context = avcodec_alloc_context3(codec);
if (!context) {
NSLog(#"Context no bueno.");
return;
}
// Bit rate
context->bit_rate = 400000; // HARD CODE
context->bit_rate_tolerance = 10;
// Resolution
context->width = width;
context->height = height;
// Frames Per Second
context->time_base = (AVRational) {1,25};
context->gop_size = 1;
//context->max_b_frames = 1;
context->pix_fmt = PIX_FMT_YUV420P;
// Open the codec
if (avcodec_open2(context, codec, 0) < 0) {
NSLog(#"Unable to open codec");
return;
}
// Create the frame
frame = avcodec_alloc_frame();
if (!frame) {
NSLog(#"Unable to alloc frame");
return;
}
frame->format = context->pix_fmt;
frame->width = context->width;
frame->height = context->height;
avpicture_fill((AVPicture *) frame, rawPixelBase, context->pix_fmt, frame->width, frame->height);
int got_output = 0;
av_init_packet(&packet);
avcodec_encode_video2(context, &packet, frame, &got_output)
// Unlock the pixel data
CVPixelBufferUnlockBaseAddress(pixelBuffer, 0);
// Send the data over the network
[self uploadData:[NSData dataWithBytes:packet.data length:packet.size] toRTMP:self.rtmp_OutVideoStream];
}
Note: It is known that this code has memory leaks because I am not freeing the memory that is dynamically allocated.
UPDATE
I updated my code to use #pogorskiy method. I only try to upload the frame if got output returns 1 and clear the buffer once I am done encoding video frames.

How to correctly read decoded PCM samples on iOS using AVAssetReader -- currently incorrect decoding

I am currently working on an application as part of my Bachelor in Computer Science. The application will correlate data from the iPhone hardware (accelerometer, gps) and music that is being played.
The project is still in its infancy, having worked on it for only 2 months.
The moment that I am right now, and where I need help, is reading PCM samples from songs from the itunes library, and playing them back using and audio unit.
Currently the implementation I would like working does the following: chooses a random song from iTunes, and reads samples from it when required, and stores in a buffer, lets call it sampleBuffer. Later on in the consumer model the audio unit (which has a mixer and a remoteIO output) has a callback where I simply copy the required number of samples from sampleBuffer into the buffer specified in the callback. What i then hear through the speakers is something not quite what i expect; I can recognize that it is playing the song however it seems that it is incorrectly decoded and it has a lot of noise! I attached an image which shows the first ~half a second (24576 samples # 44.1kHz), and this does not resemble a normall looking output.
Before I get into the listing I have checked that the file is not corrupted, similarily I have written test cases for the buffer (so I know the buffer does not alter the samples), and although this might not be the best way to do it (some would argue to go the audio queue route), I want to perform various manipulations on the samples aswell as changing the song before it is finished, rearranging what song is played, etc. Furthermore, maybe there are some incorrect settings in the audio unit, however, the graph that displays the samples (which shows the samples are decoded incorrectly) is taken straight from the buffer, thus I am only looking now to solve why the reading from the disk and decoding does not work correctly. Right now i simply want to get a play through working.
Cant post images because new to stackoverflow so heres the link to the image: http://i.stack.imgur.com/RHjlv.jpg
Listing:
This is where I setup the audioReadSettigns which will be used for the AVAssetReaderAudioMixOutput
// Set the read settings
audioReadSettings = [[NSMutableDictionary alloc] init];
[audioReadSettings setValue:[NSNumber numberWithInt:kAudioFormatLinearPCM]
forKey:AVFormatIDKey];
[audioReadSettings setValue:[NSNumber numberWithInt:16] forKey:AVLinearPCMBitDepthKey];
[audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsBigEndianKey];
[audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsFloatKey];
[audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsNonInterleaved];
[audioReadSettings setValue:[NSNumber numberWithFloat:44100.0] forKey:AVSampleRateKey];
Now the following code listing is a method that receives an NSString with the persistant_id of the song:
-(BOOL)setNextSongID:(NSString*)persistand_id {
assert(persistand_id != nil);
MPMediaItem *song = [self getMediaItemForPersistantID:persistand_id];
NSURL *assetUrl = [song valueForProperty:MPMediaItemPropertyAssetURL];
AVURLAsset *songAsset = [AVURLAsset URLAssetWithURL:assetUrl
options:[NSDictionary dictionaryWithObject:[NSNumber numberWithBool:YES]
forKey:AVURLAssetPreferPreciseDurationAndTimingKey]];
NSError *assetError = nil;
assetReader = [[AVAssetReader assetReaderWithAsset:songAsset error:&assetError] retain];
if (assetError) {
NSLog(#"error: %#", assetError);
return NO;
}
CMTimeRange timeRange = CMTimeRangeMake(kCMTimeZero, songAsset.duration);
[assetReader setTimeRange:timeRange];
track = [[songAsset tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0];
assetReaderOutput = [AVAssetReaderAudioMixOutput assetReaderAudioMixOutputWithAudioTracks:[NSArray arrayWithObject:track]
audioSettings:audioReadSettings];
if (![assetReader canAddOutput:assetReaderOutput]) {
NSLog(#"cant add reader output... die!");
return NO;
}
[assetReader addOutput:assetReaderOutput];
[assetReader startReading];
// just getting some basic information about the track to print
NSArray *formatDesc = ((AVAssetTrack*)[[assetReaderOutput audioTracks] objectAtIndex:0]).formatDescriptions;
for (unsigned int i = 0; i < [formatDesc count]; ++i) {
CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i];
const CAStreamBasicDescription *asDesc = (CAStreamBasicDescription*)CMAudioFormatDescriptionGetStreamBasicDescription(item);
if (asDesc) {
// get data
numChannels = asDesc->mChannelsPerFrame;
sampleRate = asDesc->mSampleRate;
asDesc->Print();
}
}
[self copyEnoughSamplesToBufferForLength:24000];
return YES;
}
The following presents the function -(void)copyEnoughSamplesToBufferForLength:
-(void)copyEnoughSamplesToBufferForLength:(UInt32)samples_count {
[w_lock lock];
int stillToCopy = 0;
if (sampleBuffer->numSamples() < samples_count) {
stillToCopy = samples_count;
}
NSAutoreleasePool *apool = [[NSAutoreleasePool alloc] init];
CMSampleBufferRef sampleBufferRef;
SInt16 *dataBuffer = (SInt16*)malloc(8192 * sizeof(SInt16));
int a = 0;
while (stillToCopy > 0) {
sampleBufferRef = [assetReaderOutput copyNextSampleBuffer];
if (!sampleBufferRef) {
// end of song or no more samples
return;
}
CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBufferRef);
CMItemCount numSamplesInBuffer = CMSampleBufferGetNumSamples(sampleBufferRef);
AudioBufferList audioBufferList;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBufferRef,
NULL,
&audioBufferList,
sizeof(audioBufferList),
NULL,
NULL,
0,
&blockBuffer);
int data_length = floorf(numSamplesInBuffer * 1.0f);
int j = 0;
for (int bufferCount=0; bufferCount < audioBufferList.mNumberBuffers; bufferCount++) {
SInt16* samples = (SInt16 *)audioBufferList.mBuffers[bufferCount].mData;
for (int i=0; i < numSamplesInBuffer; i++) {
dataBuffer[j] = samples[i];
j++;
}
}
CFRelease(sampleBufferRef);
sampleBuffer->putSamples(dataBuffer, j);
stillToCopy = stillToCopy - data_length;
}
free(dataBuffer);
[w_lock unlock];
[apool release];
}
Now the sampleBuffer will have incorrectly decoded samples. Can anyone help me why this is so? This happens for different files on my iTunes library (mp3, aac, wav, etc).
Any help would be greatly appreciated, furthermore, if you need any other listing of my code, or perhaps what the output sounds like, I will attach it per request. I have been sitting on this for the past week trying to debug it and have found no help online -- everyone seems to be doign it in my way, yet it seems that only I have this issue.
Thanks for any help at all!
Peter

Currently, I am also working on a project which involves extracting audio samples from iTunes Library into AudioUnit.
The audiounit render call back is included for your reference. The input format is set as SInt16StereoStreamFormat.
I have made use of Michael Tyson's circular buffer implementation - TPCircularBuffer as the buffer storage. Very easy to use and understand!!! Thanks Michael!
- (void) loadBuffer:(NSURL *)assetURL_
{
if (nil != self.iPodAssetReader) {
[iTunesOperationQueue cancelAllOperations];
[self cleanUpBuffer];
}
NSDictionary *outputSettings = [NSDictionary dictionaryWithObjectsAndKeys:
[NSNumber numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey,
[NSNumber numberWithFloat:44100.0], AVSampleRateKey,
[NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
[NSNumber numberWithBool:NO], AVLinearPCMIsNonInterleaved,
[NSNumber numberWithBool:NO], AVLinearPCMIsFloatKey,
[NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey,
nil];
AVURLAsset *asset = [AVURLAsset URLAssetWithURL:assetURL_ options:nil];
if (asset == nil) {
NSLog(#"asset is not defined!");
return;
}
NSLog(#"Total Asset Duration: %f", CMTimeGetSeconds(asset.duration));
NSError *assetError = nil;
self.iPodAssetReader = [AVAssetReader assetReaderWithAsset:asset error:&assetError];
if (assetError) {
NSLog (#"error: %#", assetError);
return;
}
AVAssetReaderOutput *readerOutput = [AVAssetReaderAudioMixOutput assetReaderAudioMixOutputWithAudioTracks:asset.tracks audioSettings:outputSettings];
if (! [iPodAssetReader canAddOutput: readerOutput]) {
NSLog (#"can't add reader output... die!");
return;
}
// add output reader to reader
[iPodAssetReader addOutput: readerOutput];
if (! [iPodAssetReader startReading]) {
NSLog(#"Unable to start reading!");
return;
}
// Init circular buffer
TPCircularBufferInit(&playbackState.circularBuffer, kTotalBufferSize);
__block NSBlockOperation * feediPodBufferOperation = [NSBlockOperation blockOperationWithBlock:^{
while (![feediPodBufferOperation isCancelled] && iPodAssetReader.status != AVAssetReaderStatusCompleted) {
if (iPodAssetReader.status == AVAssetReaderStatusReading) {
// Check if the available buffer space is enough to hold at least one cycle of the sample data
if (kTotalBufferSize - playbackState.circularBuffer.fillCount >= 32768) {
CMSampleBufferRef nextBuffer = [readerOutput copyNextSampleBuffer];
if (nextBuffer) {
AudioBufferList abl;
CMBlockBufferRef blockBuffer;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(nextBuffer, NULL, &abl, sizeof(abl), NULL, NULL, kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment, &blockBuffer);
UInt64 size = CMSampleBufferGetTotalSampleSize(nextBuffer);
int bytesCopied = TPCircularBufferProduceBytes(&playbackState.circularBuffer, abl.mBuffers[0].mData, size);
if (!playbackState.bufferIsReady && bytesCopied > 0) {
playbackState.bufferIsReady = YES;
}
CFRelease(nextBuffer);
CFRelease(blockBuffer);
}
else {
break;
}
}
}
}
NSLog(#"iPod Buffer Reading Finished");
}];
[iTunesOperationQueue addOperation:feediPodBufferOperation];
}
static OSStatus ipodRenderCallback (
void *inRefCon, // A pointer to a struct containing the complete audio data
// to play, as well as state information such as the
// first sample to play on this invocation of the callback.
AudioUnitRenderActionFlags *ioActionFlags, // Unused here. When generating audio, use ioActionFlags to indicate silence
// between sounds; for silence, also memset the ioData buffers to 0.
const AudioTimeStamp *inTimeStamp, // Unused here.
UInt32 inBusNumber, // The mixer unit input bus that is requesting some new
// frames of audio data to play.
UInt32 inNumberFrames, // The number of frames of audio to provide to the buffer(s)
// pointed to by the ioData parameter.
AudioBufferList *ioData // On output, the audio data to play. The callback's primary
// responsibility is to fill the buffer(s) in the
// AudioBufferList.
)
{
Audio* audioObject = (Audio*)inRefCon;
AudioSampleType *outSample = (AudioSampleType *)ioData->mBuffers[0].mData;
// Zero-out all the output samples first
memset(outSample, 0, inNumberFrames * kUnitSize * 2);
if ( audioObject.playingiPod && audioObject.bufferIsReady) {
// Pull audio from circular buffer
int32_t availableBytes;
AudioSampleType *bufferTail = TPCircularBufferTail(&audioObject.circularBuffer, &availableBytes);
memcpy(outSample, bufferTail, MIN(availableBytes, inNumberFrames * kUnitSize * 2) );
TPCircularBufferConsume(&audioObject.circularBuffer, MIN(availableBytes, inNumberFrames * kUnitSize * 2) );
audioObject.currentSampleNum += MIN(availableBytes / (kUnitSize * 2), inNumberFrames);
if (availableBytes <= inNumberFrames * kUnitSize * 2) {
// Buffer is running out or playback is finished
audioObject.bufferIsReady = NO;
audioObject.playingiPod = NO;
audioObject.currentSampleNum = 0;
if ([[audioObject delegate] respondsToSelector:#selector(playbackDidFinish)]) {
[[audioObject delegate] performSelector:#selector(playbackDidFinish)];
}
}
}
return noErr;
}
- (void) setupSInt16StereoStreamFormat {
// The AudioUnitSampleType data type is the recommended type for sample data in audio
// units. This obtains the byte size of the type for use in filling in the ASBD.
size_t bytesPerSample = sizeof (AudioSampleType);
// Fill the application audio format struct's fields to define a linear PCM,
// stereo, noninterleaved stream at the hardware sample rate.
SInt16StereoStreamFormat.mFormatID = kAudioFormatLinearPCM;
SInt16StereoStreamFormat.mFormatFlags = kAudioFormatFlagsCanonical;
SInt16StereoStreamFormat.mBytesPerPacket = 2 * bytesPerSample; // *** kAudioFormatFlagsCanonical <- implicit interleaved data => (left sample + right sample) per Packet
SInt16StereoStreamFormat.mFramesPerPacket = 1;
SInt16StereoStreamFormat.mBytesPerFrame = SInt16StereoStreamFormat.mBytesPerPacket * SInt16StereoStreamFormat.mFramesPerPacket;
SInt16StereoStreamFormat.mChannelsPerFrame = 2; // 2 indicates stereo
SInt16StereoStreamFormat.mBitsPerChannel = 8 * bytesPerSample;
SInt16StereoStreamFormat.mSampleRate = graphSampleRate;
NSLog (#"The stereo stream format for the \"iPod\" mixer input bus:");
[self printASBD: SInt16StereoStreamFormat];
}

I guess it is kind of late, but you could try this library:
https://bitbucket.org/artgillespie/tslibraryimport
After using this to save the audio into a file, you could process the data with render callbacks from MixerHost.

If I were you I would either use kAudioUnitSubType_AudioFilePlayer to play the file and access its samples with the units render callback.
Or
Use ExtAudioFileRef to extract the samples straight to a buffer.

How can I record a conversation / phone call on iOS?

Is it theoretically possible to record a phone call on iPhone?
I'm accepting answers which:
may or may not require the phone to be jailbroken
may or may not pass apple's guidelines due to use of private API's (I don't care; it is not for the App Store)
may or may not use private SDKs
I don't want answers just bluntly saying "Apple does not allow that".
I know there would be no official way of doing it, and certainly not for an App Store application, and I know there are call recording apps which place outgoing calls through their own servers.

Here you go. Complete working example. Tweak should be loaded in mediaserverd daemon. It will record every phone call in /var/mobile/Media/DCIM/result.m4a. Audio file has two channels. Left is microphone, right is speaker. On iPhone 4S call is recorded only when the speaker is turned on. On iPhone 5, 5C and 5S call is recorded either way. There might be small hiccups when switching to/from speaker but recording will continue.
#import <AudioToolbox/AudioToolbox.h>
#import <libkern/OSAtomic.h>
//CoreTelephony.framework
extern "C" CFStringRef const kCTCallStatusChangeNotification;
extern "C" CFStringRef const kCTCallStatus;
extern "C" id CTTelephonyCenterGetDefault();
extern "C" void CTTelephonyCenterAddObserver(id ct, void* observer, CFNotificationCallback callBack, CFStringRef name, void *object, CFNotificationSuspensionBehavior sb);
extern "C" int CTGetCurrentCallCount();
enum
{
kCTCallStatusActive = 1,
kCTCallStatusHeld = 2,
kCTCallStatusOutgoing = 3,
kCTCallStatusIncoming = 4,
kCTCallStatusHanged = 5
};
NSString* kMicFilePath = #"/var/mobile/Media/DCIM/mic.caf";
NSString* kSpeakerFilePath = #"/var/mobile/Media/DCIM/speaker.caf";
NSString* kResultFilePath = #"/var/mobile/Media/DCIM/result.m4a";
OSSpinLock phoneCallIsActiveLock = 0;
OSSpinLock speakerLock = 0;
OSSpinLock micLock = 0;
ExtAudioFileRef micFile = NULL;
ExtAudioFileRef speakerFile = NULL;
BOOL phoneCallIsActive = NO;
void Convert()
{
//File URLs
CFURLRef micUrl = CFURLCreateWithFileSystemPath(NULL, (CFStringRef)kMicFilePath, kCFURLPOSIXPathStyle, false);
CFURLRef speakerUrl = CFURLCreateWithFileSystemPath(NULL, (CFStringRef)kSpeakerFilePath, kCFURLPOSIXPathStyle, false);
CFURLRef mixUrl = CFURLCreateWithFileSystemPath(NULL, (CFStringRef)kResultFilePath, kCFURLPOSIXPathStyle, false);
ExtAudioFileRef micFile = NULL;
ExtAudioFileRef speakerFile = NULL;
ExtAudioFileRef mixFile = NULL;
//Opening input files (speaker and mic)
ExtAudioFileOpenURL(micUrl, &micFile);
ExtAudioFileOpenURL(speakerUrl, &speakerFile);
//Reading input file audio format (mono LPCM)
AudioStreamBasicDescription inputFormat, outputFormat;
UInt32 descSize = sizeof(inputFormat);
ExtAudioFileGetProperty(micFile, kExtAudioFileProperty_FileDataFormat, &descSize, &inputFormat);
int sampleSize = inputFormat.mBytesPerFrame;
//Filling input stream format for output file (stereo LPCM)
FillOutASBDForLPCM(inputFormat, inputFormat.mSampleRate, 2, inputFormat.mBitsPerChannel, inputFormat.mBitsPerChannel, true, false, false);
//Filling output file audio format (AAC)
memset(&outputFormat, 0, sizeof(outputFormat));
outputFormat.mFormatID = kAudioFormatMPEG4AAC;
outputFormat.mSampleRate = 8000;
outputFormat.mFormatFlags = kMPEG4Object_AAC_Main;
outputFormat.mChannelsPerFrame = 2;
//Opening output file
ExtAudioFileCreateWithURL(mixUrl, kAudioFileM4AType, &outputFormat, NULL, kAudioFileFlags_EraseFile, &mixFile);
ExtAudioFileSetProperty(mixFile, kExtAudioFileProperty_ClientDataFormat, sizeof(inputFormat), &inputFormat);
//Freeing URLs
CFRelease(micUrl);
CFRelease(speakerUrl);
CFRelease(mixUrl);
//Setting up audio buffers
int bufferSizeInSamples = 64 * 1024;
AudioBufferList micBuffer;
micBuffer.mNumberBuffers = 1;
micBuffer.mBuffers[0].mNumberChannels = 1;
micBuffer.mBuffers[0].mDataByteSize = sampleSize * bufferSizeInSamples;
micBuffer.mBuffers[0].mData = malloc(micBuffer.mBuffers[0].mDataByteSize);
AudioBufferList speakerBuffer;
speakerBuffer.mNumberBuffers = 1;
speakerBuffer.mBuffers[0].mNumberChannels = 1;
speakerBuffer.mBuffers[0].mDataByteSize = sampleSize * bufferSizeInSamples;
speakerBuffer.mBuffers[0].mData = malloc(speakerBuffer.mBuffers[0].mDataByteSize);
AudioBufferList mixBuffer;
mixBuffer.mNumberBuffers = 1;
mixBuffer.mBuffers[0].mNumberChannels = 2;
mixBuffer.mBuffers[0].mDataByteSize = sampleSize * bufferSizeInSamples * 2;
mixBuffer.mBuffers[0].mData = malloc(mixBuffer.mBuffers[0].mDataByteSize);
//Converting
while (true)
{
//Reading data from input files
UInt32 framesToRead = bufferSizeInSamples;
ExtAudioFileRead(micFile, &framesToRead, &micBuffer);
ExtAudioFileRead(speakerFile, &framesToRead, &speakerBuffer);
if (framesToRead == 0)
{
break;
}
//Building interleaved stereo buffer - left channel is mic, right - speaker
for (int i = 0; i < framesToRead; i++)
{
memcpy((char*)mixBuffer.mBuffers[0].mData + i * sampleSize * 2, (char*)micBuffer.mBuffers[0].mData + i * sampleSize, sampleSize);
memcpy((char*)mixBuffer.mBuffers[0].mData + i * sampleSize * 2 + sampleSize, (char*)speakerBuffer.mBuffers[0].mData + i * sampleSize, sampleSize);
}
//Writing to output file - LPCM will be converted to AAC
ExtAudioFileWrite(mixFile, framesToRead, &mixBuffer);
}
//Closing files
ExtAudioFileDispose(micFile);
ExtAudioFileDispose(speakerFile);
ExtAudioFileDispose(mixFile);
//Freeing audio buffers
free(micBuffer.mBuffers[0].mData);
free(speakerBuffer.mBuffers[0].mData);
free(mixBuffer.mBuffers[0].mData);
}
void Cleanup()
{
[[NSFileManager defaultManager] removeItemAtPath:kMicFilePath error:NULL];
[[NSFileManager defaultManager] removeItemAtPath:kSpeakerFilePath error:NULL];
}
void CoreTelephonyNotificationCallback(CFNotificationCenterRef center, void *observer, CFStringRef name, const void *object, CFDictionaryRef userInfo)
{
NSDictionary* data = (NSDictionary*)userInfo;
if ([(NSString*)name isEqualToString:(NSString*)kCTCallStatusChangeNotification])
{
int currentCallStatus = [data[(NSString*)kCTCallStatus] integerValue];
if (currentCallStatus == kCTCallStatusActive)
{
OSSpinLockLock(&phoneCallIsActiveLock);
phoneCallIsActive = YES;
OSSpinLockUnlock(&phoneCallIsActiveLock);
}
else if (currentCallStatus == kCTCallStatusHanged)
{
if (CTGetCurrentCallCount() > 0)
{
return;
}
OSSpinLockLock(&phoneCallIsActiveLock);
phoneCallIsActive = NO;
OSSpinLockUnlock(&phoneCallIsActiveLock);
//Closing mic file
OSSpinLockLock(&micLock);
if (micFile != NULL)
{
ExtAudioFileDispose(micFile);
}
micFile = NULL;
OSSpinLockUnlock(&micLock);
//Closing speaker file
OSSpinLockLock(&speakerLock);
if (speakerFile != NULL)
{
ExtAudioFileDispose(speakerFile);
}
speakerFile = NULL;
OSSpinLockUnlock(&speakerLock);
Convert();
Cleanup();
}
}
}
OSStatus(*AudioUnitProcess_orig)(AudioUnit unit, AudioUnitRenderActionFlags *ioActionFlags, const AudioTimeStamp *inTimeStamp, UInt32 inNumberFrames, AudioBufferList *ioData);
OSStatus AudioUnitProcess_hook(AudioUnit unit, AudioUnitRenderActionFlags *ioActionFlags, const AudioTimeStamp *inTimeStamp, UInt32 inNumberFrames, AudioBufferList *ioData)
{
OSSpinLockLock(&phoneCallIsActiveLock);
if (phoneCallIsActive == NO)
{
OSSpinLockUnlock(&phoneCallIsActiveLock);
return AudioUnitProcess_orig(unit, ioActionFlags, inTimeStamp, inNumberFrames, ioData);
}
OSSpinLockUnlock(&phoneCallIsActiveLock);
ExtAudioFileRef* currentFile = NULL;
OSSpinLock* currentLock = NULL;
AudioComponentDescription unitDescription = {0};
AudioComponentGetDescription(AudioComponentInstanceGetComponent(unit), &unitDescription);
//'agcc', 'mbdp' - iPhone 4S, iPhone 5
//'agc2', 'vrq2' - iPhone 5C, iPhone 5S
if (unitDescription.componentSubType == 'agcc' || unitDescription.componentSubType == 'agc2')
{
currentFile = &micFile;
currentLock = &micLock;
}
else if (unitDescription.componentSubType == 'mbdp' || unitDescription.componentSubType == 'vrq2')
{
currentFile = &speakerFile;
currentLock = &speakerLock;
}
if (currentFile != NULL)
{
OSSpinLockLock(currentLock);
//Opening file
if (*currentFile == NULL)
{
//Obtaining input audio format
AudioStreamBasicDescription desc;
UInt32 descSize = sizeof(desc);
AudioUnitGetProperty(unit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, 0, &desc, &descSize);
//Opening audio file
CFURLRef url = CFURLCreateWithFileSystemPath(NULL, (CFStringRef)((currentFile == &micFile) ? kMicFilePath : kSpeakerFilePath), kCFURLPOSIXPathStyle, false);
ExtAudioFileRef audioFile = NULL;
OSStatus result = ExtAudioFileCreateWithURL(url, kAudioFileCAFType, &desc, NULL, kAudioFileFlags_EraseFile, &audioFile);
if (result != 0)
{
*currentFile = NULL;
}
else
{
*currentFile = audioFile;
//Writing audio format
ExtAudioFileSetProperty(*currentFile, kExtAudioFileProperty_ClientDataFormat, sizeof(desc), &desc);
}
CFRelease(url);
}
else
{
//Writing audio buffer
ExtAudioFileWrite(*currentFile, inNumberFrames, ioData);
}
OSSpinLockUnlock(currentLock);
}
return AudioUnitProcess_orig(unit, ioActionFlags, inTimeStamp, inNumberFrames, ioData);
}
__attribute__((constructor))
static void initialize()
{
CTTelephonyCenterAddObserver(CTTelephonyCenterGetDefault(), NULL, CoreTelephonyNotificationCallback, NULL, NULL, CFNotificationSuspensionBehaviorHold);
MSHookFunction(AudioUnitProcess, AudioUnitProcess_hook, &AudioUnitProcess_orig);
}
A few words about what's going on. AudioUnitProcess function is used for processing audio streams in order to apply some effects, mix, convert etc. We are hooking AudioUnitProcess in order to access phone call's audio streams. While phone call is active these streams are being processed in various ways.
We are listening for CoreTelephony notifications in order to get phone call status changes. When we receive audio samples we need to determine where they come from - microphone or speaker. This is done using componentSubType field in AudioComponentDescription structure. Now, you might think, why don't we store AudioUnit objects so that we don't need to check componentSubType every time. I did that but it will break everything when you switch speaker on/off on iPhone 5 because AudioUnit objects will change, they are recreated. So, now we open audio files (one for microphone and one for speaker) and write samples in them, simple as that. When phone call ends we will receive appropriate CoreTelephony notification and close the files. We have two separate files with audio from microphone and speaker that we need to merge. This is what void Convert() is for. It's pretty simple if you know the API. I don't think I need to explain it, comments are enough.
About locks. There are many threads in mediaserverd. Audio processing and CoreTelephony notifications are on different threads so we need some kind synchronization. I chose spin locks because they are fast and because the chance of lock contention is small in our case. On iPhone 4S and even iPhone 5 all the work in AudioUnitProcess should be done as fast as possible otherwise you will hear hiccups from device speaker which obviously not good.

Yes. Audio Recorder by a developer named Limneos does that (and quite well). You can find it on Cydia. It can record any type of call on iPhone 5 and up without using any servers etc'. The call will be placed on the device in an Audio file. It also supports iPhone 4S but for speaker only.
This tweak is known to be the first tweak ever that managed to record both streams of audio without using any 3rd party severs, VOIP or something similar.
The developer placed beeps on the other side of the call to alert the person you are recording but those were removed too by hackers across the net. To answer your question, Yes, it's very much possible, and not just theoretically.
Further reading
https://stackoverflow.com/a/19413363/202451
http://forums.macrumors.com/showthread.php?t=1566350
https://github.com/nst/iOS-Runtime-Headers

The only solution I can think of is to use the Core Telephony framework, and more specifically the callEventHandler property, to intercept when a call is coming in, and then to use an AVAudioRecorder to record the voice of the person with the phone (and maybe a little of the person on the other line's voice). This is obviously not perfect, and would only work if your application is in the foreground at the time of the call, but it may be the best you can get. See more about finding out if there is an incoming phone call here: Can we fire an event when ever there is Incoming and Outgoing call in iphone?.
EDIT:
.h:
#import <AVFoundation/AVFoundation.h>
#import<CoreTelephony/CTCallCenter.h>
#import<CoreTelephony/CTCall.h>
#property (strong, nonatomic) AVAudioRecorder *audioRecorder;
ViewDidLoad:
NSArray *dirPaths;
NSString *docsDir;
dirPaths = NSSearchPathForDirectoriesInDomains(
NSDocumentDirectory, NSUserDomainMask, YES);
docsDir = dirPaths[0];
NSString *soundFilePath = [docsDir
stringByAppendingPathComponent:#"sound.caf"];
NSURL *soundFileURL = [NSURL fileURLWithPath:soundFilePath];
NSDictionary *recordSettings = [NSDictionary
dictionaryWithObjectsAndKeys:
[NSNumber numberWithInt:AVAudioQualityMin],
AVEncoderAudioQualityKey,
[NSNumber numberWithInt:16],
AVEncoderBitRateKey,
[NSNumber numberWithInt: 2],
AVNumberOfChannelsKey,
[NSNumber numberWithFloat:44100.0],
AVSampleRateKey,
nil];
NSError *error = nil;
_audioRecorder = [[AVAudioRecorder alloc]
initWithURL:soundFileURL
settings:recordSettings
error:&error];
if (error)
{
NSLog(#"error: %#", [error localizedDescription]);
} else {
[_audioRecorder prepareToRecord];
}
CTCallCenter *callCenter = [[CTCallCenter alloc] init];
[callCenter setCallEventHandler:^(CTCall *call) {
if ([[call callState] isEqual:CTCallStateConnected]) {
[_audioRecorder record];
} else if ([[call callState] isEqual:CTCallStateDisconnected]) {
[_audioRecorder stop];
}
}];
AppDelegate.m:
- (void)applicationDidEnterBackground:(UIApplication *)application//Makes sure that the recording keeps happening even when app is in the background, though only can go for 10 minutes.
{
__block UIBackgroundTaskIdentifier task = 0;
task=[application beginBackgroundTaskWithExpirationHandler:^{
NSLog(#"Expiration handler called %f",[application backgroundTimeRemaining]);
[application endBackgroundTask:task];
task=UIBackgroundTaskInvalid;
}];
This is the first time using many of these features, so not sure if this is exactly right, but I think you get the idea. Untested, as I do not have access to the right tools at the moment. Compiled using these sources:
Recording voice in background using AVAudioRecorder
http://prassan-warrior.blogspot.com/2012/11/recording-audio-on-iphone-with.html
Can we fire an event when ever there is Incoming and Outgoing call in iphone?

Apple does not allow it and does not provide any API for it.
However, on a jailbroken device I'm sure it's possible. As a matter of fact, I think it's already done. I remember seeing an app when my phone was jailbroken that changed your voice and recorded the call - I remember it was a US company offering it only in the states. Unfortunately I don't remember the name...

I guess some hardware could solve this. Connected to the minijack-port; having earbuds and a microphone passing through a small recorder. This recorder can be very simple. While not in conversation the recorder could feed the phone with data/the recording (through the jack-cable). And with a simple start button (just like the volum controls on the earbuds) could be sufficient for timing the recording.
Some setups
http://www.danmccomb.com/posts/483/how-to-record-iphone-conversations-using-zoom-h4n/
http://forums.macrumors.com/showthread.php?t=346430

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

why is audio coming up garbled when using AVAssetReader with audio queue - ios

Related

Recording audio and passing the data to a UIWebView (JavascriptCore) on iOS 8/9

Programmatically Record a call and save file in iPhone [duplicate]

Encoding raw YUV420P to h264 with AVCodec on iOS

How to correctly read decoded PCM samples on iOS using AVAssetReader -- currently incorrect decoding

How can I record a conversation / phone call on iOS?

Categories

Resources