How to synchronize multiple audio files on iOS? - ios

I like to synchronize (one-shot) audio effects with the beat of (looping) background music on iOS.
How do I approach that task?
Edit: So, to give more details, say the background music loops over 4 bars. I want to be able to start the playback of another audio file (of an audio effect) on the next 8th (or 16th or 4th...) note.

Simple. Put them on the same timeline (e.g. audio playback callback), and use the input tempo to determine the metric intervals in samples (or ticks, if using MIDI events).
Update
double SamplesPerBeat(const double audioSampleRate,
const double beatsPerMinute) {
assert(8000.0 <= audioSampleRate && 192000.0 >= audioSampleRate);
assert(20.0 <= beatsPerMinute && 500.0 >= beatsPerMinute);
return audioSampleRate / beatsPerMinute;
}
uint32_t StartPosition(const double audioSampleRate,
const double beatsPerMinute,
const uint32_t beatNumber) {
const double samplesPerBeat = SamplesPerBeat(audioSampleRate, beatNumber);
return (uint32_t)floor(samplesPerBeat * beatNumber);
}

Related

AVAudioPlayerNode lastRenderTime

I use multiple AVAudioPlayerNode in AVAudioEngine to mix audio files for playback.
Once all the setup is done (engine prepared, started, audio file segments scheduled), I'm calling play() method on each player node to start playback.
Because it takes times to loop through all player nodes, I take a snapshot of the first nodes's lastRenderTime value and use it to compute a start time for the nodes play(at:) method, to keep playback in sync between nodes :
let delay = 0.0
let startSampleTime = time.sampleTime // time is the snapshot value
let sampleRate = player.outputFormat(forBus: 0).sampleRate
let startTime = AVAudioTime(
sampleTime: startSampleTime + AVAudioFramePosition(delay * sampleRate),
atRate: sampleRate)
player.play(at: startTime)
The problem is with the current playback time.
I use this computation to get the value, where seekTime is a value I keep track of in case we seek the player. It's 0.0 at start :
private var _currentTime: TimeInterval {
guard player.engine != nil,
let lastRenderTime = player.lastRenderTime,
lastRenderTime.isSampleTimeValid,
lastRenderTime.isHostTimeValid else {
return seekTime
}
let sampleRate = player.outputFormat(forBus: 0).sampleRate
let sampleTime = player.playerTime(forNodeTime: lastRenderTime)?.sampleTime ?? 0
if sampleTime > 0 && sampleRate != 0 {
return seekTime + (Double(sampleTime) / sampleRate)
}
return seekTime
}
While this produces a relatively correct value, I can hear a delay between the time I play, and the first sound I hear. Because the lastRenderTime immediately starts to advance once I call play(at:), and there must be some kind of processing/buffering time offset.
The noticeable delay is around 100ms, which is very big, and I need a precise current time value to do visual rendering in parallel.
It probably doesn't matter, but every audio file is AAC audio, and I schedule segments of them in player nodes, I don't use buffers directly.
Segments length may vary. I also call prepare(withFrameCount:) on each player node once I have scheduled audio data.
So my question is, is the delay I observe is a buffering issue ? (I mean should I schedule shorter segments for example), is there a way to compute precisely this value so I can adjust my current playback time computation ?
When I install a tap block on one AVAudioPlayerNode, the block is called with a buffer of length 4410, and the sample rate is 44100 Hz, this means 0.1s of audio data. Should I rely on this to compute the latency ?
I'm wondering if I can trust the length of the buffer I get in the tap block. Alternatively, I'm trying to compute the total latency for my audio graph. Can someone provide insights on how to determine this value precisely ?
From a post on Apple's developer forums by theanalogkid:
On the system, latency is measured by:
Audio Device I/O Buffer Frame Size + Output Safety Offset + Output Stream Latency + Output Device Latency
If you're trying to calculate total roundtrip latency you can add:
Input Latency + Input Safety Offset to the above.
The timestamp you see at the render proc. account for the buffer frame size and the safety offset but the stream and device latencies are not accounted for.
iOS gives you access to the most important of the above information via AVAudioSession and as mentioned you can also use the "preferred" session settings - setPreferredIOBufferDuration and preferredIOBufferDuration for further control.
/ The current hardware input latency in seconds. */
#property(readonly) NSTimeInterval inputLatency NS_AVAILABLE_IOS(6_0);
/ The current hardware output latency in seconds. */
#property(readonly) NSTimeInterval outputLatency NS_AVAILABLE_IOS(6_0);
/ The current hardware IO buffer duration in seconds. */
#property(readonly) NSTimeInterval IOBufferDuration NS_AVAILABLE_IOS(6_0);
Audio Units also have the kAudioUnitProperty_Latency property you can query.

How to get the accurate time position of a live streaming in avplayer

I'm using AVPlayer to play a live streaming. This stream supports one hour catch-up which means user can seek to one hour ago and play. But I have one question how do I know the accurate position that the player is playing. I need to display current position on the player view. For example,if user is playing half an hour ago then display -30:00; if user is playing the latest content, the player will show 00:00 or live. Thanks
Swift solution :
override func getLiveDuration() -> Float {
var result : Float = 0.0;
if let items = player.currentItem?.seekableTimeRanges {
if(!items.isEmpty) {
let range = items[items.count - 1]
let timeRange = range.timeRangeValue
let startSeconds = CMTimeGetSeconds(timeRange.start)
let durationSeconds = CMTimeGetSeconds(timeRange.duration)
result = Float(startSeconds + durationSeconds)
}
}
return result;
}
To get a live position poison and seek to it you can by using seekableTimeRanges of AVPlayerItem:
CMTimeRange seekableRange = [player.currentItem.seekableTimeRanges.lastObject CMTimeRangeValue];
CGFloat seekableStart = CMTimeGetSeconds(seekableRange.start);
CGFloat seekableDuration = CMTimeGetSeconds(seekableRange.duration);
CGFloat livePosition = seekableStart + seekableDuration;
[player seekToTime:CMTimeMake(livePosition, 1)];
Also when you seek some time back, you can get current playing position by calling currentTime method
CGFloat current = CMTimeGetSeconds([self.player.currentItem currentTime]);
CGFloat diff = livePosition - current;
I know this question is old, but I had the same requirement and I believe the solutions aren't addressing properly the intent of the question.
What I did for this same requirement was to gather the current point in time, the starting time, and the length of the total duration of the stream.
I'll explain something before going further, the current point in time could surpass the (starting time + total duration) this is due to the way hls is structured as ts segments. Ts segments are small chucks of playable video, you could have on your seekable range 5 ts segments of 10 seconds each. This doesn't mean that 50 secs is the full length of the live stream, there is around a full segment more (so 60 seconds of playtime total) but it isn't categorized as seekable since you shouldn't seek to that segment. If you were to do this you'll notice in most instances rebuffering (cause the source may be still creating the next ts segment when you already reached the end of playback).
What I did was checking if the current stream time is further than the seekable rage, if so this would mean were are live on stream. If it isn't you could easily calculate how far behind you are from live if you subtract the current time, starting time, and total duration.
let timeRange:CMTimeRange = player.currentItem?.seekableTimeRanges.last
let start = timeRange.start.seconds
let totalDuration = timeRange.duration.seconds
let currentTime = player.currentTime().seconds
let secondsBehindLive = currentTime - totalDuration - start
The code above will give you a negative number with the number of seconds behind "live" or more specifically the start of the lastest ts segment. Or a positive number or zero when it's playing the latest ts segment.
Tbh I don't really know when does the seekableTimeRanges will have more than 1 value, it has always been just one for the streams I have tested with, but if you find in your streams more than 1 value you may have to figure if you want to add all the ranges duration, which time range to use as the start value, etc. At least for my use case, this was enough.

Varispeed with Libsndfile, Libsamplerate and Portaudio in C

I'm working on an audio visualizer in C with OpenGL, Libsamplerate, portaudio, and libsndfile. I'm having difficulty using src_process correctly within my whole paradigm. My goal is to use src_process to achieve Vinyl Like varispeed in real time within the visualizer. Right now my implementation changes the pitch of the audio without changing the speed. It does so with lots of distortion due to what sounds like missing frames as when I lower the speed with the src_ratio it almost sounds granular like chopped up samples. Any help would be appreciated, I keep experimenting with my buffering chunks however 9 times out of 10 I get a libsamplerate error saying my input and output arrays are overlapping. I've also been looking at the speed change example that came with libsamplerate and I can't find where I went wrong. Any help would be appreciated.
Here's the code I believe is relevant. Thanks and let me know if I can be more specific, this semester was my first experience in C and programming.
#define FRAMES_PER_BUFFER 1024
#define ITEMS_PER_BUFFER (FRAMES_PER_BUFFER * 2)
float src_inBuffer[ITEMS_PER_BUFFER];
float src_outBuffer[ITEMS_PER_BUFFER];
void initialize_SRC_DATA()
{
data.src_ratio = 1; //Sets Default Playback Speed
/*---------------*/
data.src_data.data_in = data.src_inBuffer; //Point to SRC inBuffer
data.src_data.data_out = data.src_outBuffer; //Point to SRC OutBuffer
data.src_data.input_frames = 0; //Start with Zero to Force Load
data.src_data.output_frames = ITEMS_PER_BUFFER
/ data.sfinfo1.channels; //Number of Frames to Write Out
data.src_data.src_ratio = data.src_ratio; //Sets Default Playback Speed
}
/* Open audio stream */
err = Pa_OpenStream( &g_stream,
NULL,
&outputParameters,
data.sfinfo1.samplerate,
FRAMES_PER_BUFFER,
paNoFlag,
paCallback,
&data );
/* Read FramesPerBuffer Amount of Data from inFile into buffer[] */
numberOfFrames = sf_readf_float(data->inFile, data->src_inBuffer, framesPerBuffer);
/* Looping of inFile if EOF is Reached */
if (numberOfFrames < framesPerBuffer)
{
sf_seek(data->inFile, 0, SEEK_SET);
numberOfFrames = sf_readf_float(data->inFile,
data->src_inBuffer+(numberOfFrames*data->sfinfo1.channels),
framesPerBuffer-numberOfFrames);
}
/* Inform SRC Data How Many Input Frames To Process */
data->src_data.end_of_input = 0;
data->src_data.input_frames = numberOfFrames;
/* Perform SRC Modulation, Processed Samples are in src_outBuffer[] */
if ((data->src_error = src_process (data->src_state, &data->src_data))) {
printf ("\nError : %s\n\n", src_strerror (data->src_error)) ;
exit (1);
}
* Write Processed SRC Data to Audio Out and Visual Out */
for (i = 0; i < framesPerBuffer * data->sfinfo1.channels; i++)
{
// gl_audioBuffer[i] = data->src_outBuffer[i] * data->amplitude;
out[i] = data->src_outBuffer[i] * data->amplitude;
}
I figured out a solution that works well enough for me and am just going to explain it best I can for anyone else with a similar issue. So to get the Varispeed to work, the way the API works is you give it a certain number of frames, and it spits out a certain number of frames. So for a SRC ratio of 0.5, if you process 512 frames per loop you are feeding in 512/0.5 frames = 1024 frames. That way when the API runs its src_process function, it compresses those 1024 frames into 512, speeding up the samples. So I dont fully understand why it solved my issue, but the problem was if the ratio is say 0.7, you end up with a float number which doesn't work with the arrays indexed int values. Therefore there's missing samples unless the src ratio is eqaully divisble by the framesperbuffer potentially at the end of each block. So what I did was add +2 frames to be read if the framesperbuffer%src.ratio != 0 and it seemed to fix 99% of the glitches.
/* This if Statement Ensures Smooth VariSpeed Output */
if (fmod((double)framesPerBuffer, data->src_data.src_ratio) == 0)
{
numInFrames = framesPerBuffer;
}
else
numInFrames = (framesPerBuffer/data->src_data.src_ratio) + 2;
/* Read FramesPerBuffer Amount of Data from inFile into buffer[] */
numberOfFrames = sf_readf_float(data->inFile, data->src_inBuffer, numInFrames);

ios audio queue - how to meter audio level in buffer?

I'm working on an app that should do some audio signal processing. I need to measure the audio level in each one of the buffers I get (through the Callback function). I've been searching the web for some time, and I found that there is a build-in property called Current level metering:
AudioQueueGetProperty(recordState->queue,kAudioQueueProperty_CurrentLevelMeter,meters,&dlen);
This property gets me the average or peak audio level, but it's not synchronised to the current buffer.
I figured out I need to calculate the audio level from the buffer data by myself, so I had this:
double calcAudioRMS (SInt16 * audioData, int numOfSamples)
{
double RMS, adPercent;
RMS = 0;
for (int i=0; i<numOfSamples; i++)
{
adPercent=audioData[i]/32768.0f;
RMS += adPercent*adPercent;
}
RMS = sqrt(RMS / numOfSamples);
return RMS;
}
This function gets the audio data (casted into Sint16) and the number of samples in the current buffer. The numbers I get are indeed between 0 and 1, but they seem to be rather random and low comparing to the numbers I got from the built-in audio level metering.
The recording audio format is:
format->mSampleRate = 8000.0;
format->mFormatID = kAudioFormatLinearPCM;
format->mFramesPerPacket = 1;
format->mChannelsPerFrame = 1;
format->mBytesPerFrame = 2;
format->mBytesPerPacket = 2;
format->mBitsPerChannel = 16;
format->mReserved = 0;
format->mFormatFlags = kLinearPCMFormatFlagIsSignedInteger |kLinearPCMFormatFlagIsPacked;
My question is how to get the right values from the buffer? Is there a built-in function \ property for this? Or should I calculate the audio level myself, and how to do it?
Thanks in advance.
Your calculation for RMS power is correct. I'd be inclined to say that you have a fewer number of samples than Apple does, or something similar, and that would explain the difference. You can check by inputting a loud sine wave, and checking that Apple (and you) calculate RMS power at 1/sqrt(2).
Unless there's a good reason, I would use Apple's power calculations. I've used them, and they seem good to me. Additionally, generally you don't want RMS power, you want RMS power as decibels, or use the kAudioQueueProperty_CurrentLevelMeterDB constant. (This depends on if you're trying to build an audio meter, or truly display the audio power)

Playing multiple files with a single file player audio unit

I'm trying to use a file player audio unit (kAudioUnitSubType_AudioFilePlayer) to play multiple files (not at the same time, of course). That's on iOS.
So I've successfully opened the files and stored their details in an array of AudioFileID's that I set to the audio unit using kAudioUnitProperty_ScheduledFileIDs. Now I would like to define 2 ScheduledAudioFileRegion's, one per file, and used them with the file player...
But I can't seem to find out:
How to set the kAudioUnitProperty_ScheduledFileRegion property to store these 2 regions (actually, how to define the index of each region)?
How to trigger the playback of a specific region.. My guess is that the kAudioTimeStampSampleTimeValid parameter should enable this but how to define which region you want to play?
Maybe I'm just plain wrong about the way I should use this audio unit, but documentation is very difficult to get and I haven't found any example showing the playback of 2 regions on the same player!
Thanks in advance.
You need to schedule region every time you want play file. In ScheduledAudioFileRegion you must set AudioFileID to play. Playback begins when current time in unit (samples) are equal or greater than sample time in scheduled region.
Example:
// get current unit time
AudioTimeStamp timeStamp;
UInt32 propSize = sizeof(AudioTimeStamp);
AudioUnitGetProperty(m_playerUnit, kAudioUnitProperty_CurrentPlayTime, kAudioUnitScope_Global, 0, &timeStamp, &propSize);
// when to start playback
timeStamp.mSampleTime += 100;
// schedule region
ScheduledAudioFileRegion region;
memset(&region, 0, sizeof(ScheduledAudioFileRegion));
region.mAudioFile = ...; // your AudioFileID
region.mFramesToPlay = ...; // count of frames to play
region.mLoopCount = 1;
region.mStartFrame = 0;
region.mTimeStamp = timeStamp;
AudioUnitSetProperty(m_playerUnit, kAudioUnitProperty_ScheduledFileRegion, kAudioUnitScope_Global, 0, &region,sizeof(region));

Resources