I am working on an iOS MPEG-DASH player and I have an issue with seeking feature.
ExtAudioFileSeek(::) documentation said
Sets the file’s read position to the specified sample frame number. A subsequent call to the ExtAudioFileRead(::_:) function returns samples from precisely this location, even if it is located in the middle of a packet.
Unfortunately, the AudioToolbox.AudioFile_ReadProc loop does not seek straight to the right frame and goes trough all file segments, requesting 16 and 27682 bytes packages.
It takes a lot of time (especially for long tracks) and force to download all intermediate segments (that should not be required)
This also cause the app to crash when long tracks with high sound quality are being played.
Here my log trace. I converted frame index to "mega" for more readability.
[PlayerEngine] - Trying to seek at 16.870537 %
[AudioSource] - Pausing AudioOutputUnit
[AudioSource] - AudioOutputUnit successfully stopped
[AudioOutputUnit] - Trying to seek at frame 31.006805M
[AudioOutputUnit] - Successfully sought at frame 31.006805M
[AudioInputUnit] - Audio frames have been flushed
[AudioInputUnit] - Seek to frame 31.006806M pending
[AudioConverter] - Converted frame buffer has been flush
[AudioSource] - Resuming AudioOutputUnit
[AudioSource] - AudioOutputUnit successfully resumed
[PlayerEngine] - Successfully sought at 16.870537 %
[AudioInputUnit] - Seeking to frame 31.006805M
[CoreAudioDecoder] - Trying to seek at frame 31.006805M
[CoreAudioDecoder] - Seek on frame 31.006806M done successfully
[CoreAudioDecoder] - AudioToolbox.AudioFile_ReadProc : inClientData, inPosition:1.315635M, requestCount:16
(...)
[CoreAudioDecoder] - AudioToolbox.AudioFile_ReadProc : inClientData, inPosition:1.331687M, requestCount:27682
Is it a bug on AudioToolBox or is there a way to fix it ?
Thanks a lot !
Cool!
Since you mention read callbacks, I assume you're using not only the ExtAudioFile API but also the AudioFile API as well, something like ExtAudioFileWrapAudioFileID(AudioFileInitializeWithCallbacks(...))
Compressed audio formats don't always have a simple mapping between frame and file offsets and so the naive behaviour you're seeing is probably due to one of these APIs (AudioFile?) understandably not knowing this mapping.
Try either setting the kExtAudioFileProperty_PacketTable property on the ExtAudioFile or kAudioFilePropertyPacketTableInfo on the wrapped audio file. The former probably makes more sense. I don't know if the whole packet table info will be available to you from the beginning or if it will be revealed to you over time, nor how the APIs will react to you setting these properties multiple times.
Good luck!
Related
I checked the video stream displayed well in qml video surface. now I want to get the video frame data to do something not bad thing. but, It seems not doing well until now... I made a simple pipeline like below for focus on a test.
nvarguscamerasrc - appsink
I used QGst::Utils::ApplicationSink to get a frame data. I referenced an example "appsink-src"
/* making pipeline */
QGst::ElementPtr source, sink;
SubClassApplicationSink *appsink;
source = QGst::ElementFactory::make("nvarguscamerasrc");
sink = QGst::ElementFactory::make("appsink");
appsink = new SubClassApplicationSink();
// configure elements
source->setProperty("sensor-id", n);
appsink->setElement(sink);
appsink->enableDrop(true);
appsink->setMaxBuffers(7654321);
m_pipeline->add(source, sink);
source->link(sink);
subclass of ApplicationSink implements some callbacks eos, preroll, sample.
and I prints logs some values in a buffer I got from the new sample.
the same outputs are repeated as callback function is called.
result: [start-end offsets are -1, no flags, memory count 1, memory size 1008]
I don't know why... How do you think?
I solved the issue. the problem was a pipeline's composition. after put a "nvvidconv" element between "nvarguscamerasrc" and "appsink" then I could get video frames successfully.
I don't know why needs a nvvidconv element. but, It seems because of source's video type, "video/x-raw(memory:NVMM)" which means using DMA buffers for performance reasons.
https://forums.developer.nvidia.com/t/what-is-the-meaning-of-memory-nvmm/180522
IXAudio2SourceVoice has a GetState function which returns an XAUDIO2_VOICE_STATE structure. This structure has a SamplesPlayed member, which is:
Total number of samples processed by this voice since it last started, or since the last audio stream ended (as marked with the XAUDIO2_END_OF_STREAM flag).
What I want to be able to do it stop the source voice, flush all its buffers, and then reset the SamplesPlayed counter to zero. Neither calling Stop nor FlushSourceBuffers will by themselves reset SamplesPlayed. And while flagging the last buffer with XAUDIO2_END_OF_STREAM does correctly reset SamplesPlayed back to zero, this seemingly only works if that last buffer is played to completion; if the buffer is flushed, then SamplesPlayed does not get reset. I have also tried calling Discontinuity both before and after stopping/flushing with no effect.
My current workaround is, after stopping and flushing the source voice, to submit a tiny 1-sample silent buffer with the XAUDIO2_END_OF_STREAM flag set and then let the source voice play to process that buffer and thus reset SamplesPlayed to zero. This works fine-ish for my use case, but it seems pretty hacky/clumsy. Is there a better solution?
Looking at the XAudio2 source, there's no exposed way to do that in the API other than letting a packet play with XAUDIO2_END_OF_STREAM.
Calling Discontinuity sets up the end-of-stream flag on the currently playing buffer, or if there's none playing and a queued buffer it sets it there. You need to call Discontinuity and then let the voice play to completion before you recycle it.
I am developing MIDI Player by referring to the following Web-Page.
http://twocentstudios.com/2017/02/20/bouncing-midi-to-audio-on-ios/
I don't do any recording, I just want to play the SMF file.
However, when I run setPreload (true), it says "ASSERTION FAILED: Preroll mode set during render" and my app hangs.
I searched for "Preroll mode set during render" but couldn't find any valid information.
Please help someone.
EDIT:
hi, #dspr.
The percussion sounds even if I don't do "AudioUnitSetProperty (kAUMIDISynthProperty_EnablePreload: 1)".
I think this is because the BANK for percussion is automatically assigned to ch.10.
However, in this state, the piano and guitar and others do not sound.
AVAudioUnitMIDI Instrument needs kAUMIDISynthProperty_EnablePreload to analyze which tone is assigned to which track in the SMF file, right?
Which method does AVAudioUnitMIDIInstrument use to preload SMF files?
(1) AudioUnitSetProperty (kAUMIDISynthProperty_EnablePreload: 1) to AVAudioUnitMIDISynth
(2) << How to preload? >>
(3) AudioUnitSetProperty (kAUMIDISynthProperty_EnablePreload: 0) to AVAudioUnitMIDISynth
(4) Start AVAudioSequencer
MIDI Player uses the kAUMIDISynthProperty_EnablePreload property of MIDISynth for that purpose. See the Apple comment about it below. Note the It should only be used prior to MIDI playback and must be set back to 0 before attempting to start playback sentence at the end :
/*!
#constant kAUMIDISynthProperty_EnablePreload
#discussion Scope: Global
Value Type: UInt32
Access: Write
Setting this property to 1 puts the MIDISynth in a mode where it will attempt to load
instruments from the bank or file when it receives a program change message. This
is used internally by the MusicSequence. It should only be used prior to MIDI playback,
and must be set back to 0 before attempting to start playback.
*/
EDIT : frankly, I'm a little bit reserved about your link
One strategy I haven’t tried would be to pitch shift the MIDI up one octave, play it back at 2x, record it at 88.2kHz, then downsample to 44.1kHz. AVAudioSession presumably can’t go past 48kHz though.
Clearly, the person who wrote that has a very poor knowledge about audio and sampling. Playing a MIDI song transposed one octave up at double tempo is really not equivalent than playing the same recorded in audio at double speed whatever you make the recording at 88.2kHz or any other sample rate. As a simple example, what happens is the file contains a drum set ? A snare drum (40) will become a Chinese cymbal (52) played two times slower ?
As I can understand this post, the described hack has for unique purpose to make recording. So if you simply want to play your MIDI file back you can certainly find a simpler and better example.
What I'm trying to do:
Record up to a specified duration of audio/video, where the resulting output file will have a pre-defined background music from external audio-file added - without further encoding/exporting after recording.
As if you were recording video using the iPhones Camera-app, and all the recorded videos in 'Camera Roll' have background-songs. No exporting or loading after ending recording, and not in a separate AudioTrack.
How I'm trying to achieve this:
By using AVCaptureSession, in the delegate-method where the (CMSampleBufferRef)sample buffers are passed through, I'm pushing them to an AVAssetWriter to write to file. As I don't want multiple audio tracks in my output file, I can't pass the background-music through a separate AVAssetWriterInput, which means I have to add the background-music to each sample buffer from the recording while it's recording to avoid having to merge/export after recording.
The background-music is a specific, pre-defined audio file (format/codec: m4a aac), and will need no time-editing, just adding beneath the entire recording, from start to end. The recording will never be longer than the background-music-file.
Before starting the writing to file, I've also made ready an AVAssetReader, reading the specified audio-file.
Some pseudo-code(threading excluded):
-(void)startRecording
{
/*
Initialize writer and reader here: [...]
*/
backgroundAudioTrackOutput = [AVAssetReaderTrackOutput
assetReaderTrackOutputWithTrack:
backgroundAudioTrack
outputSettings:nil];
if([backgroundAudioReader canAddOutput:backgroundAudioTrackOutput])
[backgroundAudioReader addOutput:backgroundAudioTrackOutput];
else
NSLog(#"This doesn't happen");
[backgroundAudioReader startReading];
/* Some more code */
recording = YES;
}
- (void)captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection
{
if(!recording)
return;
if(videoConnection)
[self writeVideoSampleBuffer:sampleBuffer];
else if(audioConnection)
[self writeAudioSampleBuffer:sampleBuffer];
}
The AVCaptureSession is already streaming the camera-video and microphone-audio, and is just waiting for the BOOL recording to be set to YES. This isn't exactly how I'm doing this, but a short, somehow equivalent representation. When the delegate-method receives a CMSampleBufferRef of type Audio, I call my own method writeAudioSamplebuffer:sampleBuffer. If this was to be done normally, without a background-track as I'm trying to do, I'd simply put something like this: [assetWriterAudioInput appendSampleBuffer:sampleBuffer]; instead of calling my method. In my case though, I need to overlap two buffers before writing it:
-(void)writeAudioSamplebuffer:(CMSampleBufferRef)recordedSampleBuffer
{
CMSampleBufferRef backgroundSampleBuffer =
[backgroundAudioTrackOutput copyNextSampleBuffer];
/* DO MAGIC HERE */
CMSampleBufferRef resultSampleBuffer =
[self overlapBuffer:recordedSampleBuffer
withBackgroundBuffer:backgroundSampleBuffer];
/* END MAGIC HERE */
[assetWriterAudioInput appendSampleBuffer:resultSampleBuffer];
}
The problem:
I have to add incremental sample buffers from a local file to the live buffers coming in. The method I have created named overlapBuffer:withBackgroundBuffer: isn't doing much right now. I know how to extract AudioBufferList, AudioBuffer and mData etc. from a CMSampleBufferRef, but I'm not sure how to actually add them together - however - I haven't been able to test different ways to do that, because the real problem happens before that. Before the Magic should happen, I am in possession of two CMSampleBufferRefs, one received from microphone, one read from file, and this is the problem:
The sample buffer received from the background-music-file is different than the one I receive from the recording-session. It seems like the call to [self.backgroundAudioTrackOutput copyNextSampleBuffer]; receives a large number of samples. I realize that this might be obvious to some people, but I've never before been at this level of media-technology. I see now that it was wishful thinking to call copyNextSampleBuffer each time I receive a sampleBuffer from the session, but I don't know when/where to put it.
As far as I can tell, the recording-session gives one audio-sample in each sample-buffer, while the file-reader gives multiple samples in each sample-buffer. Can I somehow create a counter to count each received recorded sample/buffers, and then use the first file-sampleBuffer to extract each sample, until the current file-sampleBuffer has no more samples 'to give', and then call [..copyNext..], and do the same to that buffer?
As I'm in full control of both the recording and the file's codecs, formats etc, I am hoping that such a solution wouldn't ruin the 'alignment'/synchronization of the audio. Given that both samples have the same sampleRate, could this still be a problem?
Note
I'm not even sure if this is possible, but I see no immediate reason why it shouldn't.
Also worth mentioning that when I try to use a Video-file instead of an Audio-file, and try to continually pull video-sampleBuffers, they align up perfectly.
I am not familiarized with AVCaptureOutput, since all my sound/music sessions were built using AudioToolbox instead of AVFoundation. However, I guess you should be able to set the size of the recording capturing buffer. If not, and you are still get just one sample, I would recommend you to store each individual data obtained from the capture output in an auxiliar buffer. When the auxiliar buffer reaches the same size as the file-reading buffer, then call [self overlapBuffer:auxiliarSampleBuffer withBackgroundBuffer:backgroundSampleBuffer];
I hope this would help you. If not, I can provide example about how to do this using CoreAudio. Using CoreAudio I have been able to obtain 1024 LCPM samples buffer from both microphone capturing and file reading. So the overlapping is immediate.
I've been trying to find a thread implementation in IOS that suits my projects needs. So far I've failed to find an acceptable solution.
My Problem :
I need to read audio from up to 16 mp3 files on disk simultaneously.
What I have tried:
First off I tried using a NSTimer witch repeats. The timer was not fast enough and the audio would drop out when I played any more than 4 files.
Second I tried Using an NSThread with a priority of 1. The audio just about played correctly but the UI Became wholly unresponsive.
Finally I tried dispatching blocks using GCD in my callback whenever I needed more audio from a file. Again the audio would drop out but the UI was responsive.
In all three of the examples above I also tried dividing up the work load by creating 4 threads and having each thread handle 4 audio files each but this caused really bad synchronization problems with the audio.
Are there other thread options that I can try or do the above sum up what IOS has to offer?
Do you think that reading from 16 files from disk simultaneously is too much of a strain for the IOS system?
Is there a limit of how many threads IOS can handle?
To avoid making my question sound like a discussion I will summarize as follows.
What IOS thread technology is best suited for very frequent calling, quickly completing execution, that can be easily synchronized and will not impact on UI responsiveness.
Any anecdotal advice from solving a similar audio programming problem is also appreciated.
EDIT 1
This is some stripped down code I modelled on a suggestion from a so user. All I'm after solid advice on what setup is going to work best for me. Since my last post I tried NSThread and it does seem to leave me with audio dropouts. Also I tried using NSConditions so that my thread is wasting processing power when its not filling buffer but using these locks seems like a real bad idea for audio callbacks.
OSStatus channelMixerCallback(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData) {
AudioInfo = myaudio[inBusNumber];
if(myaudio.needsbufferfill==YES)
{
[refToSelf performSelector:#selector(GetAudioForItem:) onThread:engineDescribtion.producerthread withObject:myaudio waitUntilDone:false];
}
}
-(void) startthread
{
engineDescribtion.producerthread =[[NSThread alloc]initWithTarget:self selector:#selector(dosinglerunloop) object:nil];
[engineDescribtion.producerthread start];
}
-(void)dosinglerunloop
{
BOOL isstarted=YES;
NSAutoreleasePool *pool=[[NSAutoreleasePool alloc]init];
do {
[[NSRunLoop currentRunLoop]addPort:[NSMachPort port] forMode:NSDefaultRunLoopMode];
[[NSRunLoop currentRunLoop]runMode:NSDefaultRunLoopMode beforeDate:[NSDate distantFuture]];
} while (isstarted);
[pool release];
}
- (void)GetAudioForItem:(AudioInfo *)info
{
// use data in Audio Info to seek to
//corrent place in file
//and extract audio to buffers
}
Problem 0:
Your audio render callbacks should never lock. Example: Creating a single heap allocation will lock.
Your threads will all compete for the hardware. To keep the UI responsive, you should not have many highest priority threads (the audio playback should be the only one). Consider the number of cores, disks, etc you have available in your design.
If you still have issues once you have correctly fixed that: Loading short files into memory can offload some of the disk's demand to memory.
You should profile to determine what is actually the problem: It may be CPU or I/O. You may be simply missing your render deadlines and equating audio dropouts to "can't read fast enough". If you are using a lot of CPU, then Disk I/O may not be the problem. Decoding and performing sample rate conversion on 16 mp3 files can require relatively high CPU (as one example of the things you need to look for).
pthreads will be fastest, but will require some work to implement right. That really doesn't matter at this time because there seem to be a few high level issues yet and there are multiple APIs which should handle the task just fine.
Your program should be smart enough to detect when read buffers cannot be filled fast enough.
You are pre filling the buffers, correct?
Presumably, you are using a run loop?
Well, there's only one disk… So any solution that requires 16 simultaneous reads might be an issue. (Depending on if you're I/O bound or CPU bound.)
NSTimer is not going to get you consistent results.
I don't see any reason why NSThread would kill UI responsiveness, perhaps you had a bug.
I'm going with this system being disk-bound because 16 channels of MP3 is no problem CPU-wise on modern machines - how much rattling is coming from your box? I would probably be tempted to use just one thread to fill the empty buffers with the buffer sized to accommodate, (averageDiskLatency*(bytes/msec)*16*bodgeFactor) bytes of audio stream, (bodgeFactor means rounded up to 8K boundary and add a few 8K's). Whenever threads/callbacks/whatever empty a buffer and so start on the other one, they should queue the empty buffer to the disk read thread, (thread-safe producer-consumer queue), to get it filled up again. Probably, each buffer should include a 'fileControl' instance containing the the fileSpec, file handle, state variable for EOF etc, error string space and anything else needed for the read thread to work as well as the buffer space itself.
This design allows the disk to read nice, large chunks without being annoyingly preempted half-way through reads and being avoidably forced to move lumps of metal too often.
Rgds,
Martin
PS - If you haven't got one already, get an SSD - works wonders for multi-channel audio/video latency.