Read an AVAudioFile into a buffer starting at a certain time - ios

Let's say I have an AVAudioFile with a duration of 10 seconds. I want to load that file into an AVAudioPCMBuffer but I only want to load the audio frames that come after a certain number of seconds/milliseconds or after a certain AVAudioFramePosition.
It doesn't look like AVAudioFile's readIntoBuffer methods give me that kind of precision so I'm assuming I'll have to work at the AVAudioBuffer level or lower?

You just need to set the AVAudioFile's framePosition property before reading.

Related

How to modify the size of AudioUnit Buffer?

I'm developing an app of recording, but I have a demand that the size of input buffer should be 882 bytes. I know that I can modify the mDataByteSize of buffList like this picture.
But it can only be modified to power of 2. When I tried to modify it to 882, it warned me that "AudioUnitRender error:-50".
I hope somebody can help me because I have no way.
You can't demand a specific input size in an Audio Unit recording callback bufferList. In fact, the Audio Unit API is allowed to change the number of samples per audio buffer at run time. So your app has to support a differing number of frames than requested, each callback.
Instead your app should save the samples in another temporary FIFO buffer, and the later remove some of the samples in your desired block size when that temporary FIFO buffer become full enough. Typically a circular buffer is used to store the temporary amount until it gets filled to the size you need or larger. They you can pull out exactly 882 or whatever number of samples you need.

Is it possible to split the recorded wav file into multiple wav files on iOS, given the duration of the splits?

I want to extract a few clips from the recorded wav file. I am not finding much help online regarding this issue. I understand we can't split from compressed formats like mp3, but how do we do it with caf/wav files?
One approach you may consider would be to calculate and read the bytes from an audio file and write them to a new file. Because you are dealing with LPCM formats the calculations are relatively simple.
If for example you have a file of 16bit mono LPCM audio sampled at 44.1kHz that is one minute in duration, then you have a total of (60 secs x 44100Hz) 2,646,000 samples. Times 2 bytes per sample gives a total of 5,292,000 bytes. And if you want audio from 10sec to 30sec then you need to read the bytes from 882,000 to 2,646,000 and write them to a separate file.
There is a bit of code involved but it can be done using Audio File Services Class from the AudioToolbox framework.
Functions you'll need to use are AudioFileOpenURL, AudioFileCreateWithURL, AudioFileReadBytes, AudioFileWriteBytes, and AudioFileClose.
An algorithm would be something like this-
You first set up an AudioFileID which is an opaque type that gets passed in to the AudioFileCreateWithURL function. Then open the file you wish to splice up using AudioFileOpenURL.
Calculate the start and end bytes of what you want to copy.
Next, in a loop preferably, read in the bytes and write them to file. AudioFileReadBytes and AudioFileWriteBytes allow you to do this. Whats good is that you can read and write whatever size bytes you decide on each iteration of the loop.
When finished close the new file and original using AudioFileClose.
Then repeat for each file (audio extraction) to be written.
On an additional note you would split a compressed format by converting the compressed format to LPCM first.

Millisecond (and greater) precision for audio file elapsed time in iOS

I am looking for a low-latency way of finding out how many seconds have elapsed in an audio file to guaranteed millisecond precision in real-time. According to the AVAudioPlayer class reference, a call to -currentTime will return "the offset of the current playback position, measured in seconds from the start of the sound", however an NSTimeInterval is a double and this implies fractions of a second are possible.
As a testing scenario, I have an audio file playing and the user taps a button. Playback DOES NOT pause/stop, but at the moment the button was tapped I would like to obtain information about the elapsed time. In the real application, the "button may be pressed" many times in one second, hence the need for millisecond precision.
My files are stored as AIFF files and are around 1-10 minutes in length. Ideally I would like to find out exactly which sample frame is 'up-next' when playback resumes - however, this level of precision is a little excessive and millisecond precision is perfectly acceptable.
Is AVAudioPlayer's -currentTime method sufficient to achieve guaranteed millisecond precision for a currently-playing audio file? Or, would it be preferable to use a lower-level API such as iOS's Audio Units?
If you want sub-millisecond relative time resolution, convert to raw PCM and count buffers * length + samples using a low latency RemoteIO Audio Unit configuration. Most iOS devices will support as small as 6 mS RemoteIO buffers of 256 samples, with a callback for each buffer.

ios endless video recording

I'm trying to develop an iPhone app that will use the camera to record only the last few minutes/seconds.
For example, you record some movie for 5 minutes click "save", and only the last 30s will be saved. I don't want to actually record five minutes and then chop last 30s (this wont work for me). This idea is called "Loop recording".
This results in an endless video recording, but you remember only last part.
Precorder app do what I want to do. (I want use this feature in other context)
I think this should be easily simulated with a Circular buffer.
I started a project with AVFoundation. It would be awesome if I could somehow redirect video data to a circular buffer (which I will implement). I found information only on how to write it to a file.
I know I can chop video into intervals and save them, but saving it and restarting camera to record another part will take time and it is possible to lose some important moments in the movie.
Any clues how to redirect data from camera would be appreciated.
Important! As of iOS 8 you can use VTCompressionSession and have direct access to the NAL units instead of having to dig through the container.
Well luckily you can do this and I'll tell you how, but you're going to have to get your hands dirty with either the MP4 or MOV container. A helpful resource for this (though, more MOV-specific) is Apple's Quicktime File Format Introduction manual
http://developer.apple.com/library/mac/#documentation/QuickTime/QTFF/QTFFPreface/qtffPreface.html#//apple_ref/doc/uid/TP40000939-CH202-TPXREF101
First thing's first, you're not going to be able to start your saved movie from an arbitrary point 30 seconds before the end of the recording, you'll have to use some I-Frame at approximately 30 seconds. Depending on what your Keyframe Interval is, it may be several seconds before or after that 30 second mark. You could use all I-frames and start from an arbitrary point, but then you'll probably want to re-encode the video afterward because it will be quite large.
SO knowing that, let's move on.
First step is when you set up your AVAssetWriter, you will want to set its AVAssetWriterInput's expectsMediaDataInRealTime property to YES.
In the captureOutput callback you'll be able to do an fread from the file you are writing to. The first fread will get you a little bit of MP4/MOV (whatever format you're using) header (i.e. 'ftyp' atom, 'wide' atom, and the beginning of the 'mdat' atom). You want what's inside the 'mdat' section. So the offset you'll start saving data from will be 36 or so.
Each read will get you 0 or more AVC NAL Units. You can find a listing of NAL unit types from ISO/IEC 14496-10 Table 7-1. They will be in a slightly different format than specified in Annex B, but it's fine. Additionally, there will only be IDR slices and non-IDR slices in the MP4/MOV file. IDR will be the I-Frame you're looking to hang onto.
The NAL unit format in the MP4/MOV container is as follows:
4 bytes - Size
[Size] bytes - NALU Data
data[0] & 0x1F - NALU Type
So now you have the data you're looking for. When you go to save this file, you'll have to update the MPV/MOV container with the correct length, sample count, you'll have to update the 'stsz' atom with the correct sizes for each sample and things like updating the media headers and track headers with the correct duration of the movie and so on. What I would probably recommend doing is creating a sample container on first run that you can more or less just overwrite/augment with the appropriate data for that particular movie. You'll want to do this because the encoders on the various iDevices don't all have the same settings and the 'avcC' atom contains encoder information.
You don't really need to know much about the AVC stream in this case, so you'll probably want to concentrate your experimenting around updating the container format you choose correctly. Good luck.

Get PTS from raw H264 mdat generated by iOS AVAssetWriter

I'm trying to simultaneously read and write H.264 mov file written by AVAssetWriter. I managed to extract individual NAL units, pack them into ffmpeg's AVPackets and write them into another video format using ffmpeg. It works and the resulting file plays well except the playback speed is not right. How do I calculate the correct PTS/DTS values from raw H.264 data? Or maybe there exists some other way to get them?
Here's what I've tried:
Limit capture min/max frame rate to 30 and assume that the output file will be 30 fps. In fact its fps is always less than values that I set. And also, I think the fps is not constant from packet to packet.
Remember each written sample's presentation timestamp and assume that samples map one-to-one to NALUs and apply saved timestamp to output packet. This doesn't work.
Setting PTS to 0 or AV_NOPTS_VALUE. Doesn't work.
From googling about it I understand that raw H.264 data usually doesn't contain any timing info. It can sometimes have some timing info inside SEI, but the files that I use don't have it. On the other hand, there are some applications that do exactly what I'm trying to do, so I suppose it is possible somehow.
You will either have to generate them yourself, or access the Atom's containing timing information in the MP4/MOV container to generate PTS/DTS information. FFmpeg's mov.c in libavformat might help.
Each sample/frame you write with AVAssetWriter will map one to one with the VCL NALs. If all you are doing is converting then have FFmpeg do all the heavy lifting. It will properly maintain the timing information when going from one container format to another.
The bitstream generated by AVAssetWriter does not contain SEI data. It only contains SPS/PPS/I/P frames. The SPS also does not contain VUI or HRD parameters.
-- Edit --
Also, keep in mind that if you are saving PTS information from the CMSampleBufferRef's then the time base may be different from that of the target container. For instance AVFoundation time base is nanoseconds, and a FLV file is milliseconds.

Resources