I'm reading an audio file with AVAssetReaderAudioMixOutput. So I'll get AudioBuffer objects with normally 2*8192 samples. Now I want to do some analysis on exactly 2*44100 samples shifted by 1024 every 1024 samples. Is there a simple way to collect an exact amount of samples? Or do I have to build that all on my own?
And is there a collection like a ring buffer that works well with AudioBuffer?
The best way I found to to this is with the TPCircularBuffer (https://github.com/michaeltyson/TPCircularBuffer). It has a category that can deal directly with AudioBuffer objects. So I put them into them into the buffer until there are 2*44100 bytes in the buffer and then I remove the last 2*8192 bytes. Works like a charm!
Related
I'm developing an app of recording, but I have a demand that the size of input buffer should be 882 bytes. I know that I can modify the mDataByteSize of buffList like this picture.
But it can only be modified to power of 2. When I tried to modify it to 882, it warned me that "AudioUnitRender error:-50".
I hope somebody can help me because I have no way.
You can't demand a specific input size in an Audio Unit recording callback bufferList. In fact, the Audio Unit API is allowed to change the number of samples per audio buffer at run time. So your app has to support a differing number of frames than requested, each callback.
Instead your app should save the samples in another temporary FIFO buffer, and the later remove some of the samples in your desired block size when that temporary FIFO buffer become full enough. Typically a circular buffer is used to store the temporary amount until it gets filled to the size you need or larger. They you can pull out exactly 882 or whatever number of samples you need.
how can I get 3sn of samples of an audio that is being recorded. I have used RemoteIO audiounit and it brings 512 samples and it is 10 milisecond. I need total 3sec of samples? Can you give me an idea how to do it .
here is my another post with details of my code Concatenating Audio Buffers in ObjectiveC
my worst sceneario will be recording the audio in a file then get its samples. I dont want to go with this.
should I use AudioQueue ? Any Advice?
I really need help. thanks
Save the buffers (giving to you by the Audio Unit callback) to an array (C array), and increment the index of the array used for saving data by 512 after every 512 samples of input data.
I have append the frames that came with each render like hotpaw2 said. now you can find the detailed codes, how I applied buffers in my this post/question
Concatenating Audio Buffers in ObjectiveC
thanks
I want to extract a few clips from the recorded wav file. I am not finding much help online regarding this issue. I understand we can't split from compressed formats like mp3, but how do we do it with caf/wav files?
One approach you may consider would be to calculate and read the bytes from an audio file and write them to a new file. Because you are dealing with LPCM formats the calculations are relatively simple.
If for example you have a file of 16bit mono LPCM audio sampled at 44.1kHz that is one minute in duration, then you have a total of (60 secs x 44100Hz) 2,646,000 samples. Times 2 bytes per sample gives a total of 5,292,000 bytes. And if you want audio from 10sec to 30sec then you need to read the bytes from 882,000 to 2,646,000 and write them to a separate file.
There is a bit of code involved but it can be done using Audio File Services Class from the AudioToolbox framework.
Functions you'll need to use are AudioFileOpenURL, AudioFileCreateWithURL, AudioFileReadBytes, AudioFileWriteBytes, and AudioFileClose.
An algorithm would be something like this-
You first set up an AudioFileID which is an opaque type that gets passed in to the AudioFileCreateWithURL function. Then open the file you wish to splice up using AudioFileOpenURL.
Calculate the start and end bytes of what you want to copy.
Next, in a loop preferably, read in the bytes and write them to file. AudioFileReadBytes and AudioFileWriteBytes allow you to do this. Whats good is that you can read and write whatever size bytes you decide on each iteration of the loop.
When finished close the new file and original using AudioFileClose.
Then repeat for each file (audio extraction) to be written.
On an additional note you would split a compressed format by converting the compressed format to LPCM first.
I'm trying to simultaneously read and write H.264 mov file written by AVAssetWriter. I managed to extract individual NAL units, pack them into ffmpeg's AVPackets and write them into another video format using ffmpeg. It works and the resulting file plays well except the playback speed is not right. How do I calculate the correct PTS/DTS values from raw H.264 data? Or maybe there exists some other way to get them?
Here's what I've tried:
Limit capture min/max frame rate to 30 and assume that the output file will be 30 fps. In fact its fps is always less than values that I set. And also, I think the fps is not constant from packet to packet.
Remember each written sample's presentation timestamp and assume that samples map one-to-one to NALUs and apply saved timestamp to output packet. This doesn't work.
Setting PTS to 0 or AV_NOPTS_VALUE. Doesn't work.
From googling about it I understand that raw H.264 data usually doesn't contain any timing info. It can sometimes have some timing info inside SEI, but the files that I use don't have it. On the other hand, there are some applications that do exactly what I'm trying to do, so I suppose it is possible somehow.
You will either have to generate them yourself, or access the Atom's containing timing information in the MP4/MOV container to generate PTS/DTS information. FFmpeg's mov.c in libavformat might help.
Each sample/frame you write with AVAssetWriter will map one to one with the VCL NALs. If all you are doing is converting then have FFmpeg do all the heavy lifting. It will properly maintain the timing information when going from one container format to another.
The bitstream generated by AVAssetWriter does not contain SEI data. It only contains SPS/PPS/I/P frames. The SPS also does not contain VUI or HRD parameters.
-- Edit --
Also, keep in mind that if you are saving PTS information from the CMSampleBufferRef's then the time base may be different from that of the target container. For instance AVFoundation time base is nanoseconds, and a FLV file is milliseconds.
I am using VTCompressionSessionEncodeFrameWithOutputHandler to compress pixel buffers from camera into raw h264 stream. I am using kVTEncodeFrameOptionKey_ForceKeyFrame to be sure that every output from VTCompressionSessionEncodeFrame is not dependent on other pieces. Also, there is kVTCompressionPropertyKey_AllowFrameReordering = false, kVTCompressionPropertyKey_RealTime = true options during session initialization and VTCompressionSessionCompleteFrames called after each VTCompressionSessionEncodeFrame call.
I also collect samples, produced by VTCompressionSessionEncodeFrame and periodically save them as MP4 file (using Bento4 library).
But final track is always shorter than samples, feeded to VTCompressionSessionEncodeFrame on 1-2 seconds. After several attempts to resolve this, i can be sure, that is it VTCompressionSessionEncodeFrame outputs frames, that depends on later frames to be decoded properly - so this frames are lost, since they can not be used to produce "final chunks" of the track.
So the question - how one can force VTCompressionSessionEncodeFrame to produce totally independent data chunks?
Turn out this was... FPS issue! NAL units do not have special timing itself (aside of pts, which is capture-fps-bound in my case), so it is quite important they are produced at exact rate as FPS in movie is expecting them to be... Nothing was lost, just saved frames were played faster (this was not so easy to spot, in fact)