ios endless video recording - ios

I'm trying to develop an iPhone app that will use the camera to record only the last few minutes/seconds.
For example, you record some movie for 5 minutes click "save", and only the last 30s will be saved. I don't want to actually record five minutes and then chop last 30s (this wont work for me). This idea is called "Loop recording".
This results in an endless video recording, but you remember only last part.
Precorder app do what I want to do. (I want use this feature in other context)
I think this should be easily simulated with a Circular buffer.
I started a project with AVFoundation. It would be awesome if I could somehow redirect video data to a circular buffer (which I will implement). I found information only on how to write it to a file.
I know I can chop video into intervals and save them, but saving it and restarting camera to record another part will take time and it is possible to lose some important moments in the movie.
Any clues how to redirect data from camera would be appreciated.

Important! As of iOS 8 you can use VTCompressionSession and have direct access to the NAL units instead of having to dig through the container.
Well luckily you can do this and I'll tell you how, but you're going to have to get your hands dirty with either the MP4 or MOV container. A helpful resource for this (though, more MOV-specific) is Apple's Quicktime File Format Introduction manual
http://developer.apple.com/library/mac/#documentation/QuickTime/QTFF/QTFFPreface/qtffPreface.html#//apple_ref/doc/uid/TP40000939-CH202-TPXREF101
First thing's first, you're not going to be able to start your saved movie from an arbitrary point 30 seconds before the end of the recording, you'll have to use some I-Frame at approximately 30 seconds. Depending on what your Keyframe Interval is, it may be several seconds before or after that 30 second mark. You could use all I-frames and start from an arbitrary point, but then you'll probably want to re-encode the video afterward because it will be quite large.
SO knowing that, let's move on.
First step is when you set up your AVAssetWriter, you will want to set its AVAssetWriterInput's expectsMediaDataInRealTime property to YES.
In the captureOutput callback you'll be able to do an fread from the file you are writing to. The first fread will get you a little bit of MP4/MOV (whatever format you're using) header (i.e. 'ftyp' atom, 'wide' atom, and the beginning of the 'mdat' atom). You want what's inside the 'mdat' section. So the offset you'll start saving data from will be 36 or so.
Each read will get you 0 or more AVC NAL Units. You can find a listing of NAL unit types from ISO/IEC 14496-10 Table 7-1. They will be in a slightly different format than specified in Annex B, but it's fine. Additionally, there will only be IDR slices and non-IDR slices in the MP4/MOV file. IDR will be the I-Frame you're looking to hang onto.
The NAL unit format in the MP4/MOV container is as follows:
4 bytes - Size
[Size] bytes - NALU Data
data[0] & 0x1F - NALU Type
So now you have the data you're looking for. When you go to save this file, you'll have to update the MPV/MOV container with the correct length, sample count, you'll have to update the 'stsz' atom with the correct sizes for each sample and things like updating the media headers and track headers with the correct duration of the movie and so on. What I would probably recommend doing is creating a sample container on first run that you can more or less just overwrite/augment with the appropriate data for that particular movie. You'll want to do this because the encoders on the various iDevices don't all have the same settings and the 'avcC' atom contains encoder information.
You don't really need to know much about the AVC stream in this case, so you'll probably want to concentrate your experimenting around updating the container format you choose correctly. Good luck.

Related

Get byte range positions for a specified time range of an mp3 file

I'd like to be able to determine at what byte positions a segment of an NSData compressed mp3 file begins and ends.
For example, if I am playing an mp3 file using the AVPlayer (or any player) that is 1 minute long and 1000000 bytes, I'd like to know approximately at how many bytes in the file the 30 second mark happens, then how many bytes the 40 second mark happens.
Note that due to the mp3 file being compressed I can't just divide the bytes in half to determine the 30 second mark.
If this can't be done with Swift/Objective-C, do you know if this determination can be done with any programming language? Thanks!
It turns out I had a different problem to solve. I was trying to approximate the byte position of a specific time, say, the 4:29 point of a 32:45 long podcast episode, within a few seconds of accuracy.
I used a function along these lines to calculate the approximate byte position:
startTimeBytesPosition = (startTimeInSeconds / episodeDuration) * episodeFileSize
That function worked like a charm for some episodes, but for others the resulting start time would be off by about 30-40 seconds.
It turns out this inaccuracy was happening because some mp3s contain metadata at the very beginning, and image files stored within metadata can be +500KB, so my calculation of time based on byte position for any episode with an image file would be off by about 500KB (which translated into about 30-40 seconds in this case).
To resolve this, I am first determining the size in bytes of the metadata in an mp3 file, and then use that to offset the approximation function:
startTimeBytesPosition = metadataBytesOffset + (startTimeInSeconds / episodeDuration) * episodeFileSize
So far this code seems to be doing a good job of approximating time based on byte position accurately within a few seconds.
I should note that this assumes that the metadata for the image will always appear at the beginning of the mp3 file, and I don't know if that will always be the case.

Is it possible to split the recorded wav file into multiple wav files on iOS, given the duration of the splits?

I want to extract a few clips from the recorded wav file. I am not finding much help online regarding this issue. I understand we can't split from compressed formats like mp3, but how do we do it with caf/wav files?
One approach you may consider would be to calculate and read the bytes from an audio file and write them to a new file. Because you are dealing with LPCM formats the calculations are relatively simple.
If for example you have a file of 16bit mono LPCM audio sampled at 44.1kHz that is one minute in duration, then you have a total of (60 secs x 44100Hz) 2,646,000 samples. Times 2 bytes per sample gives a total of 5,292,000 bytes. And if you want audio from 10sec to 30sec then you need to read the bytes from 882,000 to 2,646,000 and write them to a separate file.
There is a bit of code involved but it can be done using Audio File Services Class from the AudioToolbox framework.
Functions you'll need to use are AudioFileOpenURL, AudioFileCreateWithURL, AudioFileReadBytes, AudioFileWriteBytes, and AudioFileClose.
An algorithm would be something like this-
You first set up an AudioFileID which is an opaque type that gets passed in to the AudioFileCreateWithURL function. Then open the file you wish to splice up using AudioFileOpenURL.
Calculate the start and end bytes of what you want to copy.
Next, in a loop preferably, read in the bytes and write them to file. AudioFileReadBytes and AudioFileWriteBytes allow you to do this. Whats good is that you can read and write whatever size bytes you decide on each iteration of the loop.
When finished close the new file and original using AudioFileClose.
Then repeat for each file (audio extraction) to be written.
On an additional note you would split a compressed format by converting the compressed format to LPCM first.

Get PTS from raw H264 mdat generated by iOS AVAssetWriter

I'm trying to simultaneously read and write H.264 mov file written by AVAssetWriter. I managed to extract individual NAL units, pack them into ffmpeg's AVPackets and write them into another video format using ffmpeg. It works and the resulting file plays well except the playback speed is not right. How do I calculate the correct PTS/DTS values from raw H.264 data? Or maybe there exists some other way to get them?
Here's what I've tried:
Limit capture min/max frame rate to 30 and assume that the output file will be 30 fps. In fact its fps is always less than values that I set. And also, I think the fps is not constant from packet to packet.
Remember each written sample's presentation timestamp and assume that samples map one-to-one to NALUs and apply saved timestamp to output packet. This doesn't work.
Setting PTS to 0 or AV_NOPTS_VALUE. Doesn't work.
From googling about it I understand that raw H.264 data usually doesn't contain any timing info. It can sometimes have some timing info inside SEI, but the files that I use don't have it. On the other hand, there are some applications that do exactly what I'm trying to do, so I suppose it is possible somehow.
You will either have to generate them yourself, or access the Atom's containing timing information in the MP4/MOV container to generate PTS/DTS information. FFmpeg's mov.c in libavformat might help.
Each sample/frame you write with AVAssetWriter will map one to one with the VCL NALs. If all you are doing is converting then have FFmpeg do all the heavy lifting. It will properly maintain the timing information when going from one container format to another.
The bitstream generated by AVAssetWriter does not contain SEI data. It only contains SPS/PPS/I/P frames. The SPS also does not contain VUI or HRD parameters.
-- Edit --
Also, keep in mind that if you are saving PTS information from the CMSampleBufferRef's then the time base may be different from that of the target container. For instance AVFoundation time base is nanoseconds, and a FLV file is milliseconds.

OpenAL buffer update in real-time

I'm working in iOS and have a simple OpenAL project running.
The difference to most openAL projects i've seen is that im not loading in a sound file. Instead I load an array of raw data into the alBufferData. Using a couple of equations I can load in data to produce white noise, sine and pulse waves. And all is working well.
My problem is that I need a way to modify this data whilst the sound is playing in real-time.
Is there a way to modify this data without having to create a new buffer (i tried the approach of creating a new buffer with new data and then use it instead but its nowhere near quick enough).
Any help or suggestions of other ways to accomplish this would be much appreciated.
Thanks
I haven't done it on iOS, but with openAL on the PC what you would do is chain a few buffers together. Each buffer would have a small time period's worth of data. Periodically, check to see if the playing buffer is done, and if so, add it to a free list for reuse When you want to change the sound, write the new waveform into a free buffer and add it to the chain. You select the buffer size to balance latency and required update rate - smaller buffers allow faster response to changes, but need to be generated more often.
This page suggests that a half second update rate is doable. Whether you can go faster depends on the complexity of your calculations as well as on the overhead of the OS.
Changing the data during playback is not supported in OpenAL.
However, you can still try it and see if you get acceptable defaults (though you'll be racing against the OpenAL playback mechanism, and any lag-outs in your app could throw it off, so do this at your own risk).
There's an Apple extension version of ALBufferData that tells OpenAL to use the data you give it directly, rather than making its own local copy. You set it up like so:
typedef ALvoid AL_APIENTRY (*alBufferDataStaticProcPtr) (const ALint bid,
ALenum format,
const ALvoid* data,
ALsizei size,
ALsizei freq);
static alBufferDataStaticProcPtr alBufferDataStatic = NULL;
alBufferDataStatic = (alBufferDataStaticProcPtr) alcGetProcAddress(NULL, (const ALCchar*) "alBufferDataStatic");
Call alBufferDataStatic() it like you would call alBufferData():
alBufferDataStatic(bufferId, format, data, size, frequency);
Since it's now using your sound data buffer rather than its own, you could conceivably modify that data and it won't be the wiser (provided you're not modifying things too close to where it's currently playing from in the buffer).
However, this approach is risky, since it depends on timing you're not fully in control of. To be 100% safe you'll need to use Audio Units.

VTCompressionSessionEncodeFrame: last seconds are lost?

I am using VTCompressionSessionEncodeFrameWithOutputHandler to compress pixel buffers from camera into raw h264 stream. I am using kVTEncodeFrameOptionKey_ForceKeyFrame to be sure that every output from VTCompressionSessionEncodeFrame is not dependent on other pieces. Also, there is kVTCompressionPropertyKey_AllowFrameReordering = false, kVTCompressionPropertyKey_RealTime = true options during session initialization and VTCompressionSessionCompleteFrames called after each VTCompressionSessionEncodeFrame call.
I also collect samples, produced by VTCompressionSessionEncodeFrame and periodically save them as MP4 file (using Bento4 library).
But final track is always shorter than samples, feeded to VTCompressionSessionEncodeFrame on 1-2 seconds. After several attempts to resolve this, i can be sure, that is it VTCompressionSessionEncodeFrame outputs frames, that depends on later frames to be decoded properly - so this frames are lost, since they can not be used to produce "final chunks" of the track.
So the question - how one can force VTCompressionSessionEncodeFrame to produce totally independent data chunks?
Turn out this was... FPS issue! NAL units do not have special timing itself (aside of pts, which is capture-fps-bound in my case), so it is quite important they are produced at exact rate as FPS in movie is expecting them to be... Nothing was lost, just saved frames were played faster (this was not so easy to spot, in fact)

Resources