Bitrate is not getting limited for H.264 HW accelerated encode on iOS using the VideoToolbox API - encode

Bitrate is not getting limited for H.264 HW accelerated encode on iOS using the VideoToolbox API with property kVTCompressionPropertyKey_AverageBitRate.
It is observed that the bitrate is shooting upto 4mbps(for both 1280x780, 640x360) at times for H.264 HW accelerated encode though the encoder's bitrate is configured rightly.
This high bitrate value is not in the acceptable limits.
*There is a single property for setting bitrate i.e kVTCompressionPropertyKey_AverageBitRate available in the videoToolbox. The documentation says "This is not a hard limit; the bit rate may peak above this".
I have tried below two things :
1. Set bitrate and Set Data rate to some hardcoded values, as a part of encoderSpec attribute of VTCompressionSessionCreate in the init. Removed any re configuring/setting of bitrate after the init.
2. Set bitrate and Set Data rate using VTSessionSetProperty run time
Both does not seem to work.
Is there any way to restrict the bitrate to certain limit ? Any help is greatly appreciated.

If you deal with motion scene, 4 Mbps perhaps is a right value. In non-real time situation, I think you should try to configure Profile to High with Level 5, setting H264EntropyMode to CABAC and extend the value of MaxKeyFrameInterval key.

Related

VTCompressionSession Bitrate/Datarate overshooting

I have been working on an H264 hardware accelerated encoder implementation using VideoToolbox's VTCompressionSession for a while now, and a consistent problem has been the unreliable bitrate coming out of it. I have read many forum posts and looked through existing code for this, and tried to follow suit, but the bitrate out of my encoder is almost always somewhere between 5% and 50% off what it is set at, and on occasion I've seen some huge errors, like even 400% overshoot, where even one frame will be twice the size of the given average bitrate.
My session is setup as follows:
kVTCompressionPropertyKey_AverageBitRate = desired bitrate
kVTCompressionPropertyKey_DataRateLimits = [desired bitrate / 8, 1]; accounting for bits vs bytes
kVTCompressionPropertyKey_ExpectedFrameRate = framerate (30, 15, 5, or 1 fps)
kVTCompressionPropertyKey_MaxKeyFrameInterval = 1500
kVTCompressionPropertyKey_MaxKeyFrameIntervalDuration = 1500 / framerate
kVTCompressionPropertyKey_AllowFrameReordering = NO
kVTCompressionPropertyKey_ProfileLevel = kVTProfileLevel_H264_Main_AutoLevel
kVTCompressionPropertyKey_RealTime = YES
kVTCompressionPropertyKey_H264EntropyMode = kVTH264EntropyMode_CABAC
kVTCompressionPropertyKey_BaseLayerFrameRate = framerate / 2
And I adjust the average bitrate and datarate values throughout the session to try and compensate for the volatility (if it's too high, I reduce them a bit, if too low, I increase them, with restrictions on how high and low to go).
I create the session and then apply the above configuration as a single dictionary using VTSessionSetProperties and feed frames into it like this:
VTCompressionSessionEncodeFrame(compressionSessionRef,
static_cast<CVImageBufferRef<(pixelBuffer),
CMTimeMake(capturetime, 1000),
kCMTimeInvalid,
frameProperties,
frameDetailsStruct,
&encodeInfoFlags);
So I'm supplying timing information as the API says to do.
Then I add up the size of the output for each frame and divide over a periodic time period, to determine the outgoing bitrate and error from desired. This is where I see the significant volatility.
I'm looking for any help in getting the bitrate under control, as I'm not sure what to do at this point. Thank you!
I think you can check the frameTimestamp set in VTCompressionSessionEncodeFrame, it seems affects the bitrate. If you change frame rate, change the frameTimestamp.

Get peak volume of audio input on iOS

On iOS 7, how do I get the current microphone input volume in a range between 0 and 1?
I've seen several approaches like this one, but the results I get baffle me.
The return values of peakPowerForChannel: are documented to be in the range of -160 to 0 with 0 being the loudest and -160 near absolute silence.
Problem: Given a quite room and a short but loud noise, the power goes all the way up in an instant but takes very long time to drop back to quite level (way longer than the actual noise...)
What I want: Essentially I want an exact copy of the Audio Input patch of Quartz Composer with its Volume Peak output. Any tips?
To get a similar volume peak measurement, you might have to input raw audio via the iOS Audio Queue API (or the RemoteIO Audio Unit), and analyze the raw PCM waveform samples in each audio callback, looking for a magnitude maxima over your desired frame width or analysis time.

Millisecond (and greater) precision for audio file elapsed time in iOS

I am looking for a low-latency way of finding out how many seconds have elapsed in an audio file to guaranteed millisecond precision in real-time. According to the AVAudioPlayer class reference, a call to -currentTime will return "the offset of the current playback position, measured in seconds from the start of the sound", however an NSTimeInterval is a double and this implies fractions of a second are possible.
As a testing scenario, I have an audio file playing and the user taps a button. Playback DOES NOT pause/stop, but at the moment the button was tapped I would like to obtain information about the elapsed time. In the real application, the "button may be pressed" many times in one second, hence the need for millisecond precision.
My files are stored as AIFF files and are around 1-10 minutes in length. Ideally I would like to find out exactly which sample frame is 'up-next' when playback resumes - however, this level of precision is a little excessive and millisecond precision is perfectly acceptable.
Is AVAudioPlayer's -currentTime method sufficient to achieve guaranteed millisecond precision for a currently-playing audio file? Or, would it be preferable to use a lower-level API such as iOS's Audio Units?
If you want sub-millisecond relative time resolution, convert to raw PCM and count buffers * length + samples using a low latency RemoteIO Audio Unit configuration. Most iOS devices will support as small as 6 mS RemoteIO buffers of 256 samples, with a callback for each buffer.

how to transmit signal with data rate (3.84 Mbps) using USRP1?

I want to send signal with data rate (3.84 M) using USRP1, but when I transmit the signal it tells me some thing like this in the terminal :
WARNING
Target data rate: 3840000 bps
Actual data rate: 4000000 bps
but I'm trying to implement TX working with the UMTS air interface and I don't want this error in the data rate,
anyone can help?????
Your sample rate is dependent on the master clock rate you are using with your USRP. Your USRP1 has a master clock rate of 64 MHz, and you can only sample at integer decimations of that value, by default, which is why you cannot sample at 3.84 MSps.
UHD is auto-correcting your requested sample rate to a rate that is supported by your USRP, for you. This is actually desirable behavior.
You have two options:
Replace the clock on the USRP1 that will divide down to the rate you want.
Use a rational re-sampler. GNURadio provides this block for you, if you want to use it.
I would suggest using a rational resampler before attempting a hardware mod, which may permanently destroy your USRP if you do it incorrectly.

Get PTS from raw H264 mdat generated by iOS AVAssetWriter

I'm trying to simultaneously read and write H.264 mov file written by AVAssetWriter. I managed to extract individual NAL units, pack them into ffmpeg's AVPackets and write them into another video format using ffmpeg. It works and the resulting file plays well except the playback speed is not right. How do I calculate the correct PTS/DTS values from raw H.264 data? Or maybe there exists some other way to get them?
Here's what I've tried:
Limit capture min/max frame rate to 30 and assume that the output file will be 30 fps. In fact its fps is always less than values that I set. And also, I think the fps is not constant from packet to packet.
Remember each written sample's presentation timestamp and assume that samples map one-to-one to NALUs and apply saved timestamp to output packet. This doesn't work.
Setting PTS to 0 or AV_NOPTS_VALUE. Doesn't work.
From googling about it I understand that raw H.264 data usually doesn't contain any timing info. It can sometimes have some timing info inside SEI, but the files that I use don't have it. On the other hand, there are some applications that do exactly what I'm trying to do, so I suppose it is possible somehow.
You will either have to generate them yourself, or access the Atom's containing timing information in the MP4/MOV container to generate PTS/DTS information. FFmpeg's mov.c in libavformat might help.
Each sample/frame you write with AVAssetWriter will map one to one with the VCL NALs. If all you are doing is converting then have FFmpeg do all the heavy lifting. It will properly maintain the timing information when going from one container format to another.
The bitstream generated by AVAssetWriter does not contain SEI data. It only contains SPS/PPS/I/P frames. The SPS also does not contain VUI or HRD parameters.
-- Edit --
Also, keep in mind that if you are saving PTS information from the CMSampleBufferRef's then the time base may be different from that of the target container. For instance AVFoundation time base is nanoseconds, and a FLV file is milliseconds.

Resources