VideoToolbox HEVC decoding failing for iOS14 on device - ios14

So while I'm sure I'm not about to provide enough info for anyone to fix my specific code, what I am itching to know is this:
Does anyone know what might have happened to iOS14 to change HEVC decoding requirements??
I have a decoder built using VideoToolbox for an HEVC encoded video stream coming over the network, that was and is working fine on iOS 13 devices, and iOS 14 simulators. But it's failing most of the time in iOS 14 (up to 14.4 at time of writing) on iOS devices. "Most of the time", because sometimes it does just work, depending on where in the stream I'm trying to begin decoding.
An error I'm occasionally getting from my decompression output callback record is OSStatus -12909 – kVTVideoDecoderBadDataErr. So far, so unhelpful.
Or I may get no error output, like in a unit test which takes fixed packets of data in and should always generate video frames out. (This test likewise fails to generate expected frames when using iOS14 on devices.)
Anyone else had any issues with HEVC decoding in iOS 14 specifically? I'm literally fishing for clues here... I've tried toggling all the usual input flags for VTDecompressionSessionDecodeFrame() (._EnableAsynchronousDecompression, ._EnableTemporalProcessing, ...)
I've also tried redoing my entire rendering layer to use AVSampleBufferDisplayLayer with the raw CMSampleBuffers. It decodes perfectly!! But I can't use it... because I need to micromanage the timing of the output frames myself (and they're not always in order).
(If it helps, the fixed input packets I'm putting into my unit test include NALUs of the following types in order: NAL_UNIT_VPS, NAL_UNIT_SPS, NAL_UNIT_PPS, NAL_UNIT_PREFIX_SEI, NAL_UNIT_CODED_SLICE_CRA, and finally NAL_UNIT_CODED_SLICE_TRAIL_N and NAL_UNIT_CODED_SLICE_TRAIL_R. I took these from a working network stream at some point in the past to server as a basic sanity test.)

So this morning I came across a solution / workaround. It still sort of bears the original question of "what happened??" but here it is, may it help someone:
The kVTVideoDecoderBadDataErr error was occuring on all NALU packets of type RASL_R or RASL_N that were typically coming in from my video stream immediately after the first content frame (CRA type NALU.)
Simply skipping these packets (i.e. not passing them to VTDecompressionSessionDecodeFrame()) has resolved the issue for me and my decoder now works fine in both iOS 13 and 14.
The section on "Random Access Support" here says "RASL frames are ... usually discarded." I wonder if iOS 13 and earlier VideoToolbox implementations discarded these frames, while newer implementations don't, leaving it in this case up to the developer?

Related

Long freeze on WebRTC iOS native SDK

I am working on an iOS app that uses WebRTC's native SDK to provide access to streams from different cameras. The codec used is H264/AVC.
Although most camera streams work perfectly fine, there are some that consistently freeze when the streams are first launched. It looks like the frames are not being decoded but I am not sure how to go about fixing it.
When I enable debug logging, I see a lot of the following in WebRTC's logs:
(rtp_frame_reference_finder.cc:240): Generic frame with packet range [21170, 21170] has no GoP, dropping frame.
(rtp_frame_reference_finder.cc:240): Generic frame with packet range [21169, 21169] has no GoP, dropping frame.
(video_receive_stream.cc:699): No decodable frame in 200 ms, requesting keyframe.
(video_receive_stream.cc:699): No decodable frame in 200 ms, requesting keyframe.
When there is a freeze, VideoBroadcaster::OnFrame on video_broadcaster.cc is never called which prevents the entire stream flow from starting. When I test on Xcode and pause/unpause the debugger, almost always the stream will starting working and I see VideoBroadcaster::OnFrame getting fired and frames start being decoded. So somehow the pause/unpause process fixes the issue and kicks off the stream.
On the iOS SDK that the encoders are never setup. I have used the RTCVideoEncoderFactoryH264 encoder provided by the SDK. I have provided implementations for the interface/protocol RTCVideoEncoderFactory and also tried overriding the encoders in the SDK. In all of these cases, the createEncoder() function is never called. There are no issues with the decoder however, it sets up correctly.
In the RTCInboundRTPVideoStream stats report, PLICount and NACKCount are steadily increasing. My understanding is that the receiver is letting the other peer know there is picture loss in the encoded video.
Since I don't know what exactly is preventing the frames from being decoded, I would like to restart to the stream when I PLICount or NACKCount increasing.
How can I do that without going through the whole SDP offer/answer process? The only way I see is to toggle the isEnabled flag on RTCMediaStreamTrack but that doesn't fix the problem for me.
Are there any encoding/decoding parameters I can update to restart the stream?
What could be the reason for pausing/unpausing the debugger fixing the issue?

Using AVAudioSequencer to send MIDI to third-party AUv3 instruments

I'm having trouble controlling third-party AUv3 instruments with MIDI using AVAudioSequencer (iOS 12.1.4, Swift 4.2, Xcode 10.1) and would appreciate your help.
What I'm doing currently:
Get all AUs of type kAudioUnitType_MusicDevice.
Instantiate one and connect it to the AVAudioEngine.
Create some notes, and put them on a MusicTrack.
Hand the track data over to an AVAudioSequencer connected to the engine.
Set the destinationAudioUnit of the track to my selected Audio Unit.
So far, so good, but...
When I play the sequence using AVAudioSequencer it plays fine the first time, using the selected Audio Unit. On the second time I get either silence, or a sine wave sound (and I wonder who is making that). I'm thinking the Audio Unit should not be going out of scope in between playbacks of the sequence, but I do stop the engine and restart it again for the new round. (But it should even be possible to swap AUs while the engine is running, so I think this is OK.)
Are there some steps that I'm missing? I would love to include code, but it is really hard to condense it down to its essence from a wall of text. But if you want to ask for specifics, I can answer. Or if you can point me to a working example that shows how to reliably send MIDI to AUv3 using AVAudioSequencer, that would be great.
Is AVAudioSequencer even supposed to work with other Audio Units than Apple's? Or should I start looking for other ways to send MIDI over to AUv3?
I should add that I can consistently send MIDI to the AUv3 using the InstrumentPlayer method from Apple's AUv3Host sample, but that involves a concurrent thread, and results in all sorts of UI sync and timing problems.
EDIT: I added an example project to GitHub:
https://github.com/jerekapyaho/so54753738
It seems that it's now working in iPadOS 13.7, but I don't think I'm doing anything that different than earlier, except this loads a MIDI file from the bundle, instead of generating it from data on the fly.
If someone still has iOS 12, it would be interesting to know if it's broken there, but working on iOS 13.x (x = ?)
In case you are using AVAudioUnitSampler as an audio unit instrument, the sine tone happens when you stop and start the audio engine without reloading the preset. Whenever you start the engine you need to load any instruments back into the sampler (e.g. a SoundFont), otherwise you may hear the sine. This is an issue with the Apple AUSampler, not with 3rd party instruments.
Btw you can test it under iOS 12 using the simulator.

AVPlayer audio stops after bitrate spike

My iOS app uses AVPlayer to decode H.264 videos with AAC audio tracks out of local device storage. Content with bit rate spikes cause audio to drop shortly (less than a second) after the spike is played, yet video playback continues normally. Playing the videos through Safari seems to work fine, and this behavior is repeatable on several models of iPhones ranging from 6s through 8 plus.
I've been looking for any messages generated, delegates called with error information, or interesting KVOs, but there's been no helpful information so far. What might I do to get some sort of more detailed information that can point me in the right direction?
Turned out that the AVPlayer was configured to utilize methods for loading data in a custom way. The implementation of these methods failed to follow the pattern of satisfying the requests completely. (Apple docs are a vague about this.) The video portion of the AVPlayer asked for more data repeatedly, so eventually all its data got pulled. However, the audio portion patiently waited for the data to come in because there were neither an error state reported nor was all the data provided -- the presumption being that it was pending.
So, in short, sounds like there's provisions in the video handling code to treat missing data as a stall of some form and to plow onward, whereas audio doesn't have that feature. Not a bad design -- if audio cuts out it's very noticeable, and it's also by far the smaller stream so it's much less likely.
Despite spending quite a few days on the problem before posting, the lack of any useful signals made it hard to chase down the problem. I eventually reasoned that if there's no error in producing output from the stream, the problem must be in the delivery of the stream, and the problem revealed itself once I started tweaking the data loading code.

HLS Streaming on IPad with IOS 7 / 8 causes 10 seconds freezed frame - no clue why

We have a problem with HLS h.264 mp4 on IPad devices using HLS streaming on IOS 7 and 8:
The first 9-15 seconds (the length of the first TS segment) only shows the first key frame (IDR) of the second TS segment, while the sound plays normally. When the second segment begins to play, the video continues as it should.
HLS segmenter is a wowza with 10 seconds segment length. The encoding software we use is TMPG, latest version (uses x264). The funny thing is, that handbrake, xmedia recode, adobe me deliver videos which work. I am aware of the fact, that this hints to a bug within our encoding software, but if someone already had that problem with another software / segmenter combination and fixed it, I would like to know what the source of the problem was.
What we already tried:
changing almost every setting so that it sticks as close as possible to apple's recommendations
changing GOP structure, GOP length, encoding parameters which influence efficiency of encoding
analysis of the TS segments created by wowza, they are fine and begin all with keyframes
contact TMPG/Apple/Wowza support
So, did anyone stumble upon this problem before? Has anyone solved it?
EDIT:
It seems that TMPGEnc uses a defective x264 implementation. The mediastreamvalidator tool from Apple returned an error stating that our TS segment "does not contain any IDR access unit with a SPS and a PPS" - which it does actually, but obviously in the wrong places if that somehow matters.
Whatever tool you use for segmenting is not ensuring that the segment begins with an SPS+PPS+IDR. this could be an issue with your encoder or your segmenter. Basically, decoding can nor begin until all three of these thing are encountered in the player. Try using mediafilessegmenter and mediastreamvarifier from apple to analyze the problem.

iOS SDK avcodec_decode_video Optimization

I've recently started a project that relies on streaming FLV directly to an iOS device. As most famous i went with ffmpeg (and an iOS wrapper - kxmovie). To my surprise iPhone 4 is incapable of playing even SD low-bitrate FLV videos. The current implementation i'm using is decoding the video/audio/sub frames in dispatch_async while loop and copies the YUV frame data to a object, where the object is parsed to 3 textures - Y/U/V (in case of RGB color space - just parse the data) and rendered on screen. After much trial and error, i've decided to kill the whole rendering pipeline and leave only the avcodec_decode_video2 function to run. Surprisingly the FPS did not improve and videos are still unplayable.
My question is: What can i do to improve the performance of avcodec_decode_video2?
Note:
I've tried a few commercial apps and they play the same file perfectly fine with no more than 50-60% cpu usage.
The library is based off the 1.2 branch and this is are the build args:
'--arch=arm',
'--cpu=cortex-a8',
'--enable-pic',
"--extra-cflags='-arch armv7'",
"--extra-ldflags='-arch armv7'",
"--extra-cflags='-mfpu=neon -mfloat-abi=softfp -mvectorize-with-neon-quad'",
'--enable-neon',
'--enable-optimizations',
'--disable-debug',
'--disable-armv5te',
'--disable-armv6',
'--disable-armv6t2',
'--enable-small',
'--disable-ffmpeg',
'--disable-ffplay',
'--disable-ffserver',
'--disable-ffprobe',
'--disable-doc',
'--disable-bzlib',
'--target-os=darwin',
'--enable-cross-compile',
#'--enable-nonfree',
'--enable-gpl',
'--enable-version3',
And according to Instruments the following functions take about 30% CPU usage each:
Running Time Self Symbol Name
37023.9ms 32.3% 13874,8 ff_h264_decode_mb_cabac
34626.2ms 30.2% 9194,7 loop_filter
29430.0ms 25.6% 173,8 ff_h264_hl_decode_mb
As it turns out, even with NEON support, FFmpeg is still executed on the CPU and thus it can't decode faster than that. There are apps that use ffmpeg and HW decoder, my best guess would be that they strip the header and feed Apple's AssetReader the raw h264 data.
just for the fun of it see what kind of performance you get from this, it does seem to play flv's quick but I have not tested it on the iPhone 4
https://github.com/mooncatventures-group/WebStreamX_flv_demo
You should use --enable-asm optimization parameter to boost performance for %10-15 more.
Also, you must install the latest gas-preprocessor.pl

Resources