HLS Streaming on IPad with IOS 7 / 8 causes 10 seconds freezed frame - no clue why - ipad

We have a problem with HLS h.264 mp4 on IPad devices using HLS streaming on IOS 7 and 8:
The first 9-15 seconds (the length of the first TS segment) only shows the first key frame (IDR) of the second TS segment, while the sound plays normally. When the second segment begins to play, the video continues as it should.
HLS segmenter is a wowza with 10 seconds segment length. The encoding software we use is TMPG, latest version (uses x264). The funny thing is, that handbrake, xmedia recode, adobe me deliver videos which work. I am aware of the fact, that this hints to a bug within our encoding software, but if someone already had that problem with another software / segmenter combination and fixed it, I would like to know what the source of the problem was.
What we already tried:
changing almost every setting so that it sticks as close as possible to apple's recommendations
changing GOP structure, GOP length, encoding parameters which influence efficiency of encoding
analysis of the TS segments created by wowza, they are fine and begin all with keyframes
contact TMPG/Apple/Wowza support
So, did anyone stumble upon this problem before? Has anyone solved it?
EDIT:
It seems that TMPGEnc uses a defective x264 implementation. The mediastreamvalidator tool from Apple returned an error stating that our TS segment "does not contain any IDR access unit with a SPS and a PPS" - which it does actually, but obviously in the wrong places if that somehow matters.

Whatever tool you use for segmenting is not ensuring that the segment begins with an SPS+PPS+IDR. this could be an issue with your encoder or your segmenter. Basically, decoding can nor begin until all three of these thing are encountered in the player. Try using mediafilessegmenter and mediastreamvarifier from apple to analyze the problem.

Related

VideoToolbox HEVC decoding failing for iOS14 on device

So while I'm sure I'm not about to provide enough info for anyone to fix my specific code, what I am itching to know is this:
Does anyone know what might have happened to iOS14 to change HEVC decoding requirements??
I have a decoder built using VideoToolbox for an HEVC encoded video stream coming over the network, that was and is working fine on iOS 13 devices, and iOS 14 simulators. But it's failing most of the time in iOS 14 (up to 14.4 at time of writing) on iOS devices. "Most of the time", because sometimes it does just work, depending on where in the stream I'm trying to begin decoding.
An error I'm occasionally getting from my decompression output callback record is OSStatus -12909 – kVTVideoDecoderBadDataErr. So far, so unhelpful.
Or I may get no error output, like in a unit test which takes fixed packets of data in and should always generate video frames out. (This test likewise fails to generate expected frames when using iOS14 on devices.)
Anyone else had any issues with HEVC decoding in iOS 14 specifically? I'm literally fishing for clues here... I've tried toggling all the usual input flags for VTDecompressionSessionDecodeFrame() (._EnableAsynchronousDecompression, ._EnableTemporalProcessing, ...)
I've also tried redoing my entire rendering layer to use AVSampleBufferDisplayLayer with the raw CMSampleBuffers. It decodes perfectly!! But I can't use it... because I need to micromanage the timing of the output frames myself (and they're not always in order).
(If it helps, the fixed input packets I'm putting into my unit test include NALUs of the following types in order: NAL_UNIT_VPS, NAL_UNIT_SPS, NAL_UNIT_PPS, NAL_UNIT_PREFIX_SEI, NAL_UNIT_CODED_SLICE_CRA, and finally NAL_UNIT_CODED_SLICE_TRAIL_N and NAL_UNIT_CODED_SLICE_TRAIL_R. I took these from a working network stream at some point in the past to server as a basic sanity test.)
So this morning I came across a solution / workaround. It still sort of bears the original question of "what happened??" but here it is, may it help someone:
The kVTVideoDecoderBadDataErr error was occuring on all NALU packets of type RASL_R or RASL_N that were typically coming in from my video stream immediately after the first content frame (CRA type NALU.)
Simply skipping these packets (i.e. not passing them to VTDecompressionSessionDecodeFrame()) has resolved the issue for me and my decoder now works fine in both iOS 13 and 14.
The section on "Random Access Support" here says "RASL frames are ... usually discarded." I wonder if iOS 13 and earlier VideoToolbox implementations discarded these frames, while newer implementations don't, leaving it in this case up to the developer?

Transcoding fMP4 to HLS while writing on iOS using FFmpeg

TL;DR
I want to convert fMP4 fragments to TS segments (for HLS) as the fragments are being written using FFmpeg on an iOS device.
Why?
I'm trying to achieve live uploading on iOS while maintaining a seamless, HD copy locally.
What I've tried
Rolling AVAssetWriters where each writes for 8 seconds, then concatenating the MP4s together via FFmpeg.
What went wrong - There are blips in the audio and video at times. I've identified 3 reasons for this.
1) Priming frames for audio written by the AAC encoder creating gaps.
2) Since video frames are 33.33ms long, and audio frames 0.022ms long, it's possible for them to not line up at the end of a file.
3) The lack of frame accurate encoding present on Mac OS, but not available for iOS Details Here
FFmpeg muxing a large video only MP4 file with raw audio into TS segments. The work was based on the Kickflip SDK
What Went Wrong - Every once in a while an audio only file would get uploaded, with no video whatsoever. Never able to reproduce it in-house, but it was pretty upsetting to our users when they didn't record what they thought they did. There were also issues with accurate seeking on the final segments, almost like the TS segments were incorrectly time stamped.
What I'm thinking now
Apple was pushing fMP4 at WWDC this year (2016) and I hadn't looked into it much at all before that. Since an fMP4 file can be read, and played while it's being written, I thought that it would be possible for FFmpeg to transcode the file as it's being written as well, as long as we hold off sending the bytes to FFmpeg until each fragment within the file is finished.
However, I'm not familiar enough with the FFmpeg C API, I only used it briefly within attempt #2.
What I need from you
Is this a feasible solution? Is anybody familiar enough with fMP4 to know if I can actually accomplish this?
How will I know that AVFoundation has finished writing a fragment within the file so that I can pipe it into FFmpeg?
How can I take data from a file on disk, chunk at a time, pass it into FFmpeg and have it spit out TS segments?
Strictly speaking you don't need to transcode the fmp4 if it contains h264+aac, you just need to repackage the sample data as TS. (using ffmpeg -codec copy or gpac)
Wrt. alignment (1.2) I suppose this all depends on your encoder settings (frame rate, sample rate and GOP size). It is certainly possible to make sure that audio and video align exactly at fragment boundaries (see for example: this table). If you're targeting iOS, I would recommend using HLS protocol version 3 (or 4) allowing timing to be represented more accurately. This also allows you to stream audio and video separately (non-multiplexed).
I believe ffmpeg should be capable of pushing a live fmp4 stream (ie. using a long-running HTTP POST), but playout requires origin software to do something meaningful with it (ie. stream to HLS).

iOS SDK avcodec_decode_video Optimization

I've recently started a project that relies on streaming FLV directly to an iOS device. As most famous i went with ffmpeg (and an iOS wrapper - kxmovie). To my surprise iPhone 4 is incapable of playing even SD low-bitrate FLV videos. The current implementation i'm using is decoding the video/audio/sub frames in dispatch_async while loop and copies the YUV frame data to a object, where the object is parsed to 3 textures - Y/U/V (in case of RGB color space - just parse the data) and rendered on screen. After much trial and error, i've decided to kill the whole rendering pipeline and leave only the avcodec_decode_video2 function to run. Surprisingly the FPS did not improve and videos are still unplayable.
My question is: What can i do to improve the performance of avcodec_decode_video2?
Note:
I've tried a few commercial apps and they play the same file perfectly fine with no more than 50-60% cpu usage.
The library is based off the 1.2 branch and this is are the build args:
'--arch=arm',
'--cpu=cortex-a8',
'--enable-pic',
"--extra-cflags='-arch armv7'",
"--extra-ldflags='-arch armv7'",
"--extra-cflags='-mfpu=neon -mfloat-abi=softfp -mvectorize-with-neon-quad'",
'--enable-neon',
'--enable-optimizations',
'--disable-debug',
'--disable-armv5te',
'--disable-armv6',
'--disable-armv6t2',
'--enable-small',
'--disable-ffmpeg',
'--disable-ffplay',
'--disable-ffserver',
'--disable-ffprobe',
'--disable-doc',
'--disable-bzlib',
'--target-os=darwin',
'--enable-cross-compile',
#'--enable-nonfree',
'--enable-gpl',
'--enable-version3',
And according to Instruments the following functions take about 30% CPU usage each:
Running Time Self Symbol Name
37023.9ms 32.3% 13874,8 ff_h264_decode_mb_cabac
34626.2ms 30.2% 9194,7 loop_filter
29430.0ms 25.6% 173,8 ff_h264_hl_decode_mb
As it turns out, even with NEON support, FFmpeg is still executed on the CPU and thus it can't decode faster than that. There are apps that use ffmpeg and HW decoder, my best guess would be that they strip the header and feed Apple's AssetReader the raw h264 data.
just for the fun of it see what kind of performance you get from this, it does seem to play flv's quick but I have not tested it on the iPhone 4
https://github.com/mooncatventures-group/WebStreamX_flv_demo
You should use --enable-asm optimization parameter to boost performance for %10-15 more.
Also, you must install the latest gas-preprocessor.pl

DirectX.Capture FrameRates

I'm using DirectX.Capture library to save to an AVI fomr Webcam. I need video to be saved to have 50fps or more, but when i use this:
capture.FrameRate = 59.994;
FrameRate doesn't change at all. It had 30 before that line and passing that line it keeps its 30. I tried other values, even 20 and 10, and nothing changes.
What else should i do so i can be able to change that value? or it is something regarding my hardware and i can hope it works in other machine?
Please help me, i don't know what to do.
Thanx
The source material (video, app/etc), is probably only being updated at 30fps, either because that is the way the video codec or app behaves, or because you have vsync turned on in the target app (check vsync settings, it might be getting forced by the video card drivers if there is hardware acceleration). The behaviour of DirectX.Capture is probably to clamp to the highest available framerate from the source.
If you really want to make the video 50fps, capture it at its native rate (30/29.97) and just resample the video using some other software (note that this would be a destructive operation since 50 is not a clean multiple of 30). This will be no different from what DX capture would do if you could force it at 50fps (even if its nonsensical due to the source material being at a lower framerate). FYI most video files are between 25 and 30 FPS.

Capture, encode then stream video from an iPhone to a server

I've got experience with building iOS apps but don't have experience with video. I want to build an iPhone app that streams real time video to a server. Once on the server I will deliver that video to consumers in real time.
I've read quite a bit of material. Can someone let me know if the following is correct and fill in the blanks for me.
To record video on the iPhone I should use the AVFoundation classes. When using the AVCaptureSession the delegate method captureOutput:didOutputSampleBuffer::fromConnection I can get access to each frame of video. Now that I have the video frame I need to encode the frame
I know that the Foundation classes only offer H264 encoding via AVAssetWriter and not via a class that easily supports streaming to a web server. Therefore, I am left with writing the video to a file.
I've read other posts that say they can use two AssetWritters to write 10 second blocks then NSStream those 10 second blocks to the server. Can someone explain how to code the use of two AVAssetWriters working together to achieve this. If anyone has code could they please share.
You are correct that the only way to use the hardware encoders on the iPhone is by using the AVAssetWriter class to write the encoded video to a file. Unfortunately the AVAssetWriter does not write the moov atom to the file (which is required to decode the encoded video) until the file is closed.
Thus one way to stream the encoded video to a server would be to write 10 second blocks of video to a file, close it, and send that file to the server. I have read that this method can be used with no gaps in playback caused by the closing and opening of files, though I have not attempted this myself.
I found another way to stream video here.
This example opens 2 AVAssetWriters. Then on the first frame it writes to two files but immediately closes one of the files so the moov atom gets written. Then with the moov atom data the second file can be used as a pipe to get a stream of encoded video data. This example only works for sending video data but it is very clean and easy to understand code that helped me figure out how to deal with many issues with video on the iPhone.

Resources