iOS SDK avcodec_decode_video Optimization - ios

I've recently started a project that relies on streaming FLV directly to an iOS device. As most famous i went with ffmpeg (and an iOS wrapper - kxmovie). To my surprise iPhone 4 is incapable of playing even SD low-bitrate FLV videos. The current implementation i'm using is decoding the video/audio/sub frames in dispatch_async while loop and copies the YUV frame data to a object, where the object is parsed to 3 textures - Y/U/V (in case of RGB color space - just parse the data) and rendered on screen. After much trial and error, i've decided to kill the whole rendering pipeline and leave only the avcodec_decode_video2 function to run. Surprisingly the FPS did not improve and videos are still unplayable.
My question is: What can i do to improve the performance of avcodec_decode_video2?
Note:
I've tried a few commercial apps and they play the same file perfectly fine with no more than 50-60% cpu usage.
The library is based off the 1.2 branch and this is are the build args:
'--arch=arm',
'--cpu=cortex-a8',
'--enable-pic',
"--extra-cflags='-arch armv7'",
"--extra-ldflags='-arch armv7'",
"--extra-cflags='-mfpu=neon -mfloat-abi=softfp -mvectorize-with-neon-quad'",
'--enable-neon',
'--enable-optimizations',
'--disable-debug',
'--disable-armv5te',
'--disable-armv6',
'--disable-armv6t2',
'--enable-small',
'--disable-ffmpeg',
'--disable-ffplay',
'--disable-ffserver',
'--disable-ffprobe',
'--disable-doc',
'--disable-bzlib',
'--target-os=darwin',
'--enable-cross-compile',
#'--enable-nonfree',
'--enable-gpl',
'--enable-version3',
And according to Instruments the following functions take about 30% CPU usage each:
Running Time Self Symbol Name
37023.9ms 32.3% 13874,8 ff_h264_decode_mb_cabac
34626.2ms 30.2% 9194,7 loop_filter
29430.0ms 25.6% 173,8 ff_h264_hl_decode_mb

As it turns out, even with NEON support, FFmpeg is still executed on the CPU and thus it can't decode faster than that. There are apps that use ffmpeg and HW decoder, my best guess would be that they strip the header and feed Apple's AssetReader the raw h264 data.

just for the fun of it see what kind of performance you get from this, it does seem to play flv's quick but I have not tested it on the iPhone 4
https://github.com/mooncatventures-group/WebStreamX_flv_demo

You should use --enable-asm optimization parameter to boost performance for %10-15 more.
Also, you must install the latest gas-preprocessor.pl

Related

VideoToolbox HEVC decoding failing for iOS14 on device

So while I'm sure I'm not about to provide enough info for anyone to fix my specific code, what I am itching to know is this:
Does anyone know what might have happened to iOS14 to change HEVC decoding requirements??
I have a decoder built using VideoToolbox for an HEVC encoded video stream coming over the network, that was and is working fine on iOS 13 devices, and iOS 14 simulators. But it's failing most of the time in iOS 14 (up to 14.4 at time of writing) on iOS devices. "Most of the time", because sometimes it does just work, depending on where in the stream I'm trying to begin decoding.
An error I'm occasionally getting from my decompression output callback record is OSStatus -12909 – kVTVideoDecoderBadDataErr. So far, so unhelpful.
Or I may get no error output, like in a unit test which takes fixed packets of data in and should always generate video frames out. (This test likewise fails to generate expected frames when using iOS14 on devices.)
Anyone else had any issues with HEVC decoding in iOS 14 specifically? I'm literally fishing for clues here... I've tried toggling all the usual input flags for VTDecompressionSessionDecodeFrame() (._EnableAsynchronousDecompression, ._EnableTemporalProcessing, ...)
I've also tried redoing my entire rendering layer to use AVSampleBufferDisplayLayer with the raw CMSampleBuffers. It decodes perfectly!! But I can't use it... because I need to micromanage the timing of the output frames myself (and they're not always in order).
(If it helps, the fixed input packets I'm putting into my unit test include NALUs of the following types in order: NAL_UNIT_VPS, NAL_UNIT_SPS, NAL_UNIT_PPS, NAL_UNIT_PREFIX_SEI, NAL_UNIT_CODED_SLICE_CRA, and finally NAL_UNIT_CODED_SLICE_TRAIL_N and NAL_UNIT_CODED_SLICE_TRAIL_R. I took these from a working network stream at some point in the past to server as a basic sanity test.)
So this morning I came across a solution / workaround. It still sort of bears the original question of "what happened??" but here it is, may it help someone:
The kVTVideoDecoderBadDataErr error was occuring on all NALU packets of type RASL_R or RASL_N that were typically coming in from my video stream immediately after the first content frame (CRA type NALU.)
Simply skipping these packets (i.e. not passing them to VTDecompressionSessionDecodeFrame()) has resolved the issue for me and my decoder now works fine in both iOS 13 and 14.
The section on "Random Access Support" here says "RASL frames are ... usually discarded." I wonder if iOS 13 and earlier VideoToolbox implementations discarded these frames, while newer implementations don't, leaving it in this case up to the developer?

Low fps (choppy video), no sound, "No JPEG data found in image" when using webcam/hdmi-usb dongle

This took me too long to figure out, so in the hopes of helping anyone out there dealing with any of these issues, I wanted to post the solution. But first, the problems:
I bought one of those cheap HDMI->USB dongles and connected my PS3 as a video source. On vlc the image looked crisp, but I was getting no sound, and the video was really choppy. Checking the codec tab in the "info" section, I saw I was getting 1080p at 5 fps. I thought I got a defective dongle, but decided to check with other apps. tvtime/xawtv gave me great framerate, but low resolution that I couldn't change, cheese allowed me to set all the options, and I was getting good framerate, and good resolution (but no sound), and then I finally tried obs which gave me a perfect result. So clearly the dongle is fine, and the problem was with vlc.
See my answer below for the solution to all those problems (and more!)
I found, through much research and experimentation, that the reason the video was choppy in vlc was because it was using the default "chroma" of YUV2, which if I am not mistaken is uncompressed. (You can check your webcam/dongle's capabilities by running: v4l2-ctl --list-formats-ext -d /dev/video0 where /dev/video0 is your device)
The correct setting to overcome this is mjpg. However, that results in a flood of errors saying:
[mjpeg # 0x7f4e0002fcc0] No JPEG data found in image
This is caused by the fact that the default resolution and framerate (1080p#60fps) overwhelm what I guess is the mjpeg decoder. Setting it to 720p, or lowering the framerate to 30fps prevents the errors.
Next, the sound was missing, and this is due to the fact that I am using pulseaudio and vlc cannot figure out which source to use.
I found the pulse source by running:
pactl list short sources
which yielded:
alsa_input.usb-MACROSILICON_USB_Video-02.multichannel-input
You can test that this is the correct source by running:
vlc pulse://alsa_input.usb-MACROSILICON_USB_Video-02.multichannel-input
I found that to combine the v4l2 video source with the correct pulseaudio sink, you have to set the audio via the input-slave parameter to vlc, but unfortunately, that did not work for me as specified in the guides, and instead I had to set the video source as the slave. The final commands that worked for me were either of:
720p:
vlc pulse://alsa_input.YOUR-SOURCE-HERE-input --input-slave=v4l2:///dev/video0:chroma=mjpg:width=1280:height=720
1080p#30fps:
vlc pulse://alsa_input.YOUR-SOURCE-HERE-input --input-slave=v4l2:///dev/video0:chroma=mjpg:fps=30

Transcoding fMP4 to HLS while writing on iOS using FFmpeg

TL;DR
I want to convert fMP4 fragments to TS segments (for HLS) as the fragments are being written using FFmpeg on an iOS device.
Why?
I'm trying to achieve live uploading on iOS while maintaining a seamless, HD copy locally.
What I've tried
Rolling AVAssetWriters where each writes for 8 seconds, then concatenating the MP4s together via FFmpeg.
What went wrong - There are blips in the audio and video at times. I've identified 3 reasons for this.
1) Priming frames for audio written by the AAC encoder creating gaps.
2) Since video frames are 33.33ms long, and audio frames 0.022ms long, it's possible for them to not line up at the end of a file.
3) The lack of frame accurate encoding present on Mac OS, but not available for iOS Details Here
FFmpeg muxing a large video only MP4 file with raw audio into TS segments. The work was based on the Kickflip SDK
What Went Wrong - Every once in a while an audio only file would get uploaded, with no video whatsoever. Never able to reproduce it in-house, but it was pretty upsetting to our users when they didn't record what they thought they did. There were also issues with accurate seeking on the final segments, almost like the TS segments were incorrectly time stamped.
What I'm thinking now
Apple was pushing fMP4 at WWDC this year (2016) and I hadn't looked into it much at all before that. Since an fMP4 file can be read, and played while it's being written, I thought that it would be possible for FFmpeg to transcode the file as it's being written as well, as long as we hold off sending the bytes to FFmpeg until each fragment within the file is finished.
However, I'm not familiar enough with the FFmpeg C API, I only used it briefly within attempt #2.
What I need from you
Is this a feasible solution? Is anybody familiar enough with fMP4 to know if I can actually accomplish this?
How will I know that AVFoundation has finished writing a fragment within the file so that I can pipe it into FFmpeg?
How can I take data from a file on disk, chunk at a time, pass it into FFmpeg and have it spit out TS segments?
Strictly speaking you don't need to transcode the fmp4 if it contains h264+aac, you just need to repackage the sample data as TS. (using ffmpeg -codec copy or gpac)
Wrt. alignment (1.2) I suppose this all depends on your encoder settings (frame rate, sample rate and GOP size). It is certainly possible to make sure that audio and video align exactly at fragment boundaries (see for example: this table). If you're targeting iOS, I would recommend using HLS protocol version 3 (or 4) allowing timing to be represented more accurately. This also allows you to stream audio and video separately (non-multiplexed).
I believe ffmpeg should be capable of pushing a live fmp4 stream (ie. using a long-running HTTP POST), but playout requires origin software to do something meaningful with it (ie. stream to HLS).

HLS Streaming on IPad with IOS 7 / 8 causes 10 seconds freezed frame - no clue why

We have a problem with HLS h.264 mp4 on IPad devices using HLS streaming on IOS 7 and 8:
The first 9-15 seconds (the length of the first TS segment) only shows the first key frame (IDR) of the second TS segment, while the sound plays normally. When the second segment begins to play, the video continues as it should.
HLS segmenter is a wowza with 10 seconds segment length. The encoding software we use is TMPG, latest version (uses x264). The funny thing is, that handbrake, xmedia recode, adobe me deliver videos which work. I am aware of the fact, that this hints to a bug within our encoding software, but if someone already had that problem with another software / segmenter combination and fixed it, I would like to know what the source of the problem was.
What we already tried:
changing almost every setting so that it sticks as close as possible to apple's recommendations
changing GOP structure, GOP length, encoding parameters which influence efficiency of encoding
analysis of the TS segments created by wowza, they are fine and begin all with keyframes
contact TMPG/Apple/Wowza support
So, did anyone stumble upon this problem before? Has anyone solved it?
EDIT:
It seems that TMPGEnc uses a defective x264 implementation. The mediastreamvalidator tool from Apple returned an error stating that our TS segment "does not contain any IDR access unit with a SPS and a PPS" - which it does actually, but obviously in the wrong places if that somehow matters.
Whatever tool you use for segmenting is not ensuring that the segment begins with an SPS+PPS+IDR. this could be an issue with your encoder or your segmenter. Basically, decoding can nor begin until all three of these thing are encountered in the player. Try using mediafilessegmenter and mediastreamvarifier from apple to analyze the problem.

optimize upload videos in different signal strength

I have a question, my app is a short video share application just like vine, but now I encounter questions when used in subway or some places with weak signals, it will fail sometimes and have poor user experience.
I am a newbie for network programming and iOS. I did a lot search on Google, and have some general sense, let me sum up my finds and pls help to give some suggestions for it.
My requirement is:1. support resume when uploading interrupt. 2. can success upload in weak signal. Actually I do NOT need to think about the realtime problems or how to compress the video, just think the video as a file is totally ok. BTW the server is a REST style, I use post to upload datas.
Questions:
which is the better way for my requirement, using stream(stream NOT mean live stream video just data stream like NSOutStream&NSInputStream, just play the video after all of it has uploaded, NOT the live stream video playing and downloading at meantime) or divide the whole file into several chunks and upload chunk by chunk.
someone said, using stream is good for resource efficiency since the stream will read files into memory and control the size of the buffer and after setup connection with server we use delegate to control the failure so easy to use.
Upload chunk by chunk is good at speed, I have puzzled with this statement, upload by chunks after successfully upload one chunk we need to release the connection resources and setup another connection then do upload I think this will spend time to do these preparation stuffs.
If upload by chunks which size should be good, one video file is almost 1M bytes, someone said 8k is a safe choice, but......
since the app needs to adapt to different signal strength, is there any way? for example the chunk size is depended on the bandwidth or other ways
Is there any private API already support resume uploading interrupt or is there any apple api can support this, my app needs to run on iOS 5 and above so can NOT use NSURLSession
Concurrent uploading is a way to speed up? If so how to implement or any API available?
Thank you in advance for helping a newbie like me. Thank you very much.
It takes o lot of topics your question. iOS doesn't have an public API to stream video (such as the face time components). The main issue here is sending frame by frame will require a lot of network traffic, instead if you use the normal video writer you get hardware compression, that will be a lot better. There's more and you can check here: Realtime Audio/Video Streaming FROM iPhone to another device (Browser, or iPhone), Upload live streaming video from iPhone like Ustream or Qik, How send to stream video from iOS device to server? and here
If real time is not your problem I would suggest you just to use a good network manager such as: MKNetworkkit or AFNetworking 2.0 . They will take care of most of the aspect that you asked.

Resources