Reduce/remove buffer lag on <video> element (iOS) - ios

We have an FFMPEG stream being streamed to mobile devices. We're using the HTML5 <video src="..." webkit-playsinline> tag to display the video inline (inside a real-time streaming app). We've managed to reduce the delay at the FFMPEG end down to the minimum but there's still a lag at the iOS end, where the player presumably buffers for a couple of seconds.
Is there any way to reduce the client-side delay?
We need as close to real-time as possible and skipping is acceptable.

If you are using an HTML5 video tag then the iOS device will use Quicktime to playback the video. Apple offers no control over internal mechanism like buffer settings for its Quicktime player. For a project on Apple TV I even work with a guy in Cupertino at Apple and they just won't allow any access to the information you would require on their device.
Typically if you use HLS:
Is this a real-time delivery system?
No. It has inherent latency corresponding to the size and duration of the media files containing stream segments. At least one segment must fully download before it can be viewed by the client, and two may be required to ensure seamless transitions between segments. In addition, the encoder and segmenter must create a file from the input; the duration of this file is the minimum latency before media is available for download. Typical latency with recommended settings is in the neighborhood of 30 seconds.
What is the latency?
Approximately 30 seconds, with recommended settings. See question #15.
For live streaming scenario on iOS you better off tuning the streaming chain before the actual player:
capture -> transcoding -> upload -> streaming server -> delivery -> playback
Using ffmpeg you can tune for zero lantency streaming at transcoding level which I understand you have done. After that using a well established streaming server like Wowza and CDN delivery will help you get there (of course at a certain cost - and assuming you need a streaming server which you may not).
If you go all native for your iOS app you may look at MPMoviePlayerController. I have no experience with native app code in iOS so I let you decide if it is worth the time (still I doubt it will be possible because of the underlying Quicktime/HLS layer).
I also came across this which sounds interesting but I have not tested it and even with such an approach you will face limitations.
Even if it may not be the answer you were looking for I hope this helps.

Related

What options exist to stream video from iOS to browser?

I'm looking for a way to implement real-time streaming of video (and optionally audio) from iOS device to a browser. In this case iOS device is a server and browser is a client.
Video resolution must be in the range 800x600-1920x1080. Probably the most important criteria is lag that should be less than 500 msec.
I've tried a few approaches so far.
1. HLS
Server: Objective-C, AVFoundation, UIKit, custom HTTP-server implementation
Client: JS, VIDEO tag
Works well. Streams smoothly. The VIDEO tag in the browser handles incoming video steam out of the box. This is great! However, it has lags that are hard to minimize. It feels like this protocol was built for non-interactive video streaming. Something like twitch where a few seconds of lag is fine.
Tried Enabling Low-Latency. A lot of requests. A lot of hassle with the playlist. Let me know if this is the right option and I have to push harder in this direction.
2. Compress every frame into JPEG and send to a browser via WebSockets
Server: Objective-C, AVFoundation, UIKit, custom HTTP-server implementation, WebSockets server
Client: JS, rendering via IMG tag
Works super-fast and super-smooth. Latency is 20-30 msec! However, when I receive a frame in a browser, I have to load it using loading from a Blob field via base64 encoded URL. At the start, all of this works fast and smoothly, but after a while, the browser starts to slow down and lags. Not sure why I haven't investigated too deeply yet. Another issue is that frames compressed as JPEGs are much larger (60-120kb per frame) than MP4 video stream of HLS. This means that more data is pumped through WiFi, and other WiFi consumers are starting to struggle. This approach works but doesn't feel like a perfect solution.
Any ideas or hints (frameworks, protocols, libraries, approaches, e.t.c.) are appreciated!
HLS
… It feels like this protocol was built for non-interactive video streaming …
Yep, that's right. The whole point of HLS was to utilize generic HTTP servers as media streaming infrastructure, rather than using proprietary streaming servers. As you've seen, several tradeoffs are made. The biggest problem is that media is chunked, which naturally causes latency of at least the size of the chunk. In practice, it ends up being the size of a couple chunks.
"Low latency" HLS is a hack to return to the methods we had before HLS, with servers that just stream content from the origin, in a way compatible with all the HLS stuff we have to deal with now.
Compress every frame into JPEG and send to a browser via WebSockets
In this case, you've essentially recreated a video codec, and added the overhead of Web Sockets. Also, with the base64 encoding rather than sending it binary, you're adding extra CPU and memory requirements, as well as ~33% overhead in bandwidth.
If you really wanted to go this route, you could simply use MediaRecorder, an HTTP PUT request, stream the output of the recorder, send it to the server, to relay on to the client over HTTP. The client then just needs a <video> tag referencing some URL on the server, and nothing special to playback. You'll get nice low latency without all the overhead and hassle.
However, don't go that route. Suppose the bandwidth drops out? What if some packets are lost and you need to re-sync? How will you set up communication between each end to continually adjust quality, buffering, codec negotiation, etc.? What if peer-to-peer connections are advantageous?
Use WebRTC
It's a full purpose-built stack for maintaining low latency. Libraries are available for most any stack on most any platform. It works in browsers.
Rather than reinventing all of this, you can take advantage of what's there.
The downside is complexity... it isn't easy to get started with, but well worth it for most low latency use cases.

How to reduce initial buffering time of AVPlayer to play video?

Hello Friends,
I am working on OTT platform app, I need to play video very smoothly without any delay like Snapchat and instagram as reference. I am using Cloudinary for uploading videos and everything is working good but at first time, AVPlayer takes time of 1-2 second to start video, which is bad thing for Me. Once video play, next time I come on same video it plays smoothly with less delay of max Half second.
As far as I tried to learn through different blogs and stack over flow answers, I get rid this is default AVPlayer Buffering time and it depends of video durations and its fetching video information like title, metadata etc. But I don't have to use these information anywhere.
I tried to set false this property of AVPlayer .automaticallyWaitsToMinimizeStalling = false, but still no luck.
I tried few solutions from StackOverflow posts, but didn't get success
How to reduce iOS AVPlayer start delay
This is demo video Link Which you can try http://res.cloudinary.com/dtzhnffrp/video/upload/v1621013678/1on1/bgasthklqvixukocv6xy.mov
If you can suggest, what I can use for OTT platforms to play video smoothly really grateful to everyone...
Thanks In Advance
Most streaming services use ABR, which creates multiple resolution copies of the video and beaks each into 2-10 second, typicaLLY, chunks.
One benefit of ABR is that to speed up playback start up, the video can start on a lower resolution bit rate and then 'step up' to higher bit rates as it proceeds.
You can often see this on popular streaming services where you will see the video quality is lower when the video starts and improves after a short time.
See here for more on ABRs: https://stackoverflow.com/a/42365034/334402
This requires you to do work on the server side to prepare the video for HLS and DASH streaming, the two most common ABR streaming protocols.
Typically dedicated streaming servers, or a combination of encoders and packagers, are use to prepare and serve the ABR streams. There are also cloud services, for example AWS Media Services or Azure Media Services, which allow on demand streaming models.
You can make the videos smaller either by reducing the dimensions or by compressing it more. Both of these have the effect of lowering startup time - but will sacrifice quality in exchange.
Cloudinary will create ABR versions for you, but the last I checked, you pay for each version created.

How to build a simple video streaming server?

I am a newbie in video streaming and I just build a sample website which plays videos. Here i just give the video file location to the video tag in html5. I just noticed that in youtube the video tag contains the blob url and had a look into this. I found that the video data comes in segments and came across a term called pseudo streaming. Whereas it seems likes the website that i build downloads the whole file and plays the video. I am not trying to do any live streaming, just trying to stream local videos. I thought maybe the way video data is received in segments is done by a video streaming server. I came across RED5 open source streaming server, but most of the examples that is given is for live streaming which I am not experimenting on. Its been few days and I am not sure whether i am on the right track
The segmented approach you refer to is to support Adaptive Bit Rate streaming - ABR.
ABR allows the client device or player download the video in chunks, e.g 10 second chunks, and select the next chunk from the bit rate most appropriate to the current network conditions. See here for an example:
https://stackoverflow.com/a/42365034/334402
For your existing site, so long as your server supports range requests then you probably are not actually downloading the whole video. With Range Requests, the browser or player will request just part of the file at a time so it can start playback before the whole file is downloaded.
For MP4 files, it is worth noting that you need to have the header information, which is contained in a 'block' or 'atom' called MOOV atom, at the start of the file rather than the end - it is at the end for regular MP4 files. There are a number of tools which will allow you move it to the start - e.g.:
http://multimedia.cx/eggs/improving-qt-faststart/
You are definitely on the right track with your investigations - video hosting and streaming is a specialist area so it is generally easier to leverage existing streaming technologies and services rather than to build them your self. Some good places to look to get a feel for open source solutions:
https://gstreamer.freedesktop.org
http://www.videolan.org/vlc/streaming.html

How does HLS video on iOS pick which rendition to start out with?

I heard from a WWDC video that it measures the speed of previous HLS downloads to pick which rendition to use, but how does it choose which one to use at the very start? Is the download speed of the file for the list of renditions or the download speed of the file for a specific rendition used at all? I want to make sure that I'm not tricking the video player into using too high quality of a rendition by loading metadata files instantly from the cache.
It picks the first entry:
The first entry in the variant playlist will be played at the initiation of a stream and is used as part of a test to determine which stream is most appropriate. The order of the other streams is irrelevant. Therefore, the first bit rate in the playlist should be the one that most clients can sustain.
From the Bit rate recommendations section of Apple's Technical Note TN2224:
Best Practices for Creating and Deploying HTTP Live Streaming Media for the iPhone and iPad

How to synchronize audio playback on 2 or more iOS devices?

I would like to write a web application that allows me to sync audio playback of an MP3 down to ~50ms, or close enough that the human ear can't detect the difference.
The idea would be that two or more smartphones could each be paired to a bluetooth speaker, and two or more speakers would play the same audio at the exact same time.
How would you suggest I go about setting this up, both client-side and server-side? I'm planning to use Rails/Ruby for backend, and iOS/obj c for mobile dev.
I had though of the idea of syncing to a global/atomic clock on the server, and having the server provide instructions to clients on when to start playing/jump in to an already playing track. My concern is that, if I want to stream the audio, that it will be impossible to load a song into memory and start playback accurately on the millisecond level.
Thoughts?
The jitter in internet packet delivery will be too large, so forget about syncing over the internet. However you could check the accuracy of NTP which is still used (I guess, I know that older UNIX's used it) by the OS when you switch on automatic date/time in Settings, but my guess is that it won't be good enough either. But perhaps the OS may also use other time sources like GPS; I'm don't know how iOS does it but accuracy within 20ms is not to be expected. You could create experimental app to check it out.
So, what's left is a sync closer to home, meaning between the devices directly. Of course you need to make sure that all devices haves loaded (enough of) the song, and have preloaded it in AVAudioPlayer or whatever you're using, to be able to start playing immediately. (It may actually not be the best idea to use higher level 'AVAudioPlayer` API's as it may give higher delays, and what more important higher jitter, than lower level API's.)
Here are three ideas (one device needs to be master triggering the start play, the others are slaves that are waiting for the trigger):
Use an audio trigger pulse, like a high tone of a defined length and frequency. Then use FFT to recognise this tone.
Connect the devices via GameKit Bluetooth and transmit the trigger on these connections.
Use the iPhone 4+ flash as trigger: flash in a certain pattern. This would require you to sample the video data which is quite doable and can be very fast.
I'm going with a solution that uses an atomic clock for synchronization, and an external service that allows server instructions/messages to be sent to all devices in close sync.

Resources