I am working on an iOS app that uses WebRTC's native SDK to provide access to streams from different cameras. The codec used is H264/AVC.
Although most camera streams work perfectly fine, there are some that consistently freeze when the streams are first launched. It looks like the frames are not being decoded but I am not sure how to go about fixing it.
When I enable debug logging, I see a lot of the following in WebRTC's logs:
(rtp_frame_reference_finder.cc:240): Generic frame with packet range [21170, 21170] has no GoP, dropping frame.
(rtp_frame_reference_finder.cc:240): Generic frame with packet range [21169, 21169] has no GoP, dropping frame.
(video_receive_stream.cc:699): No decodable frame in 200 ms, requesting keyframe.
(video_receive_stream.cc:699): No decodable frame in 200 ms, requesting keyframe.
When there is a freeze, VideoBroadcaster::OnFrame on video_broadcaster.cc is never called which prevents the entire stream flow from starting. When I test on Xcode and pause/unpause the debugger, almost always the stream will starting working and I see VideoBroadcaster::OnFrame getting fired and frames start being decoded. So somehow the pause/unpause process fixes the issue and kicks off the stream.
On the iOS SDK that the encoders are never setup. I have used the RTCVideoEncoderFactoryH264 encoder provided by the SDK. I have provided implementations for the interface/protocol RTCVideoEncoderFactory and also tried overriding the encoders in the SDK. In all of these cases, the createEncoder() function is never called. There are no issues with the decoder however, it sets up correctly.
In the RTCInboundRTPVideoStream stats report, PLICount and NACKCount are steadily increasing. My understanding is that the receiver is letting the other peer know there is picture loss in the encoded video.
Since I don't know what exactly is preventing the frames from being decoded, I would like to restart to the stream when I PLICount or NACKCount increasing.
How can I do that without going through the whole SDP offer/answer process? The only way I see is to toggle the isEnabled flag on RTCMediaStreamTrack but that doesn't fix the problem for me.
Are there any encoding/decoding parameters I can update to restart the stream?
What could be the reason for pausing/unpausing the debugger fixing the issue?
Related
I'm looking for a way to implement real-time streaming of video (and optionally audio) from iOS device to a browser. In this case iOS device is a server and browser is a client.
Video resolution must be in the range 800x600-1920x1080. Probably the most important criteria is lag that should be less than 500 msec.
I've tried a few approaches so far.
1. HLS
Server: Objective-C, AVFoundation, UIKit, custom HTTP-server implementation
Client: JS, VIDEO tag
Works well. Streams smoothly. The VIDEO tag in the browser handles incoming video steam out of the box. This is great! However, it has lags that are hard to minimize. It feels like this protocol was built for non-interactive video streaming. Something like twitch where a few seconds of lag is fine.
Tried Enabling Low-Latency. A lot of requests. A lot of hassle with the playlist. Let me know if this is the right option and I have to push harder in this direction.
2. Compress every frame into JPEG and send to a browser via WebSockets
Server: Objective-C, AVFoundation, UIKit, custom HTTP-server implementation, WebSockets server
Client: JS, rendering via IMG tag
Works super-fast and super-smooth. Latency is 20-30 msec! However, when I receive a frame in a browser, I have to load it using loading from a Blob field via base64 encoded URL. At the start, all of this works fast and smoothly, but after a while, the browser starts to slow down and lags. Not sure why I haven't investigated too deeply yet. Another issue is that frames compressed as JPEGs are much larger (60-120kb per frame) than MP4 video stream of HLS. This means that more data is pumped through WiFi, and other WiFi consumers are starting to struggle. This approach works but doesn't feel like a perfect solution.
Any ideas or hints (frameworks, protocols, libraries, approaches, e.t.c.) are appreciated!
HLS
… It feels like this protocol was built for non-interactive video streaming …
Yep, that's right. The whole point of HLS was to utilize generic HTTP servers as media streaming infrastructure, rather than using proprietary streaming servers. As you've seen, several tradeoffs are made. The biggest problem is that media is chunked, which naturally causes latency of at least the size of the chunk. In practice, it ends up being the size of a couple chunks.
"Low latency" HLS is a hack to return to the methods we had before HLS, with servers that just stream content from the origin, in a way compatible with all the HLS stuff we have to deal with now.
Compress every frame into JPEG and send to a browser via WebSockets
In this case, you've essentially recreated a video codec, and added the overhead of Web Sockets. Also, with the base64 encoding rather than sending it binary, you're adding extra CPU and memory requirements, as well as ~33% overhead in bandwidth.
If you really wanted to go this route, you could simply use MediaRecorder, an HTTP PUT request, stream the output of the recorder, send it to the server, to relay on to the client over HTTP. The client then just needs a <video> tag referencing some URL on the server, and nothing special to playback. You'll get nice low latency without all the overhead and hassle.
However, don't go that route. Suppose the bandwidth drops out? What if some packets are lost and you need to re-sync? How will you set up communication between each end to continually adjust quality, buffering, codec negotiation, etc.? What if peer-to-peer connections are advantageous?
Use WebRTC
It's a full purpose-built stack for maintaining low latency. Libraries are available for most any stack on most any platform. It works in browsers.
Rather than reinventing all of this, you can take advantage of what's there.
The downside is complexity... it isn't easy to get started with, but well worth it for most low latency use cases.
So while I'm sure I'm not about to provide enough info for anyone to fix my specific code, what I am itching to know is this:
Does anyone know what might have happened to iOS14 to change HEVC decoding requirements??
I have a decoder built using VideoToolbox for an HEVC encoded video stream coming over the network, that was and is working fine on iOS 13 devices, and iOS 14 simulators. But it's failing most of the time in iOS 14 (up to 14.4 at time of writing) on iOS devices. "Most of the time", because sometimes it does just work, depending on where in the stream I'm trying to begin decoding.
An error I'm occasionally getting from my decompression output callback record is OSStatus -12909 – kVTVideoDecoderBadDataErr. So far, so unhelpful.
Or I may get no error output, like in a unit test which takes fixed packets of data in and should always generate video frames out. (This test likewise fails to generate expected frames when using iOS14 on devices.)
Anyone else had any issues with HEVC decoding in iOS 14 specifically? I'm literally fishing for clues here... I've tried toggling all the usual input flags for VTDecompressionSessionDecodeFrame() (._EnableAsynchronousDecompression, ._EnableTemporalProcessing, ...)
I've also tried redoing my entire rendering layer to use AVSampleBufferDisplayLayer with the raw CMSampleBuffers. It decodes perfectly!! But I can't use it... because I need to micromanage the timing of the output frames myself (and they're not always in order).
(If it helps, the fixed input packets I'm putting into my unit test include NALUs of the following types in order: NAL_UNIT_VPS, NAL_UNIT_SPS, NAL_UNIT_PPS, NAL_UNIT_PREFIX_SEI, NAL_UNIT_CODED_SLICE_CRA, and finally NAL_UNIT_CODED_SLICE_TRAIL_N and NAL_UNIT_CODED_SLICE_TRAIL_R. I took these from a working network stream at some point in the past to server as a basic sanity test.)
So this morning I came across a solution / workaround. It still sort of bears the original question of "what happened??" but here it is, may it help someone:
The kVTVideoDecoderBadDataErr error was occuring on all NALU packets of type RASL_R or RASL_N that were typically coming in from my video stream immediately after the first content frame (CRA type NALU.)
Simply skipping these packets (i.e. not passing them to VTDecompressionSessionDecodeFrame()) has resolved the issue for me and my decoder now works fine in both iOS 13 and 14.
The section on "Random Access Support" here says "RASL frames are ... usually discarded." I wonder if iOS 13 and earlier VideoToolbox implementations discarded these frames, while newer implementations don't, leaving it in this case up to the developer?
I'm currently experiencing an intermittent issue with some VOIP WebRTC voice calls.
The symptom is that the outbound audio can sometimes fade in and out and sounds extremely muffled or even disappears momentarily. The 2 audio files reference here show examples or a snippet from a good call and then a bad call, both very close together. The audio is captured server side.
Good quality call - https://s3-eu-west-1.amazonaws.com/audio-samples-mlcl/Good.mp3
Poor quality call - https://s3-eu-west-1.amazonaws.com/audio-samples-mlcl/Poor.mp3
The tech stack is comprised of…
Electron application running on Mac/Windows
Electron wraps Chromium v66
WebRTC used within Chromium
OPUS codec used from client to server.
Wired network connection (stats show no packet loss and Jitter, RTT and delay are all very low)
SRTP used for media between client and TURN server (Coturn)
Cotur
Janus WebRTC Gateway
Freeswitch
These are using high-quality headsets and have been tested with various different manufacturers connecting to the Mac/Windows using USB.
Any ideas/help would be greatly be appreciated.
This might be a result of auto gain control. Try disabling it by passing autoGainControl: false to getUserMedia. In Chrome/electron googAutoGainControl and googAutoGainControl2 might still work.
My iOS app uses AVPlayer to decode H.264 videos with AAC audio tracks out of local device storage. Content with bit rate spikes cause audio to drop shortly (less than a second) after the spike is played, yet video playback continues normally. Playing the videos through Safari seems to work fine, and this behavior is repeatable on several models of iPhones ranging from 6s through 8 plus.
I've been looking for any messages generated, delegates called with error information, or interesting KVOs, but there's been no helpful information so far. What might I do to get some sort of more detailed information that can point me in the right direction?
Turned out that the AVPlayer was configured to utilize methods for loading data in a custom way. The implementation of these methods failed to follow the pattern of satisfying the requests completely. (Apple docs are a vague about this.) The video portion of the AVPlayer asked for more data repeatedly, so eventually all its data got pulled. However, the audio portion patiently waited for the data to come in because there were neither an error state reported nor was all the data provided -- the presumption being that it was pending.
So, in short, sounds like there's provisions in the video handling code to treat missing data as a stall of some form and to plow onward, whereas audio doesn't have that feature. Not a bad design -- if audio cuts out it's very noticeable, and it's also by far the smaller stream so it's much less likely.
Despite spending quite a few days on the problem before posting, the lack of any useful signals made it hard to chase down the problem. I eventually reasoned that if there's no error in producing output from the stream, the problem must be in the delivery of the stream, and the problem revealed itself once I started tweaking the data loading code.
am newbie for multimedia work.i want to capture audio by samples and transfer to some other ios device via network.how to start my work??? .i have just gone through apple multi media guide and speakhere example ,it is full of c++ code and they are writing in file and then start services ,but i need buffer...please help me to start my work in correct way .
Thanks in advance
I just spent a bunch of time working on real time audio stuff you can use AudioQueue but it has latency issues around 100-200ms.
If you want to do something like the t-pain app, you have to use
RemoteIO API
Audio Unit API
They are equally difficult to implement, so I would just pick the remote IO path.
Source can be found here:
http://atastypixel.com/blog/using-remoteio-audio-unit/
I have upvoted the answer above, but I wanted to add a piece of information that took me a while to figure out. When using AudioQueue for recording, the intuitive notion is that the callback is done in regular intervals of whatever the number of samples represent. That notion is incorrect, AudioQueue seems to gather the samples for a long period of time, then deliver them in very fast iterations of the callback.
In my case, I was doing 20ms samples, and receiving 320 samples per callback. When printing out the timestamps for the call, I noticed a pattern of: 1 call every 2 ms, then after a while one call of ~180ms. Since I was doing VoIP, this presented the symptom of an increasing delay on the receiving end. Switching to Remote I/O seems to have solved the issue.