I have stream of numpy arrays on input and then using OpenCV I am converting it into jpgs. Then I am streaming jpegs to the browser with multipart/x-mixed-replace; boundary=frame mimetype. Everything runs on local machine.
I want to replace jpg stream with Rtsp stream. Which libs/tools/tech stack should I use?
I'm using ffmpeg to read an h264 RTSP stream from a Cisco 3050 IP camera and reencode it to disk as h264 (there are reasons why I'm not just using -codec:copy).
The ffmpeg version is as follows:
ffmpeg version 3.2.6 Copyright (c) 2000-2017 the FFmpeg developers
built with gcc 6.3.0 (Alpine 6.3.0)
I've also tried with ffmpeg 2.8.14-0ubuntu0.16.04.1 and the latest ffmpeg built from source (I used this commit) and see the same behaviour as below.
The command I'm running is:
ffmpeg -rtsp_transport udp -i 'rtsp://<user>:<pw>#<ip>:554/StreamingSetting?version=1.0&action=getRTSPStream&ChannelID=1&ChannelName=Channel1' -r 10 -c:v h264 -crf 23 -x264-params keyint=60:min-keyint=60 -an -f ssegment -segment_time 60 -strftime 1 /output/%Y%m%d_%H%M%S.ts -abort_on empty_output
I get a variety of errors at a fairly steady rate of at least one per second. Here's a sample:
[rtsp # 0x7f268c5e9220] max delay reached. need to consume packet
[rtsp # 0x7f268c5e9220] RTP: missed 40 packets
[h264 # 0x55b1e115d400] left block unavailable for requested intra mode
[h264 # 0x55b1e115d400] error while decoding MB 0 12, bytestream 114567
[h264 # 0x55b1e115d400] concealing 3889 DC, 3889 AC, 3889 MV errors in I frame
The most common one is 'error while decoding MB x x, bytestream x'. This corresponds to severe corruption in the video file when played back.
I see many references to that error message on stackoverflow and elsewhere, but I've yet to find a satisfying explanation or workaround. It comes from this line which appears to correspond to missing data at the end of the stream. 'left block unavailable' comes from here and also looks like missing data.
Others have suggested using -rtsp_transport tcp instead (1, 2, 3) which in my case just gives a slightly different mix of errors, and still video corruption:
[h264 # 0x557923191b00] left block unavailable for requested intra4x4 mode -1
[h264 # 0x557923191b00] error while decoding MB 0 28, bytestream 31068
[h264 # 0x557923191b00] concealing 2609 DC, 2609 AC, 2609 MV errors in I frame
[rtsp # 0x7f88e817b220] CSeq 5 expected, 0 received.
Using Wireshark I confirmed that in both UDP and TCP mode, all of the packets are making it from the camera to the PC (sequential RTP sequence numbers without any missing) which makes me think the data is being lost after it arrives at ffmpeg.
I also see similar behaviour when running the same command against a Panasonic WV-SFV110 camera, but with less frequent errors overall. Switching from UDP to TCP on the Panasonic camera reduces but does not completely eliminate the errors/corruption.
I also tried a similar command with VLC and got similar errors (cvlc rtsp://<user>:<pw>#<ip>/MediaInput/h264 :sout='#transcode{vcodec=h264}:std{access=file, mux=ts, dst="output.ts"}) -- presumably the code hasn't diverged much since libav forked from ffmpeg.
The camera is plugged directly into a PoE port on the PC so network congestion can't be a problem. Given that the PC has enough CPU to keep up encoding the live stream, it seems to me a problem with ffmpeg that it still drops data from the TCP stream.
Qualitatively, there are several factors which seem to make the problem worse:
Higher video resolution
Higher system load on the machine running ffmpeg (e.g. transcoding to a low res .avi file produces fewer errors than transcoding to h264 VBR; using -codec:copy eliminates all errors except a couple while ffmpeg is starting up)
Greater motion within the camera view
What the does the error mean? And what can I do about it?
Looking at the initial error message:
[rtsp # 0x7f268c5e9220] max delay reached. need to consume packet
[rtsp # 0x7f268c5e9220] RTP: missed 40 packets
I guess that you are loosing UDP packets. The rest of the H.264 error messages are caused by receiving an incomplete bitstream.
Now key is to isolate the issue. Is your network dropping packets? Or is your sever too slow or overloaded receiving the UDP (RTP).
First I'd check the UDP buffer size of your OS. https://access.redhat.com/documentation/en-US/JBoss_Enterprise_Web_Platform/5/html/Administration_And_Configuration_Guide/jgroups-perf-udpbuffer.html
If increasing the UDP buffer size doesn't help - use ffmpeg with -codec:copy to lower the CPU load. Do you still get errors?
Since you want to reencode consider using Intel Quicksync -vcodec h264_qsv or some other hardware encoder lowering your CPU load.
The question is not so much about if the PC has enough CPU. But more about identifying the bottle neck in the processing pipeline. Your H.264 encoder (x264) may over subscribe your CPU so that you get momentary peak loads that result in packet drops. Try limiting the number of threads for x264 and/or lower the quality to 'fast' or 'faster'.
It does sound like packet loss is an issue. Higher video resolution and greater motion both increase the bitrate of the encoded video stream which will increase your packet loss. Depending on which packet is lost, you will see varying errors in the decoding process as you indicated in your post.
The higher system load running ffmpeg also indicates that your network card might be dropping packets, when e.g. ffmpeg takes too long to read them while it is busy transcoding the video.
First question is what is your network topology? Streaming over the public Internet is a lot harder than streaming over your LAN. What kind of switches/routers are in the network?
Next question, what bitrate is your camera streaming at? Try reducing this and check the results. Be systematic in your approach i.e.
don't transcode at first.
just receive the video.
write it to file.
Check for packet loss/video artifacts.
start at lower bitrates e.g. 100kbps and increase this if no loss is evident
The next thing I would try to do is to increase the size of the receiver buffers. While I am not that familiar with ffmpeg, it looks like you can set it via recv_buffer_size as indicated here. You then need to work out a reasonably big enough size based on your camera configuration to store e.g. a couple (5?) of seconds of video data. Check if there are less artifacts as you increase the receiver buffer size or longer periods without artifacts.
Of course if your processor is too slow to transcode the video in real-time, you will run out of space sooner or later, in which case, you might have to transcode to a lower resolution/bitrate or use less intensive encoder settings, etc or run the transcoding on a faster machine.
Also, note that adjusting receiver buffer size will not compensate for packet loss occurring on the public Internet so the above will help assuming you're streaming on a local network that supports the bitrate of the camera. If you exceed the bandwidth of the network you can expect packet loss. In that case streaming over TCP could help somewhat (at least until the receiver buffer overruns eventually).
More things you can try if the above does not help or solve the problem completely:
Sniff the incoming traffic with wireshark or tcpdump.
Have a look at the traces. Filter the trace using "RTSP".
You should be able to see the RTP traffic where consecutive RTP packets have increasing sequence numbers e.g. 20, 21, 22, 23, etc. If you see missing sequence numbers, then you've got packet loss and try streaming over TCP. Repeat the trace while streaming over TCP. Also, remember to increase the receiver buffer size also when streaming over TCP.
In summary you have a pipeline architecture and you need to determine where in the pipeline the loss is occurring:
camera -> network -> receiver buffer (OS) -> application (ffmpeg)
I have a video source which gives me a raw h264 stream. I need to re-stream this live input in a way it is cross-compatible and playable without any plugin. I've tried using ffmpeg+ffserver to produce a fragmented mp4, but unfortunately my iPhone isn't playing it.
Is there a way to make it (raw h264 in mp4 container) playable in iOS's Safari, or maybe another cross-platform container?
Ps: i'm using a raspberry pi 3 to host ffmpeg processes, so i'm avoiding re encoding tasks; instead i'm just trying to fit my raw h264 in a "ios-compatible" container and make it accessible through a media server.
For live streams you must use HTTP Live Streaming (HLS) with either the traditional MPEG-TS or fMP4 for newer iOS versions (see Apple HLS examples).
With FFmpeg you can use the hls muxer. The hls_segment_type option is used to choose between mpegts and fmp4.
I am developing a Live Streaming application where mobile application (iOS/Android) records the video and Audio and then encodes raw pixels to h.264 and AAC using VideoToolBox and AudioToolBox, these encoded pixels are converted to PES (Packetized Elementary Stream) separately (Video & Audio). Now we are stuck at what to transfer to the server either PES or MPEG-TS, which one gives the minimum latency and packet loss like Periscope, Meerkat, Qik, UStream and other live streaming applications.
For transmitting which networking protocol is best suitable TCP/UDP.
And what is required at server to receive these packets. I know FFMPEG will trans-code and generates the segmented files(.ts) and .m3u8 file for HLS streaming, but do we need any pipe before FFMPEG layer?
Please give me some ideas in terms of which is best and what are pros and cons of each.
Thanks
Shiva.
I am writing a client-server application which does real time video transmission from an android based phone to a server. The captured video from the phone camera is encoded using the android provided h264 encoder and transmitted via UDP socket. The frames are not RTP encapsulated. I need it to reduce the overhead and hence the delay.
On the receiver, I need to decode the incoming encoded frame. The data being sent on the UDP socket not only contains the encoded frame but some other information related to the frame as a part of its header. Each frame is encoded as an nal unit.
I am able to retrieve the frames from the received packet as a byte array. I can save this byte array as raw h264 file and playback using vlc and everything works fine.
However, I need to do some processing on this frame and hence need to use it with opencv.
Can anyone help me with decoding a raw h264 byte array in opencv?
Can ffmpeg be used for this?
Short answer: ffmpeg and ffplay will work directly. I remember Opencv can be built on top of those 2. so shouldn`t be difficult to use the FFMEPG/FFSHOW plug in to convert to cv::Mat. Follow the docuemnts
OpenCV can use the FFmpeg library (http://ffmpeg.org/) as backend to
record, convert and stream audio and video. FFMpeg is a complete,
cross-reference solution. If you enable FFmpeg while configuring
OpenCV than CMake will download and install the binaries in
OPENCV_SOURCE_CODE/3rdparty/ffmpeg/. To use FFMpeg at runtime, you
must deploy the FFMepg binaries with your application.
https://docs.opencv.org/3.4/d0/da7/videoio_overview.html
Last time, I have to play with DJI PSDK. And they only allow stream at UDP port udp://192.168.5.293:23003 with H.264
So I wrote a simple ffmpeg interface to stream to the PSDK. But I have to debug it beforehand. So I use ffplay to show this network stream to proof it is working. This is the script to show the stream. So you have to work on top of this to work as opencv plugin
ffplay -f h264 -i udp://192.168.1.45:23003