How can I stream RTSP video with OpenCV without decoding? - opencv

So I'd like to stream video from an RTSP stream using OpenCV, and immediately save the raw video to a file without decoding. We need high performance and might be processing as many as 500 cameras on one machine. Which means that even though typically small and 2% CPU, the extra 10x CPU needed to decode every frame adds up. When I run with ffmpeg command line it's getting so little usage it shows 0.0% CPU.
The recommendation I've seen before is just use ffmpeg, reason I'd like OpenCV is:
A) We might need to do some image analysis in the future, wouldn't be on all frames though, just a small sample of them, so all still need to be saved frame by frame to file without decoding, but some will separately be decoded.
B) Simpler to implement than ffmpeg (that in the future would need to pass select frames to OpenCV)
Edit: I've tried using VideoWriter::fourcc('X', '2', '6', '4'), and -1 in order to try to skip encoding. Seems like it goes ahead and does it anyway even though it's already in that format?
Thoughts? Thanks!

Related

MP3 radio Stream buffer underrun detection

any pointers to detect through a script on linux that an mp3 radio stream is breaking up, i am having issues with my radio station when the internet connection slows down and causes the stream on the client side to stop, buffer and then play.
There are a few ways to do this.
Method 1: Assume constant bitrate
If you know that you will have a constant bitrate, you can measure that bitrate over time on the server and determine when it slows below a threshold. Note that this isn't the most accurate method, and won't always work. Not all streams use a constant bitrate. But, this method is as easy as counting bytes received over the wire.
Method 2: Playback on server
You can run a headless player on the server (via cvlc or similar) and track when it has buffer underruns. This will work at any bitrate and will give you a decent idea of what's happening on the clients. This sort of player setup also enables utility functions like silence detection. The downside is that it takes a little bit of CPU to decode, and a bit more effort to automate.
Method 3 (preferred): Log output buffer on source
Your source encoder will have a buffer on its output, data waiting to be sent to the server. When this buffer grows over a particular threshold, log it. This means that output over the network stalled for whatever reason. This method gets the appropriate data right from the source, and ensures you don't have to worry about clock synchronization issues that can occur over time in your monitoring of audio streams. (44.1 kHz to your encoder might be 44.101 kHz to a player.) This method might require modifying your source client.

CPU load of streaming vs file downloading when routing data

I'm using a Raspberry Pi 2 to route wifi-eth connections. So from the eth side I have a computer that will connect to internet using the Pi wifi connection. On the Raspberry I started htop to monitor the CPUs load, then on the computer I started chrome and played a 20-minute 1080 video. The load on the CPU didn't seem to go beyond 5% anyhow. After that I closed youtube tab and started a download of a binary file of 5GB from the first row here (https://testdebit.info/). Well, I noticed that CPU load was much more higher, around 10%!
Any explanation of such a difference?
It has to do with compression and how video is encoded. A normal file can be compressed, but nothing like that of a video stream.
A video stream can achieve very high compressions due to the predictable characteristics of video, e.g. video from one frame to another doesn't change much. As such, video will send a whole frame (I-frame) and then update it with just the changes (P-frame). It's even possible to do backward prediction (B-frame). Here's a wikipedia reference.
Yes, I hear your next unspoken question: Doesn't more compression mean more CPU time to uncompress? That's true for a lot of types of compression, such as that used by zip files. But since raw video is not very information dense over time, you have compression techniques that in essence reduce the amount of data you send with very little CPU usage.
I hope this helps.

Fastest way to get frames from webcam

I have a little wee of a problem developing one of my programs in C++ (Visual studio) - Right now im struggling with connection of multiple webcams (connected via usb cables), creating for each of them separate thread to capture frames, and separate frame for processing image.
I use OpenCV to process frames, but the problem is that i dont get a peak of webcam possibilities (it supports 25 fps, i get only 18) is there some library that i could use to get frames, than process them with OpenCV that would made frames be captured faster?
I was researching a bit and the most popular way is to use directshow to get frames and OpenCV to process them.
Do You agree? Or do You have another solution?
I wouldn't be offended by some links :)
DirectShow is only used, if you open your capture using the
CV_CAP_DSHOW flag, like:
VideoCapture capture( CV_CAP_DSHOW + 0 ); // 0,1,2, your cam id there
(without it, it defaults to vfw )
the capture already runs in a separate thread, so wrapping it with more threads won't give you any gain.
another obstacle with multiple cams is the usb bandwidth, so if you got ports on the back & the front of your machine, dont plug all your cams into the same port/controller else you just saturate it
OpenCV uses DirectShow. Using DirectShow (primary video capture API in Windows) directly will obviously get you par or better performance (and even more likely so if OpenCV is set to use Video for Windows). USB cams typically hit USB bandwidth and hence frame rate limit, using DirectShow to capture in compressed formats or in formats with less bits/pixel is the way to reach higher frame rates within the same USB bandwidth limit.
Another typical problem causing low frame rates is slow synchronous processing delaying the capture. You typically identify this by putting trivial processing into the same capture loop and seeing higher FPS compared to processing-enabled operation.

Do the video capture APIs on iPhone running iOS4+ encode video on-the-fly?

I'm just researching at the moment the possibility of writing an app to record an hours worth of video/audio for a specific use case.
As the video will be an hour long I would want to encode on-the-fly and not after the recording has finished to keep disk usage to a minimum.
Do the video capture APIs write a large uncompressed file to disk that has to be encoded after or can they encode on-the-fly resulting in a optimised file written to disk?
It's important that the video is recorded at a lower resolution than the iPhone's advertised 720/1080p as I need to keep the file sizes down due to length of video (which will need to be uploaded).
Any information you have would be appreciated or even just a pointer in the right direction.
No they do not record uncompressed to disk (unless this is what you want). You can specify to record to a MOV/MP4 and have the video encoded in H264. Additionally you can control the average bit rate of the encoding. You can also specify the capture size, and output encoding size along with scaling options if needed. For demo code check out AVCamDemo in the WWDC 2010 sample code. This demo code may now be available in the docs.

How does the ability to compress a stream affect a compression algorithm?

I recently backed up my soon-to-expire university home directory by sending it as a tar stream and compressing it on my end: ssh user#host "tar cf - my_dir/" | bzip2 > uni_backup.tar.bz2.
This got me thinking: I only know the basics of how compression works, but I would imagine that this ability to compress a stream of data would lead to poorer compression since the algorithm needs to finish handling a block of data at one point, write this to the output stream and continue to the next block.
Is this the case? Or do these programs simply read a lot of data into memory compress this, write it, and then do this over again? Or are there any clever tricks used in these “stream compressors”? I see that both bzip2 and xz's man pages talk about memory usage, and man bzip2 also hints to the fact that little is lost on chopping the data to be compressed into blocks:
Larger block sizes give rapidly diminishing marginal returns. Most of the compression comes from the first two or three hundred k of block size, a fact worth bearing in mind when using bzip2 on small machines. It is also important to appreciate that the decompression memory requirement is set at compression time by the choice of block size.
I would still love to hear if other tricks are used, or about where I can read more about this.
This question relates more to buffer handling than compression algorithm, although a bit could be said about it too.
Some compression algorithm are inherently "block based", which means they absolutely need to work with blocks of specific size. This is the situation of bzip2, which block size is selected thanks to the "level" switch, from 100kb to 900kb.
So, if you stream data into it, it will wait for the block to be filled, and start compressing this block when it's full (alternatively, for the last block, it will work with whatever size it receives).
Some other compression algorithm can handle streams, which means they can continuously compress new data using older one kept in a memory buffer. Algorithms based on "sliding windows" can do it, and typically zlib is able to achieve that.
Now, even "sliding window" compressors may nonetheless select to cut input data into blocks, either for easier buffer management, or to develop multi-threading capabilities, such as pigz.

Resources