I have an algorithm that reads in frames from a live rtsp stream (from camera connected to my computer). I can read and process frames at a faster rate than the incoming stream (20 vs 6).
I run cv2.VideoCapture on the stream, and then using a while loop, read in frames using stream.read(). I just wanted to check, if I read frames faster than the stream can deliver them, does that mean stream.read() will return false sometimes? Or will it "wait" for the next frame to come in.
Related
any pointers to detect through a script on linux that an mp3 radio stream is breaking up, i am having issues with my radio station when the internet connection slows down and causes the stream on the client side to stop, buffer and then play.
There are a few ways to do this.
Method 1: Assume constant bitrate
If you know that you will have a constant bitrate, you can measure that bitrate over time on the server and determine when it slows below a threshold. Note that this isn't the most accurate method, and won't always work. Not all streams use a constant bitrate. But, this method is as easy as counting bytes received over the wire.
Method 2: Playback on server
You can run a headless player on the server (via cvlc or similar) and track when it has buffer underruns. This will work at any bitrate and will give you a decent idea of what's happening on the clients. This sort of player setup also enables utility functions like silence detection. The downside is that it takes a little bit of CPU to decode, and a bit more effort to automate.
Method 3 (preferred): Log output buffer on source
Your source encoder will have a buffer on its output, data waiting to be sent to the server. When this buffer grows over a particular threshold, log it. This means that output over the network stalled for whatever reason. This method gets the appropriate data right from the source, and ensures you don't have to worry about clock synchronization issues that can occur over time in your monitoring of audio streams. (44.1 kHz to your encoder might be 44.101 kHz to a player.) This method might require modifying your source client.
I am developing VoIP app and need to play data from RTP packets which are sent by server every 20 ms.
I have a buffer which accumulates samples from RTP packets. Audio unit render callback reads data from this buffer.
The problem is that I cannot synchronise audio unit with RTP stream. Preferred IO buffer duration cannot be set to exactly 20 ms. And number of frames requested by render callback also cannot be set to the packet's number of samples.
As a result, there are two possible situations (depending on sample rate and IO buffer duration):
a) audio unit reads from my buffer faster than it is filled from RTP packets; in this case buffer periodically doesn't contain the requested number of samples and I get distorted sound;
b) buffer is filled faster than audio unit reads from it; in this case buffer periodically is overflowed and samples from new RTP packets are lost.
What should I do to avoid this issue?
If you have control over the packet rate, this is typically done via a "leaky bucket" algorithm. A circular FIFO/buffer can hold the "bucket" of incoming data, and a certain amount of padding needs to be kept in the FIFO/buffer to cover variations in network rate and latency. If the bucket gets too full, you ask the packet sender to slow down, etc.
On the audio playback end, various audio concealment methods (PSOLA time-pitch modification, etc.) can be used to slightly stretch or shrink the data to fit, if adequate buffer fill thresholds are exceeded.
If you are receiving audio
Try having the client automatically request periodically (ex. every second) that the server sends audio of a certain bitrate dependent on the buffer size and connection speed.
For example, have each audio sample be 300kbits large if there are, for example, 20 samples in the buffer and a 15000kbit/s speed and increase/decrease the audio sample bitrate dynamically as necessary.
If you are sending audio
Do the same, but in reverse. Have the server request periodically that the client changes the audio bitrate.
So, I have setup a multichannel mixer and a Remote I/O unit to mix/play several buffers of PCM data that I read from audio files.
For short sound effects in my game, I load the whole file into a memory buffer using ExtAudioFileRead().
For my background music, let's say I have a 3 minute compressed audio file. Assuming it's encoded as mp3 # 128 kbps (44,100 Hz stereo), that gives around 1 MB per minute, or 3 MB total. Uncompressed, in memory, I believe it's around ten times that if I remember correctly. I could use the exact same method as for small files; I believe ExtAudioFileRead() takes care of the decoding, using the (single) hardware decoder when available, but I'd rather not read the whole buffer at once, and instead 'stream' it at regular intervals from disk.
The first thing that comes to mind is going one step below to the (non-"extended") Audio File Services API and use AudioFileReadPackets(), like so:
Prepare two buffers A and B, each big enough to hold (say) 5 seconds of audio. During playback, start reading from one buffer and switch to the other one when reaching the end (i.e., they make up the two halves of a ring buffer).
Read first 5 seconds of audio from file into buffer A.
Read next 5 seconds of audio from file into buffer B.
Begin playback (from buffer A).
Once the play head enters buffer B, load next 5 seconds of audio into buffer A.
Once the play head enters buffer A again, load next 5 seconds of audio into buffer B.
Go to #5
Is this the right approach, or is there a better way?
I'd suggest using the high-level AVAudioPlayer class to do simple background playback of an audio file. See:
https://developer.apple.com/library/ios/documentation/AVFoundation/Reference/AVAudioPlayerClassReference/Chapters/Reference.html#//apple_ref/doc/uid/TP40008067
If you require finer-grained control and lower latency, check out Apple's AUAudioFilePlayer. See AudioUnitProperties.h for a discussion. This is an Audio Unit that that abstracts the complexities of streaming an audio file from disk. That said, it's still pretty complicated to set up and use, so definitely try AVAudioPlayer first.
I am writing a client-server application which does real time video transmission from an android based phone to a server. The captured video from the phone camera is encoded using the android provided h264 encoder and transmitted via UDP socket. The frames are not RTP encapsulated. I need it to reduce the overhead and hence the delay.
On the receiver, I need to decode the incoming encoded frame. The data being sent on the UDP socket not only contains the encoded frame but some other information related to the frame as a part of its header. Each frame is encoded as an nal unit.
I am able to retrieve the frames from the received packet as a byte array. I can save this byte array as raw h264 file and playback using vlc and everything works fine.
However, I need to do some processing on this frame and hence need to use it with opencv.
Can anyone help me with decoding a raw h264 byte array in opencv?
Can ffmpeg be used for this?
Short answer: ffmpeg and ffplay will work directly. I remember Opencv can be built on top of those 2. so shouldn`t be difficult to use the FFMEPG/FFSHOW plug in to convert to cv::Mat. Follow the docuemnts
OpenCV can use the FFmpeg library (http://ffmpeg.org/) as backend to
record, convert and stream audio and video. FFMpeg is a complete,
cross-reference solution. If you enable FFmpeg while configuring
OpenCV than CMake will download and install the binaries in
OPENCV_SOURCE_CODE/3rdparty/ffmpeg/. To use FFMpeg at runtime, you
must deploy the FFMepg binaries with your application.
https://docs.opencv.org/3.4/d0/da7/videoio_overview.html
Last time, I have to play with DJI PSDK. And they only allow stream at UDP port udp://192.168.5.293:23003 with H.264
So I wrote a simple ffmpeg interface to stream to the PSDK. But I have to debug it beforehand. So I use ffplay to show this network stream to proof it is working. This is the script to show the stream. So you have to work on top of this to work as opencv plugin
ffplay -f h264 -i udp://192.168.1.45:23003
I have a little wee of a problem developing one of my programs in C++ (Visual studio) - Right now im struggling with connection of multiple webcams (connected via usb cables), creating for each of them separate thread to capture frames, and separate frame for processing image.
I use OpenCV to process frames, but the problem is that i dont get a peak of webcam possibilities (it supports 25 fps, i get only 18) is there some library that i could use to get frames, than process them with OpenCV that would made frames be captured faster?
I was researching a bit and the most popular way is to use directshow to get frames and OpenCV to process them.
Do You agree? Or do You have another solution?
I wouldn't be offended by some links :)
DirectShow is only used, if you open your capture using the
CV_CAP_DSHOW flag, like:
VideoCapture capture( CV_CAP_DSHOW + 0 ); // 0,1,2, your cam id there
(without it, it defaults to vfw )
the capture already runs in a separate thread, so wrapping it with more threads won't give you any gain.
another obstacle with multiple cams is the usb bandwidth, so if you got ports on the back & the front of your machine, dont plug all your cams into the same port/controller else you just saturate it
OpenCV uses DirectShow. Using DirectShow (primary video capture API in Windows) directly will obviously get you par or better performance (and even more likely so if OpenCV is set to use Video for Windows). USB cams typically hit USB bandwidth and hence frame rate limit, using DirectShow to capture in compressed formats or in formats with less bits/pixel is the way to reach higher frame rates within the same USB bandwidth limit.
Another typical problem causing low frame rates is slow synchronous processing delaying the capture. You typically identify this by putting trivial processing into the same capture loop and seeing higher FPS compared to processing-enabled operation.