OpenCV how to grab the real time frame and not the next frame? - opencv

I realized that when I use OpenCV to grab videos (cv2.VideoCapture('rtsp://...')) in a rtsp url, actually I am getting everyframe of the stream and not the real time frame.
Example: If a video has 30fps and 10 seconds long, if I get the first frame and wait 1 second to get the next, I get the frame number 2 and not the real time frame (it should be frame number 30 or 31).
I am worried about these because if my code take a little longer to do the video processing (deep learning convolutions), the result will always be delivered later and not in real time.
Any ideas how can I manage to always get the current frame when I capture from rtsp?
Thanks!

This is not about the code. Many IP cameras gives you encoded output(H.265/H.264).
When you used VideoCapture() , the output data of the camera is decoded by the CPU. Getting delay as you mentioned also such as between 1 sec and 2 sec is normal.
What can be done to make it faster:
If you have a GPU hardware, you can decode the data via on it. This
will give you really good results(according to experiences by using
latest version of NVIDIA gpus: you will get almost 25 milisecond
delay) To achieve that on your code, you need:
CUDA installation
CUDA enabled OpenCV installation
VideoReader class of OpenCV
You can use VideoCapture() with FFMPEG flag, FFMPEG has advanced methods to decode encoded data and this will give you probably most faster output which you can get with your CPU. But this will not decrease time much.

Related

Opening opencv cap for webcamera for extended period of time

i have a django web application hosted locally on a desktop workstation that is supposed to retrieve a video feed/video frames from a webcamera connected via usb. What then happens is that the frame would used as input into an object detection model to count some stuff, but since the objects are small, i need to retrieve the frame at a higher resolution (720, 1280) instead of the default resolution. after the counting is done, i stop reading frames from the web camera - here is the sequence:
Press button in web application to start retrieving video from webcamera
create new opencv cap, since i need a higher resolution, i have to specify the resolution
cap = cv2.VideoCapture(0 + cv2.CAP_DSHOW)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)
while loop that include frame retrieval, counting logic and visualization of the frame with inference results using opencv cv2.imshow()
stop retrieving video frames (cv2.destroyAllWindows() and break out of while loop) after counting is done, finally calling cap.release()
the problem with this sequence is that the video stream takes a few seconds (about 3 seconds) to load (this means that after clicking the button in step 1, i only see the video stream in step 3 after about 3 seconds which is quite slow, and there is an autofocus that takes about 1 second after the video stream pops up)
i am considering this new sequence which does not involve the release of the cap. i found that the video stream takes much faster to load and there is no autofocus each time the video stream is loaded:
when starting the django server, create the opencv cap and specify the required resolution (meaning this is only done once)
cap = cv2.VideoCapture(0 + cv2.CAP_DSHOW)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 720)
while loop that include frame retrieval, counting logic and visualization of the frame with inference results using opencv cv2.imshow()
stop retrieving video frames (break out of while loop WITHOUT calling cap.release()) after counting is done
Would the second flow result in any problems or damage to the webcamera, especially if the cap is opened for a long time (like weeks or even months).

My FPS fall drastically when recording a video

I'm using OpenCV to record video for a Full HD Camera Running at 30 FPS. I know the FPS of the camera because I measure it by counting the number of valid frames (this also coincides with the specifications). I do this by using a 5 ms second timer. Here is the code that I run each 5 ms.
cv::Mat frame;
if (capture.read(frame)){
showCurrentFrame(frame);
fpsCounter.newFrame();
ui->labelVideo->setText("Video (" + QString::number(frame.cols) + "x" + QString::number(frame.rows) + ")");
ui->labFPS->setText("FPS: " + QString::number(fpsCounter.getFPS()));
if (isRecording){
recorder << frame;
fpsRecordCounter++;
}
}
I set the FPS of the recorded video to 30 when I press the "Start Recording Button"
recorder.open(currentVideoFile.toStdString(),
VIDEO_CODEC_FOURCC, // Tells it to record MJPEG
VIDEO_REC_FPS, // Define with 30
frameSize)){ // 1920x1080
I've developed my Program in my workstation which runs Centos 7 using Qt and OpenCV 2.4.5. When I record in the desktop PC the FPS show consistently to be about 30.
However, this needs to record from a moving car. So I copy-pasted the code AS IS into my laptop and compiled it with zero issues.
My laptop uses Debian Testing and OpenCV 2.4.9. It is HERE that the slowdown is observed.
Since I'm using Qt I need to process the cv::Mat in order to display it. I do this in the showCurrentFrame function.
If I deactivate this function when recording I get maybe 23 FPS (and can't see what I'm recording)
If I leave it as is in the code above I get about 16~17 FPS.
The first thing that I thought was that my computer was not powerful enough, but it shouldn't be. This is the model:
https://www.asus.com/Notebooks/ASUS_VivoBook_S550CA/specifications/
It's the I5 variant with 500 Gb of HDD.
So I'm at a loss. Is it some sort of bug on OpenCV that was introduced in newer openCV versions or is it simply that my laptop is not powerful enough?

FSK demodulation with GNU Radio

I'm trying to demodulate a signal using GNU Radio Companion. The signal is FSK (Frequency-shift keying), with mark and space frequencies at 1200 and 2200 Hz, respectively.
The data in the signal text data generated by a device called GeoStamp Audio. The device generates audio of GPS data fed into it in real time, and it can also decode that audio. I have the decoded text version of the audio for reference.
I have set up a flow graph in GNU Radio (see below), and it runs without error, but with all the variations I've tried, I still can't get the data.
The output of the flow graph should be binary (1s and 0s) that I can later convert to normal text, right?
Is it correct to feed in a wav audio file the way I am?
How can I recover the data from the demodulated signal -- am I missing something in my flow graph?
This is a FFT plot of the wav audio file before demodulation:
This is the result of the scope sink after demodulation (maybe looks promising?):
UPDATE (August 2, 2016): I'm still working on this problem (occasionally), and unfortunately still cannot retrieve the data. The result is a promising-looking string of 1's and 0's, but nothing intelligible.
If anyone has suggestions for figuring out the settings on the Polyphase Clock Sync or Clock Recovery MM blocks, or the gain on the Quad Demod block, I would greatly appreciate it.
Here is one version of an updated flow graph based on Marcus's answer (also trying other versions with polyphase clock recovery):
However, I'm still unable to recover data that makes any sense. The result is a long string of 1's and 0's, but not the right ones. I've tried tweaking nearly all the settings in all the blocks. I thought maybe the clock recovery was off, but I've tried a wide range of values with no improvement.
So, at first sight, my approach here would look something like:
What happens here is that we take the input, shift it in frequency domain so that mark and space are at +-500 Hz, and then use quadrature demod.
"Logically", we can then just make a "sign decision". I'll share the configuration of the Xlating FIR here:
Notice that the signal is first shifted so that the center frequency (middle between 2200 and 1200 Hz) ends up at 0Hz, and then filtered by a low pass (gain = 1.0, Stopband starts at 1 kHz, Passband ends at 1 kHz - 400 Hz = 600 Hz). At this point, the actual bandwidth that's still present in the signal is much lower than the sample rate, so you might also just downsample without losses (set decimation to something higher, e.g. 16), but for the sake of analysis, we won't do that.
The time sink should now show better values. Have a look at the edges; they are probably not extremely steep. For clock sync I'd hence recommend to just go and try the polyphase clock recovery instead of Müller & Mueller; chosing about any "somewhat round" pulse shape could work.
For fun and giggles, I clicked together a quick demo demod (GRC here):
which shows:

Mov file has more frames than written/Possible iOS AVAsset writer usage issue

I am manually generated a .mov video file.
Here is a link to an example file: link, I wrote a few image frames, and then after a long break wrote approximately 15 image frames just to emphasise my point for debuting purposes. When I extract images from the video ffmpeg returns around 400 frames instead of the 15-20 I expected. Is this because the API i am using is inserting these image files automatically? Is it a part of the .mov file format that requires this? Or is it due to the way the library is extracting the image frames from the video? I have tried searching the internet but could not arrive at an answer.
My use case is that I am trying to write the current "sensor data" (from core motion) from core motion while writing a video. For each frame I receive from the camera, I use "AppendPixelBuffer" to write the frame to the video and then
Thanks for any help. The end result is I want a 1:1 ratio of Frames in the video to rows in the CSV file. I have confirmed I am writing the CSV file correctly using various counters etc. So my issue is cleariy the understanding of the movie format or API.
Thanks for any help.
UPDATED
It looks like your ffmpeg extractor is wrong. To extract only the timestamped frames (and not frames sampled at 24Hz) in your file, try this:
ffmpeg -i video.mov -r 1/1 image-%03d.jpeg
This gives me the 20 frames expected.
OLD ANSWER
ffprobe reports that your video has a frame rate of 2.19 frames/s and a duration of 17s, which gives 2.19 * 17 = 37 frames, which is closer to your expected 15-20 than ffmpeg's 400.
So maybe the ffmpeg extractor is at fault?
Hard to say if you don't show how you encode and decode the file.

VTCompressionSessionEncodeFrame: last seconds are lost?

I am using VTCompressionSessionEncodeFrameWithOutputHandler to compress pixel buffers from camera into raw h264 stream. I am using kVTEncodeFrameOptionKey_ForceKeyFrame to be sure that every output from VTCompressionSessionEncodeFrame is not dependent on other pieces. Also, there is kVTCompressionPropertyKey_AllowFrameReordering = false, kVTCompressionPropertyKey_RealTime = true options during session initialization and VTCompressionSessionCompleteFrames called after each VTCompressionSessionEncodeFrame call.
I also collect samples, produced by VTCompressionSessionEncodeFrame and periodically save them as MP4 file (using Bento4 library).
But final track is always shorter than samples, feeded to VTCompressionSessionEncodeFrame on 1-2 seconds. After several attempts to resolve this, i can be sure, that is it VTCompressionSessionEncodeFrame outputs frames, that depends on later frames to be decoded properly - so this frames are lost, since they can not be used to produce "final chunks" of the track.
So the question - how one can force VTCompressionSessionEncodeFrame to produce totally independent data chunks?
Turn out this was... FPS issue! NAL units do not have special timing itself (aside of pts, which is capture-fps-bound in my case), so it is quite important they are produced at exact rate as FPS in movie is expecting them to be... Nothing was lost, just saved frames were played faster (this was not so easy to spot, in fact)

Resources