Obtaining matching image frames from videos with two different frame rates - image-processing

I am in the process of converting videos to images in python3.6 (i.e. cut videos to get images)
I have two types of videos
The first one is a RGB video recorded from Real sense D435i with a frame rate of 30 (i.e. fps= 30)
The second one is a thermal-IR video recorded from Flir Adas camera with a frame rate of 9 (It was originally a frame stream file which i converted to images and formulated it into a video with fps of 9 using python3-cv2).
The video formats are in mp4 and avi respectively (though I have converted the avi to mp4 and tested it out as well).
They are equal in length.
I am trying to create a matched image pair from the thermal and IR videos. However, when I seem to cut them using the same frame rate, it seems like they don't match (but differ by 5-6 images).
I have about 200 ish videos so it is very time consuming and difficult for me to track them down one by one.
Any ideas on how I can get this dataset to make it a paired one?
Many thanks

You might be able to recreate the 30 fps video as a matching 9 fps video by matching up the timestamps of the two videos. OpenCV lets you specify the framerate in the VideoWriter (I think you knew this, but just to be sure). OpenCV reports the timestamp of the current video with
cv2.VideoCapture.get(cv2.CAP_PROP_POS_MSEC);
Grab one frame and timestamp from the FLIR camera and then keep grabbing frames and timestamps from the color camera until the timestamp catches up or passes the FLIR timestamp. Then write the color frame to the vidwriter. So long as both of the videos had a consistent framerate and they both started/stopped recording at the same time then this should see both videos matched up as closely as possible.

What about just converting the 9 fps videos to 30 fps with a simple ffmpeg script over the list of files:
ffmpeg -i source9fps.mp4 -r 30 -c:v libx264 -c:b 10M out30fps.mp4
#-c:b ... bitrate
or:
ffmpeg -i source9fps.mp4 -r 30 -c:v copy out30fps.mp4
(Without reencoding, very fast)
30 vs 9 fps is a nasty combination though, there will be a skew and I'm not sure about the exact conversion algorithm (some blending might be better due to the skew at frames with a big temporal mismatch).
It'd be better if the second video was recorded at 5, 10 or 15 fps.

Related

IOS Swift buffer 30FPS Video for realtime object-detection

I have trained an ObjectDetector for iOS. Now I want to use it on a Video with a frame rate of 30FPS.
The ObjectDetector is a bit too slow, needs 85ms for one frame. For the 30FPS it should be below 33ms.
Now I am wondering if it is possible to buffer the frames and the predictions for a specified time x and then play the video on the screen?
If you have already tried using a smaller/faster model (and also to ensured that your model is fully optimized to run in CoreML on the neural engine), we had success doing inference only every nth frame.
The results were suitable for our use-case and you couldn't really tell that we were only doing it at 5 fps because we were able to continue to display the camera output at full frame-rate.
If you don't need realtime then yes, certainly you could store the video and do the processing per frame afterwards; this would let you parallelize things into bigger batch sizes as well.

Is it good to use minterpolate in FFmpeg for reducing blurred frames

I'm using FFMPEG to slice png files from videos.
I'm slicing the videos in fps between 1-3 depending on some video metadata.
I can see that when the subjects in the video moving fast or the camera are not steady I will get blurred frames. I try to research how I can solve it (The quality of these frames is my main goal) and I tackled the minterpolate option.
I think that if I will use the blend option that will mean the 3 frames to 1 the "noise" of the blurred subjects will reduce.
So my current command now is like this:
./ffmpeg -i "/home/dev/ffmpeg/test/input/#3.mp4" -vf minterpolate=fps=1:mi_mode=blend,mpdecimate=hi=11456:lo=6720:frac=0.5 -vsync 0 "/home/dev/ffmpeg/test/output/3/(#%04d).png"
Am I right? Do you think of a better way to use FFMPEG to solve my problem?
You can create a better interpolation result if you use the mci method in ffmpeg, rather than the blend method. There are also more advanced techniques available.
If I understand you correctly you have a blurred image (let's call it B, in the middle) between two non-blurred images (let's call them A on the left and C on the right). Now you want to replace the middle frame B by a non-blurred version of B.
The minterpolate filter is used for interpolating in ffmpeg. It has two different approaches. A blend mode, which fades out A and then fades in C to generate a new image for B.
Blend
If I run it with ffmpeg -i %02d.png -framerate 10 -vf minterpolate=fps=20:mi_mode=blend test-%02d.png I get the following image.
Motion estimation
You can also use the motion estimation or mci mode, which allows you to do motion based interpolation. You can call it by changing the mode: ffmpeg -i %02d.png -framerate 10 -vf minterpolate=fps=20:mi_mode=mci test-%02d.png That generates a circle in the middle.
Going further
The ffmpeg mci mode uses a classic algorithm. There are some more advanced optical flow and neural network based approaches available. Which can give better results with more complex images and with more complex motion fields.

How to estimate bandwidth / speed requirements for real-time streaming video?

For a project I'm working on, I'm trying to stream video to an iPhone through its headphone jack. My estimated bitrate is about 200kbps (If i'm wrong about this, please ignore that).
I'd like to squeeze as much performance out of this bitrate as possible and sound is not important for me, only video. My understanding is that to stream a a real-time video I will need to encode it with some codec on-the-fly and send compressed frames to the iPhone for it to decode and render. Based on my research, it seems that H.265 is one of the most space efficient codecs available so i'm considering using that.
Assuming my basic understanding of live streaming is correct, how would I estimate the FPS I could achieve for a given resolution using the H.265 codec?
The best solution I can think of it to take a video file, encode it with H.265 and trim it to 1 minute of length to see how large the file is. The issue I see with this approach is that I think my calculations would include some overhead from the video container format (AVI, MKV, etc) and from the audio channels that I don't care about.
I'm trying to stream video to an iPhone through its headphone jack.
Good luck with that. Headphone jack is audio only.
My estimated bitrate is about 200kbps
At what resolution? 320x240?
I'd like to squeeze as much performance out of this bitrate as possible and sound is not important for me, only video.
Then, drop the sound streams all together. Really though, 200kbit isn't enough for video of any reasonable size or quality.
Assuming my basic understanding of live streaming is correct, how would I estimate the FPS I could achieve for a given resolution using the H.265 codec?
Nobody knows, because you've told us almost nothing about what's in this video. The bandwidth required for the video is a product of many factors, such as:
Resolution
Desired Quality
Color Space
Visual complexity of the scene
Movement and scene changes
Tweaks and encoding parameters (fast start? low latency?)
You're going to have to decide what sort of quality you're willing to accept, and decide subjectively what the balance between that quality and frame rate is. (Remember too that if there isn't much going on, you basically get frames for free since they take very little bandwidth. Experiment.)
The best solution I can think of it to take a video file, encode it with H.265 and trim it to 1 minute of length to see how large the file is.
Take many videos, typical of what you'll be dealing with, and figure it out from there.
The issue I see with this approach is that I think my calculations would include some overhead from the video container format (AVI, MKV, etc) and from the audio channels that I don't care about.
Your video stream won't have a container at all? Not even TS? You can use FFmpeg to dump the raw stream data for you.

OpenCV: GoPro video editing blur

I am attempting to post-process a video in OpenCV. The problem is that the GoPro video is very blurry, even with a high frame rate.
Is there any way that I can remove blur? I've heard about deinterlacing, but don't know if this applies to a GoPro 3+, or where even to begin.
Any help is appreciated.
You can record at a high frame rate to remove any blur, also make sure you are recording with enough natural light, so recording outdoors is recommended.
Look at this video: https://www.youtube.com/watch?v=-nU2_ERC_oE
At 30 fps, there is some blur in the car, but at 60fps the blur is non existant, just doubling the FPS can do some good. Since you have a HERO3+ you can record 720p 60 and that will remove the blur. WVGA 120fps can also do some justice (example is 1080p but still applies)

How to find out the frame rate of a video?

How to find out the frame rate of a video ?How to do in C++ OpenCV?
I want to read the different number of video with respective of frames per second.
It has to work on all Video formats? .Avi, .MP4, .Flv
easy (just take with a grain of salt, see remarks below):
VideoCapture cap("ma.avi");
double fps = cap.get(CV_CAP_PROP_FPS);

Resources