Pass ffmpeg Stream to OpenCV - opencv

I would like to use the redirection operator to bring the stream from ffmpeg to cv2 so that I can recognize or mark the faces on the stream and redirect this stream again so that it runs under another stream.
One withoutfacedetect and One withfacedetect.
raspivid -w 1920 -h 1080 -fps 30 -o - -t 0 -vf -hf -b 6000000 | ffmpeg -f h264 -i - -vcodec copy -g 50 -strict experimental -f tee -map 0:v "[f=flv]rtmp://xx.xx.xx.xx/live/withoutfacedetect |[f=h264]pipe:1" > test.mp4
I then read up on CV2 and came across the article.
https://www.bogotobogo.com/python/OpenCV_Python/python_opencv3_Image_Object_Detection_Face_Detection_Haar_Cascade_Classifiers.php
I then ran the script with my picture and was very amazed that there was a square around my face.
But now back to business. What is the best way to do this?
thanks to #Mark Setchell, forgot to mention that I'm using a Raspberry Pi 4.

I'm still not 100% certain what you are really trying to do, and have more thoughts than I can express in a comment. I have not tried all of what I think you are trying to do, and I may be over-thinking it, but if I put down my thought-train, maybe others will add in some helpful thoughts/corrections...
Ok, the video stream comes from the camera into the Raspberry Pi initially as RGB or YUV. It seems silly to use ffmpeg to encode that to h264, to pass it to OpenCV on its stdin when AFAIK, OpenCV cannot easily decode it back into BGR or anything it naturally likes to do face detection with.
So, I think I would alter the parameters to raspivid so that it generates RGB data-frames, and remove all the h264 bitrate stuff i.e.
raspivid -rf rgb -w 1920 -h 1080 -fps 30 -o - | ffmpeg ...
Now we have RGB coming into ffmpeg, so you need to use tee and map similar to what you have already and send RGB to OpenCV on its stdin and h264-encode the second stream to rtmp as you already have.
Then in OpenCV, you just need to do a read() from stdin of 1920x1080x3 bytes to get each frame. The frame will be in RGB, but you can use:
cv2.cvtColor(cv2.COLOR_RGB2BGR)
to re-order the channels to BGR as OpenCV requires.
When you read the data from stdin you need to do:
frame = sys.stdin.buffer.read(1920*1080*3)
rather than:
frame = sys.stdin.read(1920*1080*3)
which mangles binary data such as images.

Related

How to let FFMPEG fetch frames from OpenCV and stream them to HTTP server

There is a camera that shoots at 20 frame per second. each frame is 4000x3000 pixel.
The frames are sent to a software that contain openCV in it. OpenCV resizes the freames to 1920x1080 then they must be sent to FFMPEG to be encoded to H264 or H265 using Nvidia Nvenc.
The encoded video then got steamed HTTP to a maximum of 10 devices.
The infrastructure is crazy good (10 GB Lan) with state of the art switchers, routers etc...
Right now, i can get 90 FPS when encoding the images from an Nvme SSD. this means that the required encoding speed is achieved.
The question is how to get the images from OpenCV to FFMPEG ?
the stream will be watched on a webapp that was made using MERN stack (assuming that this is relevant).
For cv::Mat you have cv::VideoWriter. If you wish to use FFMpeg, assuming Mat is continuous, which can be enforced:
if (! mat.isContinuous())
{
mat = mat.clone();
}
you can simply feed mat.data into sws_scale
sws_scale(videoSampler, mat.data, stride, 0, mat.rows, videoFrame->data, videoFrame->linesize);
or directly into AVFrame
For cv::cuda::GpuMat, VideoWriter implementation is not available, but you can use NVIDIA Video Codec SDK and similarly feed cv::cuda::GpuMat::data into NvEncoderCuda, just make sure your GpuMat has 4 channels (BGRA):
NV_ENC_BUFFER_FORMAT eFormat = NV_ENC_BUFFER_FORMAT_ABGR;
std::unique_ptr<NvEncoderCuda> pEnc(new NvEncoderCuda(cuContext, nWidth, nHeight, eFormat));
...
cv::cuda::cvtColor(srcIn, srcIn, cv::ColorConversionCodes::COLOR_BG2BGRA);
NvEncoderCuda::CopyToDeviceFrame(cuContext, srcIn.data, 0, (CUdeviceptr)encoderInputFrame->inputPtr,
(int)encoderInputFrame->pitch,
pEnc->GetEncodeWidth(),
pEnc->GetEncodeHeight(),
CU_MEMORYTYPE_HOST,
encoderInputFrame->bufferFormat,
encoderInputFrame->chromaOffsets,
encoderInputFrame->numChromaPlanes);
Here's my complete sample of using GpuMat with NVIDIA Video Codec SDK

Python and ffmpeg create different tiff stacks

Hello everybody out there with an interest in image processing,
Creating a multipage tiff file (tiff stack) out of a grayscale movie can be achieved without programming using ffmpeg and tiffcp (the latter being part of Debian's libtiff-tools):
ffmpeg -i movie.avi frame%03d.tif
tiffcp frame*.tif stack.tif
Programming it in Python also seemed to be feasible to me using the OpenCV and tifffile libraries:
import numpy as np
import cv2
import tifffile
cap = cv2.VideoCapture('movie.avi')
success, frame = cap.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
image = np.zeros((300, 400, 500), 'uint8') # pre-allocate some space
i = 0;
while success:
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
image[i,:,:] = gray[80:480,0:500]
success, frame = cap.read()
cap.release()
tifffile.imsave('image.tif',image,photometric='minisblack')
However, the results differ in size. Looking at the histogram of the Python solution, I realized that it differes from the ffmpeg solution.
Thanks to the answer below, I compared the output files with the file:
user#ubuntu:~$ file ffmpeg.tif tifffile.tif
ffmpeg.tif: TIFF image data, little-endian
tifffile.tif: TIFF image data, little-endian, direntries=17, height=400, bps=8, compression=none, PhotometricIntepretation=BlackIsZero, description={"shape": [300, 400, 500]}, width=500
In addition, I compared the files with ffmpeg:
user#ubuntu:~$ ffmpeg -i ffmpeg.tif -i tifffile.tif
[tiff_pipe # 0x556cfec95d80] Stream #0: not enough frames to estimate rate; consider increasing probesize
Input #0, tiff_pipe, from 'ffmpeg.tif':
Duration: N/A, bitrate: N/A
Stream #0:0: Video: tiff, gray, 500x400 [SAR 1:1 DAR 5:4], 25 tbr, 25 tbn, 25 tbc
[tiff_pipe # 0x556cfeca6b40] Stream #0: not enough frames to estimate rate; consider increasing probesize
Input #1, tiff_pipe, from 'tifffile.tif':
Duration: N/A, bitrate: N/A
Stream #1:0: Video: tiff, gray, 500x400 [SAR 1:1 DAR 5:4], 25 tbr, 25 tbn, 25 tbc
Which additional diagnostics could I use in order to pin down the problem?
compression algorith
By default ffmpeg uses the packbits compression algorithm for TIFF output. This can be changed with the -compression_algo option, and other accepted values are raw, lzw, and deflate:
ffmpeg -i input.avi -compression_algo lzw output_%04d.tif
pixel format
Another difference may be caused by the pixel format (color space and chroma subsampling). See ffmpeg -h encoder=tiff for a list of supported pixel formats.
Which pixel format gets used depends on your input, and the log/console output will indicate the selected pixel format.
comparing outputs
I don't know what defaults are used by tifffile, but you can run ffmpeg -i ffmpeg.tif -i tifffile.tif and file ffmpeg.tif tifffile.tif to view details which may explain the discrepancy.

How to detect artifacts in video?

I'm using OpenCV to handle videos in mp4 format. The image below is a random frame extracted from a video, and you can see the obvious distortion on the sweater.
How can we detect such artifacts? Or can we avoid such artifacts by extracting nearby keyframes and how?
As #VC.One suggested, these distortions are due to video interlacing. Here is a good article about interlacing/deinterlacing: What is Deinterlacing? Facts, solutions, examples.
There are several tools to handle deinterlacing:
[Windows] The one suggested in 100fps.com: Virtualdub + DivX codec + AviSynth
[Windows] MediaCoder suggested by #VC.One.
[Windows/Linux] FFmpeg provides serveral deinterlacing filters, e.g. yadif, kerndeint etc. Here is an example: ffmpeg -i input.mp4 -vf yadif output.mp4

H.264 / H.265 Compression of a single Bitmap-Image

I hope someone can help me.
I started researching different compression methods to compress Bitmap-Images lossless and lossy. The first methods i used were JPEG, JPEG-2000 and JPEG-XR. Now i want to compare these "standard" ones with H.264 and H.265, maybe they perform as well as they do for video compression.
I tried using ffmpeg, but i can't find out which parameters i need, there are plenty... So maybe someone can help me or link me to an Article/Howto or something else?!
Thanks a lot!
EDIT:
I used the following command:
ffmpeg -i 01.bmp -c:v libx264 -preset veryslow -crf 40 test.avi
but this created an 7kb file from an 76,8 kb input file... not very good compression ratio... is there any possibility to achieve more?
"-crf 40" will choose bitrate around QP = 40, that is somehow low visual quality.
For H.264, QP = 0 ~ 51, where 0 is the best.
So you can consider use "-crf = 16", or even smaller number.
I believe the quality will be much better.

C++ TIFF (raw) to JPEG : Faster than ImageMagick?

I need to convert many TIFF images to JPEG per second. Currently I'm using libmagick++ (Q16). I'm in the process of compiling ImageMagick Q8 as I read that it may improve performance (specially because I'm only working with 8bit images).
CImg also looks like a good option and GraphicsMagick claims to be faster than ImageMagic. I haven't tested either of those yet, but I was wondering if there are any other alternatives that could be faster than using ImageMagick Q8?
I'm looking for a Linux only solution.
UPDATE width GraphicsMagick & ImageMagick Q8
Base comparison (see comment to Mark): 0.2 secs with ImageMagick Q16
I successfully compiled GraphicsMagick with Q8, but after all, it seems about 30% slower than ImageMagick (0.3 secs).
After compiling ImageMagick with Q8, there was a gain of about 25% (0.15 secs). Nice :)
UPDATE width VIPS
Thanks to Mark's post, I give it a try to VIPS. Using the 7.38 version that is found in Ubuntu Trusty repositories:
time vips copy input.tiff output.jpg[Q=95]
real 0m0.105s
user 0m0.130s
sys 0m0.038s
Very nice :)
I also tried with the 7.42 (from ppa:dhor/myway) but it seems slighlty slower:
real 0m0.134s
user 0m0.168s
sys 0m0.039s
I will try to compile VIPS from source and see if I can beat that time. Well done Mark!
UPDATE: with VIPS 8.0
Compiled from source, vips-8.0 gets practically the same performance than 7.38:
real 0m0.100s
user 0m0.137s
sys 0m0.031s
Configure command:
./configure CC=c99 CFLAGS=-O2 --without-magick --without-OpenEXR --without-openslide --without-matio --without-cfitsio --without-libwebp --without-pangoft2 --without-zip --without-png --without-python
I have a few thoughts...
Thought 1
If your input images are 15MB and, for argument's sake, your output images are 1MB, you are already using 80MB/s of disk bandwidth to process 5 images a second - which is already around 50% of what a sensible disk might sustain. I would do a little experiment with using a RAMdisk to see if that might help, or an SSD if you have one.
Thought 2
Try experimenting with using VIPS from the command line to convert your images. I benchmarked it like this:
# Create dummy input image with ImageMagick
convert -size 3288x1152! xc:gray +noise gaussian -depth 8 input.tif
# Check it out
ls -lrt
-rw-r--r--# 1 mark staff 11372808 28 May 11:36 input.tif
identify input.tif
input.tif TIFF 3288x1152 3288x1152+0+0 8-bit sRGB 11.37MB 0.000u 0:00.000
Convert to JPEG with ImageMagick
time convert input.tif output.jpg
real 0m0.409s
user 0m0.330s
sys 0m0.046s
Convert to JPEG with VIPS
time vips copy input.tif output.jpg
real 0m0.218s
user 0m0.169s
sys 0m0.036s
Mmm, seems a good bit faster. YMMV of course.
Thought 3
Depending on the result of your test on disk speed, if your disk is not the limiting factor, consider using GNU Parallel to process more than one image at a time if you have a quad core CPU. It is pretty simple to use and I have always had excellent results with it.
For example, here I sequentially process 32 TIFF images created as above:
time for i in {0..31} ; do convert input-$i.tif output-$i.jpg; done
real 0m11.565s
user 0m10.571s
sys 0m0.862s
Now, I do exactly the same with GNU Parallel, doing 16 in parallel at a time
time parallel -j16 convert {} {.}.jpg ::: *tif
real 0m2.458s
user 0m15.773s
sys 0m1.734s
So, that's now 13 images per second, rather than 2.7 per second.

Resources