Change video stream resolution in YoloV4 demo - opencv

Here's what shows when loading the live stream demo for Yolov4:
Webcam index: 2
[ WARN:0] global ../modules/videoio/src/cap_gstreamer.cpp (935) open OpenCV | GStreamer warning: Cannot query video position: status=0, value=-1, duration=-1
Video stream: 2304 x 1536
Objects:
Then it starts finding objects with 2 fps.
How do I change the video stream resolution to 1080p or 720p? The frame rate is very slow and this appears to be the fix.
Can't find it within the makefile or cfg folder. Any thoughts? Is this an opencv problem?
Thanks!
cfg settings:
[net]
batch=64
subdivisions=8
# Training
#width=512
#height=512
width=320
height=320
channels=3
momentum=0.949
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
learning_rate=0.0013
burn_in=1000
max_batches = 500500
policy=steps
steps=400000,450000
scales=.1,.1
I tried with the built-in camera and connected my phone(IP) and got 1080 on both with smooth results. I didn't find anywhere to change the webcam settings which are stuck on 2304x1536. Where would camera settings be located?

After searching around for a solution to this issue myself I finally found it!
In the darknet/src/ folder is a file named "image_opencv.cpp". At lines 597 and 598 you will find the following 2 commented commands:
//cap->set(CV_CAP_PROP_FRAME_WIDTH, 1280);
&
//cap->set(CV_CAP_PROP_FRAME_HEIGHT, 960);
After trying out these commands a lot more errors showed up, this is due to yolov4 (and my install) using OpenCV 4.1.1. Which has a different syntax. Your resolution should change to 1920x1080 if you replace the two aforementioned commands with these:
cap->set(cv::CAP_PROP_FRAME_WIDTH, 1920);
cap->set(cv::CAP_PROP_FRAME_HEIGHT, 1080);
Notice that the comment slashes have been removed as to activate the commands.

Related

Nvidia codec SDK samples: can't decode an encoded file correctly

I'm trying out the sample applications in the Nvidia video codec sdk, and am having trouble getting a useable decoded result.
My input file is YUV 4:2:0, taken from here, which is 352x288px.
I'm encoding using the AppEncD3D12.exe sample, with the following command:
.\AppEncD3D12.exe -i D:\akiyo_cif.y4m -s 352x288 -o D:\akiyo_out.mp4
This gives the output
GPU in use: NVIDIA GeForce RTX 2080 Super with Max-Q Design
[INFO ][17:46:39] Encoding Parameters:
codec : h264
preset : p3
tuningInfo : hq
profile : (default)
chroma : yuv420
bitdepth : 8
rc : vbr
fps : 30/1
gop : 250
bf : 1
multipass : 0
size : 352x288
bitrate : 0
maxbitrate : 0
vbvbufsize : 0
vbvinit : 0
aq : disabled
temporalaq : disabled
lookahead : disabled
cq : 0
qmin : P,B,I=0,0,0
qmax : P,B,I=0,0,0
initqp : P,B,I=0,0,0
Total frames encoded: 112
Saved in file D:\akiyo_out.mp4
Which looks promising. However, using the decode sample, a single frame of the output contains what look like 12 smaller frames of the input, in monochrome.
I'm running the decode sample like this:
PS D:\Nvidia\Video_Codec_SDK_11.1.5\Samples\build\Debug> .\AppDecD3D.exe -i D:\akiyo_out.mp4
GPU in use: NVIDIA GeForce RTX 2080 Super with Max-Q Design
Display with D3D9.
[INFO ][17:58:58] Media format: raw H.264 video (h264)
Session Initialization Time: 23 ms
[INFO ][17:58:58] Video Input Information
Codec : AVC/H.264
Frame rate : 30000/1000 = 30 fps
Sequence : Progressive
Coded size : [352, 288]
Display area : [0, 0, 352, 288]
Chroma : YUV 420
Bit depth : 8
Video Decoding Params:
Num Surfaces : 7
Crop : [0, 0, 0, 0]
Resize : 352x288
Deinterlace : Weave
Total frame decoded: 112
Session Deinitialization Time: 8 ms
I'm quite new to this so could be doing something stupid. Right now I don't know whether to look at encode or decode! Any ideas or tips most appreciated.
-I've tried other YUV files with the same result. I read that 4:2:2 is not supported, the above is 4:2:0.
Using the AppEncCuda sample, the decoded video (played with AppDecD3D.exe) is the correct size and in colour, but the video appears to scroll to the right as it is played, with colour information not scrolling at the same rate as the image
you have 2 problems:
According to the code and remarks in the AppEncD3D12 sample it expect the input frames to be in ARGB format but your input file is YUV -so the sample read data from the YUV file and treat it as ARGB. If you want the AppEncD3D12 to work with this file you need to either convert each YUV frame to argb or to change the code to work with YUV as input. The AppEncCuda sample is expecting YUV as input and that is the reason it give you better results. you can also see that in the AppEncD3D12 there were a total of 112 encoded but in the AppEncCuda there a total of 300 frames - this is because YUV frame are smaller then ARGB frames.
the 2nd problem is that the both sample save the output as RAW h264. The file is not really MP4 despite the name you gave it. There are a few players that can play a file of h264 RAW data and you can try to use them to play the output file. another option is to use FFMPEG to create a valid MP4 file and pass the RAW h264 samples to it - the NVIDIA encoder encode the video but it does not handle the creation of video files containers (There 2 many type of files like avi,mpg,mp4,mkv,ts, etc.) - you should use FFMPEG or other solution for that. The sdk samples contain a file FFmpegStreamer.h under the Utils folder that show how to use ffmpeg to output h264 video in Mpeg2 transport stream format to a file (*.ts) or the network.

cv2.VideoCapture(0, cv2.DSHOW) returns none

I'm trying to capture video from an in-build webcam on a laptop (or external USB camera) using opencv, specifically VideoCapture with the DSHOW argument.
I know there is a way to set the resolution and even FPS, however the DirectShow argument for the API returns none when I included it in the code.
For example;
# returns my webcam's stream, but all optional arguments are ignored
camera = cv2.VideoCapture(0)
camera = cv2.VideoCapture(0, cv2.CAP_V4L2)
# returns none and loops infinitely or errors out when *if im.any()*
camera = cv2.VideoCapture(0, cv2.CAP_DSHOW)
This is the code that follows after the above;
# should set resolution, settings are always ignored
camera.set(cv2.CAP_PROP_FRAME_WIDTH, 1920)
camera.set(cv2.CAP_PROP_FRAME_HEIGHT, 1080)
while(True):
retval, im = camera.read()
if im.any(): # errors out when image is none
cv2.imshow("image", im)
k = cv2.waitKey(33)
if k==27: # Esc key press
print('Resolution: {0}x and {1}y'.format(im.shape[1],im.shape[0]))
print('FPS: {0}'.format(camera.get(cv2.CAP_PROP_FPS)))
break
camera.release()
cv2.destroyAllWindows()
Is the DSHOW the correct API to use and is it the only API to use that can change resolution and FPS of a camera stream using opencv? Or is there something else I'm doing incorrectly?
More details about the system.
Ubuntu 18.04.6
python 3.9.5
opencv-python 4.5.2.52
Thank you in advance for the help!
Regards, Tiz
DSHOW (and MSMF) are windows only.
on linux, use V4L, FFMPEG or GSTREAMER
also, please check the return val of capture.set(),
not all properties/values will be supported on any given machine

OpenCV with multiple webcams - how to tell which camera is which in code?

Previously I've used industrial cameras with Ethernet connections and distinct IP addresses for multiple camera setups. Now I'm attempting a multiple camera setup with OpenCV and I'm not sure how to match the OpenCV VideoCapture ID to a certain camera.
I should probably use my current situation as an example to make my question more clear. I currently have 3 cameras connected. I'm using Ubuntu 18.04 if that matters. Here is my output from lsusb (omitting everything except the 3 Logitech webcams I have connected):
$ lsusb
Bus 001 Device 013: ID 046d:0843 Logitech, Inc. Webcam C930e
Bus 001 Device 003: ID 046d:0843 Logitech, Inc. Webcam C930e
Bus 001 Device 006: ID 046d:0892 Logitech, Inc. OrbiCam
As you can see I have 2 C930es and one OrbiCam connected. Based on this very helpful post:
https://superuser.com/questions/902012/how-to-identify-usb-webcam-by-serial-number-from-the-linux-command-line
I found I could get the serial number of the cams like so:
$ sudo lsusb -v -d 046d:0843 | grep -i serial
iSerial 1 D2DF1D2E
iSerial 1 99A8F15E
$ sudo lsusb -v -d 046d:0892 | grep -i serial
iSerial 1 C83E952F
Great, so I now have a way to uniquely identify each camera based on the serial numbers stored in the cam's memory (D2DF1D2E, 99A8F15E, and C83E952F).
The problem is, opening a webcam connection in OpenCV is done as follows:
vidCapForCamX = cv2.VideoCapture(OPEN_CV_VID_CAP_ID_FOR_CAM_X)
vidCapForCamY = cv2.VideoCapture(OPEN_CV_VID_CAP_ID_FOR_CAM_Y)
vidCapForCamZ = cv2.VideoCapture(OPEN_CV_VID_CAP_ID_FOR_CAM_Z)
Where camera X, Y, and Z are the 3 cameras I need to use, each for a different determined purpose, and OPEN_CV_VID_CAP_ID_FOR_CAM_X, Y, and Z are the OpenCV VideoCapture IDs. Right now, I'm relating cameras to the OpenCV VideoCapture IDs with the following manual process:
1) Make a test script like this:
# cam_test.py
import numpy as np
import cv2
cap = cv2.VideoCapture(4)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1920)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 1080)
while True:
# Capture frame-by-frame
ret, frame = cap.read()
# Display the resulting frame
cv2.imshow('frame', frame)
keyPress = cv2.waitKey(10)
if keyPress == ord('q'):
break
# end if
# end while
# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
2) Try numbers 0-99 for the VideoCapture parameter until I find the 3 magic numbers for my 3 attached cameras. In my current example they are 0, 2, and 4.
3) Each time I find a valid VideoCapture ID, wave my hand in front of each camera until I determine which one that VideoCapture ID is for, then write down which camera in my project that needs to correspond to, ex in my case:
0 => serial D2DF1D2E => cam X
2 => serial 99A8F15E => cam Y
4 => serial C83E952F => cam Z
4) Edit my code (or a stored config file or database field) so cam X uses VideoCapture ID 0, cam Y uses VideoCapture ID 2, etc.
I should clarify that cameras X, Y, and Z are in different positions and serve different purposes, i.e. if I use VideoCapture ID 4 for cam X the application wouldn't work (they have to be mapped a certain way as above).
Clearly for a production application this routine is not acceptable.
I realize I can do something like this:
import cv2
openCvVidCapIds = []
for i in range(100):
try:
cap = cv2.VideoCapture(i)
if cap is not None and cap.isOpened():
openCvVidCapIds.append(i)
# end if
except:
pass
# end try
# end for
print(str(openCvVidCapIds))
To get a list of the valid OpenCV VideoCapture IDs, but I still have to do the manual hand wave thing to determine which OpenCV VideoCapture IDs corresponds to each camera.
To make matters worse, swapping which camera is connected to which physical port on a device shuffles the OpenCV VideoCapture IDs, so if any camera connection is changed, or a cam is added or removed the manual process has to be repeated for all cameras.
So my question is, is there some genius way (in code, not a manual way) to relate the serial number of each camera or some other unique ID stored in the cam's memory to the magic numbers that OpenCV seems to come up with for VideoCapture IDs?
To put my question another way, I need to write a function camSerialNumToOpenCvVidCapId that could be used like so:
vidCapForCamX = cv2.VideoCapture(camSerialNumToOpenCvVidCapId(D2DF1D2E))
vidCapForCamY = cv2.VideoCapture(camSerialNumToOpenCvVidCapId(99A8F15E))
vidCapForCamZ = cv2.VideoCapture(camSerialNumToOpenCvVidCapId(C83E952F))
Is this possible and how could this be done?
P.S. I'm comfortable with OpenCV C++ or Python, any helpful answers using either would be greatly appreciated.
--- Edit ---
This question:
OpenCV VideoCapture device index / device number
Has a response (not accepted) that pertains to using Windows API calls, but I'm using Ubuntu.
--- Edit2 ---
# Micka, here is what I have for cameras in /dev/:
$ ls -l /dev/video*
crw-rw----+ 1 root video 81, 0 Nov 20 12:26 /dev/video0
crw-rw----+ 1 root video 81, 1 Nov 20 12:26 /dev/video1
crw-rw----+ 1 root video 81, 2 Nov 20 12:26 /dev/video2
crw-rw----+ 1 root video 81, 3 Nov 20 12:26 /dev/video3
crw-rw----+ 1 root video 81, 4 Nov 20 12:26 /dev/video4
crw-rw----+ 1 root video 81, 5 Nov 20 12:26 /dev/video5
I'm not sure if this helps
--- Edit3 ---
After considering this some more what I really need is a cam property in OpenCV to identify each camera uniquely. After getting a list of available VideoCapture IDs as mentioned above, if there was a property like:
serialNum = cv2.get(cv2.CAP_PROP_SERIAL_NUM)
Then it would be easy, but there does not seem to be such a property or anything similar (after checking PyCharm auto-complete for cv2.CAP_PROP_* and reading the OpenCV docs for VideoCapture).
For the solution you found, you need root privileges. On my setup with Ubuntu20 this is not required for:
udevadm info --name=/dev/video0
This outputs properties of first camera detected. Pipe it through "grep" to filter out specific property that is different for all cameras like "ID_SERIAL=". You can then use "cut" to remove beginning of this string "ID_SERIAL=" and leave just the value like:
udevadm info --name=/dev/video0 | grep ID_SERIAL= | cut -d "=" -f 2
In Python you can run external command to get this info like:
def get_cam_serial(cam_id):
# Prepare the external command to extract serial number.
p = subprocess.Popen('udevadm info --name=/dev/video{} | grep ID_SERIAL= | cut -d "=" -f 2'.format(cam_id),
stdout=subprocess.PIPE, shell=True)
# Run the command
(output, err) = p.communicate()
# Wait for it to finish
p.status = p.wait()
# Decode the output
response = output.decode('utf-8')
# The response ends with a new line so remove it
return response.replace('\n', '')
To acquire all the camera serial numbers, just loop through several camera ID's. On my setup trying camera ID 0 and 1 target the same camera. Also 2 and 4 target the second camera, so the loop can have 2 for step. Once all ID's are extracted, place them in a dictionary to be able to associate cam ID with serial number. The complete code could be:
serials = {}
FILTER = "ID_SERIAL="
def get_cam_serial(cam_id):
p = subprocess.Popen('udevadm info --name=/dev/video{} | grep {} | cut -d "=" -f 2'.format(cam_id, FILTER),
stdout=subprocess.PIPE, shell=True)
(output, err) = p.communicate()
p.status = p.wait()
response = output.decode('utf-8')
return response.replace('\n', '')
for cam_id in range(0, 10, 2):
serial = get_cam_serial(cam_id)
if len(serial) > 6:
serials[cam_id] = serial
print('Serial numbers:', serials)
It is not very difficult to do. In Linux browse to the directory
/dev/v4l/by-id/
This directory lists all the webcams connected to your system with names like usb-046d_081b_31296650-video-index0 Copy this id and use it in your code in the following manner:
cv::VideoCapture camera;
camera.open("/dev/v4l/by-id/usb-046d_081b_31296650-video-index0");
cv::Mat frame;
camera >> frame;
For different cameras you can first note down their ids and then refer them in your code.

Can't get an Opencv, cv2.VideoCapture rtsp steam to work from IP camera

I am trying to read a rtsp stream from my Ip camera using Opencv and running Linux. The camera is a Floureon IPC 360 from China. I am trying to develop some facial recognition code.
I am using the following code:
import numpy as np
import cv2
vcap = cv2.VideoCapture("rtsp://192.168.1.240:554/realmonitor?channel=0")
print(vcap)
while(1):
ret, frame = vcap.read()
print (ret,frame)
cv2.imshow('VIDEO', frame)
#cv2.imwrite('messigray.png',frame)
cv2.waitKey(1)
$ python w.py
<VideoCapture 0x7fc685598230>
(False, None)
Traceback (most recent call last):
File "w.py", line 9, in <module>
cv2.imshow('VIDEO', frame)
cv2.error: OpenCV(4.1.0) /io/opencv/modules/highgui/src/window.cpp:352: error: (-215:Assertion failed) size.width>0 && size.height>0 in function 'imshow'
cv2.imshow is failing as the frame is 'None' & (ret is False).
In a separate window I can run openRTSP :
./openRTSP -4 -P 10 -F cam_eight -t -d 8 rtsp://192.168.1.240:554/realmonitor?channel=0
Which creates me a nice mp4 file that I can play:
107625 Sep 12 19:08 cam_eight-00000-00010.mp4
OpenRTSP works with or without the t (tcp).
I have also tried supplying the admin:123456 credentials to the cv2.VideoCapture line, which openRTSP doesn't appear to require.
Any ideas why cv2.VideoCapture is apparently failing ?
I have tried variants of the above code, but nothing seems to work.
I have enabled ONVIF on the camera
According to other answers, it isn't possible to acquire ONVIF streams with OpenCV, since it defaults the stream to use the tcp protocol, while ONVIF relies on udp.
You should define the environment variable OPENCV_FFMPEG_CAPTURE_OPTIONS to skip the default setting to tcp, as can be seen in the original source code here:
OPENCV_FFMPEG_CAPTURE_OPTIONS=whatever
If you want to properly configure the capture options, then you should refer to the ffmpeg documentation, which is used internally by OpenCV.
As stated in the linked answer, keys and values are separated with ; and pairs are separated via |.

CAP_PROP_FPS doesn't change in opencv 3

So I set
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FPS, 60)
I also tried integer 5 instead of cv2.CAP_PROP_FPS. Neverteless, frame rate doesn't change. I get 30 when I
print(cap.get(cv2.CAP_PROP_FPS))
Why?
The problem maybe is with the codec of the camera stream and not with the FPS itself, for example, if your camera only supports YUYV it is probably that you could only work with some specific FPS, try with the app guvcview to check this in a GUI.
Try to change the codec to MJPG and then change the FPS using CAP_PROP_FPS. I'm using a Logitech C922 pro and this works for me to configure 1080p and 30fps, if you have other camera probably yu need to use a lower resolution to achieve 30fps:
import cv2 as cv
def decode_fourcc(v):
v = int(v)
return "".join([chr((v >> 8 * i) & 0xFF) for i in range(4)])
def setfourccmjpg(cap):
oldfourcc = decode_fourcc(cap.get(cv.CAP_PROP_FOURCC))
codec = cv.VideoWriter_fourcc(*'MJPG')
res=cap.set(cv.CAP_PROP_FOURCC,codec)
if res:
print("codec in ",decode_fourcc(cap.get(cv.CAP_PROP_FOURCC)))
else:
print("error, codec in ",decode_fourcc(cap.get(cv.CAP_PROP_FOURCC)))
cap = cv.VideoCapture(CAMERANUM)
cu.setfourccmjpg(cap)
w=1920
h=1080
fps=30
res1=cap.set(cv.CAP_PROP_FRAME_WIDTH,w)
res2=cap.set(cv.CAP_PROP_FRAME_HEIGHT,h)
res3=cap.set(cv.CAP_PROP_FPS,fps)
then resume your normal video capture polling loop.
Not all openCV parameters are supported by all cameras from an opencv standpoint. Each camera has a different set of parameters that need to be set. You need to find out what parameters are supported by your camera...

Resources