I am trying to estimate a gues for the intrinsic matrix, K, of a DJI Phantom 4 drone. I know that the form of this matrix is:
but i cant seem to get the units right. Looking up the specs at https://www.dji.com/phantom-4/info#specs I find that the focal length is 8.88 (dosnt say units...) and the image dimensions are 4000x3000. WHat would K look like with these?
*PS, I am scaling down the images so they are smaller. Will this effect the K matrix I should use for openCV?
The page the OP linked to lists a FOV of 94 degrees. With an image width of 4000 pixels this corresponds to a focal length of
f = (4000 / 2) pixels / tan(94 / 2 degrees) = 1865 pixels
Absent any other calibration data, one should therefore use an estimated camera matrix of the form:
K = [ [1865, 0 , 2000],
[0 , 1865, 1500],
[0 , 0 , 1 ] ]
OP, you may have confused the specs of the P4 and the P4Pro, which have different sensors and lenses. The P4Pro, not the P4, has a focal length of 8.8mm. The P4 has a focal length of 3.61mm.
If you are indeed using images from a P4, Francesco's answer is correct.
However, if you are actually using images from a P4Pro, you need to use these values:
f = (4864 / 2) pixels / tan(84 / 2 degrees) = 2701 pixels
K = [ [2701, 0 , 2432],
[0 , 2701, 1824],
[0 , 0 , 1 ] ]
For future reference for anyone that may find this answer, here are the relevant specs for the P4 and P4Pro sensors/lenses:
Phantom 4:
Sensor size: 1/2.3" (6.17mm x 4.55mm)
Focal length (actual): 3.61mm
Focal length (35mm equivalent): 20mm
FOV: 94°
Image size: 4000×3000 pixels
Video frame size
UHD: 4096×2160 pixels
4K: 3840×2160 pixels
2.7K: 2704×1520 pixels
FHD: 1920×1080 pixels
HD: 1280×720 pixels
Phantom 4 Pro:
Sensor size: 1" (12.8mm x 9.6mm)
Focal length (actual): 8.88mm
Focal length (35mm equivalent): 24mm
FOV: 84°
Image size
3:2 aspect ratio: 5472×3648 pixels
4:3 aspect ratio: 4864×3648 pixels
16:9 aspect ratio: 5472×3078 pixels
Video frame size
C4K: 4096×2160 pixels
4K: 3840×2160 pixels
2.7K: 2720×1530 pixels
FHD: 1920×1080 pixels
HD: 1280×720 pixels
I think it is much better to work from the focal length in mm
https://www.dxomark.com/Cameras/DJI/Phantom4-Pro---Specifications
For P4 Pro:
13.2 x 8.8 so pixel size is = 0.00241 or 2.41 um focal length is 8.8mm
so focal length in pixel = 8.8 / 0.00241 = 3684.6 pixels
Incidentally in the image metadata, there is a field:
CalibratedFocalLength 3666.666504 (use exiftool to find it) so I think K should be
K = [ [3666.6, 0 , 2432],
[0 , 3666.6, 1824],
[0 , 0 , 1 ] ]
Related
I'm trying to measure distance in real time from stereo pair to a person detected in the scene. First i calibrated both cameras separately with a 9x6 checkerboard (square size of 59 mm) and i obtained a rms error between 0.15 and 0.19 for both cameras. Using the obtained parameters i calibrated the stereo pair and the rms error was 0.36. Later, I rectified, undistort and remap the stereo pair giving me this result:
rectified and undistorted stereo
Done that, I computed stereo correspondence using stereoSGBM. That's how i did:
Mat imgDisp= Mat(frame1.cols, frame1.rows,CV_16S);
cvtColor(frame1, frame1, CV_BGR2GRAY);
cvtColor(frame2, frame2, CV_BGR2GRAY);
//parameters for stereoSGBM
stereo.SADWindowSize = 3;
stereo.numberOfDisparities = 144;
stereo.preFilterCap = 63;
stereo.minDisparity = -39;
stereo.uniquenessRatio = 10;
stereo.speckleWindowSize = 100;
stereo.speckleRange = 32;
stereo.disp12MaxDiff = 1;
stereo.fullDP = false;
stereo.P1 = 216;
stereo.P2 = 864;
double minVal; double maxVal;
minMaxLoc(imgDisp, &minVal, &maxVal);
return imgDisp;
I attached the result from stereoSGBM here: disparity map.
For detect a person in the scene I used hog + svm (the default people dectector) and tracked that person with optical flow (cvCalcOpticalFlowPyrLK()). Using the disparity map obtained in the stereo correspondence process i obtained the disparity for each corner tracked from one person as follow:
int x= cornersA[k].x;
int y= cornersA[k].y;
short pixVal= mapaDisp.at<short>(y,x);
float dispFeatures= pixVal/ 16.0f;
with the disparity for each corner tracked for one person in the scene I computed the maxim disparity and computed the depth in that pixel using the formula ((focal*baseline)/disp):
float Disp =maxDisp_v[p];
cout<< "max disp"<< Disp<<endl;
float d = ((double)(879.85* 64.32)/(double)(Disp))/10; //distance in cms.
** for focal length I calculated the average between fx and fy obtained in the cameras matrix [3x3] parameters:
CM1: [9.0472706037497187e+02 0. 3.7829164759284492e+02
0. 8.4576999835299739e+02 1.8649783393160138e+02
0. 0. 1.]
CM2: [9.1390904648169953e+02 0. 3.5700689147467887e+02 0.
8.5514555697053311e+02 2.1723345133656409e+02 0. 0. 1.]
so fx camera1: 904.7; fy camera1: 845.7; fx camera2: 913.9; fy camera2: 855.1
** The result of T[0,0] matrix matched with the baseline that I measure manuallly so I assumed that's correct baseline.
**due to the square size of checkerboard is in mm i assumed that baseline must be in the same unit, that's why I'm put 64.32 mm in baseline.
The result of distance is aprox. 55 cms but the real distance is 300 cms. I have checked many times but the measured distance is still incorrect: distanceResult
Help me please!, I have no idea what i'm doing wrong.
*** I'm using opencv 2.4.9 in osx system.
I think you are making a mistake with units:
focal length is provided in pixels,
baseline is provided in cm
disparity is provided in pixels.
Right?
According to formula you have pix*cm/pix = cm. But you devide it by 10 and get dm. So you have the distance around 55dm which is twice bigger then 300. Which is not a bad case for you approach.
You cannot use the simple parallel-cameras triangulation formula on rectified images, because you need to undo the rectification homographies.
Use cv2.reprojectImageTo3D
I aim to flag if the image captured is blurred. For this I have tried out two methods using opencv and intend to use threshold for deciding if image is blurred :
1. Variance of laplacian using following:
Imgproc.Laplacian(src_gray_image, dest_lap_image, CvType.CV_16S,3,1,0);
Core.meanStdDev(dest_lap_image, mean , std);
var of laplacian = Math.pow(std.get(0,0)[0],2);
Gradient in image in x and y directions:
Imgproc.Sobel( image, Gx, CvType.CV_32F, 1, 0 );
Imgproc.Sobel( image, Gy, CvType.CV_32F, 0, 1 );
double sumSq = normGx * normGx + normGy * normGy;
gradient = (float)( 1. / ( sumSq / image.size().area() + 1e-6 ));
These values differ at lot when same scene is captured with different mobile phones.
E.g laplacian variance = 79 for camera1 and 5000 for camera2
gradient value = 2*10-4 for camera1 and 4*10-5 for camera2.
Following are the meta data:
camera1:
4096x2304
Exposure time: 1/17
Aperture value: 1.53
ISO speed: 121
Focal length: 4.0
camera2:
1456x2592
Exposure time: 1/50
Aperture Value: 2.53
ISO: 160
Focal length: 3.5mm
What I am not able to understand is ,
1. what values of camera decides on sharpness and focus and how does it affect the gradient and laplacian variance values because these values are the features that ideally must me camera independent.
2. How do we calculate these values so that they are device independent.
3. Is there any other method to do a quick basic blur detection that does not depend on image meta data.
I have calibrated my GoPro Hero 4 Black using Camera calibration toolbox for Matlab and calculated its fields of view and focal length using OpenCV's calibrationMatrixValues(). These, however, differ from GoPro's specifications. Istead of 118.2/69.5 FOVs I get 95.4/63.4 and focal length 2.8mm instead of 17.2mm. Obviously something is wrong.
I suppose the calibration itself is correct since image undistortion seems to be working well.
Can anyone please give me a hint where I made a mistake? I am posting my code below.
Thanks.
Code
cameraMatrix = new Mat(3, 3, 6);
for (int i = 0; i < cameraMatrix.height(); i ++)
for (int j = 0; j < cameraMatrix.width(); j ++) {
cameraMatrix.put(i, j, 0);
}
cameraMatrix.put(0, 0, 582.18394);
cameraMatrix.put(0, 2, 663.50655);
cameraMatrix.put(1, 1, 582.52915);
cameraMatrix.put(1, 2, 378.74541);
cameraMatrix.put(2, 2, 1.);
org.opencv.core.Size size = new org.opencv.core.Size(1280, 720);
//output parameters
double [] fovx = new double[1];
double [] fovy = new double[1];
double [] focLen = new double[1];
double [] aspectRatio = new double[1];
Point ppov = new Point(0, 0);
org.opencv.calib3d.Calib3d.calibrationMatrixValues(cameraMatrix, size,
6.17, 4.55, fovx, fovy, focLen, ppov, aspectRatio);
System.out.println("FoVx: " + fovx[0]);
System.out.println("FoVy: " + fovy[0]);
System.out.println("Focal length: " + focLen[0]);
System.out.println("Principal point of view; x: " + ppov.x + ", y: " + ppov.y);
System.out.println("Aspect ratio: " + aspectRatio[0]);
Results
FoVx: 95.41677635378488
FoVy: 63.43170132212425
Focal length: 2.8063085232812504
Principal point of view; x: 3.198308916796875, y: 2.3934605770833333
Aspect ratio: 1.0005929569269807
GoPro specifications
https://gopro.com/help/articles/Question_Answer/HERO4-Field-of-View-FOV-Information
Edit
Matlab calibration results
Focal Length: fc = [ 582.18394 582.52915 ] ± [ 0.77471 0.78080 ]
Principal point: cc = [ 663.50655 378.74541 ] ± [ 1.40781 1.13965 ]
Skew: alpha_c = [ -0.00028 ] ± [ 0.00056 ] => angle of pixel axes = 90.01599 ± 0.03208 degrees
Distortion: kc = [ -0.25722 0.09022 -0.00060 0.00009 -0.01662 ] ± [ 0.00228 0.00276 0.00020 0.00018 0.00098 ]
Pixel error: err = [ 0.30001 0.28188 ]
One of the images used for calibration
And the undistorted image
You have entered 6.17mm and 4.55mm for the sensor size in OpenCV, which corresponds to an aspect ratio 1.36 whereas as your resolution (1270x720) is 1.76 (approximately 16x9 format).
Did you crop your image before MATLAB calibration?
The pixel size seems to be 1.55µm from this Gopro page (this is by the way astonishingly small!). If pixels are squared, and they should be on this type of commercial cameras, that means your inputs are not coherent. Computed sensor size should be :
[Sensor width, Sensor height] = [1280, 720]*1.55*10^-3 = [1.97, 1.12]
mm
Even if considering the maximal video resolution which is 3840 x 2160, we obtain [5.95, 3.35]mm, still different from your input.
Please see this explanation about equivalent focal length to understand why the actual focal length of the camera is not 17.2 but 17.2*5.95/36 ~ 2.8mm. In that case, compute FOV using the formulas here for instance. You will indeed find values of 93.5°/61.7° (close to your outputs but still not what is written in the specifications because there probably some optical distortion due to the wide angles).
What I do not understand though, is how the focal distance returned can be right whereas sensor size entered is wrong. Could you give more info and/or send an image?
Edits after question updates
On that cameras, with a working resolution of 1280x720, the image is downsampled but not cropped so what I said above about sensor dimensions does not apply. The sensor size to consider is indeed the one used (6.17x4.55) as explained in your first comment.
The FOV is constrained by the calibration matrix inputs (fx, fy, cx, cy) given in pixels and the resolution. You can check it by typing:
2*DEGRES(ATAN(1280/(2*582.18394))) (=95.416776...°)
This FOV value is smaller than what is expected, but by the look of the undistorted image, your MATLAB distortion model is right and the calibration is correct. The barrel distortion due to the wide angle seems well corrected by the the rewarp you applied.
However, MATLAB toolbox uses a pinhole model, which is linear and cannot account for intrinsic parameters such as lens distortion. I assume this from the page :
https://fr.mathworks.com/help/vision/ug/camera-calibration.html
Hence, my best guess is that unless you find a model which fits more accurately the Gopro camera (maybe a wide-angle lens model), MATLAB calibration will return an intrinsic camera matrix corresponding to the "linear" undistorted image and the FOV will indeed be smaller (in the case of barrel distortion). You will have to apply distortion coefficients associated to the calibration to retrieve the actual FOV value.
We can see in the corrected image that side parts of the FOV get rejected out of bounds. If you had warped the image entirely, you would find that some undistorted pixels coordinates exceed [-1280/2;+1280/2] (horizontally, and idem vertically). Then, replacing opencv.core.Size(1280, 720) by the most extreme ranges obtained, you would hopefully retrieve Gopro website values.
In conclusion, I think you can rely on the focal distance value that you obtained if you make measurements in the center of your image, otherwise there is too much distortion and it doesn't apply.
I would like you to clarify me on these questions plz
1.If I calibrate my camera to a particular resolution say 640x360, can I use it for another resolution like 1024x768?
2.Also I want to know as to how many centimeters does 1 pixel contain in my image. It varies from system to system., How do I find that?. Also, it is not compulsorily square in shape. So I have to find the length and width of it. How do I do that?
I am using a logitech c170 which is a low speed cam. Is it okay to get an error around 8mm when I am trying to measure the distances in the image and compare them with real-time distances?
EDIT1:
Since the number of pixels in 1 mm is sensor_width/image_width , which is the inverse of density, I can calculate a_x/f and find the inverse right?
#marol
Intrinsic parameters of left camera:
Focal Length: fc_left = [ 1442.67707 1457.17435 ] ± [ 18.12442 19.46439 ]
Principal point: cc_left = [ 497.66112 291.77311 ] ± [ 42.37874 31.97065 ]
Skew: alpha_c_left = [ 0.00000 ] ± [ 0.00000 ] => angle of pixel axes = 90.00000 ± 0.00000 degrees
Distortion: kc_left = [ 0.02924 -0.65151 -0.01104 -0.01342 0.00000 ] ± [ 0.16553 1.57119 0.00913 0.01306 0.00000 ]
Intrinsic parameters of right camera:
Focal Length: fc_right = [ 1443.32678 1458.82558 ] ± [ 25.55850 26.08659 ]
Principal point: cc_right = [ 567.11672 258.09152 ] ± [ 20.46962 17.87495 ]
Skew: alpha_c_right = [ 0.00000 ] ± [ 0.00000 ] => angle of pixel axes = 90.00000 ± 0.00000 degrees
Distortion: kc_right = [ -0.58576 21.53289 -0.02278 0.00845 0.00000 ] ± [ 0.28148 9.37092 0.00787 0.00847 0.00000 ]
Extrinsic parameters (position of right camera wrt left camera):
Rotation vector: om = [ -0.04239 0.02401 -0.00677 ]
Translation vector: T = [ 71.66430 -0.79025 -8.76546 ]
If you mean: I have calibrated my camera using set of images with resolution X so I got calibration matrix K, can I use this matrix with images of different resolution Y? The direct answer is no, you cannot, since calibration matrix K has a form:
K = [a_x, 0, c_x;
0, a_y, c_y;
0, 0, 1;]
Where a_x = focal_length * density of pixels on mm in x direction, a_y = focal_length * density of pixels on mm in y direction (usually those densities are equal) and c_x = translation of image plane to principal point in x direction (similar c_y). When you ouput your calibration matrix K you will see something like:
K = [a_x, 0, 320;
0, a_y, 180;
0, 0, 1]
And yes, you can see that c_x = 320 = 640 / 2 and c_y = 180 = 360/2. So your calibration matrix is correlated with the image resolution, so you cannot use it directly with any other resolution without changing matrix K.
2.You have to divide sensor size by image size, ie
k_x = 1 / c_x = sensor_size_width / image_size_width.
k_y = 1 / c_y
Image sensor is that tiny plane made from photosensitive material which absorb light in your camera device. Usually you can find such information in camera manual, search for sensor size.
EDIT: And if you can't find sensor size in the camera manual, what is a normal behavior in case of webcameras, you can try to do the following: calibrate your camera given matrix K. Value a_x and a_y contains such information. Since we said a_x = f * density, so if you know focal length (and you know - it is 2.3mm - see here) so you can find out density = a_x / f. And we know that density is equal to image_width / sensor_width, so finally we have sensor_width = image_width / density = image_width * f / a_x. Similar thinking for sensor_height.
EDIT2: For example if you get:
Focal Length: fc_left = [ 1442.67707 1457.17435 ] ± [ 18.12442 19.46439 ]
So we have a_x = 1442.67707. From our conclusions and if we assume image size to 640 x 320, we have sensor width = 640 * 2.3 / 1442.67707 = 1,02 mm.
-- Update 2 --
The following article is really useful (although it is using Python instead of C++) if you are using a single camera to calculate the distance: Find distance from camera to object/marker using Python and OpenCV
Best link is Stereo Webcam Depth Detection. The implementation of this open source project is really clear.
Below is the original question.
For my project I am using two camera's (stereo vision) to track objects and to calculate the distance. I calibrated them with the sample code of OpenCV and generated a disparity map.
I already implemented a method to track objects based on color (this generates a threshold image).
My question: How can I calculate the distance to the tracked colored objects using the disparity map/ matrix?
Below you can find a code snippet that gets the x,y and z coordinates of each pixel. The question: Is Point.z in cm, pixels, mm?
Can I get the distance to the tracked object with this code?
Thank you in advance!
cvReprojectImageTo3D(disparity, Image3D, _Q);
vector<CvPoint3D32f> PointArray;
CvPoint3D32f Point;
for (int y = 0; y < Image3D->rows; y++) {
float *data = (float *)(Image3D->data.ptr + y * Image3D->step);
for (int x = 0; x < Image3D->cols * 3; x = x + 3)
{
Point.x = data[x];
Point.y = data[x+1];
Point.z = data[x+2];
PointArray.push_back(Point);
//Depth > 10
if(Point.z > 10)
{
printf("%f %f %f", Point.x, Point.y, Point.z);
}
}
}
cvReleaseMat(&Image3D);
--Update 1--
For example I generated this thresholded image (of the left camera). I almost have the same of the right camera.
Besides the above threshold image, the application generates a disparity map. How can I get the Z-coordinates of the pixels of the hand in the disparity map?
I actually want to get all the Z-coordinates of the pixels of the hand to calculate the average Z-value (distance) (using the disparity map).
See this links: OpenCV: How-to calculate distance between camera and object using image?, Finding distance from camera to object of known size, http://answers.opencv.org/question/5188/measure-distance-from-detected-object-using-opencv/
If it won't solve you problem, write more details - why it isn't working, etc.
The math for converting disparity (in pixels or image width percentage) to actual distance is pretty well documented (and not very difficult) but I'll document it here as well.
Below is an example given a disparity image (in pixels) and an input image width of 2K (2048 pixels across) image:
Convergence Distance is determined by the rotation between camera lenses. In this example it will be 5 meters. Convergence distance of 5 (meters) means that the disparity of objects 5 meters away is 0.
CD = 5 (meters)
Inverse of convergence distance is: 1 / CD
IZ = 1/5 = 0.2M
Size of camera's sensor in meters
SS = 0.035 (meters) //35mm camera sensor
The width of a pixel on the sensor in meters
PW = SS/image resolution = 0.035 / 2048(image width) = 0.00001708984
The focal length of your cameras in meters
FL = 0.07 //70mm lens
InterAxial distance: The distance from the center of left lens to the center of right lens
IA = 0.0025 //2.5mm
The combination of the physical parameters of your camera rig
A = FL * IA / PW
Camera Adjusted disparity: (For left view only, right view would use positive [disparity value])
AD = 2 * (-[disparity value] / A)
From here you can compute actual distance using the following equation:
realDistance = 1 / (IZ – AD)
This equation only works for "toe-in" camera systems, parallel camera rigs will use a slightly different equation to avoid infinity values, but I'll leave it at this for now. If you need the parallel stuff just let me know.
if len(puntos) == 2:
x1, y1, w1, h1 = puntos[0]
x2, y2, w2, h2 = puntos[1]
if x1 < x2:
distancia_pixeles = abs(x2 - (x1+w1))
distancia_cm = (distancia_pixeles*29.7)/720
cv2.putText(imagen_A4, "{:.2f} cm".format(distancia_cm), (x1+w1+distancia_pixeles//2, y1-30), 2, 0.8, (0,0,255), 1,
cv2.LINE_AA)
cv2.line(imagen_A4,(x1+w1,y1-20),(x2, y1-20),(0, 0, 255),2)
cv2.line(imagen_A4,(x1+w1,y1-30),(x1+w1, y1-10),(0, 0, 255),2)
cv2.line(imagen_A4,(x2,y1-30),(x2, y1-10),(0, 0, 255),2)
else:
distancia_pixeles = abs(x1 - (x2+w2))
distancia_cm = (distancia_pixeles*29.7)/720
cv2.putText(imagen_A4, "{:.2f} cm".format(distancia_cm), (x2+w2+distancia_pixeles//2, y2-30), 2, 0.8, (0,0,255), 1,
cv2.LINE_AA)
cv2.line(imagen_A4,(x2+w2,y2-20),(x1, y2-20),(0, 0, 255),2)
cv2.line(imagen_A4,(x2+w2,y2-30),(x2+w2, y2-10),(0, 0, 255),2)
cv2.line(imagen_A4,(x1,y2-30),(x1, y2-10),(0, 0, 255),2)
cv2.imshow('imagen_A4',imagen_A4)
cv2.imshow('frame',frame)
k = cv2.waitKey(1) & 0xFF
if k == 27:
break
cap.release()
cv2.destroyAllWindows()
I think this is a good way to measure the distance between two objects