how to find pixel disparity , pixel size (depth estimation in stereo vision) - vision

I'm trying to estimate depth from a stereo system with two cameras. The simple equation that I use is:
Depth = (Base line * Focal lenght) / (Pixel disparity * Pixel size)
but i can't find Pixel disparity and Pixel size
how to find pixel disparity , pixel size?
Thank you.

You can get pixel size form spec sheet of your camera sensor. Alternatively, pixel size is not required if you have calibrated your camera, so that calibrated focal length will be in pixels.
So you can modify your formula as:
Depth (in cm) = Baseline(in cm) * Focal Length(in pixels) / Disparity (in Pixels)
For getting pixel disparity, you can use OpenCV Block Matching and Semi-Global Block Matching techniques Calib 3D Docs. There are many more accurate disparity estimation algorithms were published.

Related

Depth reconstruction from disparity map using stereo camera

I'm working on depth reconstruction from disparity map. I use OpenCV to calibrate my stereo camera, then undistort and rectify the images. I use LibELAS to compute the disparity map.
My question is: According to OpenCV document (https://docs.opencv.org/3.1.0/dd/d53/tutorial_py_depthmap.html), the depth is computed by depth = Baseline*focal_length/disparity. But according to middlebury dataset (http://vision.middlebury.edu/stereo/data/scenes2014/), the depth is computed by depth = baseline * focal_length / (disparity + doffs). The "doffs" is "x-difference of principal points, doffs = cx1 - cx0".
What does the "doffs" mean ? How can I get the "doffs" from OpenCV calibration ?
The OpenCV-Calibration gives you the intrinsic matrices for both of your cameras. These are 3x3 Matrices with the following style: (from doc)
fx 0 cx
0 fy cy
0 0 1
cx and cy are the coordiantes of your principle point. From there you can calculate doffs as stated by you. With ideal cameras these parameters are the center of the image. But in real cameras they differ in a few pixels.

Radial distortion correction, camera parameters and openCV

I am trying to undistort a barrel/radial distortion from an image. When I see the equations they do not require the focal length of the camera. But the openCV API initundistortrectifymap requires them in form of the camera intrinsic matrix. Why so ? Anyway to do it without them? Because I understand the undistort is common for various distortion corrections.
The focal length is essential in distortion removal -since it provides info on the intrinsic parameters of the camera- and it is fairly simple to add it to the camera matrix. Just remember that you have to convert it from millimeters to pixels. This is done to ensure that the pixels are rectangular. For the conversion you need to know the sensor's height and width in millimeters, the horizontal (Sh) and vertical (Sv) number of pixels of the sensor and the focal length in millimeters. The conversion is done using the following equations:
fx = (f(mm) x Sh(px))/sensorwidth(mm)
fy = (f(mm) x Sv(px))/sensorheight(mm)
More on the camera matrix elements can be found here.

Opencv: Find focal lenth in mm in an analog camera

I have sucessfully calibrated an analog camera using opencv. The ouput focal length and principal points are in pixels.
I know in digital cameras you can easily multiply the size of the pixel in the sensor by the focal length in pixels and get the focal length in mm (or whatever).
How can I do with this analog camera to get the focal length in mm?
The lens manufacturers usually write focal length on the lens. Even the name of the lens contains it, e.g. "canon lens 1.8 50mm".
If not, you can try to measure it manually.
Get lens apart from the camera. Take a small well illuminated object, place it in 1-3 meters in from of lens and sheet of paper back from it. Get sharp and focused image of the object on the paper.
Now measure following:
a - distance from lens to the object;
y - object size;
y' - object image size on the paper;
f = a/(1+y/y') - focus distance.
If your output is in pixels, you must be digitizing the analog input at some point. You just need to figure out the size of the pixel that you are creating.
For example, if you are scanning film in, then you use the pixel size of the scanner.

OpenCV: How-to calculate distance between camera and object using image?

I am a newbie in OpenCV. I am working with the following formula to calculate distance:
distance to object (mm) = focal length (mm) * real height of the object (mm) * image height (pixels)
----------------------------------------------------------------
object height (pixels) * sensor height (mm)
Is there a function in OpenCV that can determine object distance? If not, any reference to sample code?
How to calculate distance given an object of known size
You need to know one of 2 things up front
Focal-length (in mm and pixels per mm)
Physical size of the image sensor (to calculate pixels per mm)
I'm going to use focal-length since I don't want to google for the sensor datasheet.
Calibrate the camera
Use the OpenCV calibrate.py tool and the Chessboard pattern PNG provided in the source code to generate a calibration matrix. I took about 2 dozen photos of the chessboard from as many angles as I could and exported the files to my Mac. For more detail check OpenCV's camera calibration docs.
Camera Calibration Matrix (iPhone 5S Rear Camera)
RMS: 1.13707201375
camera matrix:
[[ 2.80360356e+03 0.00000000e+00 1.63679133e+03]
[ 0.00000000e+00 2.80521893e+03 1.27078235e+03]
[ 0.00000000e+00 0.00000000e+00 1.00000000e+00]]
distortion coefficients: [ 0.03716712 0.29130959 0.00289784 -0.00262589 -1.73944359]
f_x = 2803
f_y = 2805
c_x = 1637
c_y = 1271
Checking the details of the series of chessboard photos you took, you will find the native resolution (3264x2448) of the photos and in their JPEG EXIF headers, visible in iPhoto, you can find the Focal Length value (4.15mm). These items should vary depending on camera.
Pixels per millimeter
We need to know the pixels per millimeter (px/mm) on the image sensor. From the page on camera resectioning we know that f_x and f_y are focal-length times a scaling factor.
f_x = f * m_x
f_y = f * m_y
Since we have two of the variables for each formula we can solve for m_x and m_y. I just averaged 2803 and 2805 to get 2804.
m = 2804px / 4.15mm = 676px/mm
Object size in pixels
I used OpenCV (C++) to grab out the Rotated Rect of the points and determined the size of the object to be 41px. Notice I have already retrieved the corners of the object and I ask the bounding rectangle for its size.
cv::RotatedRect box = cv::minAreaRect(cv::Mat(points));
Small wrinkle
The object is 41px in a video shot on the camera # 640x480.
Convert px/mm in the lower resolution
3264/676 = 640/x
x = 133 px/mm
So given 41px/133px/mm we see that the size of the object on the image sensor is .308mm .
Distance formula
distance_mm = object_real_world_mm * focal-length_mm / object_image_sensor_mm
distance_mm = 70mm * 4.15mm / .308mm
distance_mm = 943mm
This happens to be pretty good. I measured 910mm and with some refinements I can probably reduce the error.
Feedback is appreciated.
Similar triangles approach
Adrian at pyimagesearch.com demonstrated a different technique using similar triangles. We discussed this topic beforehand and he took the similar triangles approach and I did camera intrinsics.
there is no such function available in opencv to calculate the distance between object and the camera. see this :
Finding distance from camera to object of known size
You should know that the parameters depend on the camera and will change if the camera is changed.
To get a mapping between the real world and camera without any prior information of the camera you need to calibrate the camera...here you can find some theory
For calculating the depth i.e. distance between camera and object you need at least two images of the same object taken by two different cameras...which is popularly called the stereo vision technique..

What is the baseline of a stereo camera?

Could someone here explain what exactly is the baseline of a camera?
You're apparently dealing with stereo, where the baseline is (at least normally) the distance between the two lenses.
I believe
Z (depth) = (focalLength * baseline) / disparity
Other coordinates can be found here: http://www.ptgrey.com/support/kb/index.asp?a=4&q=63&ST=
The baseline (distance between both cameras) will influence the depth range that you can observe with a stereo camera, and also your depth resolution. The same also applies to the focal length of the lenses that you use.
Assuming that you process a disparity range with a constant size, then the following rules apply:
Increasing the baseline will increase your depth resolution, but will also increase the minimum distance to the camera
Increasing the focal length will also increase the depth resolution and minimum distance to the camera, but also reduce the field of view.
This relationship can be studied with the following online calculator:
https://nerian.com/support/resources/calculator/
Baseline is the distance between 2 stereo camera.
When you do stereo Calibration, the openCV method return R, T (rotation and translation between your cameras)

Resources