Find Distance between barcode and camera? - image-processing

Is it possible to find the distance between the detected qr bar code (square) and the camera, if the size of the actual bar code and the (x,y) of all the corners of the bar code detected by the camera are known?
I want the method to work even if the camera is at an angle from the barcode.
I tried using a simple equation like f=d*z/D , where f is the local length of the camera, D is the size of the object, d is the width of the detected object in pixels, and z is the distance between the camera and the barcode. First, I calculate the focal length by using data from a known distance and then get the z values accordingly.
While the above method works pretty well but it has a lot of error if the camera is at an angle.
Is there is a better method to do this ?
Also, I can use only one camera, using two cameras is not an option.

Use your current formula (which you state works well) against the longest side and its opposite, then average the results.
Alternatively, just average the lengths of the longest side and its opposite. The relationships are all linear so you should end up with the same answer.

First you have to know the camera angle.
If you can not read that parameter from a device you could estimate that parameter by using other measures.
For example you know that a bar-code is rectangular. So by detecting it you could obtain four angles and from that estimate a homografy matrix. By knowing the homography matrix you could simplify your problem by just multiplying the coordinates with a homography inverse.
Homography matrix is wiedly used in camera calibration when a known pattern is presented such as chessboard.

Related

How could I to transform a pixel from a camera image plane to another camera image plane?

Two cameras , Calibration is done between them and both intrinsic and extrinsic matrices are obtained , I am able to get (U,V) of the first camera , How could i get (U,V) of the second camera ? What is the kind of transformation could be made ?
Positions of cameras is fixed
Homography is the way a two 2D planes could be related
Since these cameras are paralel to each other(i.e. stereo), y axis of a point(x,y) in the first image will remain the same in the second image, i.e . y' = y. Only x will change. ( y is vertical axis, x is horizontal).
There are some techniques to find x'. The easiest one is normalized cross correlation. Choose a window around the points, do normalized cross correlation. The result will be an array of width of the image.
Unless you are searching for a point in a smooth region, maximum value in your array (peak) is expected to be your matching point.
Alternatively, you can try SIFT/SURF feature but I am not expert on those. I only know there are functions you can use in Matlab (such as detectSURFfeatures).
Note that if you are using two different cameras, you have to calibrate both of them.

Triangulation to find distance to the object- Image to world coordinates

Localization of an object specified in the image.
I am working on the project of computer vision to find the distance of an object using stereo images.I followed the following steps using OpenCV to achieve my objective
1. Calibration of camera
2. Surf matching to find fundamental matrix
3. Rotation and Translation vector using svd as method is described in Zisserman and Hartley book.
4. StereoRectify to get the projection matrix P1, P2 and Rotation matrices R1, R2. The Rotation matrices can also be find using Homography R=CameraMatrix.inv() H Camera Matrix.
Problems:
i triangulated point using least square triangulation method to find the real distance to the object. it returns value in the form of [ 0.79856 , .354541 .258] . How will i map it to real world coordinates to find the distance to an object.
http://www.morethantechnical.com/2012/01/04/simple-triangulation-with-opencv-from-harley-zisserman-w-code/
Alternative approach:
Find the disparity between the object in two images and find the depth using the given formula
Depth= ( focal length * baseline ) / disparity
for disparity we have to perform the rectification first and the points must be undistorted. My rectification images are black.
Please help me out.It is important
Here is the detail explanation of how i implemented the code.
Calibration of Camera using Circles grid to get the camera matrix and Distortion coefficient. The code is given on the Github (Andriod).
2.Take two pictures of a car. First from Left and other from Right. Take the sub-image and calculate the -fundmental matrix- essential matrix- Rotation matrix- Translation Matrix....
3.I have tried to projection in two ways.
Take the first image projection as identity matrix and make a second project 3x4d through rotation and translation matrix and perform Triangulation.
Get the Projection matrix P1 and P2 from Stereo Rectify to perform Triangulation.
My object is 65 meters away from the camera and i dont know how to calculate this true this based on the result of triangulation in the form of [ 0.79856 , .354541 .258]
Question: Do i have to do some extra calibration to get the result. My code is not based to know the detail of geometric size of the object.
So you already computed the triangulation? Well, then you have points in camera coordinates, i.e. in the coordinate frame centered on one of the cameras (the left or right one depending on how your code is written and the order in which you feed your images to it).
What more do you want? The vector length (square root of the sum of the square coordinates) of those points is their estimated distance from the same camera. If you want their position in some other "world" coordinate system, you need to give the coordinate transform between that system and the camera - presumably through a calibration procedure.

how can i measure distance of an detected object from camera in video using opencv?

All i know is that the height and width of an object in video. can someone guide me to calculate distance of an detected object from camera in video using c or c++? is there any algorithm or formula to do that?
thanks in advance
Martin Ch was correct in saying that you need to calibrate your camera, but as vasile pointed out, it is not a linear change. Calibrating your camera means finding this matrix
camera_matrix = [fx,0 ,cx,
0,fy,cy,
0,0, 1];
This matrix operates on a 3 dimensional coordinate (x,y,z) and converts it into a 2 dimensional homogeneous coordinate. To convert to your regular euclidean (x,y) coordinate just divide the first and second component by the third. So now what are those variables doing?
cx/cy: They exist to let you change coordinate systems if you like. For instance you might want the origin in camera space to be in the top left of the image and the origin in world space to be in the center. In that case
cx = -width/2;
cy = -height/2;
If you are not changing coordinate systems just leave these as 0.
fx/fy: These specify your focal length in units of x pixels and y pixels, these are very often close to the same value so you may be able to just give them the same value f. These parameters essentially define how strong perspective effects are. The mapping from a world coordinate to a screen coordinate (as you can work out for yourself from the above matrix) assuming no cx and cy is
xsc = fx*xworld/zworld;
ysc = fy*yworld/zworld;
As you can see the important quantity that makes things bigger closer up and smaller farther away is the ratio f/z. It is not linear, but by using homogenous coordinates we can still use linear transforms.
In short. With a calibrated camera, and a known object size in world coordinates you can calculate its distance from the camera. If you are missing either one of those it is impossible. Without knowing the object size in world coordinates the best you can do is map its screen position to a ray in world coordinates by determining the ration xworld/zworld (knowing fx).
i donĀ“t think it is easy if have to use camera only,
consider about to use 3rd device/sensor like kinect/stereo camera,
then you will get the depth(z) from the data.
https://en.wikipedia.org/wiki/OpenNI

Finding distance from camera to object of known size

I am trying to write a program using opencv to calculate the distance from a webcam to a one inch white sphere. I feel like this should be pretty easy, but for whatever reason I'm drawing a blank. Thanks for the help ahead of time.
You can use triangle similarity to calibrate the camera angle and find the distance.
You know your ball's size: D units (e.g. cm). Place it at a known distance Z, say 1 meter = 100cm, in front of the camera and measure its apparent width in pixels. Call this width d.
The focal length of the camera f (which is slightly different from camera to camera) is then f=d*Z/D.
When you see this ball again with this camera, and its apparent width is d' pixels, then by triangle similarity, you know that f/d'=Z'/D and thus: Z'=D*f/d' where Z' is the ball's current distance from the camera.
To my mind you will need a camera model = a calibration model if you want to measure distance or other things (int the real-world).
The pinhole camera model is simple, linear and gives good results (but won't correct distortions, (whether they are radial or tangential).
If you don't use that, then you'll be able to compute disparity-depth map, (for instance if you use stereo vision) but it is relative and doesn't give you an absolute measurement, only what is behind and what is in front of another object....
Therefore, i think the answer is : you will need to calibrate it somehow, maybe you could ask the user to approach the sphere to the camera till all the image plane is perfectly filled with the ball, and with a prior known of the ball measurement, you'll be able to then compute the distance....
Julien,

Distance to the object using stereo camera

Is there a way to calculate the distance to specific object using stereo camera?
Is there an equation or something to get distance using disparity or angle?
NOTE: Everything described here can be found in the Learning OpenCV book in the chapters on camera calibration and stereo vision. You should read these chapters to get a better understanding of the steps below.
One approach that do not require you to measure all the camera intrinsics and extrinsics yourself is to use openCVs calibration functions. Camera intrinsics (lens distortion/skew etc) can be calculated with cv::calibrateCamera, while the extrinsics (relation between left and right camera) can be calculated with cv::stereoCalibrate. These functions take a number of points in pixel coordinates and tries to map them to real world object coordinates. CV has a neat way to get such points, print out a black-and-white chessboard and use the cv::findChessboardCorners/cv::cornerSubPix functions to extract them. Around 10-15 image pairs of chessboards should do.
The matrices calculated by the calibration functions can be saved to disc so you don't have to repeat this process every time you start your application. You get some neat matrices here that allow you to create a rectification map (cv::stereoRectify/cv::initUndistortRectifyMap) that can later be applied to your images using cv::remap. You also get a neat matrix called Q, which is a disparity-to-depth matrix.
The reason to rectify your images is that once the process is complete for a pair of images (assuming your calibration is correct), every pixel/object in one image can be found on the same row in the other image.
There are a few ways you can go from here, depending on what kind of features you are looking for in the image. One way is to use CVs stereo correspondence functions, such as Stereo Block Matching or Semi Global Block Matching. This will give you a disparity map for the entire image which can be transformed to 3D points using the Q matrix (cv::reprojectImageTo3D).
The downfall of this is that unless there is much texture information in the image, CV isn't really very good at building a dense disparity map (you will get gaps in it where it couldn't find the correct disparity for a given pixel), so another approach is to find the points you want to match yourself. Say you find the feature/object in x=40,y=110 in the left image and x=22 in the right image (since the images are rectified, they should have the same y-value). The disparity is calculated as d = 40 - 22 = 18.
Construct a cv::Point3f(x,y,d), in our case (40,110,18). Find other interesting points the same way, then send all of the points to cv::perspectiveTransform (with the Q matrix as the transformation matrix, essentially this function is cv::reprojectImageTo3D but for sparse disparity maps) and the output will be points in an XYZ-coordinate system with the left camera at the center.
I am still working on it, so I will not post entire source code yet. But I will give you a conceptual solution.
You will need the following data as input (for both cameras):
camera position
camera point of interest (point at which camera is looking)
camera resolution (horizontal and vertical)
camera field of view angles (horizontal and vertical)
You can measure the last one yourself, by placing the camera on a piece of paper and drawing two lines and measuring an angle between these lines.
Cameras do not have to be aligned in any way, you only need to be able to see your object in both cameras.
Now calculate a vector from each camera to your object. You have (X,Y) pixel coordinates of the object from each camera, and you need to calculate a vector (X,Y,Z). Note that in the simple case, where the object is seen right in the middle of the camera, the solution would simply be (camera.PointOfInterest - camera.Position).
Once you have both vectors pointing at your target, lines defined by these vectors should cross in one point in ideal world. In real world they would not because of small measurement errors and limited resolution of cameras. So use the link below to calculate the distance vector between two lines.
Distance between two lines
In that link: P0 is your first cam position, Q0 is your second cam position and u and v are vectors starting at camera position and pointing at your target.
You are not interested in the actual distance, they want to calculate. You need the vector Wc - we can assume that the object is in the middle of Wc. Once you have the position of your object in 3D space you also get whatever distance you like.
I will post the entire source code soon.
I have the source code for detecting human face and returns not only depth but also real world coordinates with left camera (or right camera, I couldn't remember) being origin. It is adapted from source code from "Learning OpenCV" and refer to some websites to get it working. The result is generally quite accurate.

Resources