I am new to OPENCV so bear with me if there are simple things that I am missing here.
I am trying to work out a camera based system that can continuously output the speed of a vehicle with the following assumptions:
1. The camera is horizontally placed and the vehicle passes near 3 to 5 feet of the camera lens.
2. The speed will not be more than 30KM/Hrs
I was hoping to start with the concept of a optical mouse which detects the displacement in the surface pattern. However I am unclear as to how to handle the background when the vehicle starts to enter the frame.
There are two methods I was interested in experiment with but am looking for further inputs.
Detect the vehicle as it enters the frame and separate from background.
Use cvGoodFeaturesToTrack to find points on the vehicle.
Track the point across the next frame. & Calculate the horizontal velocity using Lucas_Kanade Pyramid function for optical flow.
Repeat
Please suggest corrections and amendments.
Also I request more experienced members to help me code this procedure efficiently since I don't know which are the most correct functions to use here.
Thanks in advance.
Hope you will be using a simple camera with 20 fps to 30 fps and your camera is placed perpendicular to the road but away from it...the object i.e. your cars have a max velocity of 8 ms-1 in the image plane...calculate the speed of the cars in the image plane with the help of the lens you are using...
( speed in object plane / distance of camera from road ) = ( speed in image plane / focal length )
you should get in pixels per second if you know how much each pixel measures...
Steps...
You can use frame differentiation...that is subtract the current frame from the previous frame and take the absolute difference...threshold the difference...this segments out your moving car from the back ground...remember this segments all moving objects...so if u want a car and not a moving man you can use the shape characteristic that is the height is to width ratio...fit a rectangle to the segmented part and in each frame do the same steps. so in each frame you can keep a record of the coordinate of the leading edge of the bounding box... that way when a car enters the view till it pass out of the view you know for how long the car has persisted...use the number of frames , the frame rate and the coordinaes of the leading edge of the bounding box to calculate the speed...
You can use goodfeaturestotrack and optical flow of open cv...that way you can make distinguish between fast moving and slow moving objects...but keep refreshing the points that goodfeaturestotrack gives you or else any new car coming into the camera view will not be updated...record the displacement of the set of points picked by goodfeaturestotrack in each frame..that is the displacement of the moving object...calculate speed in the same way...the basic idea to calculate speed is to record the number of frames the object has persisted in the camera field of view...if your camera is fixed so is your field of view...hence what matters is in how many frames are you able to catch the object...
remember....the optical flow of opencv works for tracking slow moving objects or more theoretically the feature point (determined by goodfeatures to track..) displacement is less between 2 consecutive frames for the algorithm to work...big displacements will have some erroneous predictions by the algorithm...that is why the speed in the image plane is important..at least qualitatively you should get an idea of it...
NOTE: both the methods are for single object tracking ..for multiple object tracking you need some modifications...however you can start with either of the method...i think it will work..
Related
I am working on a machine vision project and need to determine the angle of an object in x and y relative to the center of the frame (center in my mind being where the camera is pointed). I originally did NOT do a camera calibration (calculated angle per pixel by taking a picture of a dense grid and doing some simple math). While doing some object tracking I was noticing some strange behaviour which I suspected was due to some distortion. I also noticed that an object that should be dead center of my frame was not, the camera had to be shifted or the angle changed for that to be true.
I performed a calibration in OpenCV and got a principal point of (363.31, 247.61) with a resolution of 640x480. The angle per pixel obtained by cv2.calibrationMatrixVales() was very close to what I had calculated, but up to this point I was assuming center of the frame was based on 640/2, 480/2. I'm hoping that someone can confirm, but going forward do I assume that my (0,0) in cartesian coordinates is now at the principal point? Perhaps I can use my new camera matrix to correct the image so my original assumption is true? Or I am out to lunch and need some direction on how to achieve this.
Also was my assumption of 640/2 correct or should it technically have been (640-1)/2. Thanks all!
I'm writing an application in c++ which gets the camera pose using fiducial markers and also as input get a lat/lon coordinate in the real world and as output streams a video with X marker which shows the location of the coordinate on the screen.
When I move my head , the X stays in the same place spatially (because I know how to move it on the screen based on the camera pose or even hide it when I look away.
My only problem is to convert the coordinate from real life to coordinate on the screen.
I know my own gps coordinate and the target gps coordinate.
I also have the screen size (height / width) .
How can I in openCV translate all these to x,y pixel on the screen ?
In my point, your question isn't so clear.
The opencv is an image processing library
You can't convert your needs with opencv. You've need a solution with your own algorithms. So I have some advices and some experiments to explain somethings.
You can simulate to show your real life position on screen with any programming language. Imagine it, you want to develop a measurement software, it can measure a house plan image on screen with drawing lines to edges of all walls (You know some length of walls owing to an image like below)
If you want to measure wall of WC at bottom, you must know how much pixels are how ft, so firstly you should draw a line from start to end of known length for how much pixel width it. For example, If 12'4"" ft equals 9 pixels width. no longer, you can calculate length wall of WC at bottom with use basic proportion. Of course this is basic ratio for you.
I know this is not your need but this answer is helpful for you, I hope it will give some ideas.
I'm using openCV the calibrateCamera function to calibrate my camera. I started from the tutorial implementation, but there seems something wrong.
The camera is looking down on a table and i use a chessboard with an area that covers about 1/2 or 1/4 of my total image. Since I aim to track a flat object that slides over this table, I also slide my chessboard over this table.
So my first question is: is it ok that i move my chessboard over this table? Or do I have to make some 3D movements in order to get some good result?
Because I was wondering: how does the function guesses the distance between the table and the camera? He has only a guess of his focal point, and he has only one "eye", so there is no depth vision.
My second question: how does the bloody thing work? :p Can anyone show me some implementation of this function?
Thx!
the camera calibration needs a seed of points to calculate the camera matrix and the the position of the central point of the camera, and the distortion matrices , if you want to use a chessboard you have to take in consideration its dimension(I never used the circles function because the detection of chessboard is easier ) , the dimension of the chessboard should be pair X unpair number so you can get a correct rotation matrix ! the calibration function needs minimum 8x set of chessboardCorners and ( I use 30 tell 50) it depends on how precise you want to be .the return value of the calibration function is the re-projection error this should be near to zero if the calibration is good.
the cameraCalibration take a the size of used chessboards ( you can use different chessboardSize) and the dimension ( in mm or cm or even m etc.. ) your result will depend on your given dimension.
By the way after getting the chessboardCorners you have to refines them with the function CornerSubPix you can set how good the refinement is in the function parameter.
In the internet you can find a lot docs about this subject.
http://www.ics.uci.edu/~majumder/vispercep/cameracalib.pdf
I hope it helps !
regarding the chessboard positions, I got best results with 25-30 images
First I do 3 -4 images that show the chessboard at different distances full frame half 1/3 1/4
then I make sure to go to each corner, each center of each edge plus 4 rotation on each axis XYZ. When using a 640x480 sensor my reprojection error was mostly around 0.1 or even better
here a few links that got me in the right direction:
How to verify the correctness of calibration of a webcam?
I am trying to write a program using opencv to calculate the distance from a webcam to a one inch white sphere. I feel like this should be pretty easy, but for whatever reason I'm drawing a blank. Thanks for the help ahead of time.
You can use triangle similarity to calibrate the camera angle and find the distance.
You know your ball's size: D units (e.g. cm). Place it at a known distance Z, say 1 meter = 100cm, in front of the camera and measure its apparent width in pixels. Call this width d.
The focal length of the camera f (which is slightly different from camera to camera) is then f=d*Z/D.
When you see this ball again with this camera, and its apparent width is d' pixels, then by triangle similarity, you know that f/d'=Z'/D and thus: Z'=D*f/d' where Z' is the ball's current distance from the camera.
To my mind you will need a camera model = a calibration model if you want to measure distance or other things (int the real-world).
The pinhole camera model is simple, linear and gives good results (but won't correct distortions, (whether they are radial or tangential).
If you don't use that, then you'll be able to compute disparity-depth map, (for instance if you use stereo vision) but it is relative and doesn't give you an absolute measurement, only what is behind and what is in front of another object....
Therefore, i think the answer is : you will need to calibrate it somehow, maybe you could ask the user to approach the sphere to the camera till all the image plane is perfectly filled with the ball, and with a prior known of the ball measurement, you'll be able to then compute the distance....
Julien,
Is there a way to calculate the distance to specific object using stereo camera?
Is there an equation or something to get distance using disparity or angle?
NOTE: Everything described here can be found in the Learning OpenCV book in the chapters on camera calibration and stereo vision. You should read these chapters to get a better understanding of the steps below.
One approach that do not require you to measure all the camera intrinsics and extrinsics yourself is to use openCVs calibration functions. Camera intrinsics (lens distortion/skew etc) can be calculated with cv::calibrateCamera, while the extrinsics (relation between left and right camera) can be calculated with cv::stereoCalibrate. These functions take a number of points in pixel coordinates and tries to map them to real world object coordinates. CV has a neat way to get such points, print out a black-and-white chessboard and use the cv::findChessboardCorners/cv::cornerSubPix functions to extract them. Around 10-15 image pairs of chessboards should do.
The matrices calculated by the calibration functions can be saved to disc so you don't have to repeat this process every time you start your application. You get some neat matrices here that allow you to create a rectification map (cv::stereoRectify/cv::initUndistortRectifyMap) that can later be applied to your images using cv::remap. You also get a neat matrix called Q, which is a disparity-to-depth matrix.
The reason to rectify your images is that once the process is complete for a pair of images (assuming your calibration is correct), every pixel/object in one image can be found on the same row in the other image.
There are a few ways you can go from here, depending on what kind of features you are looking for in the image. One way is to use CVs stereo correspondence functions, such as Stereo Block Matching or Semi Global Block Matching. This will give you a disparity map for the entire image which can be transformed to 3D points using the Q matrix (cv::reprojectImageTo3D).
The downfall of this is that unless there is much texture information in the image, CV isn't really very good at building a dense disparity map (you will get gaps in it where it couldn't find the correct disparity for a given pixel), so another approach is to find the points you want to match yourself. Say you find the feature/object in x=40,y=110 in the left image and x=22 in the right image (since the images are rectified, they should have the same y-value). The disparity is calculated as d = 40 - 22 = 18.
Construct a cv::Point3f(x,y,d), in our case (40,110,18). Find other interesting points the same way, then send all of the points to cv::perspectiveTransform (with the Q matrix as the transformation matrix, essentially this function is cv::reprojectImageTo3D but for sparse disparity maps) and the output will be points in an XYZ-coordinate system with the left camera at the center.
I am still working on it, so I will not post entire source code yet. But I will give you a conceptual solution.
You will need the following data as input (for both cameras):
camera position
camera point of interest (point at which camera is looking)
camera resolution (horizontal and vertical)
camera field of view angles (horizontal and vertical)
You can measure the last one yourself, by placing the camera on a piece of paper and drawing two lines and measuring an angle between these lines.
Cameras do not have to be aligned in any way, you only need to be able to see your object in both cameras.
Now calculate a vector from each camera to your object. You have (X,Y) pixel coordinates of the object from each camera, and you need to calculate a vector (X,Y,Z). Note that in the simple case, where the object is seen right in the middle of the camera, the solution would simply be (camera.PointOfInterest - camera.Position).
Once you have both vectors pointing at your target, lines defined by these vectors should cross in one point in ideal world. In real world they would not because of small measurement errors and limited resolution of cameras. So use the link below to calculate the distance vector between two lines.
Distance between two lines
In that link: P0 is your first cam position, Q0 is your second cam position and u and v are vectors starting at camera position and pointing at your target.
You are not interested in the actual distance, they want to calculate. You need the vector Wc - we can assume that the object is in the middle of Wc. Once you have the position of your object in 3D space you also get whatever distance you like.
I will post the entire source code soon.
I have the source code for detecting human face and returns not only depth but also real world coordinates with left camera (or right camera, I couldn't remember) being origin. It is adapted from source code from "Learning OpenCV" and refer to some websites to get it working. The result is generally quite accurate.