I am working with Turtlebots and ROS, and using a camera to find the pixel positions of a marker in the camera. I've moved over from simulations to a physical system. The issue I'm having is that the pixel positions in my physical system did not match the pixel positions in the physical system despite the marker and everything else being in the same position as in the simulations. There was a shift in the vertical pixel position by about 40 pixels when everything else like the height between the camera and marker, the marker position, and the distance between the marker and camera were the same in both the physical and simulated system. The simulated system does not need a camera calibration matrix, it is assumed to be ideal.
The resolution I'm using is 640x480, so the center pixels should be cx=320 and cy=240, but what I noticed in the camera calibration matrix I was using in the physical system was that the cx was around 318, which is pretty accurate, but the cy was around 202, which is far from what it should be. This also made me think that the shift in pixel positions in the vertical direction is shifted with about the same amount of pixels that I'm getting as an error.
So is it right to assume that the error in the center pixel in the calibration could be causing the error in the pixel positions?
I have been trying to calibrate a USB camera (Logitech C920 I think) and I've been using the camera_calibrator ROS package found here http://wiki.ros.org/camera_calibration to calibrate the camera. I think the camera calibration did not go that well, seeing as I always have a pretty big error in either cx or cy. Here are the calibration matrices.
First calibration matrix, used 15x10 vertices with size 0.25
Recalibrated but did not actually use this yet, calibrated with 8x6 size 0.25
Same as previous, some difference between the two
The checkerboards were on A4 papers.
Thanks in advance.
I believe the answer to your question is to answer how to perform a better camera calibration.
Quoting from Calib.io enter link description here:
Choose the right size calibration target.
Perform calibration at the approximate working distance (WD) of your final application.
The target should have a high feature count.
Collect images from different areas and tilts.
Use good lighting.
Calibration is only as accurate as the calibration target used. Use laser or inkjet printed targets only to validate and test.
Per sample, proper mounting of calibration target and camera.
Remove bad observations. Carefully inspect reprojection errors.
Obtaining a low re-projection error does not equal a good camera calibration. Be careful of over fitting.
I am trying to understand mapping points between two images of same scene except the camera positions are different. say like this apologies for the rough sketch and the hand-writing. Sample image taken from cam1 and Sample image taken from cam2 . Trying to map between these two images. since the two cameras used are same(logitech camera). I assume camera calibration isn't required. So with the help of SIFT descriptors and feature matching, using the good matches from the images as inputs to Homography with RANSAC. I get 3*3 matrix. To verify the view mapping. I select few objects(say bins in the image) in cam1 image and try to map the same object in cam2 image using 3 * 3 matrix by using warp_perspective, but the outputs aren't good. say something like this had selected top left and bottom right of the objects in cam1 image(i.e. bins) and trying to draw a bounding box for the desired object in cam2 image.
But as visible in the view map output image the bounding boxes aren't proper to the bins.
Wanted to understand, where am i going wrong. Is it the camera positions affecting, and this shouldn't be used for homography or have to use multiple homographies or have to get to know the translation between the camera positions. very confused. Thank you.
Homography transforms plane into a plane. It can only be used if all of the matches lay on a plane in real world (e.g. on the planar wall) or the feature points are located far from both cameras so the transformation between the cameras might be expressed as pure rotation. See this link for further explanation.
In your case the objects are located at different depths so you need to perform stereo calibration of cameras and then compute the depth map to be able to map pixels from one camera into another.
So I have been tinkering a little bit with opencv and I want to be able to use a camera image to get the position of certain objects that are lying flat on plane. These objects are simple shapes such as circles squares etc. They all have the same height of 5cm. To be able to relate real world points to pixels on the camera I painted 4 white squares on the plane with known distances between them.
So the steps I have been taking are:
Initialization:
Calibrate my camera using a checkerboard image and save the calibration data.
Get the input image. call cv::undistort with the calibration data for my camera.
Find the center points of the 4 squares in the image and pass that data and the real world coordinates of the squares to the cv::solvePnP function. Save the rvec and tvec return parameters.
Warp the perspective of the image so you can get a top down view from the image. This is essentially following this tutorial: https://docs.opencv.org/3.4.1/d9/dab/tutorial_homography.html
Use the resulting image to again find the 4 white squares and then calculate a "pixels per meter" translation constant which can relate a certain amount of difference in pixels between points to the real world distance on the plane where the 4 squares are.
Finding object, This is done after initialization:
Get the input image. call cv::undistort with the calibration data for my camera.
Warp the perspective of the image so you can get a top down view from the image. This is the same as step 4 during initialisation.
Find the centerpoint of the object to detect.
Since the centerpoint of the object is on a higher plane then where I calibrated I use the following formula to correct this(d = is the pixel offset from the center of the image. camHeight is the cameraHeight I measured by using a tape measure. h is height of the object):
d = x - (h * (x / camHeight))
So here for an illustration how I got this formule:
But still the coordinates are not matching up...
So I am wondering at all if this is the correct. Specifically I have the following questions:
Is using cv::undistort before using cv::solvenPnP correct? cv::solvePnP also takes the camera calibration data as input so I'm not sure if I have to pass an undistorted image to it or not.
Similar to 1. During Finding object I call cv::undistort -> cv::warpPerspective. Is this undistort necessary here?
Is my calculation to correct for the parallel planes in step 4 correct? I feel like I am missing something but I can't see what. One thing I am wondering is whether I can get the camera height from opencv once solvePnp is done.
I am a newbie to CV so If anything else is totally wrong please also point it out to me.
Thank you for reading this wall of text!
I have two images.
Say one is a 10x10 which we call trainImage and then there is another queryImage which is the same chessboard photographed using a phone camera. Now I have to find the position of camera in (x,y,z) coordinates. Using openCV and feature detection I have been able to identify the chessboard object in photographed object, but how to go ahead with calculating the transformations on chessboard so that I can eventually calculate the position of camera. Any pointers to start looking upon will also be really appreciated. Thanks.
Edit:
Reframing the problem statement again, I have two images trainImage and queryImage. I need to find the position of camera i.e. (x,y,z) if we assume that trainImage is at (0,0,0) in queryImage. I did some reading to find this I need rvec(rotation vector) and tvec(translation vector).
When I use findHomography() function on two images I get a 3x3 homgraphy matrix using which I can find the pixels points(x,y) in queryImage by multiplying to pixel points(x,y) in trainImage. How can I use this homographyMatrix for calculating tvec and rvec.
So I am very new to OpenCV (2.1), so please keep that in mind.
So I managed to calibrate my cheap web camera that I am using (with a wide angle attachment), using the checkerboard calibration method to produce the intrinsic and distortion coefficients.
I then have no trouble feeding these values back in and producing image maps, which I then apply to a video feed to correct the incoming images.
I run into an issue however. I know when it is warping/correcting the image, it creates several skewed sections, and then formats the image to crop out any black areas. My question then is can I view the complete warped image, including some regions that have black areas? Below is an example of the black regions with skewed sections I was trying to convey if my terminology was off:
An image better conveying the regions I am talking about can be found here! This image was discovered in this post.
Currently: The cvRemap() returns basically the yellow box in the image linked above, but I want to see the whole image as there is relevant data I am looking to get out of it.
What I've tried: Applying a scale conversion to the image map to fit the complete image (including stretched parts) into frame
CvMat *intrinsic = (CvMat*)cvLoad( "Intrinsics.xml" );
CvMat *distortion = (CvMat*)cvLoad( "Distortion.xml" );
cvInitUndistortMap( intrinsic, distortion, mapx, mapy );
cvConvertScale(mapx, mapx, 1.25, -shift_x); // Some sort of scale conversion
cvConvertScale(mapy, mapy, 1.25, -shift_y); // applied to the image map
cvRemap(distorted,undistorted,mapx,mapy);
The cvConvertScale, when I think I have aligned the x/y shift correctly (guess/checking), is somehow distorting the image map making the correction useless. There might be some math involved here I am not correctly following/understanding.
Does anyone have any other suggestions to solve this problem, or what I might be doing wrong? I've also tried trying to write my own code to fix distortion issues, but lets just say OpenCV knows already how to do it well.
From memory, you need to use InitUndistortRectifyMap(cameraMatrix,distCoeffs,R,newCameraMatrix,map1,map2), of which InitUndistortMap is a simplified version.
cvInitUndistortMap( intrinsic, distort, map1, map2 )
is equivalent to:
cvInitUndistortRectifyMap( intrinsic, distort, Identity matrix, intrinsic,
map1, map2 )
The new parameters are R and newCameraMatrix. R species an additional transformation (e.g. rotation) to perform (just set it to the identity matrix).
The parameter of interest to you is newCameraMatrix. In InitUndistortMap this is the same as the original camera matrix, but you can use it to get that scaling effect you're talking about.
You get the new camera matrix with GetOptimalNewCameraMatrix(cameraMat, distCoeffs, imageSize, alpha,...). You basically feed in intrinsic, distort, your original image size, and a parameter alpha (along with containers to hold the result matrix, see documentation). The parameter alpha will achieve what you want.
I quote from the documentation:
The function computes the optimal new camera matrix based on the free
scaling parameter. By varying this parameter the user may retrieve
only sensible pixels alpha=0, keep all the original image pixels if
there is valuable information in the corners alpha=1, or get something
in between. When alpha>0, the undistortion result will likely have
some black pixels corresponding to “virtual” pixels outside of the
captured distorted image. The original camera matrix, distortion
coefficients, the computed new camera matrix and the newImageSize
should be passed to InitUndistortRectifyMap to produce the maps for
Remap.
So for the extreme example with all the black bits showing you want alpha=1.
In summary:
call cvGetOptimalNewCameraMatrix with alpha=1 to obtain newCameraMatrix.
use cvInitUndistortRectifymap with R being identity matrix and newCameraMatrix set to the one you just calculated
feed the new maps into cvRemap.