Describing nonlinear transformation between two images, using homography - image-processing

A one to one point matching has already been established
between the blue dots on the two images.
The image2 is the distorted version of the image1. The distortion model seems to be
eyefish lens distortion. The question is:
Is there any way to compute a transformation matrix which describes this transition.
In fact a matrix which transforms the blue
dots on the first image to their corresponding blue dots on the second image?
The problem here is that we don’t know the focal length(means images are uncalibrated), however we do have
perfect matching between around 200 points on the two images.
the distorted image:

I think what you're trying to do can be treated as a distortion correction problem, without the need of the rest of a classic camera calibration.
A matrix transformation is a linear one and linear transformations map always straight lines into straight lines (http://en.wikipedia.org/wiki/Linear_map). It is apparent from the picture that the transformation is nonlinear so you cannot describe it with a matrix operation.
That said, you can use a lens distortion model like the one used by OpenCV (http://docs.opencv.org/doc/tutorials/calib3d/camera_calibration/camera_calibration.html) and obtaining the coefficients shouldn't be very difficult. Here is what you can do in Matlab:
Call (x, y) the coordinates of an original point (top picture) and (xp, yp) the coordinates of a distorted point (bottom picture), both shifted to the center of the image and divided by a scaling factor (same for x and y) so they lie more or less in the [-1, 1] interval. The distortion model is:
x = ( xp*(1 + k1*r^2 + k2*r^4 + k3*r^6) + 2*p1*xp*yp + p2*(r^2 + 2*xp^2));
y = ( yp*(1 + k1*r^2 + k2*r^4 + k3*r^6) + 2*p2*xp*yp + p1*(r^2 + 2*yp^2));
Where
r = sqrt(x^2 + y^2);
You have 5 parameters: k1, k2, k3, p1, p2 for radial and tangential distortion and 200 pairs of points, so you can solve the nonlinear system.
Be sure the x, y, xp and yp arrays exist in the workspace and declare them global:
global x y xp yp
Write a function to evaluate the mean square error given a set of arbitrary distortion coefficients, say it's called 'dist':
function val = dist(var)
global x y xp yp
val = zeros(size(xp));
k1 = var(1);
k2 = var(2);
k3 = var(3);
p1 = var(4);
p2 = var(5);
r = sqrt(xp.*xp + yp.*yp);
temp1 = x - ( xp.*(1 + k1*r.^2 + k2*r.^4 + k3*r.^6) + 2*p1*xp.*yp + p2*(r.^2 + 2*xp.^2));
temp2 = y - ( yp.*(1 + k1*r.^2 + k2*r.^4 + k3*r.^6) + 2*p2*xp.*yp + p1*(r.^2 + 2*yp.^2));
val = sqrt(temp1.*temp1 + temp2.*temp2);
Solve the system with 'fsolve":
[coef, fval] = fsolve(#dist, zeros(5,1));
The values in 'coef' are the distortion coefficients you're looking for. To correct the distortion of new points (xp, yp) not present in the original set, use the equations:
r = sqrt(xp.*xp + yp.*yp);
x_corr = xp.*(1 + k1*r.^2 + k2*r.^4 + k3*r.^6) + 2*p1*xp.*yp + p2*(r.^2 + 2*xp.^2);
y_corr = yp.*(1 + k1*r.^2 + k2*r.^4 + k3*r.^6) + 2*p2*xp.*yp + p1*(r.^2 + 2*yp.^2);
Results will be shifted to the center of the image and scaled by the factor you used above.
Notes:
Coordinates must be shifted to the center of the image as the distortion is symmetric with respect to it.
It should't be necessary to normalize to the interval [-1, 1] but it is comon to do it so the distortion coefficients obtained are more or less of the same order of magnitude (working with powers 2, 4 and 6 of pixel coordinates would need very small coefficients).
This method doesn't require the points in the image to be in an uniform grid.

Related

Error in radial distortion based on (zoomed in) OpenCV-style camera parameters

Our AR device is based on a camera with pretty strong optical zoom. We measure the distortion of this camera using classical camera-calibration tools (checkerboards), both through OpenCV and the GML Camera Calibration tools.
At higher zoom levels (I'll use 249 out of 255 as an example) we measure the following camera parameters at full HD resolution (1920x1080):
fx = 24545.4316
fy = 24628.5469
cx = 924.3162
cy = 440.2694
For the radial and tangential distortion we measured 4 values:
k1 = 5.423406
k2 = -2964.24243
p1 = 0.004201721
p2 = 0.0162647516
We are not sure how to interpret (read: implement) those extremely large values for k1 and k2. Using OpenCV's classic "undistort" operation to rectify the image using these values seems to work well. Unfortunately this is (much) too slow for realtime usage.
The thumbnails below look similar, clicking them will display the full size images where you can spot the difference:
Camera footage
Undistorted using OpenCV
That's why we want to take the opposite aproach: leave the camera footage be distorted and apply a similar distortion to our 3D scene using shaders. Following the OpenCV documentation and this accepted answer in particular, the distorted position for a corner point (0, 0) would be
// To relative coordinates
double x = (point.X - cx) / fx; // -960 / 24545 = -0.03911
double y = (point.Y - cy) / fy; // -540 / 24628 = -0.02193
double r2 = x*x + y*y; // 0.002010
// Radial distortion
// -0.03911 * (1 + 5.423406 * 0.002010 + -2964.24243 * 0.002010 * 0.002010) = -0.039067
double xDistort = x * (1 + k1 * r2 + k2 * r2 * r2);
// -0.02193 * (1 + 5.423406 * 0.002010 + -2964.24243 * 0.002010 * 0.002010) = -0.021906
double yDistort = y * (1 + k1 * r2 + k2 * r2 * r2);
// Tangential distortion
... left out for brevity
// Back to absolute coordinates.
xDistort = xDistort * fx + cx; // -0.039067 * 24545.4316 + 924.3162 = -34.6002 !!!
yDistort = yDistort * fy + cy; // -0.021906 * 24628.5469 + 440.2694 = = -99.2435 !!!
These large pixel displacements (34 and 100 pixels at the upper left corner) seem overly warped and do not correspond with the undistorted image OpenCV generates.
So the specific question is: what is wrong with the way we interpreted the values we measured, and what should the correct code for distortion be?

triangulate points using epipolar geometry

I'm using in OpenCV the method
triangulatePoints(P1,P2,x1,x2)
to get the 3D coordinates of a point by its image points x1/x2 in the left/right image and the projection matrices P1/P2.
I've already studied epipolar geometry and know most of the maths behind it. But what how does this algorithm get mathematically the 3D Coordinates?
Here are just some ideas, to the best of my knowledge, should at least work theoretically.
Using the camera equation ax = PX, we can express the two image point correspondences as
ap = PX
bq = QX
where p = [p1 p2 1]' and q = [q1 q2 1]' are the matching image points to the 3D point X = [X Y Z 1]' and P and Q are the two projection matrices.
We can expand these two equations and rearrange the terms to form an Ax = b system as shown below
p11.X + p12.Y + p13.Z - a.p1 + b.0 = -p14
p21.X + p22.Y + p23.Z - a.p2 + b.0 = -p24
p31.X + p32.Y + p33.Z - a.1 + b.0 = -p34
q11.X + q12.Y + q13.Z + a.0 - b.q1 = -q14
q21.X + q22.Y + q23.Z + a.0 - b.q2 = -q24
q31.X + q32.Y + q33.Z + a.0 - b.1 = -q34
from which we get
A = [p11 p12 p13 -p1 0; p21 p22 p23 -p2 0; p31 p32 p33 -1 0; q11 q12 q13 0 -q1; q21 q22 q23 0 -q2; q31 q32 q33 0 -1], x = [X Y Z a b]' and b = -[p14 p24 p34 q14 q24 q34]'. Now we can solve for x to find the 3D coordinates.
Another approach is to use the fact, from camera equation ax = PX, that x and PX are parallel. Therefore, their cross product must be a 0 vector. So using,
p x PX = 0
q x QX = 0
we can construct a system of the form Ax = 0 and solve for x.

Calculating original signal from Discerete Fourier Transform

I am trying to calculate the original equation using a dft.
DFT on (1,0,0,0) gives (1,1,1,1)
So what is the equation of wave representing the dataset (1,0,0,0)? I mean something as follows.
f(t)=sin(t)+0.13sin(3t)
DFT of an impulse is just a flat spectrum in the frequency domain, so you have amplitude 1 at each frequency bin:
f(t) = 1 + cos(2πt/4) + cos(4πt/4) + cos(6πt/4)
= 1 + cos(πt/2) + cos(πt) + cos(3πt/2)

What is the Project Tango lens distortion model?

The Project Tango C API documentation says that the TANGO_CALIBRATION_POLYNOMIAL_3_PARAMETERS lens distortion is modeled as:
x_corr_px = x_px (1 + k1 * r2 + k2 * r4 + k3 * r6) y_corr_px = y_px (1
+ k1 * r2 + k2 * r4 + k3 * r6)
That is, the undistorted coordinates are a power series function of the distorted coordinates. There is another definition in the Java API, but that description isn't detailed enough to tell which direction the function maps.
I've had a lot of trouble getting things to register properly, and I suspect that the mapping may actually go in the opposite direction, i.e. the distorted coordinates are a power series of the undistorted coordinates. If the camera calibration was produced using OpenCV, then the cause of the problem may be that the OpenCV documentation contradicts itself. The easiest description to find and understand is the OpenCV camera calibration tutorial, which does agree with the Project Tango docs:
But on the other hand, the OpenCV API documentation specifies that the mapping goes the other way:
My experiments with OpenCV show that its API documentation appears correct and the tutorial is wrong. A positive k1 (with all other distortion parameters set to zero) means pincushion distortion, and a negative k1 means barrel distortion. This matches what Wikipedia says about the Brown-Conrady model and will be opposite from the Tsai model. Note that distortion can be modeled either way depending on what makes the math more convenient. I opened a bug against OpenCV for this mismatch.
So my question: Is the Project Tango lens distortion model the same as the one implemented in OpenCV (documentation notwithstanding)?
Here's an image I captured from the color camera (slight pincushioning is visible):
And here's the camera calibration reported by the Tango service:
distortion = {double[5]#3402}
[0] = 0.23019999265670776
[1] = -0.6723999977111816
[2] = 0.6520439982414246
[3] = 0.0
[4] = 0.0
calibrationType = 3
cx = 638.603
cy = 354.906
fx = 1043.08
fy = 1043.1
cameraId = 0
height = 720
width = 1280
Here's how to undistort with OpenCV in python:
>>> import cv2
>>> src = cv2.imread('tango00042.png')
>>> d = numpy.array([0.2302, -0.6724, 0, 0, 0.652044])
>>> m = numpy.array([[1043.08, 0, 638.603], [0, 1043.1, 354.906], [0, 0, 1]])
>>> h,w = src.shape[:2]
>>> mDst, roi = cv2.getOptimalNewCameraMatrix(m, d, (w,h), 1, (w,h))
>>> dst = cv2.undistort(src, m, d, None, mDst)
>>> cv2.imwrite('foo.png', dst)
And that produces this, which is maybe a bit overcorrected at the top edge but much better than my attempts with the reverse model:
The Tango C-API Docs state that (x_corr_px, y_corr_px) is the "corrected output position". This corrected output position needs to then be scaled by focal length and offset by center of projection to correspond to a distorted pixel coordinates.
So, to project a point onto an image, you would have to:
Transform the 3D point so that it is in the frame of the camera
Convert the point into normalized image coordinates (x, y)
Calculate r2, r4, r6 for the normalized image coordinates (r2 = x*x + y*y)
Compute (x_corr_px, y_corr_px) based on the mentioned equations:
x_corr_px = x (1 + k1 * r2 + k2 * r4 + k3 * r6)
y_corr_px = y (1 + k1 * r2 + k2 * r4 + k3 * r6)
Compute distorted coordinates
x_dist_px = x_corr_px * fx + cx
y_dist_px = y_corr_px * fy + cy
Draw (x_dist_px, y_dist_px) on the original, distorted image buffer.
This also means that the corrected coordinates are the normalized coordinates scaled by a power series of the normalized image coordinates' magnitude. (this is the opposite of what the question suggests)
Looking at the implementation of cvProjectPoints2 in OpenCV (see [opencv]/modules/calib3d/src/calibration.cpp), the "Poly3" distortion in OpenCV is being applied the same direction as in Tango. All 3 versions (Tango Docs, OpenCV Tutorials, OpenCV API) are consistent and correct.
Good luck, and hopefully this helps!
(Update: Taking a closer look at a the code, it looks like the corrected coordinates and distorted coordinates are not the same. I've removed the incorrect parts of my response, and the remaining parts of this answer are still correct.)
Maybe it's not the right place to post, but I really want to share the readable version of code used in OpenCV to actually correct the distortion.
I'm sure that I'm not the only one who needs x_corrected and y_corrected and fails to find an easy and understandable formula.
I've rewritten the essential part of cv2.undistortPoints in Python and you may notice that the correction is performed iteratively. This is important, because the solution for polynom of 9-th power does not exist and all we can do is to apply its the reveresed version several times to get the numerical solution.
def myUndistortPoint((x0, y0), CM, DC):
[[k1, k2, p1, p2, k3, k4, k5, k6]] = DC
fx, _, cx = CM[0]
_, fy, cy = CM[1]
x = x_src = (x0 - cx) / fx
y = y_src = (y0 - cy) / fy
for _ in range(5):
r2 = x**2 + y**2
r4 = r2**2
r6 = r2 * r4
rad_dist = (1 + k4*r2 + k5*r4 + k6*r6) / (1 + k1*r2 + k2*r4 + k3*r6)
tang_dist_x = 2*p1 * x*y + p2*(r2 + 2*x**2)
tang_dist_y = 2*p2 * x*y + p1*(r2 + 2*y**2)
x = (x_src - tang_dist_x) * rad_dist
y = (y_src - tang_dist_y) * rad_dist
x = x * fx + cx
y = y * fy + cy
return x, y
To speed up, you can use only three iterations, on most cameras this will give enough precision to fit the pixels.

How can i calculate elongation of a convex hull in opencv?

I have found this way to calculate elongation basing on image moments
#ELONGATION
def elongation(m):
x = m['mu20'] + m['mu02']
y = 4 * m['mu11']**2 + (m['mu20'] - m['mu02'])**2
return (x + y**0.5) / (x - y**0.5)
mom = cv2.moments(unicocnt, 1)
elongation = elongation(mom)
How can i calculate elongation of a Convex Hull?
hull = cv2.convexHull(unicocnt)
where 'unicocnt' is a contour that was taken with find contours.
By default, convexHull output a vector of indices of points. You have to set the returnPoints argument to 1 to output a vector of points that you can then pass to cv2.moments.

Resources