Ortho projection of 3D points with a vector - ios

I have 3D points and I need to make an 2D orthographic projection of them onto a plane that is defined by the origin and a normal n. The meaning of this is basically looking at the points from the top (given the vertical vector). How can I do it?
What I'm thinking is:
project point P onto the 3D plane: P - P dot n * n
look at the 3D plane from the "back" in respect to the normal (not sure how to define this)
do an ortho projection using max-min coordinates of the points in the plane to define the clipping
I am working with iOS.

One way to do this would be to:
rotate the coordinate system so that the plane of interest lies in the x-y plane, and the normal vector n is aligned with the z-axis
project the points onto the x-y plane by setting their z-components to 0
Set up the coordinate transformation
There are infinitely many solutions to this problem since we can always rotate a solution in the x-y plane to get another valid solution.
To fix this, let's choose a vector v lying in the plane that will line up with the x-axis after the transformation. Any vector will do; let's take the vector in the plane with coordinates x=1 and y=0.
Since our plane intersects the origin, its equation is:
x*n1 + y*n2 + z*n3 = 0
z = -(x*n1 + y*n2)/n3
After substituting x=1, y=0, we see that
v = [1 0 -n1/n3]
We also need to make sure v is normalized, so set
v = v/sqrt(v1*v1 + v2*v2 + v3*v3)
EDIT: The above method will fail in cases where n3=0. An alternative method to find v is to take a random point P1 from our point set that is not a scalar multiple of n and calculate v = P1 - P1 dot n * n, which is the projection of P1 into the plane. Just keep searching through your points until you find one that satisfies (P1 dot n/norm(n)) != P1 and this is guaranteed to work.
Now we need a vector u that will line up with the y-axis after the transformation. We get this from the cross product of n and v:
u = n cross v
If n and v are normalized, then u is automatically normalized.
Next, create the matrix
M = [ v1 v2 v3 ]
[ u1 u2 u3 ]
[ n1 n2 n3 ]
Transform the points
Now given a 3 by N array of points P, we just follow the two steps above
P_transformed = M*P
P_plane = set the third row of P_transformed to zero
The x-y coordinates of P_plane are now a 2D coordinate system in the plane.
If you need to get the 3D spatial coordinates back, just do the reverse transformation with P_space = M_transpose*P_plane.

Related

Calculate depth of pixels

is it possible to calculate the depth of two pixels given that you have
The pixel coordinates p1 and p2
A displacement vector D = P2 - P1 of the corresponding pixels in world space
The intrinsic camera Matrix K (there is no camera translation or rotation)
Shouldn't it be possible to infer the world coordinates of the points P1 and P2? I just can't seem to figure it out.

Estimate depth of a 2D pixel given intrinsic, extrinsic, and a constraint of Y=0

I have a single-view camera at a certain height (h) from the ground. Through calibration I have obtained intrinsic parameters K, Rotation matrix and translation vector [R|t] and, because I have full access to the camera and the environment, I can measure whatever I want.
My goal is to estimate depth of a pixel [u,v] on the camera given that I know that the pixel is on the floor (so it is at y=-h with respect to the camera).
Given this constraint, I did the following (without success):
create a new 3D point P1 from [u, v] and the camera parameters + focal length: [u - cx, v - cy, f]
multiply P by the inverse of my camera matrix K and call the result P2
multiply P2 by the inverse of the [R|t] matrix and call the result P3
P3 is a 4x1 vector, so we normalize it and bring it down to 3x1 [X1, Y1, Z1]. This point should be the world coordiante projection of my [u, v] point
Solve X and Z when Y=-h in the following way:
x = x1 * (-h / y1)
y = z1 * (-h / y1)
Unfortunately it dosen't look right! I have settled on this problem for 2 weeks now, so it would be really great to get some help from the community. I'm sure it's something obvious that I am missing out.
Thanks again
The homogeneous image coordinate is P1 = [u,v,1], or [f*u,f*v,f].
The multiplication with the inverse of the camera matrix gives you a ray along which the 3D point is located.
P2 ~= K⁻¹ * P1 (~= is equality up to a scale factor)
Let's assume the camera is located at C (which is (0,0,0,1) in the camera's coordinate system), and the vector P2 has the form [x,y,z,0]. (The zero at the end makes it translation invariant!)
Then the 3D point you are looking for is located at C + k*P2 and you must solve for the variable k.
P3 = Rt⁻¹ * (C + k*P2)
P4 = C2 + k * P3
C2 is the camera position in world coordinates.
P3 is the vector in world coordinates. P4 is your point at Y=-h
Finally, plug in your constraint Y=-h and calculate k using the y components:
k = (-h - C2_y) / P3_y

Coordinate transformation in OpenCV

I have a polyline figure, given as an array of relative x and y point coordinates (0.0 to 1.0).
I have to draw the figure with random position, scale and rotation angle.
How can I do it in the best way?
You could use a simple transformation with RT matrix.
Let X = (x y 1)^t be coordinates of one point of your figure. Let R be a 2x2 rotation matrix, and T be 2x1 translation vector of the transformation You plan to make. RT matrix A will have the form of A = [R T;0 0 1]. To get transformed coordinates of point X, You need to do this simple calculation AX = X', where X' are the new coordinates. Now, to get the whole figure transformed, instead of using a single column, You use a matrix where each column has x coordinate in first row, y in the second and 1 in the third row.
Of course You can try to use functions provided by OpenCV, shown in this tutorial, or ones intended for vectors of points instead of whole images, but the way above makes You actually understand what are You doing ;)

Finding isocurve from triangulation with known uv at all vertices

I have a 3D triangulated mesh which is similar to a curved piece of paper in that it has a 4 edges and lives in 3-dimensional space. Edges may be different lengths and curve differently, but it could theoretically be continually morphed to look like a piece of paper. A uv coordinate has been assigned to every vertex and the range of u and v is between 0 and 1. Some vertices are obviously on the border. For the bottom border u is in the range [0,1] and v is 0. Top border vertices have u within [0,1] and v = 1. The left and right borders have u = 0 or u = 1 (respectively) with v within [0,1].
Now think about the "isocurve" where u = 0.5. This would be the "line" (or collection of line segments?) from bottom to top of the "middle" of the surface. How would I go about finding that?
Or, let's say I wanted to find the 3D coordinate corresponding to the uv coordinate (0.2,0.7). How would I get there?
I don't want to implement this by putting data through a renderer (OpenGL, etc). I'm sure there must be a standard method. It feels like the inverse of a texture mapping function.
Essentially both of your questions boil down to the same thing: how to convert between UV and XYZ coordinates.
This is an interpolation problem. Considering a single triangle in your mesh, you know both the UV and XYZ coordinates at the 3 vertices. As such, you have the right amount of data to interpolate X,Y,Z as linear functions of U,V:
X(U,V) = a0 + a1*U + a2*V
Y(U,V) = b0 + b1*U + b2*V
Z(U,V) = c0 + c1*U + c2*V
The problem then becomes how to determine the coefficients ai,bi,ci. This can be done by solving a set of linear equations based on the given vertex data. For example, the X coefficients for a given triangle can be found by solving:
[X1] [1 U1 V1] [a0]
[X2] = [1 U2 V2] * [a1]
[X3] [1 U3 V3] [a2]
Once you have all of these coefficients for each triangle in the mesh you can then determine an XYZ coordinate for any UV pair:
1. Locate the triangle that contains the UV point.
2. Evaluate the X(U,V),Y(U,V),Z(U,V) functions for the given triangle.

Calculating homography matrix using arbitrary known geometrical relations

I am using OpenCV for an optical measurement system. I need to carry out a perspective transformation between two images, captured by a digital camera. In the field of view of the camera I placed a set of markers (which lie in a common plane), which I use as corresponding points in both images. Using the markers' positions I can calculate the homography matrix. The problem is, that the measured object, whose images I actually want to transform is positioned in a small distance from the markers and in parallel to the markers' plane. I can measure this distance.
My question is, how to take that distance into account when calculating the homography matrix, which is necessary to perform the perspective transformation.
In my solution it is a strong requirement not to use the measured object points for calculation of homography (and that is why I need other markers in the field of view).
Please let me know if the description is not precise.
Presented in the figure is the exemplary image.
The red rectangle is the measured object. It is physically placed in a small distance behind the circular markers.
I capture images of the object from different camera's positions. The measured object can deform between each acquisition. Using circular markers, I want to transform the object's image to the same coordinates. I can measure the distance between object and markers but I do not know, how should I modify the homography matrix in order to work on the measured object (instead of the markers).
This question is quite old, but it is interesting and it might be useful to someone.
First, here is how I understood the problem presented in the question:
You have two images I1 and I2 acquired by the same digital camera at two different positions. These images both show a set of markers which all lie in a common plane pm. There is also a measured object, whose visible surface lies in a plane po parallel to the marker's plane but with a small offset. You computed the homography Hm12 mapping the markers positions in I1 to the corresponding markers positions in I2 and you measured the offset dm-o between the planes po and pm. From that, you would like to calculate the homography Ho12 mapping points on the measured object in I1 to the corresponding points in I2.
A few remarks on this problem:
First, notice that an homography is a relation between image points, whereas the distance between the markers' plane and the object's plane is a distance in world coordinates. Using the latter to infer something about the former requires to have a metric estimation of the camera poses, i.e. you need to determine the euclidian and up-to-scale relative position & orientation of the camera for each of the two images. The euclidian requirement implies that the digital camera must be calibrated, which should not be a problem for an "optical measurement system". The up-to-scale requirement implies that the true 3D distance between two given 3D points must be known. For instance, you need to know the true distance l0 between two arbitrary markers.
Since we only need the relative pose of the camera for each image, we may choose to use a 3D coordinate system centered and aligned with the coordinate system of the camera for I1. Hence, we will denote the projection matrix for I1 by P1 = K1 * [ I | 0 ]. Then, we denote the projection matrix for I2 (in the same 3D coordinate system) by P2 = K2 * [ R2 | t2 ]. We will also denote by D1 and D2 the coefficients modeling lens distortion respectively for I1 and I2.
As a single digital camera acquired both I1 and I2, you may assume that K1 = K2 = K and D1 = D2 = D. However, if I1 and I2 were acquired with a long delay between the acquisitions (or with a different zoom, etc), it will be more accurate to consider that two different camera matrices and two sets of distortion coefficients are involved.
Here is how you could approach such a problem:
The steps in order to estimate P1 and P2 are as follows:
Estimate K1, K2 and D1, D2 via calibration of the digital camera
Use D1 and D2 to correct images I1 and I2 for lens distortion, then determine the marker positions in the corrected images
Compute the fundamental matrix F12 (mapping points in I1 to epilines in I2) from the corresponding markers positions and infer the essential matrix E12 = K2T * F12 * K1
Infer R2 and t2 from E12 and one point correspondence (see this answer to a related question). At this point, you have an affine estimation of the camera poses, but not an up-to-scale one since t2 has unit norm.
Use the measured distance l0 between two arbitrary markers to infer the correct norm for t2.
For the best accuracy, you may refine P1 and P2 using a bundle adjustment, with K1 and ||t2|| fixed, based on the corresponding marker positions in I1 and I2.
At this point, you have an accurate metric estimation of the camera poses P1 = K1 * [ I | 0 ] and P2 = K2 * [ R2 | t2 ]. Now, the steps to estimate Ho12 are as follows:
Use D1 and D2 to correct images I1 and I2 for lens distortion, then determine the marker positions in the corrected images (same as 2. above, no need to re-do that) and estimate Hm12 from these corresponding positions
Compute the 3x1 vector v describing the markers' plane pm by solving this linear equation: Z * Hm12 = K2 * ( R2 - t2 * vT ) * K1-1 (see HZ00 chapter 13, result 13.5 and equation 13.2 for a reference on that), where Z is a scaling factor. Infer the distance to origin dm = ||v|| and the normal n = v / ||v||, which describe the markers' plane pm in 3D.
Since the object plane po is parallel to pm, they have the same normal n. Hence, you can infer the distance to origin do for po from the distance to origin dm for pm and from the measured plane offset dm-o, as follows: do = dm ± dm-o (the sign depends of the relative position of the planes: positive if pm is closer to the camera for I1 than po, negative otherwise).
From n and do describing the object plane in 3D, infer the homography Ho12 = K2 * ( R2 - t2 * nT / do ) * K1-1 (see HZ00 chapter 13, equation 13.2)
The homography Ho12 maps points on the measured object in I1 to the corresponding points in I2, where both I1 and I2 are assumed to be corrected for lens distortion. If you need to map points from and to the original distorted image, don't forget to use the distortion coefficients D1 and D2 to transform the input and output points of Ho12.
The reference I used:
[HZ00] "Multiple view geometry for computer vision", by R.Hartley and A.Zisserman, 2000.

Resources