I have a 3D triangulated mesh which is similar to a curved piece of paper in that it has a 4 edges and lives in 3-dimensional space. Edges may be different lengths and curve differently, but it could theoretically be continually morphed to look like a piece of paper. A uv coordinate has been assigned to every vertex and the range of u and v is between 0 and 1. Some vertices are obviously on the border. For the bottom border u is in the range [0,1] and v is 0. Top border vertices have u within [0,1] and v = 1. The left and right borders have u = 0 or u = 1 (respectively) with v within [0,1].
Now think about the "isocurve" where u = 0.5. This would be the "line" (or collection of line segments?) from bottom to top of the "middle" of the surface. How would I go about finding that?
Or, let's say I wanted to find the 3D coordinate corresponding to the uv coordinate (0.2,0.7). How would I get there?
I don't want to implement this by putting data through a renderer (OpenGL, etc). I'm sure there must be a standard method. It feels like the inverse of a texture mapping function.
Essentially both of your questions boil down to the same thing: how to convert between UV and XYZ coordinates.
This is an interpolation problem. Considering a single triangle in your mesh, you know both the UV and XYZ coordinates at the 3 vertices. As such, you have the right amount of data to interpolate X,Y,Z as linear functions of U,V:
X(U,V) = a0 + a1*U + a2*V
Y(U,V) = b0 + b1*U + b2*V
Z(U,V) = c0 + c1*U + c2*V
The problem then becomes how to determine the coefficients ai,bi,ci. This can be done by solving a set of linear equations based on the given vertex data. For example, the X coefficients for a given triangle can be found by solving:
[X1] [1 U1 V1] [a0]
[X2] = [1 U2 V2] * [a1]
[X3] [1 U3 V3] [a2]
Once you have all of these coefficients for each triangle in the mesh you can then determine an XYZ coordinate for any UV pair:
1. Locate the triangle that contains the UV point.
2. Evaluate the X(U,V),Y(U,V),Z(U,V) functions for the given triangle.
Related
Given an object's 3D mesh file and an image that contains the object, what are some techniques to get the orientation/pose parameters of the 3d object in the image?
I tried searching for some techniques, but most seem to require texture information of the object or at least some additional information. Is there a way to get the pose parameters using just an image and a 3d mesh file (wavefront .obj)?
Here's an example of a 2D image that can be expected.
FOV of camera
Field of view of camera is absolute minimum to know to even start with this (how can you determine how to place object when you have no idea how it would affect scene). Basically you need transform matrix that maps from world GCS (global coordinate system) to Camera/Screen space and back. If you do not have a clue what about I am writing then perhaps you should not try any of this before you learn the math.
For unknown camera you can do some calibration based on markers or etalones (known size and shape) in the view. But much better is use real camera values (like FOV angles in x,y direction, focal length etc ...)
The goal for this is to create function that maps world GCS(x,y,z) into Screen LCS(x,y).
For more info read:
transform matrix anatomy
3D graphic pipeline
Perspective projection
Silhouette matching
In order to compare rendered and real image similarity you need some kind of measure. As you need to match geometry I think silhouette matching is the way (ignoring textures, shadows and stuff).
So first you need to obtain silhouettes. Use image segmentation for that and create ROI mask of your object. For rendered image is this easy as you van render the object with single color without any lighting directly into ROI mask.
So you need to construct function that compute the difference between silhouettes. You can use any kind of measure but I think you should start with non overlapping areas pixel count (it is easy to compute).
Basically you count pixels that are present only in one ROI (region of interest) mask.
estimate position
as you got the mesh then you know its size so place it in the GCS so rendered image has very close bounding box to real image. If you do not have FOV parameters then you need to rescale and translate each rendered image so it matches images bounding box (and as result you obtain only orientation not position of object of coarse). Cameras have perspective so the more far from camera you place your object the smaller it will be.
fit orientation
render few fixed orientations covering all orientations with some step 8^3 orientations. For each compute the difference of silhouette and chose orientation with smallest difference.
Then fit the orientation angles around it to minimize difference. If you do not know how optimization or fitting works see this:
How approximation search works
Beware too small amount of initial orientations can cause false positioves or missed solutions. Too high amount will be slow.
Now that was some basics in a nutshell. As your mesh is not very simple you may need to tweak this like use contours instead of silhouettes and using distance between contours instead of non overlapping pixels count which is really hard to compute ... You should start with simpler meshes like dice , coin etc ... and when grasping all of this move to more complex shapes ...
[Edit1] algebraic approach
If you know some points in the image that coresponds to known 3D points (in your mesh) then you can along with the FOV of the camera used compute the transform matrix placing your object ...
if the transform matrix is M (OpenGL style):
M = xx,yx,zx,ox
xy,yy,zy,oy
xz,yz,zz,oz
0, 0, 0, 1
Then any point from your mesh (x,y,z) is transformed to global world (x',y',z') like this:
(x',y',z') = M * (x,y,z)
The pixel position (x'',y'') is done by camera FOV perspective projection like this:
y''=FOVy*(z'+focus)*y' + ys2;
x''=FOVx*(z'+focus)*x' + xs2;
where camera is at (0,0,-focus), projection plane is at z=0 and viewing direction is +z so for any focal length focus and screen resolution (xs,ys):
xs2=xs*0.5;
ys2=ys*0.5;
FOVx=xs2/focus;
FOVy=ys2/focus;
When put all this together you obtain this:
xi'' = ( xx*xi + yx*yi + zx*zi + ox ) * ( xz*xi + yz*yi + zz*zi + ox + focus ) * FOVx
yi'' = ( xy*xi + yy*yi + zy*zi + oy ) * ( xz*xi + yz*yi + zz*zi + oy + focus ) * FOVy
where (xi,yi,zi) is i-th known point 3D position in mesh local coordinates and (xi'',yi'') is corresponding known 2D pixel positions. So unknowns are the M values:
{ xx,xy,xz,yx,yy,yx,zx,zy,zz,ox,oy,oz }
So we got 2 equations per each known point and 12 unknowns total. So you need to know 6 points. Solve the system of equations and construct your matrix M.
Also you can exploit that M is a uniform orthogonal/orthonormal matrix so vectors
X = (xx,xy,xz)
Y = (yx,yy,yz)
Z = (zx,zy,zz)
Are perpendicular to each other so:
(X.Y) = (Y.Z) = (Z.X) = 0.0
Which can lower the number of needed points by introducing these to your system. Also you can exploit cross product so if you know 2 vectors the thirth can be computed
Z = (X x Y)*scale
So instead of 3 variables you need just single scale (which is 1 for orthonormal matrix). If I assume orthonormal matrix then:
|X| = |Y| = |Z| = 1
so we got 6 additional equations (3 x dot, and 3 for cross) without any additional unknowns so 3 point are indeed enough.
I have 3D points and I need to make an 2D orthographic projection of them onto a plane that is defined by the origin and a normal n. The meaning of this is basically looking at the points from the top (given the vertical vector). How can I do it?
What I'm thinking is:
project point P onto the 3D plane: P - P dot n * n
look at the 3D plane from the "back" in respect to the normal (not sure how to define this)
do an ortho projection using max-min coordinates of the points in the plane to define the clipping
I am working with iOS.
One way to do this would be to:
rotate the coordinate system so that the plane of interest lies in the x-y plane, and the normal vector n is aligned with the z-axis
project the points onto the x-y plane by setting their z-components to 0
Set up the coordinate transformation
There are infinitely many solutions to this problem since we can always rotate a solution in the x-y plane to get another valid solution.
To fix this, let's choose a vector v lying in the plane that will line up with the x-axis after the transformation. Any vector will do; let's take the vector in the plane with coordinates x=1 and y=0.
Since our plane intersects the origin, its equation is:
x*n1 + y*n2 + z*n3 = 0
z = -(x*n1 + y*n2)/n3
After substituting x=1, y=0, we see that
v = [1 0 -n1/n3]
We also need to make sure v is normalized, so set
v = v/sqrt(v1*v1 + v2*v2 + v3*v3)
EDIT: The above method will fail in cases where n3=0. An alternative method to find v is to take a random point P1 from our point set that is not a scalar multiple of n and calculate v = P1 - P1 dot n * n, which is the projection of P1 into the plane. Just keep searching through your points until you find one that satisfies (P1 dot n/norm(n)) != P1 and this is guaranteed to work.
Now we need a vector u that will line up with the y-axis after the transformation. We get this from the cross product of n and v:
u = n cross v
If n and v are normalized, then u is automatically normalized.
Next, create the matrix
M = [ v1 v2 v3 ]
[ u1 u2 u3 ]
[ n1 n2 n3 ]
Transform the points
Now given a 3 by N array of points P, we just follow the two steps above
P_transformed = M*P
P_plane = set the third row of P_transformed to zero
The x-y coordinates of P_plane are now a 2D coordinate system in the plane.
If you need to get the 3D spatial coordinates back, just do the reverse transformation with P_space = M_transpose*P_plane.
I have a polyline figure, given as an array of relative x and y point coordinates (0.0 to 1.0).
I have to draw the figure with random position, scale and rotation angle.
How can I do it in the best way?
You could use a simple transformation with RT matrix.
Let X = (x y 1)^t be coordinates of one point of your figure. Let R be a 2x2 rotation matrix, and T be 2x1 translation vector of the transformation You plan to make. RT matrix A will have the form of A = [R T;0 0 1]. To get transformed coordinates of point X, You need to do this simple calculation AX = X', where X' are the new coordinates. Now, to get the whole figure transformed, instead of using a single column, You use a matrix where each column has x coordinate in first row, y in the second and 1 in the third row.
Of course You can try to use functions provided by OpenCV, shown in this tutorial, or ones intended for vectors of points instead of whole images, but the way above makes You actually understand what are You doing ;)
I've been working on a project where I use Bezier paths to draw the curves I need. Each basic shape in my project consists of three cubic Bezier curves arranged end to end so that the slopes match where they meet.
The basic problem I need to solve is whether the compound curve made of the three Bezier curves intersects with itself. After thinking about it for a while, I've figured out that given the constraints of the curves, I can simplify the task to something else:
The curvature of each of the three Bezier paths should be opposite in curvature direction relative to the curve it's abutted to. In other words, there should be an inflection point where one bezier curve abuts to another one. If that is not the case, I want to reject the parameter set that generated the curves and select a different set.
In any case, my basic question is how to detect whether there is an inflection point where the curves abut each other.
In the illustration, each of the three Bezier curves is shown using a different color. The left black curve curves in the opposite direction from the red curve at the point where they meet, but the right black curve curves in the same direction. There is an inflection point where the red and left black curve meet but not where the red and right black curve meet.
Edit:
Below, I've added another image, showing the polygon enclosing the Bezier path. The crossing lines of the polygon, shown in the black curve, tests for an inflection point, not a loop. I'm guessing that one curve intersecting another can be tested by checking whether the enclosing polygons intersect, as illustrated by the red and blue curves.
P.S. Since there has been some question about the constraints, I will list some of them here:
The left most point and the rightmost point have the same y value.
The x value of the control point of the leftmost point is less than
the x value of the control point of the rightmost point. This keeps
the black and blue curves from intersecting each other.
The slope at the leftmost and rightmost points is within about +/- 10
degrees of horizontal.
The intersection of the black and red curves, and the intersection of
the red and blue curves divide the full curve in approximately
thirds. I don't have exact numbers, but a sample bound would be that
the x value of the left end of the red curve is somewhere between 25%
and 40% of the x value of the rightmost point.
The y value of the points of intersection are +/- some small fraction
of the overall width.
The slope at the intersections is > 0.6 and < 3.0 (positive or
negative).
The equation for curvature is moderately simple. You only need the sign of the curvature, so you can skip a little math. You are basically interested in the sign of the cross product of the first and second derivatives.
This simplification only works because the curves join smoothly. Without equal tangents a more complex test would be needed.
The sign of curvature of curve P:
ax = P[1].x - P[0].x; // a = P1 - P0
ay = P[1].y - P[0].y;
bx = P[2].x - P[1].x - ax; // b = P2 - P1 - a
by = P[2].y - P[1].y - ay;
cx = P[3].x - P[2].x - bx*2 - ax; // c = P3 - P2 - 2b - a
cy = P[3].y - P[2].y - by*2 - ay;
bc = bx*cy - cx*by;
ac = ax*cy - cx*ay;
ab = ax*by - bx*ay;
r = ab + ac*t + bc*t*t;
Note that r is the cross product of the first and second derivative and the sign indicate the direction of curvature. Calculate r at t=1 on the left curve and at t=0 on the right curve. If the product is negative, then there is an inflection point.
If you have curves U, V and W where U is the left black, V is the middle red, and W is the right black, then calculate bc, ac, and ab above for each. The following test will be true if both joins are inflection points:
(Uab+Uac+Ubc)*(Vab) < 0 && (Vab+Vac+Vbc)*(Wab) < 0
The equation for curvature has a denominator that I have ignored. It does not affect the sign of curvature and would only be zero if the curve was a line.
Math summary:
// start with the classic bezier curve equation
P = (1-t)^3*P0 + 3*(1-t)^2*t*P1 + 3*(1-t)*t^2*P2 + t^3*P3
// convert to polynomial
P = P0 + 3*t*(P1 - P0) + 3*t^2*(P2 - 2*P1 + P0) + t^3*(P3 - 3*P2 + 3*P1 - P0)
// rename the terms to a,b,c
P = P0 + 3at + 3btt + cttt
// find the first and second derivatives
P' = 3a + 6bt + 3ctt
P" = 6b + 6ct
// and the cross product after some reduction
P' x P" = ab + act + bctt
One of the deterministic ways to check if a bezier curve has a double-point or self-intersection is to calculate the inverse equation and evaluate the root, since the inversion equation of a bezier curve is always zero at the point of self-intersection.
As detailed in this example by T.W.Sederberg –course notes. Then for detecting whether two bezier curves (red and black in the question) intersect with each other, there are a few methods, binary subdivision (easier to implement) and bezier clipping (very good balance of efficiency and code complexity), implicitization (not worth it).
But a more efficient way to do it probably is to subdivide the curves into small line-segments and find the intersection. Here is one way to do it. Especially when you are interested in knowing whether the path self-intersect, but not in an accurate point of intersection.
If you have well defined assumptions about the locations of the CVs of the piece-wise bezier curves (or poly-bezier as #Pomax mentioned above), you can try out the curvature based method as you have mentioned in your question.
Here is a plot of the curvature on a similar path as in the question. Under similar circumstances, it seems this could give you the solution you are looking for. Also, in case of a cusp, it seems this quick and dirty method still works!
# Victor Engel : Thank you for the clarification. Now the solution is even easier.
In short, look at torques of both curves. If they pull together, the "inflexion" occurs, if they are opposed, the "curvature" continues.
Remark: Words "inflexion" and "curvature" have here a more intuitive nature, than a strict mathematical meaning, therefore apostrophes are used.
As before, a set of points P0,P1,P2,P3 defines the first cubic Bezier curve, and Q0,Q1,Q2 and Q3 the second.
Moreover, 4 vectors: a,b,u,w will be useful.
a = P2P3
b = P1P2
u = Q0Q1
v = Q1Q2
I will omit a C1-continuity checking, it's already done by you. ( This means P3=Q0 and a x u=0)
Finally, Tp and Tq torques will appear.
Tp = b x a
( "x" means a vector product, but Tp on the planar is treated like a simple number, not a vector. }
if Tp=0 {very rarely vectors a, b can be parallel.}
b = P0P1
Tp = b x a
if Tp=0
No! It's can't be! Who straightened the curve???
STOP
endif
endif
Tq = u x v
if Tq=0 {also vectors u, v can be parallel.}
v = Q2Q3
Tq = u x v
if Tq=0
Oh no! What happened to my curve???
STOP
endif
endif
Now the final test:
if Tp*Tq < 0 then Houston! We have AN "INFLEXION"!
else WE CONTINUE THE TURN!
Suppose you have 4 points: P0, P1, P2 and P3, which describes a cubic Bezier's curve (cBc for short) . P0 and P3 are starting and ending points, P1 and P2 are directional points.
When P0, P1 and P2 are collinear, then cBc has an inflection point at P0.
When P1, P2 and P3 are collinear, then cBC has an inflection point at P3.
Remarks.
1) Use the vector product ( P0P1 x P1P2 and P1P2 x P2P3 respectively) to check the collinearity. The result should be a zero-lenght vector.
2) Avoid the situation where P0, P1, P2 and P3 are all collinear, the cBc they create aren't curves in the common sense.
Corollary.
When P1=P2, cBc has inflection points at both sides.
I am using OpenCV for an optical measurement system. I need to carry out a perspective transformation between two images, captured by a digital camera. In the field of view of the camera I placed a set of markers (which lie in a common plane), which I use as corresponding points in both images. Using the markers' positions I can calculate the homography matrix. The problem is, that the measured object, whose images I actually want to transform is positioned in a small distance from the markers and in parallel to the markers' plane. I can measure this distance.
My question is, how to take that distance into account when calculating the homography matrix, which is necessary to perform the perspective transformation.
In my solution it is a strong requirement not to use the measured object points for calculation of homography (and that is why I need other markers in the field of view).
Please let me know if the description is not precise.
Presented in the figure is the exemplary image.
The red rectangle is the measured object. It is physically placed in a small distance behind the circular markers.
I capture images of the object from different camera's positions. The measured object can deform between each acquisition. Using circular markers, I want to transform the object's image to the same coordinates. I can measure the distance between object and markers but I do not know, how should I modify the homography matrix in order to work on the measured object (instead of the markers).
This question is quite old, but it is interesting and it might be useful to someone.
First, here is how I understood the problem presented in the question:
You have two images I1 and I2 acquired by the same digital camera at two different positions. These images both show a set of markers which all lie in a common plane pm. There is also a measured object, whose visible surface lies in a plane po parallel to the marker's plane but with a small offset. You computed the homography Hm12 mapping the markers positions in I1 to the corresponding markers positions in I2 and you measured the offset dm-o between the planes po and pm. From that, you would like to calculate the homography Ho12 mapping points on the measured object in I1 to the corresponding points in I2.
A few remarks on this problem:
First, notice that an homography is a relation between image points, whereas the distance between the markers' plane and the object's plane is a distance in world coordinates. Using the latter to infer something about the former requires to have a metric estimation of the camera poses, i.e. you need to determine the euclidian and up-to-scale relative position & orientation of the camera for each of the two images. The euclidian requirement implies that the digital camera must be calibrated, which should not be a problem for an "optical measurement system". The up-to-scale requirement implies that the true 3D distance between two given 3D points must be known. For instance, you need to know the true distance l0 between two arbitrary markers.
Since we only need the relative pose of the camera for each image, we may choose to use a 3D coordinate system centered and aligned with the coordinate system of the camera for I1. Hence, we will denote the projection matrix for I1 by P1 = K1 * [ I | 0 ]. Then, we denote the projection matrix for I2 (in the same 3D coordinate system) by P2 = K2 * [ R2 | t2 ]. We will also denote by D1 and D2 the coefficients modeling lens distortion respectively for I1 and I2.
As a single digital camera acquired both I1 and I2, you may assume that K1 = K2 = K and D1 = D2 = D. However, if I1 and I2 were acquired with a long delay between the acquisitions (or with a different zoom, etc), it will be more accurate to consider that two different camera matrices and two sets of distortion coefficients are involved.
Here is how you could approach such a problem:
The steps in order to estimate P1 and P2 are as follows:
Estimate K1, K2 and D1, D2 via calibration of the digital camera
Use D1 and D2 to correct images I1 and I2 for lens distortion, then determine the marker positions in the corrected images
Compute the fundamental matrix F12 (mapping points in I1 to epilines in I2) from the corresponding markers positions and infer the essential matrix E12 = K2T * F12 * K1
Infer R2 and t2 from E12 and one point correspondence (see this answer to a related question). At this point, you have an affine estimation of the camera poses, but not an up-to-scale one since t2 has unit norm.
Use the measured distance l0 between two arbitrary markers to infer the correct norm for t2.
For the best accuracy, you may refine P1 and P2 using a bundle adjustment, with K1 and ||t2|| fixed, based on the corresponding marker positions in I1 and I2.
At this point, you have an accurate metric estimation of the camera poses P1 = K1 * [ I | 0 ] and P2 = K2 * [ R2 | t2 ]. Now, the steps to estimate Ho12 are as follows:
Use D1 and D2 to correct images I1 and I2 for lens distortion, then determine the marker positions in the corrected images (same as 2. above, no need to re-do that) and estimate Hm12 from these corresponding positions
Compute the 3x1 vector v describing the markers' plane pm by solving this linear equation: Z * Hm12 = K2 * ( R2 - t2 * vT ) * K1-1 (see HZ00 chapter 13, result 13.5 and equation 13.2 for a reference on that), where Z is a scaling factor. Infer the distance to origin dm = ||v|| and the normal n = v / ||v||, which describe the markers' plane pm in 3D.
Since the object plane po is parallel to pm, they have the same normal n. Hence, you can infer the distance to origin do for po from the distance to origin dm for pm and from the measured plane offset dm-o, as follows: do = dm ± dm-o (the sign depends of the relative position of the planes: positive if pm is closer to the camera for I1 than po, negative otherwise).
From n and do describing the object plane in 3D, infer the homography Ho12 = K2 * ( R2 - t2 * nT / do ) * K1-1 (see HZ00 chapter 13, equation 13.2)
The homography Ho12 maps points on the measured object in I1 to the corresponding points in I2, where both I1 and I2 are assumed to be corrected for lens distortion. If you need to map points from and to the original distorted image, don't forget to use the distortion coefficients D1 and D2 to transform the input and output points of Ho12.
The reference I used:
[HZ00] "Multiple view geometry for computer vision", by R.Hartley and A.Zisserman, 2000.