Imagine a cube with each of 6 walls in a different color. The cube rotates in random directions around its center point. When a user clicks or taps a screen, the rotation stops instantly. The cube is frozen in a random position. After the user releases mouse button or moves finger up from the screen, the cube should 'straighten', what means it should rotate around some axis by the smallest possible angle, just enough to present the most visible cube wall in the 'screen plane' the way that all edges are parallel to the edges of the screen.
Is there a way to find this closest 'straight' rotation, assuming that we have an access to the 'frozen' position given either by a rotation matrix, or by a quaternion (whichever is more convenient)?
One way of looking at the top 3x3 portion of a matrix is that it's just a description of the basis vectors as, if you were to apply one to a point:
[A D G] [x] [A*x + D*y + G*z] [A] [D] [G]
[B E H] [y] = [B*x + E*y + H*z] = x * [B] + y * [E] + z * [H]
[C F I] [z] [C*x + F*y + I*z] [C] [F] [I]
i.e. if you apply that matrix then the input x axis will end up running along (A, B, C), the input y axis will end up running along (D, E, F) and the input z axis will end up running along (G, H, I).
What I think you're asking is equivalent to "which axis has the least change along output z, i.e. is closest to perpendicular to the screen"? So you can determine that by looking for the value from (C, F, I) that has the least magnitude.
You can then use the z sign of the cross product of (A, B, C) and (D, E, F) to decide whether you're looking along the axis positively or negatively by the same logic that allows you to use that test for reverse face removal — whichever face would be visible if the camera were hypothetically moved backward to infinity is the one genuinely in front.
That also suggests an alternative test which you may prefer: do the transform and the face closest to perpendicular is whichever is visible and largest. It's the same logic behind the Lambert lighting model, with the fact that the faces were of a uniform size in the first place factored in. The advantage of that test is that you could do it directly on the GPU using the occlusion query, assuming none of them is actually occluded.
Simple solution.
1) Find the required wall, and construct vector from cube center to center of wall, let it be dir1
2) vector from cube center to camera, let it be dir2
3) Build quaternion "rotation_arc" between this vectors. http://www.euclideanspace.com/maths/algebra/vectors/angleBetween/index.htm
4) rotate cube by quaternion from "3)"
To find wall : Build 6 vectors from center of cube to centers of walls. Than build vector from center of cube to camera. Estimate 6 angles between first vectors and second vector and select wall with smallest angle.
Related
I am very new to ROS and am working on building a system from ground-up to understand the concepts better. I am trying to convert a depthmap (received from visionary-t time of flight camera as a sensor_msgs/Image message) into a pointcloud. I am looping through the width and the height of the image (in my case 176x144 px) say (u, v) and the value at (u, v) is Z in meters. I then use the intrinsic camera metrics (c_x, c_y, f_x, f_y) to convert the local (u, v) co-ordinates to global (X, Y) co-ordinates and for this, I make use of the pinhole camera model.
X = (u - c_x) * Z / f_x ;
Y = (v - c_y) * Z / f_y
I then save these points into pcl::PointXYZ. My camera is mounted on top and the view is of a table with some objects on it. Although my table is flat, when I convert the depthmaps to pointclouds, I see that in the pointcloud, the table has a convex shape and is not flat.
Can someone please suggest what could be the reason for this convex shape and how do I rectify this?
There might be something wrong about how you use the intrinsics.
There is a post regarding the "reverse projection of image coordinates": Computing x,y coordinate (3D) from image point
Maybe it helps you.
I have 3D points and I need to make an 2D orthographic projection of them onto a plane that is defined by the origin and a normal n. The meaning of this is basically looking at the points from the top (given the vertical vector). How can I do it?
What I'm thinking is:
project point P onto the 3D plane: P - P dot n * n
look at the 3D plane from the "back" in respect to the normal (not sure how to define this)
do an ortho projection using max-min coordinates of the points in the plane to define the clipping
I am working with iOS.
One way to do this would be to:
rotate the coordinate system so that the plane of interest lies in the x-y plane, and the normal vector n is aligned with the z-axis
project the points onto the x-y plane by setting their z-components to 0
Set up the coordinate transformation
There are infinitely many solutions to this problem since we can always rotate a solution in the x-y plane to get another valid solution.
To fix this, let's choose a vector v lying in the plane that will line up with the x-axis after the transformation. Any vector will do; let's take the vector in the plane with coordinates x=1 and y=0.
Since our plane intersects the origin, its equation is:
x*n1 + y*n2 + z*n3 = 0
z = -(x*n1 + y*n2)/n3
After substituting x=1, y=0, we see that
v = [1 0 -n1/n3]
We also need to make sure v is normalized, so set
v = v/sqrt(v1*v1 + v2*v2 + v3*v3)
EDIT: The above method will fail in cases where n3=0. An alternative method to find v is to take a random point P1 from our point set that is not a scalar multiple of n and calculate v = P1 - P1 dot n * n, which is the projection of P1 into the plane. Just keep searching through your points until you find one that satisfies (P1 dot n/norm(n)) != P1 and this is guaranteed to work.
Now we need a vector u that will line up with the y-axis after the transformation. We get this from the cross product of n and v:
u = n cross v
If n and v are normalized, then u is automatically normalized.
Next, create the matrix
M = [ v1 v2 v3 ]
[ u1 u2 u3 ]
[ n1 n2 n3 ]
Transform the points
Now given a 3 by N array of points P, we just follow the two steps above
P_transformed = M*P
P_plane = set the third row of P_transformed to zero
The x-y coordinates of P_plane are now a 2D coordinate system in the plane.
If you need to get the 3D spatial coordinates back, just do the reverse transformation with P_space = M_transpose*P_plane.
I want to let the user control an object moving over the surface of a static sphere. Using two buttons to rotate the direction of the object clockwise and anti-clockwise as it constantly moves forward similar to asteroids.
In scene kit there are three different orientation properties for an SCNNode and I really don't know where to start. I understand how to execute everything except the rotation around the sphere.
You're looking for a parameterization of the surface of the sphere. You can find this online (but it can be tricky if you don't know the magic words to enter for your searches). Check out the entry on MathWorld.
The surface of the sphere is parameterized by two angle variables, call them s and t. Note that one variable will run from zero to 2 pi, and the other will run only from zero to pi. This is a gotcha that can be easy to miss. To convert these angles to rectangular (x, y, z) coordinates, you use the formula:
x = r cos(s) sin(t)
y = r sin(s) sin(t) // Yes it's sin(t) twice, that's not a typo.
z = r cos(t)
I find the following visualization helpful. A curve in a plane (the xy-plane, for example) sweeps out an angle from zero to pi, half a rotation and corresponds to the parameter s. If you set t equal to pi/2, so sin(t) = 1, then you can see how x and y turn into standard rectangular coordinates for a circular section. After the s parameter sweeps out half a circle, you can rotate that half circle all the way around from zero to 2 pi, to form a full sphere, and that full sweep corresponds to the parameter t.
If you represent your object's position by coordinates (s, t) then you can, for the most part, safely convert to rectangular coordinates using the formula above without worrying about the domain of either parameter; however if s or t grow without bound (say, because your object orbits continuously for a long time) it might be worth the small extra effort to normalize the parameters. I'm not sure how sin or cos behave for very large inputs.
Let's say I have a pinhole camera with known intristic values like camera matrix and distortion coefficients. Let's say there is a point in large enough distance from the camera, so we can say it is placed in infinity.
Given image coordinates of this point in pixels, I would like to calculate camera rotation relative to the axis that connects camera and this point (so rotation is 0,0 if camera is directed at this point and it is in the optical center of the image).
How can this be done using opencv?
Many thanks!
You need to specify an additional constraint - rotating the camera from its current pose to one that aligns the optical axis with an arbitrary ray leaves the camera free to rotate about the ray itself (i.e. it leaves the "roll" angle unspecified).
Let's assume that you want the roll to be zero, i.e. that you want the motion to be a pure pan-tilt. This has a unique solution as long as the ray you want to align to is not parallel to the vertical image axis (in which case pan and roll are the same motion).
Then the solution is computed as follows. Let's use the OpenCV camera frame: Z=[0,0,1]' (, where " ' " means transpose) be the camera focal axis, oriented going out of the lens, Y=[0,1,0]' the vertical axis going down, and X = Z x Y (where 'x' is the cross product) the horizontal camera axis going toward the right of the image. So "pan" is a rotation about Y, "tilt" is a rotation about X.
Let U = [u1, u2, u3]', with || u || = 1 be the ray you want to rotate to. You want to apply a pan that brings Z onto the plane Puy defined by the vectors u and Y, then apply a tilt that brings Z onto u.
The angle of the first rotation is (angle between Z and Puy) = [90 deg - (angle between Z and Y x U)]. this is because Y x U is orthogonal to Puy. Look up the expressions for computing the angle between vectors on Wikipedia or elsewhere online. Once you have the angle (or its cosine and sine), the rotation about Y can be expressed as a standard rotation matrix Ry.
The angle of the second rotation, about X after once Z is onto Puy, is the angle between vector Z and U after Ry is applied to Z, or equivalently, between Z and inv(Ry) * U. Compute the angle between the vector, and use to build a standard rotation matrix about X, Rx
The final transformation is then Rx * Ry.
I am using OpenCV for an optical measurement system. I need to carry out a perspective transformation between two images, captured by a digital camera. In the field of view of the camera I placed a set of markers (which lie in a common plane), which I use as corresponding points in both images. Using the markers' positions I can calculate the homography matrix. The problem is, that the measured object, whose images I actually want to transform is positioned in a small distance from the markers and in parallel to the markers' plane. I can measure this distance.
My question is, how to take that distance into account when calculating the homography matrix, which is necessary to perform the perspective transformation.
In my solution it is a strong requirement not to use the measured object points for calculation of homography (and that is why I need other markers in the field of view).
Please let me know if the description is not precise.
Presented in the figure is the exemplary image.
The red rectangle is the measured object. It is physically placed in a small distance behind the circular markers.
I capture images of the object from different camera's positions. The measured object can deform between each acquisition. Using circular markers, I want to transform the object's image to the same coordinates. I can measure the distance between object and markers but I do not know, how should I modify the homography matrix in order to work on the measured object (instead of the markers).
This question is quite old, but it is interesting and it might be useful to someone.
First, here is how I understood the problem presented in the question:
You have two images I1 and I2 acquired by the same digital camera at two different positions. These images both show a set of markers which all lie in a common plane pm. There is also a measured object, whose visible surface lies in a plane po parallel to the marker's plane but with a small offset. You computed the homography Hm12 mapping the markers positions in I1 to the corresponding markers positions in I2 and you measured the offset dm-o between the planes po and pm. From that, you would like to calculate the homography Ho12 mapping points on the measured object in I1 to the corresponding points in I2.
A few remarks on this problem:
First, notice that an homography is a relation between image points, whereas the distance between the markers' plane and the object's plane is a distance in world coordinates. Using the latter to infer something about the former requires to have a metric estimation of the camera poses, i.e. you need to determine the euclidian and up-to-scale relative position & orientation of the camera for each of the two images. The euclidian requirement implies that the digital camera must be calibrated, which should not be a problem for an "optical measurement system". The up-to-scale requirement implies that the true 3D distance between two given 3D points must be known. For instance, you need to know the true distance l0 between two arbitrary markers.
Since we only need the relative pose of the camera for each image, we may choose to use a 3D coordinate system centered and aligned with the coordinate system of the camera for I1. Hence, we will denote the projection matrix for I1 by P1 = K1 * [ I | 0 ]. Then, we denote the projection matrix for I2 (in the same 3D coordinate system) by P2 = K2 * [ R2 | t2 ]. We will also denote by D1 and D2 the coefficients modeling lens distortion respectively for I1 and I2.
As a single digital camera acquired both I1 and I2, you may assume that K1 = K2 = K and D1 = D2 = D. However, if I1 and I2 were acquired with a long delay between the acquisitions (or with a different zoom, etc), it will be more accurate to consider that two different camera matrices and two sets of distortion coefficients are involved.
Here is how you could approach such a problem:
The steps in order to estimate P1 and P2 are as follows:
Estimate K1, K2 and D1, D2 via calibration of the digital camera
Use D1 and D2 to correct images I1 and I2 for lens distortion, then determine the marker positions in the corrected images
Compute the fundamental matrix F12 (mapping points in I1 to epilines in I2) from the corresponding markers positions and infer the essential matrix E12 = K2T * F12 * K1
Infer R2 and t2 from E12 and one point correspondence (see this answer to a related question). At this point, you have an affine estimation of the camera poses, but not an up-to-scale one since t2 has unit norm.
Use the measured distance l0 between two arbitrary markers to infer the correct norm for t2.
For the best accuracy, you may refine P1 and P2 using a bundle adjustment, with K1 and ||t2|| fixed, based on the corresponding marker positions in I1 and I2.
At this point, you have an accurate metric estimation of the camera poses P1 = K1 * [ I | 0 ] and P2 = K2 * [ R2 | t2 ]. Now, the steps to estimate Ho12 are as follows:
Use D1 and D2 to correct images I1 and I2 for lens distortion, then determine the marker positions in the corrected images (same as 2. above, no need to re-do that) and estimate Hm12 from these corresponding positions
Compute the 3x1 vector v describing the markers' plane pm by solving this linear equation: Z * Hm12 = K2 * ( R2 - t2 * vT ) * K1-1 (see HZ00 chapter 13, result 13.5 and equation 13.2 for a reference on that), where Z is a scaling factor. Infer the distance to origin dm = ||v|| and the normal n = v / ||v||, which describe the markers' plane pm in 3D.
Since the object plane po is parallel to pm, they have the same normal n. Hence, you can infer the distance to origin do for po from the distance to origin dm for pm and from the measured plane offset dm-o, as follows: do = dm ± dm-o (the sign depends of the relative position of the planes: positive if pm is closer to the camera for I1 than po, negative otherwise).
From n and do describing the object plane in 3D, infer the homography Ho12 = K2 * ( R2 - t2 * nT / do ) * K1-1 (see HZ00 chapter 13, equation 13.2)
The homography Ho12 maps points on the measured object in I1 to the corresponding points in I2, where both I1 and I2 are assumed to be corrected for lens distortion. If you need to map points from and to the original distorted image, don't forget to use the distortion coefficients D1 and D2 to transform the input and output points of Ho12.
The reference I used:
[HZ00] "Multiple view geometry for computer vision", by R.Hartley and A.Zisserman, 2000.