For a given terrain, how can you calculate its surface area?
As of now, I plan to build the terrain using Three.js with something like:
var geo = new THREE.PlaneGeometry(300, 300, 10, 10);
for (var i = 0; i < geo.vertices.length; i++)
geo.vertices[i].y = someHeight; // Makes the flat plain into a terrain
Next, if its possible to iterate through each underlying triangle of the geometry (i.e. triangles of TRIANGLE_STRIP given to the WebGL array) the area of each triangle could be summed up to get the total surface area.
Does this approach sound right? If so, how do you determine vertices of individual triangles?
Any other ideas to build the terrain in WebGL/Three.js are welcome.
I think your approach sounds right and shouldn't be hard to implement.
I'm not familiar with three.js, but I think it's quite easy to determine positions of the vertices. You know that the vertices are evenly distribute between x=0...300, z=0...300 and you know the y coordinate. So the [i,j]-th vertex has position [i*300/10, y, j*300/10].
You have 10x10 segments in total and each segment consists of 2 triangles. This is where you have to be careful.
The triangles could form two different shapes:
------ ------
| \ | | /|
| \ | or | / |
| \| | / |
------ ------
which could result in different shape and (I'm not entirely sure about this) into different surface areas.
When you find out how exactly three.js creates the surface, it should be relatively easy to iteratively sum the triangle surfaces.
It would be nice to be able to do the sum without actual iteration over all triangles, but, right now, I don't have any idea how to do it...
Related
Actually, I am in the middle work of adaptive thresholding using mean. I used 3x3 matrix so I calculate means value on that matrix and replace it into M(1,1) or middle position of the matrix. I got confused about how to do perform the process at first position f(0,0).
This is a little illustration, let's assume that I am using 3x3 Matrix (M) and image (f) first position f(0,0) = M(1,1) = 4. So, M(0,0) M(0,1) M(0,2) M(1,0) M(2,0) has no value.
-1 | -1 | -1 |
-1 | 4 | 3 |
-1 | 2 | 1 |
Which one is the correct process,
a) ( 4 + 3 + 2 + 1 ) / 4
b) ( 4 + 3 + 2 + 1) / 9
I asked this because I follow some tutorial adaptive mean thresholding, it shows a different result. So, I need to make sure that the process is correct. Thanks.
There is no "correct" way to solve this issue. There are many different solutions used in practice, they all have some downsides:
Averaging over only the known values (i.e. your suggested (4+3+2+1)/4). By averaging over fewer pixels, one obtains a result that is more sensitive to noise (i.e. the "amount of noise" left in the image after filtering is larger near the borders. Also, a bias is introduced, since the averaging happens over values to one side only.
Assuming 0 outside the image domain (i.e. your suggested (4+3+2+1)/9). Since we don't know what is outside the image, assuming 0 is as good as anything else, no? Well, no it is not. This leads to a filter result that has darker values around the edges.
Assuming a periodic image. Here one takes values from the opposite side of the image for the unknown values. This effectively happens when computing the convolution through the Fourier domain. But usually images are not periodic, with strong differences in intensities (or colors) at opposite sides of the image, leading to "bleeding" of the colors on the opposite of the image.
Extrapolation. Extending image data by extrapolation is a risky business. This basically comes down to predicting what would have been in those pixels had we imaged them. The safest bet is 0-order extrapolation (i.e. replicating the boundary pixel), though higher-order polygon fits are possible too. The downside is that the pixels at the image edge become more important than other pixels, they will be weighted more heavily in the averaging.
Mirroring. Here the image is reflected at the boundary (imagine placing a mirror at the edge of the image). The value at index -1 is taken to be the value at index 1; at index -2 that at index 2, etc. This has similar downsides as the extrapolation method.
I have implemented a Kanade–Lucas–Tomasi feature tracker. I have used it on two images, that show the same scene, but the camera has moved a bit between taking the pictures.
As a result I get the coordinates of the features. For example:
1. Picture:
| feature | (x,y)=val |
|---------|-----------------|
| 1 | (436,349)=33971 |
| 2 | (440,365)=29648 |
| 3 | ( 36,290)=29562 |
2nd Picture:
| feature | (x,y)=val |
|---------|--------------|
| 1 | (443.3,356.0)=0 |
| 2 | (447.6,373.0)=0 |
| 3 | ( -1.0, -1.0)=-4 |
So I know the position of the features 1 & 2 in both images and that feature 3 couldn't be found in the second image. The coordinates of the features 1 & 2 aren't the same, because the camera has zoomed in a bit and also moved.
Which algorithm is suitable to get the scale, rotation and translation between the two images? Is there a robust algorithm, that also considers outliers?
If you dont know what movement happened between the images, then you need to calculate the Homography between them. The homography however, needs 4 points to be calculated.
If you have 4 points in both images, that are relatively on a plane (same flat surface, e.g. a window), then you can follow the steps from here in math.stackexchange to compute the homography matrix that will transform between images.
Note that while rotation and translation may happen between 2 images, they also could have been taken from different angles. If this happens, then the homography is your only option. Instead, if the images are for sure just rotation and translation (e.g. 2 satelite images) then you may find some other method, but homography will also help.
Depending upon whether the camera is calibrated or uncalibrated, tracked features to compute essential or fundamental matrix respectively.
Factorize the matrix into R, T. Use Multiview geometry book for any help with the formulae. https://www.robots.ox.ac.uk/~vgg/hzbook/hzbook1/HZepipolar.pdf
Caution: These steps only work well if the features come from different depth planes and cover wide field of view. In case all features lie on a single plane, you should estimate homography and try to factorize that instead.
What is the easiest way to draw a textured Sphere in OpenGL ES 2.0 with GL_TRIANGLES?
I'm especially wondering how to calculate the vertices.
There are various ways of triangulating spheres. Popular ones, less popular ones, good ones, and not so good ones. Unfortunately the most widely used approach isn't very good.
Spherical Coordinates
This might be the most widely used approach. You iterate through the two angles in a spherical coordinate system in two nested loops, and generate points for each pair of angles. With angle theta iterating from -pi/2 to pi/2 and angle phi iterating from 0 to 2*pi, and sphere radius r, each point is calculated as:
x = r * cos(theta) * cos(phi)
y = r * cos(theta) * sin(phi)
z = r * sin(theta)
The calculation can be made more efficient if necessary, but I'll skip that aspect for this answer. The level (precision) of the tessellation is determined by the number of subdivisions of the angles.
The main advantage of this approach is that it's simple to implement, and easy to understand. You can picture the subdivision as the lines of latitude and longitude on a globe.
It does not result in a very good triangulation, though. The triangles around the equator have similar dimensions in all directions, but the triangles closer to the north/south pole get increasingly narrow. At the north/south pole you have a large number of very narrow triangles meeting in a single point. Good triangulations have all very similar sized triangles, and this one does not.
Recursive Subdivision of Octahedron
With this approach, you start with a regular octahedron, giving you 8 triangles. You then recursively subdivide each triangle into 4 sub-triangles, as illustrated here:
/\
/ \
/____\
/\ /\
/ \ / \
/____\/____\
So each triangle is subdivided by calculating 3 additional vertices that are midway between two of the existing vertices, and 4 triangles are formed from these 6 vertices. For calculating the midway point between two input points, you calculate the sum of the two vectors, and normalize the result to get the point back on the sphere.
The level (precision) of the tessellation is determined by the number of levels in the recursive subdivision. It starts with the 8 original triangles of the octahedron at level 0, results in 32 triangles at level 1, 128 at level 2, 512 at level 3, etc. You normally get a reasonably good looking sphere around level 3.
This approach results in a much more regular triangulation, and is therefore superior to the spherical coordinate approach.
The main disadvantage is that it might seem more complex. The calculation of the points is in fact very simple. It gets slightly more tricky if you want to use indexed vertices, instead of repeating common vertices. And even more painful if you want to build nice triangle strips. Not terribly difficult, but it takes some work.
This is my favorite approach of drawing spheres.
Other Polyhedra
You can do the same thing I described for the octahedron starting with other polyhedra. Regular polyhedra that consist of triangles are particularly suitable, which makes the tethrahedron and the icosahedron natural candidates. The octahedron is the most attractive IMHO because the initial coordinates are so easy to enumerate. Using an icosahedron would probably result in an even more regular triangulation, and the vertex coordinates can be looked up.
Subdivided Cube
I'm not sure if anybody is actually using this. But I tried it recently, and it was kind of fun. :) The idea is that you take a cube centered at the origin, and subdivide each of the six sides into smaller sub-squares. You can then turn the cube into a sphere by simply normalizing each of the vectors that describe a vertex.
The advantage of this approach is that it's very simple, including building triangle strips. The quality of the triangulation seems reasonably good. I don't think it's as regular as the recursively subdivided octahedron, but definitely better than the (much too) widely used spherical coordinate approach.
I'm trying to achieve terrain texturing using 3D texture that consists of several layers of material and to make smooth blending between materials.
Maybe my illustration will explain it better:
Just imagine that each color is a cool terrain texture, like grass, stone, etc.
I want to get them properly blended, but with current approach I get all textures between requested besides textures which I want to appear (it seems logical because, as I've read, 3D texture is treated as three-dimensional array instead of texture pillars).
Current (and foolish, obviously) approach is simple as a pie ('current' result is rendered using point interpolation, desired result is hand-painted):
Vertexes:
Vertex 1: Position = Vector3.Zero, UVW = Vector3.Zero
Vertex 2: Position = Vector3(0, 1, 0), UVW = Vector3(0, 1, 0.75f)
Vertex 3: Position = Vector3(0, 0, 1), UVW = Vector3(1, 0, 1)
As you can see, first vertex of the triangle uses first material (the red one), second vertex uses third material (the blue one) and third vertex uses last fourth material (the yellow one).
This is how it's done in pixel shader (UVW is directly passed without changes):
float3 texColor = tex3D(ColorTextureSampler, input.UVW);
return float4(texColor, 1);
The reason about my choice is my terrain structure. The terrain is being generated from voxels (each voxel holds material ID) using marching cubes. Each vertex is 'welded' because meshes is pretty big and I don't want to make every triangle individual (but I can still do it if there is no way to solve my question using connected vertices).
I recently came to an idea about storing material IDs of other two vertices of the triangle and their blend factors (I would have an float2 UV pair, float3 for material IDs and float3 for blend factor of each material id) in each vertex, but I don't see any way to accomplish this without breaking my mesh into individual triangles.
Any help would be greatly appreciated. I'm targeting for SlimDX with C# and Direct3D 9 API. Thanks for reading.
P.S.: I'm sorry if I made some mistakes in this text, English is not my native language.
Probably, your ColorTextureSampler using point filtering (D3DTEXF_POINT). Use either D3DTEXF_LINEAR or D3DTEXF_ANISOTROPIC to acheve desired interpolation effect.
I'm not very familiar with SlimDX 9, but you've got the idea.
BTW, nice illustration =)
Update 1
Result in your comment below seems appropriate to your code.
Looks like to get desired effect you must change overall approach.
It is not complete solution for you, but there is how we make it in plain 3D terrains:
Every vertex has 1 pair (u, v) of texure coodrinates
You have n textures to sample into (T1, T2, T3, ..., Tn) that represents different layers of terrain: sand, grass, rock, etc.
You have mask texture(s) n channels in total, that stores blending coefficients for each texture T in its channels: R channel holds alpha for T1, G channel for T2, B for T3, ... etc.
In pixel shader you sampling your layer textures as usual, and get color values float4 val1, val2, val3, ...
Then you sampling masks texture(s) for corresponding blend coefficients and get float blend1, blend2, blend3, ...
Then you applying some kind of blending algorith, for example simple linear interpolation:
float4 terrainColor = lerp( val1, val2, blend1 );
terrainColor = lerp( terrainColor, val3, blend2);
terrainColor = lerp( terrainColor, ..., blendN );
For example if your T1 is a grass, and you have a big grass field in a middle of your map, you will wave a big red field in the middle.
This algorithm is a bit slow, because of much texture sampling, but simple to implement, gives good visual results and most flexible. You can use not only mask as blend coefficients, but any values: for example height (sample more snow in mountain peaks, rock in mountains, dirt in low ground), slope (rock on steep, grass on flat), even fixed values, etc. Or mix up all of that. Also, you can vary a blending: use built-in lerp or something more complicated (warning! this example is stupid):
float4 terrainColor = val1 * val2 * blend1 + val2 * val3 * blend2;
terrainColor = saturate(terrainColor);
Playing with blend algo is the most interesting part of this aproach. And you can find many-many techniques in google.
Not sure, but hope it helps!
Happy coding! =)
I have calculated the intrinsic and extrinsic parameters of the camera with OpenCV.
Now, I want to calculate world coordinates (x,y,z) from screen coordinates (u,v).
How I do this?
N.B. as I use the kinect, I already know the z coordinate.
Any help is much appreciated. Thanks!
First to understand how you calculate it, it would help you if you read some things about the pinhole camera model and simple perspective projection. For a quick glimpse, check this. I'll try to update with more.
So, let's start by the opposite which describes how a camera works: project a 3d point in the world coordinate system to a 2d point in our image. According to the camera model:
P_screen = I * P_world
or (using homogeneous coordinates)
| x_screen | = I * | x_world |
| y_screen | | y_world |
| 1 | | z_world |
| 1 |
where
I = | f_x 0 c_x 0 |
| 0 f_y c_y 0 |
| 0 0 1 0 |
is the 3x4 intrinsics matrix, f being the focal point and c the center of projection.
If you solve the system above, you get:
x_screen = (x_world/z_world)*f_x + c_x
y_screen = (y_world/z_world)*f_y + c_y
But, you want to do the reverse, so your answer is:
x_world = (x_screen - c_x) * z_world / f_x
y_world = (y_screen - c_y) * z_world / f_y
z_world is the depth the Kinect returns to you and you know f and c from your intrinsics calibration, so for every pixel, you apply the above to get the actual world coordinates.
Edit 1 (why the above correspond to world coordinates and what are the extrinsics we get during calibration):
First, check this one, it explains the various coordinates systems very well.
Your 3d coordinate systems are: Object ---> World ---> Camera. There is a transformation that takes you from object coordinate system to world and another one that takes you from world to camera (the extrinsics you refer to). Usually you assume that:
Either the Object system corresponds with the World system,
or, the Camera system corresponds with the World system
1. While capturing an object with the Kinect
When you use the Kinect to capture an object, what is returned to you from the sensor is the distance from the camera. That means that the z coordinate is already in camera coordinates. By converting x and y using the equations above, you get the point in camera coordinates.
Now, the world coordinate system is defined by you. One common approach is to assume that the camera is located at (0,0,0) of the world coordinate system. So, in that case, the extrinsics matrix actually corresponds to the identity matrix and the camera coordinates you found, correspond to world coordinates.
Sidenote: Because the Kinect returns the z in camera coordinates, there is also no need from transformation from the object coordinate system to the world coordinate system. Let's say for example that you had a different camera that captured faces and for each point it returned the distance from the nose (which you considered to be the center of the object coordinate system). In that case, since the values returned would be in the object coordinate system, we would indeed need a rotation and translation matrix to bring them to the camera coordinate system.
2. While calibrating the camera
I suppose you are calibrating the camera using OpenCV using a calibration board with various poses. The usual way is to assume that the board is actually stable and the camera is moving instead of the opposite (the transformation is the same in both cases). That means that now the world coordinate system corresponds to the object coordinate system. This way, for every frame, we find the checkerboard corners and assign them 3d coordinates, doing something like:
std::vector<cv::Point3f> objectCorners;
for (int i=0; i<noOfCornersInHeight; i++)
{
for (int j=0; j<noOfCornersInWidth; j++)
{
objectCorners.push_back(cv::Point3f(float(i*squareSize),float(j*squareSize), 0.0f));
}
}
where noOfCornersInWidth, noOfCornersInHeight and squareSize depend on your calibration board. If for example noOfCornersInWidth = 4, noOfCornersInHeight = 3 and squareSize = 100, we get the 3d points
(0 ,0,0) (0 ,100,0) (0 ,200,0) (0 ,300,0)
(100,0,0) (100,100,0) (100,200,0) (100,300,0)
(200,0,0) (200,100,0) (200,200,0) (200,300,0)
So, here our coordinates are actually in the object coordinate system. (We have assumed arbitrarily that the upper left corner of the board is (0,0,0) and the rest corners' coordinates are according to that one). So here we indeed need the rotation and transformation matrix to take us from the object(world) to the camera system. These are the extrinsics that OpenCV returns for each frame.
To sum up in the Kinect case:
Camera and World coodinate systems are considered the same, so no need for extrinsics there.
No need for Object to World(Camera) transformation, since Kinect return value is already in Camera system.
Edit 2 (On the coordinate system used):
This is a convention and I think it depends also on which drivers you use and the kind of data you get back. Check for example that, that and that one.
Sidenote: It would help you a lot if you visualized a point cloud and played a little bit with it. You can save your points in a 3d object format (e.g. ply or obj) and then just import it into a program like Meshlab (very easy to use).
Edit 2 (On the coordinate system used):
This is a convention and I think it depends also on which drivers you use and the kind of data you get back. Check for example that, that and that one.
if you for instance use microsoft sdk: then Z is not the distance to the camera but the "planar" distance to the camera. This might change the appropriate formulas.