OpenGL ES 2.0 Sphere - ios

What is the easiest way to draw a textured Sphere in OpenGL ES 2.0 with GL_TRIANGLES?
I'm especially wondering how to calculate the vertices.

There are various ways of triangulating spheres. Popular ones, less popular ones, good ones, and not so good ones. Unfortunately the most widely used approach isn't very good.
Spherical Coordinates
This might be the most widely used approach. You iterate through the two angles in a spherical coordinate system in two nested loops, and generate points for each pair of angles. With angle theta iterating from -pi/2 to pi/2 and angle phi iterating from 0 to 2*pi, and sphere radius r, each point is calculated as:
x = r * cos(theta) * cos(phi)
y = r * cos(theta) * sin(phi)
z = r * sin(theta)
The calculation can be made more efficient if necessary, but I'll skip that aspect for this answer. The level (precision) of the tessellation is determined by the number of subdivisions of the angles.
The main advantage of this approach is that it's simple to implement, and easy to understand. You can picture the subdivision as the lines of latitude and longitude on a globe.
It does not result in a very good triangulation, though. The triangles around the equator have similar dimensions in all directions, but the triangles closer to the north/south pole get increasingly narrow. At the north/south pole you have a large number of very narrow triangles meeting in a single point. Good triangulations have all very similar sized triangles, and this one does not.
Recursive Subdivision of Octahedron
With this approach, you start with a regular octahedron, giving you 8 triangles. You then recursively subdivide each triangle into 4 sub-triangles, as illustrated here:
/\
/ \
/____\
/\ /\
/ \ / \
/____\/____\
So each triangle is subdivided by calculating 3 additional vertices that are midway between two of the existing vertices, and 4 triangles are formed from these 6 vertices. For calculating the midway point between two input points, you calculate the sum of the two vectors, and normalize the result to get the point back on the sphere.
The level (precision) of the tessellation is determined by the number of levels in the recursive subdivision. It starts with the 8 original triangles of the octahedron at level 0, results in 32 triangles at level 1, 128 at level 2, 512 at level 3, etc. You normally get a reasonably good looking sphere around level 3.
This approach results in a much more regular triangulation, and is therefore superior to the spherical coordinate approach.
The main disadvantage is that it might seem more complex. The calculation of the points is in fact very simple. It gets slightly more tricky if you want to use indexed vertices, instead of repeating common vertices. And even more painful if you want to build nice triangle strips. Not terribly difficult, but it takes some work.
This is my favorite approach of drawing spheres.
Other Polyhedra
You can do the same thing I described for the octahedron starting with other polyhedra. Regular polyhedra that consist of triangles are particularly suitable, which makes the tethrahedron and the icosahedron natural candidates. The octahedron is the most attractive IMHO because the initial coordinates are so easy to enumerate. Using an icosahedron would probably result in an even more regular triangulation, and the vertex coordinates can be looked up.
Subdivided Cube
I'm not sure if anybody is actually using this. But I tried it recently, and it was kind of fun. :) The idea is that you take a cube centered at the origin, and subdivide each of the six sides into smaller sub-squares. You can then turn the cube into a sphere by simply normalizing each of the vectors that describe a vertex.
The advantage of this approach is that it's very simple, including building triangle strips. The quality of the triangulation seems reasonably good. I don't think it's as regular as the recursively subdivided octahedron, but definitely better than the (much too) widely used spherical coordinate approach.

Related

Measure distance to object with a single camera in a static scene

let's say I am placing a small object on a flat floor inside a room.
First step: Take a picture of the room floor from a known, static position in the world coordinate system.
Second step: Detect the bottom edge of the object in the image and map the pixel coordinate to the object position in the world coordinate system.
Third step: By using a measuring tape measure the real distance to the object.
I could move the small object, repeat this three steps for every pixel coordinate and create a lookup table (key: pixel coordinate; value: distance). This procedure is accurate enough for my use case. I know that it is problematic if there are multiple objects (an object could cover an other object).
My question: Is there an easier way to create this lookup table? Accidentally changing the camera angle by a few degrees destroys the hard work. ;)
Maybe it is possible to execute the three steps for a few specific pixel coordinates or positions in the world coordinate system and perform some "calibration" to calculate the distances with the computed parameters?
If the floor is flat, its equation is that of a plane, let
a.x + b.y + c.z = 1
in the camera coordinates (the origin is the optical center of the camera, XY forms the focal plane and Z the viewing direction).
Then a ray from the camera center to a point on the image at pixel coordinates (u, v) is given by
(u, v, f).t
where f is the focal length.
The ray hits the plane when
(a.u + b.v + c.f) t = 1,
i.e. at the point
(u, v, f) / (a.u + b.v + c.f)
Finally, the distance from the camera to the point is
p = √(u² + v² + f²) / (a.u + b.v + c.f)
This is the function that you need to tabulate. Assuming that f is known, you can determine the unknown coefficients a, b, c by taking three non-aligned points, measuring the image coordinates (u, v) and the distances, and solving a 3x3 system of linear equations.
From the last equation, you can then estimate the distance for any point of the image.
The focal distance can be measured (in pixels) by looking at a target of known size, at a known distance. By proportionality, the ratio of the distance over the size is f over the length in the image.
Most vision libraries (including opencv) have built in functions that will take a couple points from a camera reference frame and the related points from a Cartesian plane and generate your warp matrix (affine transformation) for you. (some are fancy enough to include non-linearity mappings with enough input points, but that brings you back to your time to calibrate issue)
A final note: most vision libraries use some type of grid to calibrate off of ie a checkerboard patter. If you wrote your calibration to work off of such a sheet, then you would only need to measure distances to 1 target object as the transformations would be calculated by the sheet and the target would just provide the world offsets.
I believe what you are after is called a Projective Transformation. The link below should guide you through exactly what you need.
Demonstration of calculating a projective transformation with proper math typesetting on the Math SE.
Although you can solve this by hand and write that into your code... I strongly recommend using a matrix math library or even writing your own matrix math functions prior to resorting to hand calculating the equations as you will have to solve them symbolically to turn it into code and that will be very expansive and prone to miscalculation.
Here are just a few tips that may help you with clarification (applying it to your problem):
-Your A matrix (source) is built from the 4 xy points in your camera image (pixel locations).
-Your B matrix (destination) is built from your measurements in in the real world.
-For fast recalibration, I suggest marking points on the ground to be able to quickly place the cube at the 4 locations (and subsequently get the altered pixel locations in the camera) without having to remeasure.
-You will only have to do steps 1-5 (once) during calibration, after that whenever you want to know the position of something just get the coordinates in your image and run them through step 6 and step 7.
-You will want your calibration points to be as far away from eachother as possible (within reason, as at extreme distances in a vanishing point situation, you start rapidly losing pixel density and therefore source image accuracy). Make sure that no 3 points are colinear (simply put, make your 4 points approximately square at almost the full span of your camera fov in the real world)
ps I apologize for not writing this out here, but they have fancy math editing and it looks way cleaner!
Final steps to applying this method to this situation:
In order to perform this calibration, you will have to set a global home position (likely easiest to do this arbitrarily on the floor and measure your camera position relative to that point). From this position, you will need to measure your object's distance from this position in both x and y coordinates on the floor. Although a more tightly packed calibration set will give you more error, the easiest solution for this may simply be to have a dimension-ed sheet(I am thinking piece of printer paper or a large board or something). The reason that this will be easier is that it will have built in axes (ie the two sides will be orthogonal and you will just use the four corners of the object and used canned distances in your calibration). EX: for a piece of paper your points would be (0,0), (0,8.5), (11,8.5), (11,0)
So using those points and the pixels you get will create your transform matrix, but that still just gives you a global x,y position on axes that may be hard to measure on (they may be skew depending on how you measured/ calibrated). So you will need to calculate your camera offset:
object in real world coords (from steps above): x1, y1
camera coords (Xc, Yc)
dist = sqrt( pow(x1-Xc,2) + pow(y1-Yc,2) )
If it is too cumbersome to try to measure the position of the camera from global origin by hand, you can instead measure the distance to 2 different points and feed those values into the above equation to calculate your camera offset, which you will then store and use anytime you want to get final distance.
As already mentioned in the previous answers you'll need a projective transformation or simply a homography. However, I'll consider it from a more practical view and will try to summarize it short and simple.
So, given the proper homography you can warp your picture of a plane such that it looks like you took it from above (like here). Even simpler you can transform a pixel coordinate of your image to world coordinates of the plane (the same is done during the warping for each pixel).
A homography is basically a 3x3 matrix and you transform a coordinate by multiplying it with the matrix. You may now think, wait 3x3 matrix and 2D coordinates: You'll need to use homogeneous coordinates.
However, most frameworks and libraries will do this handling for you. What you need to do is finding (at least) four points (x/y-coordinates) on your world plane/floor (preferably the corners of a rectangle, aligned with your desired world coordinate system), take a picture of them, measure the pixel coordinates and pass both to the "find-homography-function" of your desired computer vision or math library.
In OpenCV that would be findHomography, here an example (the method perspectiveTransform then performs the actual transformation).
In Matlab you can use something from here. Make sure you are using a projective transformation as transform type. The result is a projective tform, which can be used in combination with this method, in order to transform your points from one coordinate system to another.
In order to transform into the other direction you just have to invert your homography and use the result instead.

Nature of relationship between optic flow and depth

Assuming the static scene, with a single camera moving exactly sideways at small distance, there are two frames and a following computed optic flow (I use opencv's calcOpticalFlowFarneback):
Here scatter points are detected features, which are painted in pseudocolor with depth values (red is little depth, close to the camera, blue is more distant). Now, I obtain those depth values by simply inverting optic flow magnitude, like d = 1 / flow. Seems kinda intuitive, in a motion-parallax-way - the brighter the object, the closer it is to the observer. So there's a cube, exposing a frontal edge and a bit of a side edge to the camera.
But then I'm trying to project those feature points from camera plane to the real-life coordinates to make a kind of top view map (where X = (x * d) / f and Y = d (where d is depth, x is pixel coordinate, f is focal length, and X and Y are real-life coordinates). And here's what I get:
Well, doesn't look cubic to me. Looks like the picture is skewed to the right. I've spent some time thinking about why, and it seems that 1 / flow is not an accurate depth metric. Playing with different values, say, if I use 1 / power(flow, 1 / 3), I get a better picture:
But, of course, power of 1 / 3 is just a magic number out of my head. The question is, what is the relationship between optic flow in depth in general, and how do I suppose to estimate it for a given scene? We're just considering camera translation here. I've stumbled upon some papers, but no luck trying to find a general equation yet. Some, like that one, propose a variation of 1 / flow, which isn't going to work, I guess.
Update
What bothers me a little is that simple geometry points me to 1 / flow answer too. Like, optic flow is the same (in my case) as disparity, right? Then using this formula I get d = Bf / (x2 - x1), where B is distance between two camera positions, f is focal length, x2-x1 is precisely the optic flow. Focal length is a constant, and B is constant for any two given frames, so that leaves me with 1 / flow again multiplied by a constant. Do I misunderstand something about what optic flow is?
for a static scene, moving a camera precisely sideways a known amount, is exactly the same as a stereo camera setup. From this, you can indeed estimate depth, if your system is calibrated.
Note that calibration in this sense is rather broad. In order to get real accurate depth, you will need to in the end supply a scale parameter on top of the regular calibration stuff you have in openCV, or else there is a single uniform ambiguity of the 3D (This last step is often called going to the "metric" reconstruction from only the "Euclidean").
Another thing which is apart of broad calibration is lens distortion compensation. Before anything else, you probably want to force your cameras to behave like pin-hole cameras (which real-world cameras usually dont).
With that said, optical flow is definetely very different from a metric depth map. If you properly calibraty and rectify your system first, then optical flow is still not equivalent to disparity estimation. If your system is rectified, there is no point in doing a full optical flow estimation (such as Farnebäck), because the problem is thereafter constrained along the horizontal lines of the image. Doing a full optical flow estimation (giving 2 d.o.f) will introduce more error after said rectification likely.
A great reference for all this stuff is the classic "Multiple View Geometry in Computer Vision"

Calculating the neighborhood distance

What method would you use to compute a distance that represents the number of "jumps" one has to do to go from one area to another area in a given 2D map?
Let's take the following map for instance:
(source: free.fr)
The end result of the computation would be a triangle like this:
A B C D E F
A
B 1
C 2 1
D 2 1 1
E . . . .
F 3 2 2 1 .
Which means that going from A to D, it takes 2 jumps.
However, to go from anywhere to E, it's impossible because the "gap" is too big, and so the value is "infinite", represented here as a dot for simplification.
As you can see on the example, the polygons may share points, but most often they are simply close together and so a maximum gap should be allowed to consider two polygons to be adjacent.
This, obviously, is a simplified example, but in the real case I'm faced with about 60000 polygons and am only interested by jump values up to 4.
As input data, I have the polygon vertices as an array of coordinates, from which I already know how to calculate the centroid.
My initial approach would be to "paint" the polygons on a white background canvas, each with their own color and then walk the line between two candidate polygons centroid. Counting the colors I encounter could give me the number of jumps.
However, this is not really reliable as it does not take into account concave arrangements where one has to walk around the "notch" to go from one polygon to the other as can be seen when going from A to F.
I have tried looking for reference material on this subject but could not find any because I have a hard time figuring what the proper terms are for describing this kind of problem.
My target language is Delphi XE2, but any example would be most welcome.
You can create inflated polygon with small offset for every initial polygon, then check for intersection with neighbouring (inflated) polygons. Offseting is useful to compensate small gaps between polygons.
Both inflating and intersection problems might be solved with Clipper library.
Solution of the potential neighbours problem depends on real conditions - for example, simple method - divide plane to square cells, and check for neighbours that have vertices in the same cell and in the nearest cells.
Every pair of intersecting polygons gives an edge in (unweighted, undirected) graph. You want to find all the path with length <=4 - just execute depth-limited BFS from every vertice (polygon) - assuming that graph is sparse
You can try a single link clustering or some voronoi diagrams. You can also brute-force or try Density-based spatial clustering of applications with noise (DBSCAN) or K-means clustering.
I would try that:
1) Do a Delaunay triangulation of all the points of all polygons
2) Remove from Delaunay graph all triangles that have their 3 points in the same polygon
Two polygons are neightbor by point if at least one triangle have at least one points in both polygons (or obviously if polygons have a common point)
Two polygons are neightbor by side if each polygon have at least two adjacents points in the same quad = two adjacent triangles (or obviously two common and adjacent points)
Once the gaps are filled with new polygons (triangles eventually combined) use Djikistra Algorithm ponderated with distance from nearest points (or polygons centroid) to compute the pathes.

Smooth textured line with OpenGL ES 2.0 shaders

We have an iOS drawing app. Currently, the drawing is implemented with OpenGL ES 1.1. We use some algorithms to smooth the lines such as Bezier curves. So, when touch events occur, we get some set of points out of touch event points (based on algorithms) and draw these points. We also use brush texture for points to have more natural look.
I wonder if it's possible to implement these algorithms in OpenGL ES 2.0 shaders. Something like to call an OpenGL function to draw lines made of touch points and on output have smoothed brush-textured curve rendered.
Points P0, P1, ... P4 here are touch events and the points on red curve - output points, with such step for T so that the distance between two neighbor points on curve is not greater than 1 pixel.
And here is the link with Bezier algorithm explanation:
Bézier curve - Wikipedia, the free encyclopedia
Any help is much appreciated.
Thanks.
You cannot generate new vertices inside the vertex shader (you can do it in the geometry shader, which ES doesn't have). The number of output vertices is always the same as the number of input vertices, you can only change their positions (and ohter attributes of course).
So you would have to draw a line strip made out of enough vertices to guarantee a smooth enough curve. What you can do is put in always the same line strip, having the curve parameter values T as 1D vertex positions. In the shader you then use this input position (the parameter value) to compute the actual 2D/3D position on the curve using the DeCasteljau algorithm (or whatever) and the points P0 to P4 which you put into the shader as constants (uniform variables in GLSL terms).
But I'm not sure if that would really buy you anything over just computing those points on the CPU and putting them into a dynamic VBO. What you save is the copying of the curve points from CPU to GPU and the computation on the CPU, but on the other hand your vertex shader is much more complex. It needs to be evaluated which is the better approach. If you need to compute the curve points each frame (because the control points change each frame) and the curve is rather high detail, it might not be that bad an idea. But otherwise I don't think it really pays. And also your shader won't be adaptable that easily to a changing number of control points/curve degree at runtime.
But once again, you cannot put in 5 control points and generate N curve points on the GPU. The vertex shader always works on a single vertex and results in a single vertex, the same as the fragment shader always works on a single fragment (say pixel, though it isn't one yet) and result in a single (or no) fragment.

Math/OpenGL ES: Draw 3D bezier curve of varying width

I've been working on a problem for several weeks and have reached a point that I'd like to make sure I'm not overcomplicating my approach. This is being done in OpenGL ES 2.0 on iOS, but the principles are universal, so I don't mind the answers being purely mathematical in form. Here's the rundown.
I have 2 points in 3D space along with a control point that I am using to produce a bezier curve with the following equation:
B(t) = (1 - t)2P0 + 2(1 - t)tP1 + t2P2
The start/end points are being positioned at dynamic coordinates on a fairly large sphere, so x/y/z varies greatly, making a static solution not so practical. I'm currently rendering the points using GL_LINE_STRIP. The next step is to render the curve using GL_TRIANGLE_STRIP and control the width relative to height.
According to this quick discussion, a good way to solve my problem would be to find points that are parallel to the curve along both sides taking account the direction of it. I'd like to create 3 curves in total, pass in the indices to create a bezier curve of varying width, and then draw it.
There's also talk of interpolation and using a Loop-Blinn technique that seem to solve the specific problems of their respective questions. I believe that the solutions, however, might be too complex for what I'm going after. I'm also not interested bringing textures into the mix. I prefer that the triangles are just drawn using the colors I'll calculate later on in my shaders.
So, before I go into more reading on Trilinear Interpolation, Catmull-Rom splines, the Loop-Blinn paper, or explore sampling further, I'd like to make sure what direction is most likely to be the best bet. I think I can say the problem in its most basic form is to take a point in 3D space and find two parallel points along side it that take into account the direction the next point will be plotted.
Thank you for your time and if I can provide anything further, let me know and I'll do my best to add it.
This answer does not (as far as I see) favor one of the methods you mentioned in your question, but is what I would do in this situation.
I would calculate the normalized normal (or binormal) of the curve. Let's say I take the normalized normal and have it as a function of t (N(t)). With this I would write a helper function to calculate the offset point P:
P(t, o) = B(t) + o * N(t)
Where o means the signed offset of the curve in normal direction.
Given this function one would simply calculate the points to the left and right of the curve by:
Points = [P(t, -w), P(t, w), P(t + s, -w), P(t + s, w)]
Where w is the width of the curve you want to achieve.
Then connect these points via two triangles.
For use in a triangle strip this would mean the indices:
0 1 2 3
Edit
To do some work with the curve one would generally calculate the Frenet frame.
This is a set of 3 vectors (Tangent, Normal, Binormal) that gives the orientation in a curve at a given parameter value (t).
The Frenet frame is given by:
unit tangent = B'(t) / || B'(t) ||
unit binormal = (B'(t) x B''(t)) / || B'(t) x B''(t) ||
unit normal = unit binormal x unit tangent
In this example x denotes the cross product of two vectors and || v || means the length (or norm) of the enclosed vector v.
As you can see you need the first (B'(t)) and the second (B''(t)) derivative of the curve.

Resources