In OpenGL, depth buffer values are calculated based on the near and far clipping planes of the scene. (Reference: Getting the true z value from the depth buffer)
How does this work in WebGL? My understanding is that WebGL is unaware of my scene's far and near clipping planes. The near and far clipping planes are used to calculate my projection matrix, but I never tell WebGL what they are explicitly so it can't use them to calculate depth buffer values.
How does WebGL set values in the depth buffer when my scene is rendered?
WebGL (like modern OpenGL and OpenGL ES) gets the depth value from the value you supply to gl_Position.z in your vertex shader (though you can also write directly to the depth buffer using certain extensions but that's far less common)
There is no scene in WebGL nor modern OpenGL. That concept of a scene is part of legacy OpenGL left over from the early 90s and long since deprecated. It doesn't exist in OpenGL ES (the OpenGL that runs on Android, iOS, ChromeOS, Raspberry PI, WebGL etc...)
Modern OpenGL and WebGL are just rasterization APIs. You write shaders which are small functions that run on the GPU. You provide those shaders with data through attributes (per iteration data), uniforms (global variables), textures (2d/3d arrays), varyings (data passed from vertex shaders to fragment shaders).
The rest is up to you and what your supplied shader functions do. Modern OpenGL and WebGL are for all intents and purposes just generic computing engines with certain limits. To get them to do anything is up to you to supply shaders.
See webglfundamentals.org for more.
In the Q&A you linked to it's the programmer supplied shaders that decide to use frustum math to decide how to set gl_Position.z. The frustum math is supplied by the programmer. WebGL/GL don't care how gl_Position.z is computed, only that it's a value between -1.0 and +1.0 so how to take a value from the depth buffer and go back to Z is solely up to how the programmer decided to calculate it in the first place.
This article covers the most commonly used math for setting gl_Position.z when rendering 3d with WebGL/OpenGL. Based on your question though I'd suggest reading the preceding articles linked at the beginning of that one.
As for what actual values get written to the depth buffer it's
ndcZ = gl_Position.z / gl_Position.w;
depthValue = (far - near) / 2 * ndcZ + (near - far) / 2
near and far default to 0 and 1 respectively though you can set them with gl.depthRange but assuming they are 0 and 1 then
ndcZ = gl_Position.z / gl_Position.w;
depthValue = .5 * ndcZ - .5
That depthValue would then be in the 0 to 1 range and converted to whatever bit depth the depth buffer is. It's common to have a 24bit depth buffer so
bitValue = depthValue * (2^24 - 1)
Related
The Metal Shading Language includes a lot of mathematic functions, but it seems most of the codes inside Metal official documentation just use it to map vertexes from pixel space to clip space like
RasterizerData out;
out.clipSpacePosition = vector_float4(0.0, 0.0, 0.0, 1.0);
float2 pixelSpacePosition = vertices[vertexID].position.xy;
vector_float2 viewportSize = vector_float2(*viewportSizePointer);
out.clipSpacePosition.xy = pixelSpacePosition / (viewportSize / 2.0);
out.color = vertices[vertexID].color;
return out;
Except for GPGPU using kernel functions to do parallel computation, what things that vertex function can do, with some examples? In a game, if all vertices positions are calculated by the CPU, why GPU still matters? What does vertex function do usually?
Vertex shaders compute properties for vertices. That's their point. In addition to vertex positions, they also calculate lighting normals at each vertex. And potentially texture coordinates. And various material properties used by lighting and shading routines. Then, in the fragment processing stage, those values are interpolated and sent to the fragment shader for each fragment.
In general, you don't modify vertices on the CPU. In a game, you'd usually load them from a file into main memory, put them into a buffer and send them to the GPU. Once they're on the GPU you pass them to the vertex shader on each frame along with model, view, and projection matrices. A single buffer containing the vertices of, say, a tree or a car's wheel might be used multiple times. Each time all the CPU sends is the model, view, and projection matrices. The model matrix is used in the vertex shader to reposition and scale the vertice's positions in world space. The view matrix then moves and rotates the world around so that the virtual camera is at the origin and facing the appropriate way. Then the projection matrix modifies the vertices to put them into clip space.
There are other things a vertex shader can do, too. You can pass in vertices that are in a grid in the x-y plane, for example. Then in your vertex shader you can sample a texture and use that to generate the z-value. This gives you a way to change the geometry using a height map.
On older hardware (and some lower-end mobile hardware) it was expensive to do calculations on a texture coordinate before using it to sample from a texture because you lose some cache coherency. For example, if you wanted to sample several pixels in a column, you might loop over them adding an offset to the current texture coordinate and then sampling with the result. One trick was to do the calculation on the texture coordinates in the vertex shader and have them automatically interpolated before being sent to the fragment shader, then doing a normal look-up in the fragment shader. (I don't think this is an optimization on modern hardware, but it was a big win on some older models.)
First, I'll address this statement
In a game, if all vertices positions are calculated by the CPU, why GPU still matters? What does vertex function do usually?
I don't believe I've seen anyone calculating positions for meshes that will be later used to render them on a GPU. It's slow, you would need to get all this data from CPU to a GPU (which means copying it through a bus if you have a dedicated GPU). And it's just not that flexible. There are much more things other than vertex positions that are required to produce any meaningful image and calculating all this stuff on CPU is just wasteful, since CPU doesn't care for this data for the most part.
The sole purpose of vertex shader is to provide rasterizer with primitives that are in clip space. But there are some other uses that are mostly tricks based on different GPU features.
For example, vertex shaders can write out some data to buffers, so, for example, you can stream out transformed geometry if you don't want to transform it again at a later vertex stage if you have multi-pass rendering that uses the same geometry in more than one pass.
You can also use vertex shaders to output just one triangle that covers the whole screen, so that fragment shaders gets called one time per pixel for the whole screen (but, honestly, you are better of using compute (kernel) shaders for this).
You can also write out data from vertex shader and not generate any primitives. You can do that by generating degenerate triangles. You can use this to generate bounding boxes. Using atomic operations you can update min/max positions and read them at a later stage. This is useful for light culling, frustum culling, tile-based processing and many other things.
But, and it's a BIG BUT, you can do most of this stuff in a compute shader without incurring GPU to run all the vertex assembly pipeline. That means, you can do full-screen effects using just a compute shader (instead of vertex and fragment shader and many pipeline stages in between, such as rasterizer, primitive culling, depth testing and output merging). You can calculate bounding boxes and do light culling or frustum culling in compute shader.
There are reasons to fire up the whole rendering pipeline instead of just running a compute shader, for example, if you will still use triangles that are output from vertex shader, or if you aren't sure how primitives are laid out in memory so you need vertex assembler to do the heavy lifting of assembling primitives. But, getting back to your point, almost all of the reasonable uses for vertex shader include outputting primitives in clip space. If you aren't using resulting primitives, it's probably best to stick to compute shaders.
I'm trying to render a forrest scene for an iOS App with OpenGL. To make it a little bit nicer, I'd like to implement a depth effect into the scene. However I need a linearized depth value from the OpenGL depth buffer to do so. Currently I am using a computation in the fragment shader (which I found here).
Therefore my terrain fragment shader looks like this:
#version 300 es
precision mediump float;
layout(location = 0) out lowp vec4 out_color;
float linearizeDepth(float depth) {
return 2.0 * nearz / (farz + nearz - depth * (farz - nearz));
}
void main(void) {
float depth = gl_FragCoord.z;
float linearized = (linearizeDepth(depth));
out_color = vec4(linearized, linearized, linearized, 1.0);
}
However, this results in the following output:
As you can see, the "further" you get away, the more "stripy" the resulting depth value gets (especially behind the ship). If the terrain tile is close to the camera, the output is somewhat okay..
I even tried another computation:
float linearizeDepth(float depth) {
return 2.0 * nearz * farz / (farz + nearz - (2.0 * depth - 1.0) * (farz - nearz));
}
which resulted in a way too high value so I scaled it down by dividing:
float linearized = (linearizeDepth(depth) - 2.0) / 40.0;
Nevertheless, it gave a similar result.
So how do I achieve a smooth, linear transition between the near and the far plane, without any stripes? Has anybody had a similar problem?
the problem is that you store non linear values which are truncated so when you peek the depth values later on you got choppy result because you lose accuracy the more you are far from znear plane. No matter what you evaluate you will not obtain better results unless:
Lower accuracy loss
You can change znear,zfar values so they are closer together. enlarge znear as much as you can so the more accurate area covers more of your scene.
Another option is to use more bits per depth buffer (16 bits is too low) not sure if can do this in OpenGL ES but in standard OpenGL you can use 24,32 bits on most cards.
use linear depth buffer
So store linear values into depth buffer. There are two ways. One is compute depth so after all the underlying operations you will get linear value.
Another option is to use separate texture/FBO and store the linear depths directly to it. The problem is you can not use its contents in the same rendering pass.
[Edit1] Linear Depth buffer
To linearize depth buffer itself (not just the values taken from it) try this:
Vertex:
varying float depth;
void main()
{
vec4 p=ftransform();
depth=p.z;
gl_Position=p;
gl_FrontColor = gl_Color;
}
Fragment:
uniform float znear,zfar;
varying float depth; // original z in camera space instead of gl_FragCoord.z because is already truncated
void main(void)
{
float z=(depth-znear)/(zfar-znear);
gl_FragDepth=z;
gl_FragColor=gl_Color;
}
Non linear Depth buffer linearized on CPU side (as you do):
Linear Depth buffer GPU side (as you should):
The scene parameters are:
// 24 bits per Depth value
const double zang = 60.0;
const double znear= 0.01;
const double zfar =20000.0;
and simple rotated plate covering whole depth field of view. Booth images are taken by glReadPixels(0,0,scr.xs,scr.ys,GL_DEPTH_COMPONENT,GL_FLOAT,zed); and transformed to 2D RGB texture on CPU side. Then rendered as single QUAD covering whole screen on unit matrices ...
Now to obtain original depth value from linear depth buffer you just do this:
z = znear + (zfar-znear)*depth_value;
I used the ancient stuff just to keep this simple so port it to your profile ...
Beware I do not code in OpenGL ES nor IOS so I hope I did not miss something related to that (I am used to Win and PC).
To show the difference I added another rotated plate to the same scene (so they intersect) and use colored output (no depth obtaining anymore):
As you can see linear depth buffer is much much better (for scenes covering large part of depth FOV).
I have a vertex buffer with an unordered access view, which I'm using to fill the vertices using a compute shader, which treats the UAV as a RWStructuredBuffer, using an equivalent struct to the vertex definition. There are 216000 vertices (i.e. 60 x 60 x 60). But my compute shader seems to fill only about 8000 of them, leaving the rest with their initial values. Is there a limit on the number of elements in a structured buffer that can be written in this way?
As it turns out, if you turn on DirectX error-checking, assigning the UAV of a vertex buffer as a RWStructuredBuffer in the shader is considered to be an error. So although this actually works (for a limited number of vertices), it's not supported.
We have an iOS drawing app. Currently, the drawing is implemented with OpenGL ES 1.1. We use some algorithms to smooth the lines such as Bezier curves. So, when touch events occur, we get some set of points out of touch event points (based on algorithms) and draw these points. We also use brush texture for points to have more natural look.
I wonder if it's possible to implement these algorithms in OpenGL ES 2.0 shaders. Something like to call an OpenGL function to draw lines made of touch points and on output have smoothed brush-textured curve rendered.
Points P0, P1, ... P4 here are touch events and the points on red curve - output points, with such step for T so that the distance between two neighbor points on curve is not greater than 1 pixel.
And here is the link with Bezier algorithm explanation:
Bézier curve - Wikipedia, the free encyclopedia
Any help is much appreciated.
Thanks.
You cannot generate new vertices inside the vertex shader (you can do it in the geometry shader, which ES doesn't have). The number of output vertices is always the same as the number of input vertices, you can only change their positions (and ohter attributes of course).
So you would have to draw a line strip made out of enough vertices to guarantee a smooth enough curve. What you can do is put in always the same line strip, having the curve parameter values T as 1D vertex positions. In the shader you then use this input position (the parameter value) to compute the actual 2D/3D position on the curve using the DeCasteljau algorithm (or whatever) and the points P0 to P4 which you put into the shader as constants (uniform variables in GLSL terms).
But I'm not sure if that would really buy you anything over just computing those points on the CPU and putting them into a dynamic VBO. What you save is the copying of the curve points from CPU to GPU and the computation on the CPU, but on the other hand your vertex shader is much more complex. It needs to be evaluated which is the better approach. If you need to compute the curve points each frame (because the control points change each frame) and the curve is rather high detail, it might not be that bad an idea. But otherwise I don't think it really pays. And also your shader won't be adaptable that easily to a changing number of control points/curve degree at runtime.
But once again, you cannot put in 5 control points and generate N curve points on the GPU. The vertex shader always works on a single vertex and results in a single vertex, the same as the fragment shader always works on a single fragment (say pixel, though it isn't one yet) and result in a single (or no) fragment.
I have two sets of vertexes used as a line strip:
Vertexes1
Vertexes2
It's important to know that these vertexes have previously unknown values, as they are dynamic.
I want to make an animated transition (morph) between these two. I have come up with two different ways of doing this:
Option 1:
Set a Time uniform in the vertex shader, that goes from 0 - 1, where I can do something like this:
// Inside main() in the vertex shader
float originX = Position.x;
float destinationX = DestinationVertexPosition.x;
float interpolatedX = originX + (destinationX - originX) * Time;
gl_Position.x = interpolatedX;
As you probably see, this has one problem: How do I get the "DestinationVertexPosition" in there?
Option 2:
Make the interpolation calculation outside the vertex shader, where I loop through each vertex and create a third vertex set for the interpolated values, and use that to render:
// Pre render
// Use this vertex set to render
InterpolatedVertexes
for (unsigned int i = 0; i < vertexCount; i++) {
float originX = Vertexes1[i].x;
float destinationX = Vertexes2[i].x;
float interpolatedX = originX + (destinationX - originX) * Time;
InterpolatedVertexes[i].x = interpolatedX;
}
I have highly simplified these two code snippets, just to make the idea clear.
Now, from the two options, I feel like the first one is definitely better in terms of performance, given stuff happens at the shader level, AND I don't have to create a new set of vertexes each time the "Time" is updated.
So, now that the introduction to the problem has been covered, I would appreciate any of the following three things:
A discussion of better ways of achieving the desired results in OpenGL ES 2 (iOS).
A discussion about how Option 1 could be implemented properly, either by providing the "DestinationVertexPosition" or by modifying the idea somehow, to better achieve the same result.
A discussion about how Option 2 could be implemented.
In ES 2 you specify such attributes as you like — there's therefore no problem with specifying attributes for both origin and destination, and doing the linear interpolation between them in the vertex shader. However you really shouldn't do it component by component as your code suggests you want to as GPUs are vector processors, and the mix GLSL function will do the linear blend you want. So e.g. (with obvious inefficiencies and assumptions)
int sourceAttribute = glGetAttribLocation(shader, "sourceVertex");
glVertexAttribPointer(sourceAttribute, 3, GL_FLOAT, GL_FALSE, 0, sourceLocations);
int destAttribute = glGetAttribLocation(shader, "destVertex");
glVertexAttribPointer(destAttribute, 3, GL_FLOAT, GL_FALSE, 0, destLocations);
And:
gl_Position = vec4(mix(sourceVertex, destVertex, Time), 1.0);
Your two options here have a trade off: supply twice the geometry once and interpolate between that, or supply only one set of geometry, but do so for each frame. You have to weigh geometry size vs. upload bandwidth.
Given my experience with iOS devices, I'd highly recommend option 1. Uploading new geometry on every frame can be extremely expensive on these devices.
If the vertices are constant, you can upload them once into one or two vertex buffer objects (VBOs) with the GL_STATIC_DRAW flag set. The PowerVR SGX series has hardware optimizations for dealing with static VBOs, so they are very fast to work with after the initial upload.
As far as how to upload two sets of vertices for use in a single shader, geometry is just another input attribute for your shader. You could have one, two, or more sets of vertices fed into a single vertex shader. You just define the attributes using code like
attribute vec3 startingPosition;
attribute vec3 endingPosition;
and interpolate between them using code like
vec3 finalPosition = startingPosition * (1.0 - fractionalProgress) + endingPosition * fractionalProgress;
Edit: Tommy points out the mix() operation, which I'd forgotten about and is a better way to do the above vertex interpolation.
In order to inform your shader program as to where to get the second set of vertices, you'd use pretty much the same glVertexAttribPointer() call for the second set of geometry as the first, only pointing to that VBO and attribute.
Note that you can perform this calculation as a vector, rather than breaking out all three components individually. This doesn't get you much with a highp default precision on current PowerVR SGX chips, but could be faster on future ones than doing this one component at a time.
You might also want to look into other techniques used for vertex skinning, because there might be other ways of animating vertices that don't require two full sets of vertices to be uploaded.
The one case that I've heard where option 2 (uploading new geometry on each frame) might be preferable is in specific cases where using the Accelerate framework to do vector manipulation of the geometry ends up being faster than doing the skinning on-GPU. I remember the Unity folks were talking about this once, but I can't remember if it was for really small or really large sets of geometry. Option 1 has been faster in all the cases I've worked with myself.