I have been testing WebGL to see whether I can batch-draw polygons in a particular way. I am going to simplify the use case, but it goes something along the lines of the following:
First, my vertices are simply:
vertices[v0_xy0, v1_xyz, ... vn_xyz]
In my case, each vertex must have a z value in the range (0 - 100) (I pick 100 arbitrarily) because I want all of those vertices to be depth tested against each other using those z values. On batch N + 1, I am limited to depth values (0 - 100) again, but I need the vertices in this batch to be guaranteed to be drawn atop all previous batches (layers of vertices). In other words, vertices within each batch are depth tested against each, but each batch is just drawn atop the previous one as if there were no depth testing.
At first I was going to try drawing to a texture with a framebuffer and depthbuffer attachment, draw to the canvas, repeat for the next group of vertices, but I realized that I might be able to do just this:
// pseudocode
function drawBuffers()
// clear both the color and the depth
gl.clearDepth(1.0);
gl.clear(gl.CLEAR_COLOR_BUFFER_BIT | gl.DEPTH_BUFFER_BIT);
// iterate over all vertex batches
for each vertexBatch in vertexBatches do
// draw the batch with depth testing
gl.draw(vertexBatch);
// clear the depth buffer
/* QUESTION: does this guarantee that subsequent batches
will be drawn atop previous batches, or will the pixels be written at
random (sometimes underneath, sometimes above)?
*/
gl.clearDepth(1.0);
gl.clear(gl.DEPTH_BUFFER_BIT);
endfor
end drawBuffers
I tested the above by drawing two overlapping quads, clearing the depth buffer, translating left and in negative z (in an attempt to "go under" the previous batch), and drawing the two overlapping quads again. I think that this works because I see that the second pair of quads are drawn in front of the first pair even though their z values are behind the previous pair's z values;
I am not certain that my test is reliable though. Could there be some undefined behavior involved? Is it just a coincidence that my test works as a result of the clearDepth setting and shapes?
May I have clarification so I can confirm whether my method will work for sure?
Thank you.
Since WebGL is based on OpenGL ES see OpenGL ES 1.1 Full Specification, 4.1.6 Depth Buffer Test, page 104:
The depth buffer test discards the incoming fragment if a depth comparison fails.
....
The comparison is specified with
void DepthFunc( enum func );
This command takes a single symbolic constant: one of NEVER, ALWAYS, LESS, LEQUAL, EQUAL, GREATER, GEQUAL, NOTEQUAL. Accordingly, the depth buffer test passes never, always, if the incoming fragment’s zw value is less than, less than or equal to, equal to, greater than, greater than or equal to, or not equal to the depth value stored at the location given by the incoming fragment’s (xw, yw) coordinates.
This means, if the clear value for the depth buffer glClearDepth is 1.0 (1.0 is the initial value)
gl.clearDepth(1.0);
and the depth buffer is cleared
gl.clear(gl.DEPTH_BUFFER_BIT);
and the depth function glDepthFunc is LESS or LEQUAL (LESS is the initial value)
gl.enable(gl.DEPTH_TEST);
gl.depthFunc(gl.LEQUAL);
then the next fragment which is drawn to any (xw, yw) coordinates, will pass the depth test and will overwrite the fragment stored at the location (xw, yw).
(Of course gl.BLEND has to be disabled and the fragment has to be in clip space)
Related
I'm using a metal shader to draw many particles onto the screen. Each particle has its own position (which can change) and often two particles have the same position. How can I check if the texture2d I write into does not have a pixel at a certain position yet? (I want to make sure that I only draw a particle at a certain position if there hasn't been drawn a particle yet, because I get an ugly flickering if many particles are drawn at the same positon)
I've tried outTexture.read(particlePosition), but this obviously doesn't work, because of the texture access qualifier, which is access::write.
Is there a way I can have read and write access to a texture2d at the same time? (If there isn't, how could I still solve my problem?)
There are several approaches that could work here. In concurrent systems programming, what you're talking about is termed first-write wins.
1) If the particles only need to preclude other particles from being drawn (and aren't potentially obscured by other elements in the scene in the same render pass), you can write a special value to the depth buffer to signify that a fragment has already been written to a particular coordinate. For example, you'd turn on depth test (using the depth compare function Equal), clear the depth buffer to some distant value (like 1.0), and then write a value of 0.0 to the depth buffer in the fragment function. Any subsequent write to a given pixel will fail to pass the depth test and will not be drawn.
2) Use framebuffer read-back. On iOS, Metal allows you to read from the currently-bound primary renderbuffer by attributing a parameter to your fragment function with [[color(0)]]. This parameter will contain the current color value in the renderbuffer, which you can test against to determine whether it has been written to. This does require you to clear the texture to a predetermined color that will never otherwise be produced by your fragment function, so it is more limited than the above approach, and possibly less performant.
All of the above applies whether you're rendering to a drawable's texture for direct presentation to the screen, or to some offscreen texture.
To answer the read and write part : you can specify a read/write access for the output texture as such :
texture2d<float, access::read_write> outTexture [[texture(1)]],
Also, your texture descriptor must specify usage :
textureDescriptor?.usage = [.shaderRead, .shaderWrite]
In iOS or OS/X what texture coordinates are used in Metal Shader Language kernel function? For example, given an MTLTexture and uint2 gid[[thread_position_in_grid]] Is gid.x and gid.ybetween 0..1 (x and y are floats) or 0..inTexture.get_width() (x and y are integers).
Thanks in Advance
thread_position_in_grid is an index (an integer) in the grid that takes values in the ranges you specify in dispatchThreadgroups:threadsPerThreadgroup:. It's up to you to decide how many thread groups you want, and how many threads per group.
In the following sample code you can see that threadsPerGroup.width * numThreadgroups.width == inputImage.width and threadsPerGroup.height * numThreadgroups.height == inputImage.height. In this case, a position in the grid will thus be a non-normalized (integer) pixel coordinate.
Each launch of a compute shader in Metal is accompanied by a dense rectangular 3D grid of thread IDs. The dimensions of the grid is set when you call [MTLComputeCommandEncoder dispatchThreadGroups:threadsPerThreadgroup:]. You can for example have a threadgroup size of {16,16,1} (256 threads in a threadgroup as a 16x16x1 square), and threadgroup count of {1,2,1}, which will cause two threadgroups to be launched with a total area of 512 threads in the shape {16,32,1}. These are the integers that appear at the top of your kernel as [[thread_position_in_grid]]. The thread position is the way that you tell which thread you are, just like the threadID parameter passed to a block by dispatch_apply().
Metal specifies no mapping from [[thread_position_in_grid]] to coordinates in a texture. This is done by you in software in your compute shader. If you want to read every other pixel in a region of a texture at some offset in the image, then you need to multiply the threadID by two and add an offset in your kernel before passing the new coordinate to texture2d.sample. Since Metal can not launch partial threadgroups, it is up to you to make sure that unneeded threadgroups are not executed. For example, when applied to a smaller texture, the full size of your 32x64 launch might cause you to write off the end of your texture. In this case you must check the threadID to see if the thread will write off the end and then either return out of the shader or skip over the texture write call for that thread to avoid the problem.
thread_position_in_grid is always made of unsigned integers, and provides these options, but none of them are related to texture coordinates. It may be helpful to ask another, related question, because you seem to be conflating the idea of textures and kernel functions.
16- or -32 bit
1D, 2D, or 3D
I have a vertex buffer with an unordered access view, which I'm using to fill the vertices using a compute shader, which treats the UAV as a RWStructuredBuffer, using an equivalent struct to the vertex definition. There are 216000 vertices (i.e. 60 x 60 x 60). But my compute shader seems to fill only about 8000 of them, leaving the rest with their initial values. Is there a limit on the number of elements in a structured buffer that can be written in this way?
As it turns out, if you turn on DirectX error-checking, assigning the UAV of a vertex buffer as a RWStructuredBuffer in the shader is considered to be an error. So although this actually works (for a limited number of vertices), it's not supported.
I'm using SharpDX and I want to do antialiasing in the Depth buffer. I need to store the Depth Buffer as a texture to use it later. So is it a good idea if this texture is a Texture2DMS? Or should I take another approach?
What I really want to achieve is:
1) Depth buffer scaling
2) Depth test supersampling
(terms I found in section 3.2 of this paper: http://gfx.cs.princeton.edu/pubs/Cole_2010_TFM/cole_tfm_preprint.pdf
The paper calls for a depth pre-pass. Since this pass requires no color, you should leave the render target unbound, and use an "empty" pixel shader. For depth, you should create a Texture2D (not MS) at 2x or 4x (or some other 2Nx) the width and height of the final render target that you're going to use. This isn't really "supersampling" (since the pre-pass is an independent phase with no actual pixel output) but it's similar.
For the second phase, the paper calls for doing multiple samples of the high-resolution depth buffer from the pre-pass. If you followed the sizing above, every pixel will correspond to some (2N)^2 depth values. You'll need to read these values and average them. Fortunately, there's a hardware-accelerated way to do this (called PCF) using SampleCmp with a COMPARISON sampler type. This samples a 2x2 stamp, compares each value to a specified value (pass in the second-phase calculated depth here, and don't forget to add some epsilon value (e.g. 1e-5)), and returns the averaged result. Do 2x2 stamps to cover the entire area of the first-phase depth buffer associated with this pixel, and average the results. The final result represents how much of the current line's spine corresponds to the foremost depth of the pre-pass. Because of the PCF's smooth filtering behavior, as lines become visible, they will slowly fade in, as opposed to the aliased "dotted" line effect described in the paper.
I have found intersection point's distance with function 'D3DXIntersectTri'.
Now, using distance value, how can i find that points value?
IDE: Delphi - JEDI
Language: Pascal
DirectX 9
EDIT:
Actually i have 2 cylinder and want to render only intersected part in 3-dimention. see Image:
As explained in the MSDN article, you can calculate the point with the barycentric coordinates:
p = p1 + pU * (p2 - p1) + pV(p3 - p1)
Rendering to certain parts of the screen is the task of the stencil buffer. Unless you want to create a new vertex buffer from the intersection (which could be created by clipping parts away, which is not that easy), using the stencil buffer is more efficient.
The stencil buffer is a buffer that holds integer values. You have to create it with the depth buffer, specifying the correct format (e.g. D24S8). You can then specify when pixels are discarded. Here is the idea:
Clear stencil buffer to 0
Enable solid rendering
Enable stencil buffer
Set blend states to not draw anything (Souce: 0, Destination: 1)
Disable depth testing, enable backface culling
Set the following stencil states:
CompareFunc to Always
StencilRef to 1
StencilWriteMask to 255
StencilFail to Replace
StencilPass to Replace
//this will set value 1 to every pixel that will be drawn
Draw the first cylinder
Now set the following stencil states:
CompareFunc to Equal
StencilFail to Keep //this keeps the value where the stencil test fails
StencilPass to Increment //this increments the value to 2 where stencil test passes
Draw the second cylinder
//Now there is a 2 in the stencil buffer where the cylinders intersect
Reset blend states
Reenable depth testing
Set StencilRef to 2 //render only pixels where stencil value == 2
Draw both cylinders
You might need to change the compare function to GreaterEqual before the last render pass. If pixels overlap, there can be values greater than two.