In single pass (In a single draw cycle) to the shader How many maximum Textures we can draw at once in metal using Kernel ? I have tried to draw six squared textures in a draw cycle.When the textures points overlap actual texture is not presented as expected. Some glitches are available
Related
The Metal Shading Language includes a lot of mathematic functions, but it seems most of the codes inside Metal official documentation just use it to map vertexes from pixel space to clip space like
RasterizerData out;
out.clipSpacePosition = vector_float4(0.0, 0.0, 0.0, 1.0);
float2 pixelSpacePosition = vertices[vertexID].position.xy;
vector_float2 viewportSize = vector_float2(*viewportSizePointer);
out.clipSpacePosition.xy = pixelSpacePosition / (viewportSize / 2.0);
out.color = vertices[vertexID].color;
return out;
Except for GPGPU using kernel functions to do parallel computation, what things that vertex function can do, with some examples? In a game, if all vertices positions are calculated by the CPU, why GPU still matters? What does vertex function do usually?
Vertex shaders compute properties for vertices. That's their point. In addition to vertex positions, they also calculate lighting normals at each vertex. And potentially texture coordinates. And various material properties used by lighting and shading routines. Then, in the fragment processing stage, those values are interpolated and sent to the fragment shader for each fragment.
In general, you don't modify vertices on the CPU. In a game, you'd usually load them from a file into main memory, put them into a buffer and send them to the GPU. Once they're on the GPU you pass them to the vertex shader on each frame along with model, view, and projection matrices. A single buffer containing the vertices of, say, a tree or a car's wheel might be used multiple times. Each time all the CPU sends is the model, view, and projection matrices. The model matrix is used in the vertex shader to reposition and scale the vertice's positions in world space. The view matrix then moves and rotates the world around so that the virtual camera is at the origin and facing the appropriate way. Then the projection matrix modifies the vertices to put them into clip space.
There are other things a vertex shader can do, too. You can pass in vertices that are in a grid in the x-y plane, for example. Then in your vertex shader you can sample a texture and use that to generate the z-value. This gives you a way to change the geometry using a height map.
On older hardware (and some lower-end mobile hardware) it was expensive to do calculations on a texture coordinate before using it to sample from a texture because you lose some cache coherency. For example, if you wanted to sample several pixels in a column, you might loop over them adding an offset to the current texture coordinate and then sampling with the result. One trick was to do the calculation on the texture coordinates in the vertex shader and have them automatically interpolated before being sent to the fragment shader, then doing a normal look-up in the fragment shader. (I don't think this is an optimization on modern hardware, but it was a big win on some older models.)
First, I'll address this statement
In a game, if all vertices positions are calculated by the CPU, why GPU still matters? What does vertex function do usually?
I don't believe I've seen anyone calculating positions for meshes that will be later used to render them on a GPU. It's slow, you would need to get all this data from CPU to a GPU (which means copying it through a bus if you have a dedicated GPU). And it's just not that flexible. There are much more things other than vertex positions that are required to produce any meaningful image and calculating all this stuff on CPU is just wasteful, since CPU doesn't care for this data for the most part.
The sole purpose of vertex shader is to provide rasterizer with primitives that are in clip space. But there are some other uses that are mostly tricks based on different GPU features.
For example, vertex shaders can write out some data to buffers, so, for example, you can stream out transformed geometry if you don't want to transform it again at a later vertex stage if you have multi-pass rendering that uses the same geometry in more than one pass.
You can also use vertex shaders to output just one triangle that covers the whole screen, so that fragment shaders gets called one time per pixel for the whole screen (but, honestly, you are better of using compute (kernel) shaders for this).
You can also write out data from vertex shader and not generate any primitives. You can do that by generating degenerate triangles. You can use this to generate bounding boxes. Using atomic operations you can update min/max positions and read them at a later stage. This is useful for light culling, frustum culling, tile-based processing and many other things.
But, and it's a BIG BUT, you can do most of this stuff in a compute shader without incurring GPU to run all the vertex assembly pipeline. That means, you can do full-screen effects using just a compute shader (instead of vertex and fragment shader and many pipeline stages in between, such as rasterizer, primitive culling, depth testing and output merging). You can calculate bounding boxes and do light culling or frustum culling in compute shader.
There are reasons to fire up the whole rendering pipeline instead of just running a compute shader, for example, if you will still use triangles that are output from vertex shader, or if you aren't sure how primitives are laid out in memory so you need vertex assembler to do the heavy lifting of assembling primitives. But, getting back to your point, almost all of the reasonable uses for vertex shader include outputting primitives in clip space. If you aren't using resulting primitives, it's probably best to stick to compute shaders.
I am working on an Android application that slims or fatten faces by detecting it. Currently, I have achieved that by using the Thin-plate spline algorithm.
http://ipwithopencv.blogspot.com.tr/2010/01/thin-plate-spline-example.html
The problem is that the algorithm is not fast enough for me so I decided to change it to OpenGL. After some research, I see that the lookup table texture is the best option for this. I have a set of control points for source image and new positions of them for warp effect.
How should I create lookup table texture to get warp effect?
Are you really sure you need a lookup texture?
Seems that it`d be better if you had a textured rectangular mesh (or a non-rectangular mesh, of course, as the face detection algorithm you have most likely returns a face-like mesh) and warped it according to the algorithm:
Not only you`d be able to do that in a vertex shader, thus processing each mesh node in parallel, but also it`s less values to process compared to dynamic texture generation.
The most compatible method to achieve that is to give each mesh point a Y coordinate of 0 and X coordinate where the mesh index would be stored, and then pass a texture (maybe even a buffer texture if target devices support it) to the vertex shader, where at the needed index the R and G channels contain the desired X and Y coordinates.
Inside the vertex shader, the coordinates are to be loaded from the texture.
This approach allows for dynamic warping without reloading geometry, if the target data texture is properly updated — for example, inside a pixel shader.
The simple question is - is there any difference between gl.LINEAR_MIPMAP_NEAREST and gl.NEAREST_MIPMAP_LINEAR? I've used the first, with bad results (see below) and found the second on the web. Interestingly, both are defined (in Chrome), and I wonder what their difference is.
The real question is - If I have a texture atlas with transparency (containing glyphs), can I use mipmapping? When zooming to small sizes, the glyphs flicker, which I want to eliminate by mipmapping.
But when I turn on mipmapping (only changing the TEXTURE_MIN_FILTER from LINEAR to LINEAR_MIPMAP_NEAREST, and calling generateMipmap() afterwards), the transparency is completely gone and the entire texture turns black.
I understand that mipmapping may cause bleeding of the black ink into the transparent area, but not fill the entire texture at all mipmap levels (including the original size).
What scrap of knowledge do I miss?
From the docs
GL_NEAREST
Returns the value of the texture element that is nearest (in Manhattan distance) to the center of the pixel being textured.
GL_LINEAR
Returns the weighted average of the four texture elements that are closest to the center of the pixel being textured.
GL_NEAREST_MIPMAP_NEAREST
Chooses the mipmap that most closely matches the size of the pixel being textured and uses the GL_NEAREST criterion (the texture element nearest to the center of the pixel) to produce a texture value.
GL_LINEAR_MIPMAP_NEAREST
Chooses the mipmap that most closely matches the size of the pixel being textured and uses the GL_LINEAR criterion (a weighted average of the four texture elements that are closest to the center of the pixel) to produce a texture value.
GL_NEAREST_MIPMAP_LINEAR
Chooses the two mipmaps that most closely match the size of the pixel being textured and uses the GL_NEAREST criterion (the texture element nearest to the center of the pixel) to produce a texture value from each mipmap. The final texture value is a weighted average of those two values.
GL_LINEAR_MIPMAP_LINEAR
Chooses the two mipmaps that most closely match the size of the pixel being textured and uses the GL_LINEAR criterion (a weighted average of the four texture elements that are closest to the center of the pixel) to produce a texture value from each mipmap. The final texture value is a weighted average of those two values.
As for why your stuff turns black have you checked the JavaScript console for errors? The most likely reason is your texture is not a power of 2 in both dimensions. If that's the case, trying to use mips by switching from gl.LINEAR to gl.LINEAR_MIPMAP_NEAREST will not work because in WebGL mips are not supported textures that are not a power of 2 in both dimensions.
Could someone explain the math behind the function Tex2D in HLSL?
One of the examples is: given a quad with 4 vertices, the texture coordinates are (0,0) (0,1) (1,0) (1,1) on it and the texture's width and height are 640 and 480. How is the shader able to determine the number of times of sampling to be performed? If it is to map texels to pixels directly, does it mean that the shader needs to perform 640*480 times of sampling with the texture coordinates increasing in some kind of gradients? Also, I would appreciate if you could provide more references and articles on this topic.
Thanks.
After the vertex shader the rasterizer "converts" triangles into pixels. Each pixel is associated with a screen position, and the vertex attributes of the triangles (eg: texture coordinates) are interpolated across the triangles and an interpolated value is stored in each pixel according to the pixel position.
The pixel shader runs once per pixel (in most cases).
The number of times the texture is sampled per pixel depends on the sampler used. If you use a point sampler the texture is sampled once, 4 times if you use a bilinear sampler and a few more if you use more complex samplers.
So if you're drawing a fullscreen quad, the texture you're sampling is the same size of the render target and you're using a point sampler the texture will be sampled width*height times (once per pixel).
You can think about textures as an 2-dimensional array of texels. tex2D simply returns the texel at the requested position performing some kind of interpolation depending on the sampler used (texture coordinates are usually relative to the texture size so the hardware will convert them to absolute coordinates).
This link might be useful: Rasterization
Just as a quick example, I'm trying to do the following:
+
+
=
With the third image as an alpha map, how could this be implemented in a DX9-compatible pixel shader to "blend" between the first two images, creating an effect similar to the fourth image?
Furthermore, how could this newly created texture be given back to the CPU, where it could be placed back inside the original array of textures?
The rough way is to blend the colors of the textures with the alphamap and return it from the pixelshader:
float alpha = tex2D(AlphaSampler,TexCoord).r;
float3 texture1 = tex2D(Texture1Sampler,TexCoord).rgb;
float3 texture2 = tex2D(Texture2Sampler,TexCoord).rgb;
float3 color = lerp(texture1,texture2,alpha);
return float4(color.rgb,1);
Therefore you need a texture as rendertarget (doc) with the size of the inputtextures and a fullscreen quad as geometry for rendering, a xyzrhw quad would be the easiest. This texture you can use further for rendering. If you want to read the texels or something else, where you must lock the result you could work with StretchRect (doc) or UpdateSurface (doc) to copy the data into a normal texture.
If the performance isn't important (e.g. you preprocess the textures), you could easier compute this on the cpu (but it's slower). Lock the 4 textures, iterate over the pixels and merge them directly.