Does a MTLBlitCommandEncode preform linear sampling - ios

If I wanted to copy a low resolution texture onto a higher resolution texture and I used a Blit Command Encoder would it preform linear sampling on the texture while it stretched it?

Blit command encoders can't do that at all, let alone perform interpolation while doing so. All of the copy methods take only a single size parameter, which is the size in both the source and, implicitly, the destination of the region to copy. It can't resize.
To do what you want, you need to use a render command encoder and draw a quad which samples from the source texture and uses the destination texture as the render target (color attachment). At that point, you have control over sampling/interpolation via the fragment shader and the sampler object you use to sample from the source when determining the color of a fragment.

Related

Copy Metal frame buffer to MTLTexture with different Pixel Format

I need to grab the screen pixels into a texture to perform post processing.
Previously, i have been using BlitCommandEncoder to copy from texture to texture. Source texture being the MTLDrawable texture, onto my destination texture. They both have the same MTLPixelFormatBGRA8Unorm so everything works just fine.
However, now i need to use a frame buffer color attachment texture of MTLPixelFormatRGBA16Float for HDR rendering. So, when i am grabbing the screen pixels, i am actually grabbing from this color attachment texture instead of the Drawable texture. And i am getting this error:
[MTLDebugBlitCommandEncoder internalValidateCopyFromTexture:sourceSlice:sourceLevel:sourceOrigin:sourceSize:toTexture:destinationSlice:destinationLevel:destinationOrigin:options:]:447: failed assertion [sourceTexture pixelFormat](MTLPixelFormatRGBA16Float) must equal [destinationTexture pixelFormat](MTLPixelFormatBGRA8Unorm)
I don't think i need to change my destination texture to RGBA16Float format? Because that will take up double the memory. One full screen texture (color attachment) with that format should be enough for HDR to work right?
Is there other method to successfully perform this kind of copy?
On openGL there is no error when copying with glCopyTexImage2D
Metal automatically converts from source to destination format during rendering. So you could just do a no-op rendering pass to perform the conversion.
Alternatively, if you want to avoid boilerplate no-op rendering code, you can use the MPSImageConversion performance shader that's basically doing the same.

Write pixel data to certain mipmap level of texture2d

As you might know, Metal Shading Language allows few ways to read pixel data from texture2d in the kernel function. It can be either simple read(short2 coord) or sample(float2 coord, [different additional parameters]). But I noticed, that when it comes to writing something into texture, there's only write method.
And the problem here is that sample method allows to sample from certain mipmap level which is very convenient. Developer just needs to create a sampler with mipFilter and use normalized coordinates.
But what if I want to write into certain mipmap level of the texture? The thing is that write method doesn't have mipmap parameter the way sample method has and I cannot find any alternative for that.
I'm pretty sure there should be a way to choose mipmap level for writing data to the texture, because Metal Performance Shaders framework has solutions where mipmaps of textures are being populated.
Thanks in advance!
You can do this with texture views.
The purpose of texture views is to reinterpret the contents of a base texture by selecting a subset of its levels and slices and potentially reading/writing its pixels in a different (but compatible) pixel format.
The -newTextureViewWithPixelFormat:textureType:levels:slices: method on the MTLTexture protocol returns a new instance of id<MTLTexture> that has the first level specified in the levels range as its base mip level. By creating one view per mip level you wish to write to, you can "target" each level in the original texture.
For example, to create a texture view on the second mip level of a 2D texture, you might call the method like this:
id<MTLTexture> viewTexture =
[baseTexture newTextureViewWithPixelFormat:baseTexture.pixelFormat
textureType:baseTexture.textureType
levels:NSMakeRange(1, 1)
slices:NSMakeRange(0, 1)];
When binding this new texture as an argument, its mip level 0 will correspond to mip level 1 of its base texture. You can therefore use the ordinary texture write function in a shader to write to the selected mip level:
myShaderTexture.write(color, coords);

What can vertex function do except for mapping to clip space?

The Metal Shading Language includes a lot of mathematic functions, but it seems most of the codes inside Metal official documentation just use it to map vertexes from pixel space to clip space like
RasterizerData out;
out.clipSpacePosition = vector_float4(0.0, 0.0, 0.0, 1.0);
float2 pixelSpacePosition = vertices[vertexID].position.xy;
vector_float2 viewportSize = vector_float2(*viewportSizePointer);
out.clipSpacePosition.xy = pixelSpacePosition / (viewportSize / 2.0);
out.color = vertices[vertexID].color;
return out;
Except for GPGPU using kernel functions to do parallel computation, what things that vertex function can do, with some examples? In a game, if all vertices positions are calculated by the CPU, why GPU still matters? What does vertex function do usually?
Vertex shaders compute properties for vertices. That's their point. In addition to vertex positions, they also calculate lighting normals at each vertex. And potentially texture coordinates. And various material properties used by lighting and shading routines. Then, in the fragment processing stage, those values are interpolated and sent to the fragment shader for each fragment.
In general, you don't modify vertices on the CPU. In a game, you'd usually load them from a file into main memory, put them into a buffer and send them to the GPU. Once they're on the GPU you pass them to the vertex shader on each frame along with model, view, and projection matrices. A single buffer containing the vertices of, say, a tree or a car's wheel might be used multiple times. Each time all the CPU sends is the model, view, and projection matrices. The model matrix is used in the vertex shader to reposition and scale the vertice's positions in world space. The view matrix then moves and rotates the world around so that the virtual camera is at the origin and facing the appropriate way. Then the projection matrix modifies the vertices to put them into clip space.
There are other things a vertex shader can do, too. You can pass in vertices that are in a grid in the x-y plane, for example. Then in your vertex shader you can sample a texture and use that to generate the z-value. This gives you a way to change the geometry using a height map.
On older hardware (and some lower-end mobile hardware) it was expensive to do calculations on a texture coordinate before using it to sample from a texture because you lose some cache coherency. For example, if you wanted to sample several pixels in a column, you might loop over them adding an offset to the current texture coordinate and then sampling with the result. One trick was to do the calculation on the texture coordinates in the vertex shader and have them automatically interpolated before being sent to the fragment shader, then doing a normal look-up in the fragment shader. (I don't think this is an optimization on modern hardware, but it was a big win on some older models.)
First, I'll address this statement
In a game, if all vertices positions are calculated by the CPU, why GPU still matters? What does vertex function do usually?
I don't believe I've seen anyone calculating positions for meshes that will be later used to render them on a GPU. It's slow, you would need to get all this data from CPU to a GPU (which means copying it through a bus if you have a dedicated GPU). And it's just not that flexible. There are much more things other than vertex positions that are required to produce any meaningful image and calculating all this stuff on CPU is just wasteful, since CPU doesn't care for this data for the most part.
The sole purpose of vertex shader is to provide rasterizer with primitives that are in clip space. But there are some other uses that are mostly tricks based on different GPU features.
For example, vertex shaders can write out some data to buffers, so, for example, you can stream out transformed geometry if you don't want to transform it again at a later vertex stage if you have multi-pass rendering that uses the same geometry in more than one pass.
You can also use vertex shaders to output just one triangle that covers the whole screen, so that fragment shaders gets called one time per pixel for the whole screen (but, honestly, you are better of using compute (kernel) shaders for this).
You can also write out data from vertex shader and not generate any primitives. You can do that by generating degenerate triangles. You can use this to generate bounding boxes. Using atomic operations you can update min/max positions and read them at a later stage. This is useful for light culling, frustum culling, tile-based processing and many other things.
But, and it's a BIG BUT, you can do most of this stuff in a compute shader without incurring GPU to run all the vertex assembly pipeline. That means, you can do full-screen effects using just a compute shader (instead of vertex and fragment shader and many pipeline stages in between, such as rasterizer, primitive culling, depth testing and output merging). You can calculate bounding boxes and do light culling or frustum culling in compute shader.
There are reasons to fire up the whole rendering pipeline instead of just running a compute shader, for example, if you will still use triangles that are output from vertex shader, or if you aren't sure how primitives are laid out in memory so you need vertex assembler to do the heavy lifting of assembling primitives. But, getting back to your point, almost all of the reasonable uses for vertex shader include outputting primitives in clip space. If you aren't using resulting primitives, it's probably best to stick to compute shaders.

OpenGL Image warping using lookup table

I am working on an Android application that slims or fatten faces by detecting it. Currently, I have achieved that by using the Thin-plate spline algorithm.
http://ipwithopencv.blogspot.com.tr/2010/01/thin-plate-spline-example.html
The problem is that the algorithm is not fast enough for me so I decided to change it to OpenGL. After some research, I see that the lookup table texture is the best option for this. I have a set of control points for source image and new positions of them for warp effect.
How should I create lookup table texture to get warp effect?
Are you really sure you need a lookup texture?
Seems that it`d be better if you had a textured rectangular mesh (or a non-rectangular mesh, of course, as the face detection algorithm you have most likely returns a face-like mesh) and warped it according to the algorithm:
Not only you`d be able to do that in a vertex shader, thus processing each mesh node in parallel, but also it`s less values to process compared to dynamic texture generation.
The most compatible method to achieve that is to give each mesh point a Y coordinate of 0 and X coordinate where the mesh index would be stored, and then pass a texture (maybe even a buffer texture if target devices support it) to the vertex shader, where at the needed index the R and G channels contain the desired X and Y coordinates.
Inside the vertex shader, the coordinates are to be loaded from the texture.
This approach allows for dynamic warping without reloading geometry, if the target data texture is properly updated — for example, inside a pixel shader.

iOS Metal Shader - Texture read and write access?

I'm using a metal shader to draw many particles onto the screen. Each particle has its own position (which can change) and often two particles have the same position. How can I check if the texture2d I write into does not have a pixel at a certain position yet? (I want to make sure that I only draw a particle at a certain position if there hasn't been drawn a particle yet, because I get an ugly flickering if many particles are drawn at the same positon)
I've tried outTexture.read(particlePosition), but this obviously doesn't work, because of the texture access qualifier, which is access::write.
Is there a way I can have read and write access to a texture2d at the same time? (If there isn't, how could I still solve my problem?)
There are several approaches that could work here. In concurrent systems programming, what you're talking about is termed first-write wins.
1) If the particles only need to preclude other particles from being drawn (and aren't potentially obscured by other elements in the scene in the same render pass), you can write a special value to the depth buffer to signify that a fragment has already been written to a particular coordinate. For example, you'd turn on depth test (using the depth compare function Equal), clear the depth buffer to some distant value (like 1.0), and then write a value of 0.0 to the depth buffer in the fragment function. Any subsequent write to a given pixel will fail to pass the depth test and will not be drawn.
2) Use framebuffer read-back. On iOS, Metal allows you to read from the currently-bound primary renderbuffer by attributing a parameter to your fragment function with [[color(0)]]. This parameter will contain the current color value in the renderbuffer, which you can test against to determine whether it has been written to. This does require you to clear the texture to a predetermined color that will never otherwise be produced by your fragment function, so it is more limited than the above approach, and possibly less performant.
All of the above applies whether you're rendering to a drawable's texture for direct presentation to the screen, or to some offscreen texture.
To answer the read and write part : you can specify a read/write access for the output texture as such :
texture2d<float, access::read_write> outTexture [[texture(1)]],
Also, your texture descriptor must specify usage :
textureDescriptor?.usage = [.shaderRead, .shaderWrite]

Resources