in-place processing of a Metal texture - ios

Is it possible to process an MTLTexture in-place without osx_ReadWriteTextureTier2?
It seems like I can set two texture arguments to be the same texture. Is this supported behavior?
Specifically, I don't mind not having texture caching update after a write. I just want to in-place (and sparsely) modify a 3d texture. It's memory prohibitive to have two textures. And it's computationally expensive to copy the entire texture, especially when I might only be updating a small portion of it.

Per the documentation, regardless of feature availability, it is invalid to declare two separate texture arguments (one read, one write) in a function signature and then set the same texture for both.
Any Mac that supports osx_GPUFamily1_v2 supports function texture read-writes (by declaring the texture with access::read_write).
The distinction between "Tier 1" (which has no explicit constant) and osx_ReadWriteTextureTier2 is that the latter supports additional pixel formats for read-write textures.
If you determine that your target Macs don't support the kind of texture read-writes you need (because you need to deploy to OS X 10.11 or because you're using an incompatible pixel format for the tier of machine you're deploying to), you could operate on your texture one plane at a time, reading from your 3D texture, writing to a 2D texture, and then blitting the result back into the corresponding region in your 3D texture. It's more work, but it'll use much less than double the memory.

Related

How do I use indexed normals as an attribute? (WebGL) [duplicate]

I have some vertex data. Positions, normals, texture coordinates. I probably loaded it from a .obj file or some other format. Maybe I'm drawing a cube. But each piece of vertex data has its own index. Can I render this mesh data using OpenGL/Direct3D?
In the most general sense, no. OpenGL and Direct3D only allow one index per vertex; the index fetches from each stream of vertex data. Therefore, every unique combination of components must have its own separate index.
So if you have a cube, where each face has its own normal, you will need to replicate the position and normal data a lot. You will need 24 positions and 24 normals, even though the cube will only have 8 unique positions and 6 unique normals.
Your best bet is to simply accept that your data will be larger. A great many model formats will use multiple indices; you will need to fixup this vertex data before you can render with it. Many mesh loading tools, such as Open Asset Importer, will perform this fixup for you.
It should also be noted that most meshes are not cubes. Most meshes are smooth across the vast majority of vertices, only occasionally having different normals/texture coordinates/etc. So while this often comes up for simple geometric shapes, real models rarely have substantial amounts of vertex duplication.
GL 3.x and D3D10
For D3D10/OpenGL 3.x-class hardware, it is possible to avoid performing fixup and use multiple indexed attributes directly. However, be advised that this will likely decrease rendering performance.
The following discussion will use the OpenGL terminology, but Direct3D v10 and above has equivalent functionality.
The idea is to manually access the different vertex attributes from the vertex shader. Instead of sending the vertex attributes directly, the attributes that are passed are actually the indices for that particular vertex. The vertex shader then uses the indices to access the actual attribute through one or more buffer textures.
Attributes can be stored in multiple buffer textures or all within one. If the latter is used, then the shader will need an offset to add to each index in order to find the corresponding attribute's start index in the buffer.
Regular vertex attributes can be compressed in many ways. Buffer textures have fewer means of compression, allowing only a relatively limited number of vertex formats (via the image formats they support).
Please note again that any of these techniques may decrease overall vertex processing performance. Therefore, it should only be used in the most memory-limited of circumstances, after all other options for compression or optimization have been exhausted.
OpenGL ES 3.0 provides buffer textures as well. Higher OpenGL versions allow you to read buffer objects more directly via SSBOs rather than buffer textures, which might have better performance characteristics.
I found a way that allows you to reduce this sort of repetition that runs a bit contrary to some of the statements made in the other answer (but doesn't specifically fit the question asked here). It does however address my question which was thought to be a repeat of this question.
I just learned about Interpolation qualifiers. Specifically "flat". It's my understanding that putting the flat qualifier on your vertex shader output causes only the provoking vertex to pass it's values to the fragment shader.
This means for the situation described in this quote:
So if you have a cube, where each face has its own normal, you will need to replicate the position and normal data a lot. You will need 24 positions and 24 normals, even though the cube will only have 8 unique positions and 6 unique normals.
You can have 8 vertexes, 6 of which contain the unique normals and 2 of normal values are disregarded, so long as you carefully order your primitives indices such that the "provoking vertex" contains the normal data you want to apply to the entire face.
EDIT: My understanding of how it works:

MTLTexture vs CGImageRef

What is the main difference between MTLTexture vs CGImageRef? When do we need to use MTLTexture instead of CGImageRef (and vice versa)?
I have an app (say a video game) that draw everything by itself on a dedicated surface. this includes animation at 60fps (so I need to redraw the surface every 16ms). I don't know the most efficient way to do my app using Metal
First of all, MTLTexture comes from a low-level graphics API. MTLTexture refers to an "image" that resides in memory accessible to GPU (no necessarily on GPU itself). You can then write a program that uses Metal, specifically render (MTLRenderPipelineState) or compute (MTLComputePipelineState) pipeline states that contain shader (programs that run on GPU) to read textures, sample them, write to them and use them as attachments (output rendering results to them). Textures can also be copied to buffers (MTLBuffer) and other textures, if you want to read back texture data on the CPU. But MTLTexture is mostly intended to be used by GPU rather than CPU. Also, MTLTexture is not limited to being 2D, it can also be a cube texture or even a 3D texture.
CGImage, on the other hand, comes from a higher-level API (Core Graphics or Quartz 2D) that is intended for 2D use. You don't need shaders or GPU pipeines to create or modify CGImages and there are many functions to work with these images "out of the box".
I would say, if you have a 3D video game, you can check out Metal, but it's a low level API, and setting up Metal is a much more involved process than setting up OpenGL, for example. You can't use Core Graphics for 3D games as-is. If Metal seems too hard, you can check out higher-level APIs from Apple, such as SceneKit, which are also intended for game development.
I can't say much about 2D game development, but you can definitely use Metal for it, it might just be a bit "overkill".
In conclusion, you need to find a balance between complexity and control and chose what best suits you.

WebGL: How to interact between javascript and shaders, and how to use multiple shaders

I have seen demos on WebGL that
color rectangular surface
attach textures to the rectangles
draw wireframes
have semitransparent textures
What I do not understand is how to combine these effects into a single program, and how to interact with objects to change their look.
Suppose I want to create a scene with all the above, and have the ability to change the color of any rectangle, or change the texture.
I am trying to understand the organization of the code. Here are some short, related questions:
I can create a vertex buffer with corresponding color buffer. Can I have some rectangles with texture and some without?
If not, I have to create one vertex buffer for all objects with colors, and another with textures. Can I attach a different texture to each rectangle in a vector?
For a case with some rectangles with colors, and others with textures, it requires two different shader programs. All the demos I see have only one, but clearly more complicated programs have multiple. How do you switch between shaders?
How to draw wireframe on and off? Can it be combined with textures? In other words, is it possible to write a shader that can turn features like wireframe on and off with a flag, or does it take two different calls to two different shaders?
All the demos I have seen use an index buffer with triangles. Is Quads no longer supported in WebGL? Obviously for some things triangles would be needed, but if I have a bunch of rectangles it would be nice not to have to create an index of triangles.
For all three of the above scenarios, if I want to change the points, the color, the texture, or the transparency, am I correct in understanding the glSubBuffer will allow replacing data currently in the buffer with new data.
Is it reasonable to have a single object maintaining these kinds of objects and updating color and textures, or is this not a good design?
The question you ask is not just about WebGL, but also about OpenGL and 3D.
The most used way to interact is setting attributes at the start and uniforms at the start and on the run.
In general, answer to all of your questions is "use engine".
Imagine it like you have javascript, CPU based lang, then you have WebGL, which is like a library of stuff for JS that allows low level comunication with GPU (remember, low level), and then you have shader which is GPU program you must provide, but it works only with specific data.
Do anything that is more then "simple" requires a tool, that will allow you to skip using WebGL directly (and very often also write shaders directly). The tool we call engine. Engine usually binds together some set of abilities and skips the others (difference betwen 2D and 3D engine for example). Engine functions call some WebGL preset functions with specific order, so you must not ever touch WebGL API again. Engine also provides very complicated logic to build only single pair, or few pairs of shaders, based just on few simple engine api calls. The reason is that during entire program, swapping shader program cost is heavy.
Your questions
I can create a vertex buffer with corresponding color buffer. Can I
have some rectangles with texture and some without? If not, I have to
create one vertex buffer for all objects with colors, and another with
textures. Can I attach a different texture to each rectangle in a
vector?
Lets have a buffer, we call vertex buffer. We put various data in vertex buffer. Data doesnt go as individuals, but as sets. Each unique data in set, we call attribute. The attribute can has any meaning for its vertex that vertex shader or fragment shader code decides.
If we have buffer full of data for triangles, it is possible to set for example attribute that says if specific vertex should texture the triangle or not and do the texturing logic in the shader. Anyway I think that data size of attributes for each vertex must be equal (so the textured triangles will eat same size as nontextured).
For a case with some rectangles with colors, and others with textures,
it requires two different shader programs. All the demos I see have
only one, but clearly more complicated programs have multiple. How do
you switch between shaders?
Not true, even very complicated programs might have only one pair of shaders (one WebGL program). But still it is possible to change program on the run:
https://www.khronos.org/registry/webgl/specs/latest/1.0/#5.14.9
WebGL API function useProgram
How to draw wireframe on and off? Can it be combined with textures? In
other words, is it possible to write a shader that can turn features
like wireframe on and off with a flag, or does it take two different
calls to two different shaders?
WebGL API allows to draw in wireframe mode. It is shader program independent option. You can switch it with each draw call. Anyway it is also possible to write shader that will draw as wireframe and control it with flag (flag might be both, uniform or attribute based).
All the demos I have seen use an index buffer with triangles. Is Quads
no longer supported in WebGL? Obviously for some things triangles
would be needed, but if I have a bunch of rectangles it would be nice
not to have to create an index of triangles.
WebGL supports only Quads and triangles. I guess it is because without quads, shaders are more simple.
For all three of the above scenarios, if I want to change the points,
the color, the texture, or the transparency, am I correct in
understanding the glSubBuffer will allow replacing data currently in
the buffer with new data.
I would say it is rare to update buffer data on the run. It slows a program a lot. glSubBuffer is not in WebGL (different name???). Anyway dont use it ;)
Is it reasonable to have a single object maintaining these kinds of
objects and updating color and textures, or is this not a good design?
Yes, it is called Scene graph and is widely used and might be also combined with other techniques like display list.

DirectX9 and Incompatible Texture size

I'm working with DirectX9 and now I'm having problems with the texture creation.
I'm using the functions CreateTexture and LoadSurfaceFromMemory with D3DFMT_DXT1 compression, I checked the devices caps of my graphic card and D3DPTEXTURECAPS_POW2 and D3DPTEXTURECAPS_NONPOW2CONDITIONAL are off, I think this means that my graphic card have support of NON Power of Two Textures... I can use textures of any sizes.
My problem is the most of the textures are working well (and their sizes aren't power of two), but in some cases don't work, like "1228 x 453", if I resize to "1228 x 452" the texture works well.
What's going on?
Sorry for my English!.
Thanks.
The BCn texture formats are block based. The blocks pack pixels into groups of 4x4 elements, so the texture dimension must be aligned on 4 for theses formats.
Unfortunately, this is a graphics card issue. Even if the card claims support for non power of two textures, support is often buggy / limited.
You could pad the texture and use a subtexture, but the best approach is to build a texture atlas (in general you should be doing this anyway to conserve memory bandwidth)

How do PowerVR GPUs provide a depth buffer?

iOS devices use a PowerVR graphics architecture. The PowerVR architecture is a tile-based deferred rendering model. The primary benefit of this model is that it does not use a depth buffer.
However, I can access the depth buffer on my iOS device. Specifically, I can use an offscreen Frame Buffer Object to turn the depth buffer into a color texture and render it.
If the PowerVR architecture doesn't use a depth buffer, how is it that I'm able to render a depth buffer?
It is true that a tile-based renderer doesn't need a traditional depth buffer in order to work.
TBR split the screen in tiles and completely renders the contents of this tile using fast on-chip memory to store temporary colors and depths. Then, when the tile is finished, the final values are moved to the actual framebuffer. However, depth values in a depth buffer are traditionally temporary because they are just used as a hidden surface algorithm. Then depth values in this case can be completely discarded after the tile is rendered.
That means that effectively tile-based renderers don't really need a full screen depth buffer in slower video memory, saving both bandwidth and memory.
The Metal API easily exposes this functionality, allowing you to set the storeAction of the depth buffer to 'don't care' value, meaning that it will not back up the resulting depth values into main memory.
The exception to this case is that you may need the depth buffer contents after rendering (i.e. for a deferred renderer or as a source for some algorithm that operates with depth values). In that case the hardware must ensure that the depth values are stored in the framebuffer for you tu use.
Tile-based deferred rendering — as the name clearly says — works on a tile-by-tile basis. Each portion of the screen is loaded into internal caches, processed and written out again. The hardware takes the list of triangles overlapping the current tile and the current depth values and from those comes up with a new set of colours and depths and then writes those all out again. So whereas a completely dumb GPU might do one read and one write to every relevant depth buffer value per triangle, the PowerVR will do one read and one write per batch of geometry, with the ray casting-style algorithm doing the rest in between.
It isn't really possible to implement OpenGL on a 'pure' tile-based deferred renderer because the depth buffer needs to be readable. It also generally isn't efficient to make the depth buffer write only because the colour buffer is readable at any time (either explicitly via glReadPixels or implicitly according to whenever you present the frame buffer as per your OS's mechanisms), meaning that the hardware may have to draw a scene, then draw more onto it.
PowerVR does use a depth buffer, but in a different way than a regular(Immediate Mode Rendering) GPU
The differed part of Tile-based differed rendering means that triangles for a give scene are first processed (shaded, transformed clipped, etc. ) and saved into an intermediate buffer. Only after the entire scene is processed the tiles are rendered one by one.
Having all the processed triangles in one buffer allows the hardware to perform hidden surface removal - removing the triangles that will end up being hidden/overdrawn by other triangles. This significantly reduces the number of rendered triangles, resulting in improved performance and reduced power consumption.
Hidden surface removal typically uses something called a Tab Buffer as well as a depth buffer. (Both are small on-chip memories as they store a tile at a time)
Not sure why you're saying that PowerVR doesn't use a depth buffer. My guess is that it is just a "marketing" way of saying that there is not need to perform expensive writes and reads from system memory in order to perform depth test.
p.s
Just to add to Tommy's answer: the primary benefits of tile based differed rendering are:
Since fragments are processed a tile at a time all color/depth/stencil buffer read and writes are performed from a fast on-chip memory. While the color buffer still has to be read/written to system memory ones per tile, in many cases the depth and stencil buffers need to be written to system memory only if it is required for later use(like your user case). System memory traffic is a significant source of power consumption... so you can see how it reduced power consumption.
Differed rendering enables hidden surface removal. Less rendered triangles means less fragments processing, means less texture memory access.

Resources