how to get MTLVertexAttributeDescriptorArray size value in Metal - metal

In OpenGL, You can retrieve the maximum supported number of vertex attributes with glGetIntegerv(GL_MAX_VERTEX_ATTRIBS, &n).
So, how to get the maximum supported number of vertex attributes per vertex descriptor in Metal in addition to query through Metal-Feature-Set-Tables?

There is currently no API for querying most Metal implementation limits. You should determine which family/version your device supports, and use the values from the table, or else choose a sensible default.
For all extant Metal implementations, the maximum number of vertex attributes per vertex descriptor is 31. If you need more than that, you can fetch additional data from buffer arguments based on the current instance and vertex ID.

Related

Binding all textures on one huge descriptor set

I have some design qustion for a vulkan game engine:
In my game engine i bound all "static" textures resources on one huge descriptor-set(256k descriptors), and my shaders access those samplers by an dynamic indexing.
[For example when i want to sample a some normals-map that belong to a currtain gameobject i add an new uint into the material's ubo that represent the index of the object's normals-map descriptor inside the huge descriptor set, then i sample it and compute the final object color.]
I wondered whether this way to access objects textures is efficient compare to the idia to bind each object's texture on his per-object descriptor set(alongside the material ubo).
Does the size of an descriptor-set can drastically affect on the texel access speed?
or my idia is suck?
Again, sorry about my English.
There are no performance issues with indexing from an array of sampler descriptors. The only real reason not to do things this way is that implementations may not let you dynamically index such arrays. But if you're requiring that from the implementation (all desktop implementations allow it), then just keep doing it; it's a common technique for reducing the number of state changes you have to issue on the CPU.

Meaning and Implications of InternalFormat, Format, and Type parameter for WebGL Textures

In WebGL calls to texSubImage2D and readPixels require a Format and Type parameters. In addition texSubImage2D requires an InternalFormat parameter. While it is easy to find documentation about which combinations of these parameters are valid, it is unclear exactly what these parameters mean and how to go about using them efficiently, particularly given that some internal formats can be paired with multiple types e.g.
R16F/HALF_FLOAT vs R16F/FLOAT or GL_R11F_G11F_B10F/FLOAT vs GL_R11F_G11F_B10F/GL_UNSIGNED_INT_10F_11F_11F_REV (where the notation I am using is InternalFormat/Type)
Also both of these API calls can be used in combination with a pixels parameter that can be a TypedArray -- it this case it is unclear which choices of TypedArray are valid for a given InternalFormat/Format/Type combo (and which choice is optimal in terms of avoiding casting)
For instance, is it true that the internal memory used by the GPU per texel is determined solely by the InternalFormat -- either in an implementation dependent way (e.g. WebGL1 unsized formats) or, for some newly added InternalFormats in WegGL2, a fully specified way.
Are the Format and Type parameters related primarily to how data is marshalled into and out of ArrayBuffers? For instance, if I use GL_R11F_G11F_B10F/GL_UNSIGNED_INT_10F_11F_11F_REV
does this mean I should be passing texSubImage2D an Uint32Array with each element of the array having its bits carefully twiddled in javascript whereas if I use GL_R11F_G11F_B10F/Float then I should use a Float32Array with three times number of elements as the prior case, and WebGL will handle the bit twiddling for me? Does WebGL try to check that the TypedArray I have passed is consistent with the Format/Type I have chosen or does it operate directly on the underlying ArrayBuffer? Could I have used a Float64Array in the last instance? And what to do about HALF_FLOAT?
It looks like bulk of the question can be answered by referring to section 3.7.6 Texture Objects of the WebGL2 Spec. In particular the info in the table found in the documentation for texImage2D which clarifies which TypedArray is required for each Type:
TypedArray WebGL Type
---------- ----------
Int8Array BYTE
Uint8Array UNSIGNED_BYTE
Int16Array SHORT
Uint16Array UNSIGNED_SHORT
Uint16Array UNSIGNED_SHORT_5_6_5
Uint16Array UNSIGNED_SHORT_5_5_5_1
Uint16Array UNSIGNED_SHORT_4_4_4_4
Int32Array INT
Uint32Array UNSIGNED_INT
Uint32Array UNSIGNED_INT_5_9_9_9_REV
Uint32Array UNSIGNED_INT_2_10_10_10_REV
Uint32Array UNSIGNED_INT_10F_11F_11F_REV
Uint32Array UNSIGNED_INT_24_8
Uint16Array HALF_FLOAT
Float32Array FLOAT
My guess is that
InternalFormat determines how much GPU memory is used to store the texture
Format and Type governs how data is marshalled into/ out of the texture and javascript.
Type determines what type of TypedArray must be used
Format plus the pixelStorei parameters (section 6.10) determine how many elements the TypedArray will need and which elements will actually by used (will things be tightly packed, will some rows be padded etc)
Todo:
Workout details for
encoding/decoding some of the more obscure Type values to and from javascript.
calculating typed array size requirement and stride info given Type, Format, and pixelStorei parameters

How to select WebGL GLSL sampler type from texture format properties?

WebGL's GLSL has sampler2D, isampler2D, and usampler2D for reading float, int, and unsigned int from textures inside a shader. When creating a texture in WebGL1/2 we specify a texture InternalFormat, Format, and Type. According to the OpenGL Sampler Wiki Page, using a sampler with incompatible types for a given texture can lead to undefined values.
Is there a simple rule to determine how to map a texture's InternalFormat, Format, and Type definitively to the correct GLSL sampler type?
(Without loss of generality, I have focused on ?sampler2D but of course there are also 3D, Cube, etc textures which I assume follow the exactly same rules)
WebGL1 doesn't have those different sampler types.
WebGL2 the type is specified by the internal format. Types that end in I like RGB8I are isampler. Types that end in UI like RGB8UI are usampler formats. Everything else is sampler
There's a list of the formats on page 5 of the WebGL2 Reference Guide
Also note
(1) You should avoid the OpenGL reference pages for WebGL2 as they will often not match. Instead, you should be reading the OpenGL ES 3.0.x reference pages
(2) WebGL2 has stronger restrictions. The docs you referenced said the values can be undefined. WebGL2 doesn't allow this. From the WebGL2 spec
5.22 A sampler type must match the internal texture format
Texture lookup functions return values as floating point, unsigned integer or signed integer, depending on the sampler type passed to the lookup function. If the wrong sampler type is used for texture access, i.e., the sampler type does not match the texture internal format, the returned values are undefined in OpenGL ES Shading Language 3.00.6 (OpenGL ES Shading Language 3.00.6 ยง8.8). In WebGL, generates an INVALID_OPERATION error in the corresponding draw call, including drawArrays, drawElements, drawArraysInstanced, drawElementsInstanced , and drawRangeElements.
If the sampler type is floating point and the internal texture format is normalized integer, it is considered as a match and the returned values are converted to floating point in the range [0, 1].

Dynamic output from compute shader

If I am generating 0-12 triangles in a compute shader, is there a way I can stream them to a buffer that will then be used for rendering to screen?
My current strategy is:
create a buffer of float3 of size threads * 12, so can store the maximum possible number of triangles;
write to the buffer using an index that depends on the thread position in the grid, so there are no race conditions.
If I want to render from this though, I would need to skip the empty memory. It sounds ugly, but probably there is no other way currently. I know CUDA geometry shaders can have variable length output, but I wonder if/how games on iOS can generate variable-length data on GPU.
UPDATE 1:
As soon as I wrote the question, I thought about the possibility of using a second buffer that would point out how many triangles are available for each block. The vertex shader would then process all vertices of all triangles of that block.
This will not solve the problem of the unused memory though and as I have a big number of threads, the total memory wasted would be considerable.
What you're looking for is the Metal equivalent of D3D's "AppendStructuredBuffer". You want a type that can have structures added to it atomically.
I'm not familiar with Metal, but it does support Atomic operations such as 'Add' which is all you really need to roll your own Append Buffer. Initialise the counter to 0 and have each thread add '1' to the counter and use the original value as the index to write to in your buffer.

Direct3D 10 Hardware Instancing using Structured Buffers

I am trying to implement hardware instancing with Direct3D 10+ using Structured Buffers for the per instance data but I've not used them before.
I understand how to implement instancing when combining the per vertex and per instance data into a single structure in the Vertex Shader - i.e. you bind two vertex buffers to the input assembler and call the DrawIndexedInstanced function.
Can anyone tell me the procedure for binding the input assembler and making the draw call etc. when using Structured Buffers with hardware instancing? I can't seem to find a good example of it anywhere.
It's my understanding that Structured Buffers are bound as ShaderResourceViews, is this correct?
Yup, that's exactly right. Just don't put any per-instance vertex attributes in your vertex buffer or your input layout and create a ShaderResourceView of the buffer and set it on the vertex shader. You can then use the SV_InstanceID semantic to query which instance you're on and just fetch the relevant struct from your buffer.
StructuredBuffers are very similar to normal buffers. The only differences are that you specify the D3D11_RESOURCE_MISC_BUFFER_STRUCTURED flag on creation, fill in StructureByteStride and when you create a ShaderResourceView the Format is DXGI_UNKNOWN (the format is specified implicitly by the struct in your shader).
StructuredBuffer<MyStruct> myInstanceData : register(t0);
is the syntax in HLSL for a StructuredBuffer and you just access it using the [] operator like you would an array.
Is there anything else that's unclear?

Resources