Supporting WebGL1 + WebGL2

Supporting WebGL1 + WebGL2 - webgl

I have a certain library that uses WebGL1 to render things.
It heavily uses float textures and instanced rendering.
Nowadays it seems like support for WebGL1 is pretty weird, with some devices supporting for example WebGL2, where these extensions are core, but not supporting WebGL1, or supporting it, but not the extensions.
At the same time, support for WebGL2 isn't amazing. Maybe one day it will be, but for not it isn't.
I started looking at what it will take to support both versions.
For shaders, I think I can mostly get away with #defineing things. For example, #define texture2D texture and other similar things.
When it comes to extensions, it becomes more problematic, since the extension objects no longer exist.
As an experiment, I tried copying the extension properties into the context object, e.g. gl.drawArraysInstanced = (...args) => ext.drawArraysInstancedANGLE(...args).
When it comes to textures, not much needs to be changed, perhaps add something like gl.RGBA8 = gl.RGBA when running in WebGL1, thus it will "just work" when running in WebGL2.
So then comes the question...did anyone try this?
I am worried about it hurting performance, especially the extra indirection for function calls.
It will also make reading the code less obvious if the assumption is that it can run in WebGL1. After all, no WebGL1 context has drawArraysInstanced, or RGBA8. This also bothers Typescript typing and other minor things.
The other option is to have branches all over the code. Two versions of shaders (or #ifdef trickery), lots of brancing for every place where texture formats are needed, and every place where instancing is done.
Having something like what follows all over the place is pretty ugly:
if (version === 1) {
instancedArrays.vertexAttribDivisorANGLE(m0, 1);
instancedArrays.vertexAttribDivisorANGLE(m1, 1);
instancedArrays.vertexAttribDivisorANGLE(m2, 1);
instancedArrays.vertexAttribDivisorANGLE(m3, 1);
} else {
gl.vertexAttribDivisor(m0, 1);
gl.vertexAttribDivisor(m1, 1);
gl.vertexAttribDivisor(m2, 1);
gl.vertexAttribDivisor(m3, 1);
}
Finally, maybe there's a third way I didn't think about.
Got any recommendations?

Unfortunately I think most answers will be primarily opinion based.
The first question is why support both? If your idea runs fine on WebGL1 then just use WebGL1. If you absolutely must have WebGL2 features then use WebGL2 and realize that many devices don't support WebGL2.
If you're intent on doing it twgl tries to make it easier by providing a function that copies all the WebGL1 extensions into their WebGL2 API positions. For like you mentioned, instead of
ext = gl.getExtension('ANGLE_instanced_arrays');
ext.drawArraysInstancedANGLE(...)
You instead do
twgl.addExtensionsToContext(gl);
gl.drawArraysInstanced(...);
I don't believe there will be any noticeable perf difference. Especially since those functions are only called a few hundred times a frame the wrapping is not going to be the bottleneck in your code.
The point though is not really to support WebGL1 and WebGL2 at the same time. Rather it's just to make it so the way you write code is the same for both APIs.
Still, there are real differences in the 2 APIs. For example to use a FLOAT RGBA texture in WebGL1 you use
gl.texImage2D(target, level, gl.RGBA, width, height, 0, gl.RGBA, gl.FLOAT, ...)
In WebGL2 it's
gl.texImage2D(target, level, gl.RGBA32F, width, height, 0, gl.RGBA, gl.FLOAT, ...)
WebGL2 will fail if you try to call it the same as WebGL1 in this case. There are other differences as well.
Will work just fine in WebGL1 and WebGL2. The spec specifically says that combination results in RGBA8 on WebGL2.
Note though that your example of needing RGBA8 is not true.
gl.texImage2D(target, level, gl.RGBA, width, height, 0, gl.RGBA, gl.UNSIGNED_BYTE, ...)
The biggest difference though is there is no reason to use WebGL2 if you can get by with WebGL1. Or, visa versa, if you need WebGL2 then you probably can not easily fall back to WebGL1
For example you mentioned using defines for shaders but what are you going to do about features in WebGL2 that aren't in WebGL1. Features like textureFetch or the mod % operator, or integer attributes, etc.... If you need those features you mostly need to write a WebGL2 only shader. If you don't need those features then there was really no point in using WebGL2 in the first place.
Of course if you really want to go for it maybe you want to make a fancier renderer if the user has WebGL2 and fall back to a simpler one if WebGL1.
TD;LR IMO Pick one or the other

I found this question when writing a documentation of my library, which has many objectives, but one of them is exactly this - to support WebGL1 and WebGL2 in the same time for higher cross-device compatibility.
https://xemantic.github.io/shader-web-background/
For example I discovered with BrowserStack that Samsung phones don't support rendering to floating point textures in WebGL1, while it is perfectly fine for them in WebGL2. In the same time WebGL2 will never appear on Apple devices, but rendering to half floating point textures is pretty well supported.
My library is not providing full WebGL abstraction, but rather will configure pipeline for fragment shaders. Here is the source on GitHub with the WebGL strategy code depending on the version:
https://github.com/xemantic/shader-web-background/blob/main/src/main/js/webgl-utils.js
Therefore to answer your question, it is doable and desirable, but doing it in a totally generic way, for every WebGL feature, might be quite challenging. I guess the first question to ask is "What would be the common denominator?" in terms of supported extensions.

Related

What exactly is a constant buffer (cbuffer) used for in hlsl?

Currently I have this code in my vertex shader class:
cbuffer MatrixBuffer {
matrix worldMatrix;
matrix viewMatrix;
matrix projectionMatrix; };
I don't know why I need to wrap those variables in a cbuffer. If I delete the buffer my code works aswell. I would really appreciate it if someone could give me a brieve explanation why using cbuffers are necessary.

The reason it works either way is due to the legacy way constants were handled in Direct3D 8/Direct3D 9. Back then, there was only a single shared array of constants for the entire shader (one for VS and one for PS). This required that you had to change the constant array every single time you called Draw.
In Direct3D 10, constants were reorganized into one or more Constant Buffers to make it easier to update some constants while leaving others alone, and thus sending less data to the GPU.
See the classic presentation Windows to Reality: Getting the Most out of Direct3D 10 Graphics in Your Games for a lot of details on the impacts of constant update.
The up-shot of which here is that if you don't specify cbuffer, all the constants get put into a single implicit constant buffer bound to register b0 to emulate the old 'one constants array' behavior.
There are compiler flags to control the acceptance of legacy constructs: /Gec for backwards compatibility mode to support old Direct3D 8/9 intrinsics, and /Ges to enable a more strict compilation to weed out older constructs. That said, the HLSL compiler will pretty much always accept global constants without cbuffer and stick them into a single implicit constant buffer because this pattern is extremely common in shader code.

Vulkan texture rendering on multiple meshes

I am in the middle of rendering different textures on multiple meshes of a model, but I do not have much clues about the procedures. Someone suggested for each mesh, create its own descriptor sets and call vkCmdBindDescriptorSets() and vkCmdDrawIndexed() for rendering like this:
// Pipeline with descriptor set layout that matches the shared descriptor sets
vkCmdBindPipeline(...pipelines.mesh...);
...
// Mesh A
vkCmdBindDescriptorSets(...&meshA.descriptorSet... );
vkCmdDrawIndexed(...);
// Mesh B
vkCmdBindDescriptorSets(...&meshB.descriptorSet... );
vkCmdDrawIndexed(...);
However, the above approach is quite different from the chopper sample and vulkan's samples that makes me have no idea where to start the change. I really appreciate any help to guide me to a correct direction.
Cheers

You have a conceptual object which is made of multiple meshes which have different texturing needs. The general ways to deal with this are:
Change descriptor sets between parts of the object. Painful, but it works on all Vulkan-capable hardware.
Employ array textures. Each individual mesh fetches its data from a particular layer in the array texture. Of course, this restricts you to having each sub-mesh use textures of the same size. But it works on all Vulkan-capable hardware (up to 128 array elements, minimum). The array layer for a particular mesh can be provided as a push-constant, or a base instance if that's available.
Note that if you manage to be able to do it by base instance, then you can render the entire object with a multi-draw indirect command. Though it's not clear that a short multi-draw indirect would be faster than just baking a short sequence of drawing commands into a command buffer.
Employ sampler arrays, as Sascha Willems suggests. Presumably, the array index for the sub-mesh is provided as a push-constant or a multi-draw's draw index. The problem is that, regardless of how that array index is provided, it will have to be a dynamically uniform expression. And Vulkan implementations are not required to allow you to index a sampler array with a dynamically uniform expression. The base requirement is just a constant expression.
This limits you to hardware that supports the shaderSampledImageArrayDynamicIndexing feature. So you have to ask for that, and if it's not available, then you've got to work around that with #1 or #2. Or just don't run on that hardware. But the last one means that you can't run on any mobile hardware, since most of them don't support this feature as of yet.
Note that I am not saying you shouldn't use this method. I just want you to be aware that there are costs. There's a lot of hardware out there that can't do this. So you need to plan for that.

The person that suggested the above code fragment was me I guess ;)
This is only one way of doing it. You don't necessarily have to create one descriptor set per mesh or per texture. If your mesh e.g. uses 4 different textures, you could bind all of them at once to different binding points and select them in the shader.
And if you a take a look at NVIDIA's chopper sample, they do it pretty much the same way only with some more abstraction.
The example also sets up descriptor sets for the textures used :
VkDescriptorSet *textureDescriptors = m_renderer->getTextureDescriptorSets();
binds them a few lines later :
VkDescriptorSet sets[3] = { sceneDescriptor, textureDescriptors[0], m_transform_descriptor_set };
vkCmdBindDescriptorSets(m_draw_command[inCommandIndex], VK_PIPELINE_BIND_POINT_GRAPHICS, layout, 0, 3, sets, 0, NULL);
and then renders the mesh with the bound descriptor sets :
vkCmdDrawIndexedIndirect(m_draw_command[inCommandIndex], sceneIndirectBuffer, 0, inCount, sizeof(VkDrawIndexedIndirectCommand));
vkCmdDraw(m_draw_command[inCommandIndex], 1, 1, 0, 0);
If you take a look at initDescriptorSets you can see that they also create separate descriptor sets for the cubemap, the terrain, etc.
The LunarG examples should work similar, though if I'm not mistaken they never use more than one texture?

getShaderPrecisionFormat return values

I'm confused about the method getShaderPrecisionFormat, what it's used for, and what it's telling me because for me it always returns the exact same precision for all arguments, only differences are between INT / FLOAT.
to be clear:
calls with gl.FRAGMENT_SHADER and gl.VERTEX_SHADER in combinations with gl.LOW_FLOAT, gl.MEDIUM_FLOAT and gl.HIGH_FLOAT always return
WebGLShaderPrecisionFormat { precision: 23, rangeMax: 127, rangeMin: 127 }
calls with gl.FRAGMENT_SHADER and gl.VERTEX_SHADER in combinations with gl.LOW_INT, gl.MEDIUM_INT and gl.HIGH_INT always return
WebGLShaderPrecisionFormat { precision: 0, rangeMax: 24, rangeMin: 24 }
I experimented with also supplying two additional arguments "range" and "precision" but was unable to get any different results. I assume I made a mistake but from the docs I'm unable to figure out on my own how to use it correctly.

It looks like you're using these calls correctly.
If you're running on a desktop/laptop, the result is not surprising. I would expect WebGL to be layered on top of a full OpenGL implementation on such systems. Even if these systems support ES 2.0, which mostly matches the WebGL feature level, that's most likely just a reduced API that ends up using the same underlying driver/GPU features as the full OpenGL implementation.
Full OpenGL does not really support precisions. It does have the keywords in GLSL, but that's just for source code compatibility with OpenGL ES. In the words of the GLSL 4.50 spec:
Precision qualifiers are added for code portability with OpenGL ES, not for functionality. They have the same syntax as in OpenGL ES, as described below, but they have no semantic meaning, which includes no effect on the precision used to store or operate on variables.
It then goes on to define the use of IEEE 32-bit floats, which have the 23 bits of precision you are seeing from your calls.
You would most likely get a different result if you try the same thing on a mobile device, like a phone or tablet. Many mobile GPUs support 16-bit floats (aka "half floats"), and take advantage of them. Some of them can operate on half floats faster than they can on floats, and the reduced memory usage and bandwidth is beneficial even if the operations themselves are not faster. Reducing memory/bandwidth usage is critical on mobile devices to improve performance, as well as power efficiency.

Interpreting glReadPixels()

After more than an hour of looking for an answer (trying stuff in Munshi's "OpenGL ES 2.0 Programming Guide", searching Apple's documentation, searching StackOverflow), I'm still at a loss for getting glReadPixels to work. I've tried so many different ways, and the best I've got is fluctuating (and therefore wrong) results.
I've set up the simple case of a quad being rendered with shaders to the screen, and I've manually assigned gl_FragColor to pure red, so there should be absolutely no fluctuation on the screen. Then I try something like the following code before presentRenderbuffer:
GLubyte *pixels = (GLubyte *)malloc(3);
glReadPixels(100, 100, 1, 1, GL_RGB, GL_UNSIGNED_BYTE, pixels);
NSLog(#"%d", (int)pixels[0]);
free(pixels);
Basically, trying to read a single pixel at (100, 100) and read the red value of it, which I expect to be either 1 or 255. Instead I get values like 1522775, 3587, and 65536, though the image on the screen never changes. I did something like this on Mac and it worked fine, but for some reason I can't get this to work on iOS. I have the above statement (and have tried a number of variations I've come across on the internet) after the call to glDrawArrays() and before the presentBuffer: call. I've even tried the method from "OpenGL ES 2.0 Programming Guide" that handles all cases of read-types and read-formats by querying glGetIntegerv() for the framebuffer's information.
Any ideas? I'm sure someone will say, "use the search feature," but I've seriously come up dry on it and can't get any further. Thanks for your help!

From the Opengl ES Manual:
Only two format/type parameter pairs are accepted. GL_RGBA/GL_UNSIGNED_BYTE is always accepted, and the other acceptable pair can be discovered by querying GL_IMPLEMENTATION_COLOR_READ_FORMAT and GL_IMPLEMENTATION_COLOR_READ_TYPE.
So, you better use GL_RGBA instead of GL_RGB.

How do I choose a pixel format when creating a new Texture2D?

I'm using the SharpDX Toolkit, and I'm trying to create a Texture2D programmatically, so I can manually specify all the pixel values. And I'm not sure what pixel format to create it with.
SharpDX doesn't even document the toolkit's PixelFormat type (they have documentation for another PixelFormat class but it's for WIC, not the toolkit). I did find the DirectX enum it wraps, DXGI_FORMAT, but its documentation doesn't give any useful guidance on how I would choose a format.
I'm used to plain old 32-bit bitmap formats with 8 bits per color channel plus 8-bit alpha, which is plenty good enough for me. So I'm guessing the simplest choices will be R8G8B8A8 or B8G8R8A8. Does it matter which I choose? Will they both be fully supported on all hardware?
And even once I've chosen one of those, I then need to further specify whether it's SInt, SNorm, Typeless, UInt, UNorm, or UNormSRgb. I don't need the sRGB colorspace. I don't understand what Typeless is supposed to be for. UInt seems like the simplest -- just a plain old unsigned byte -- but it turns out it doesn't work; I don't get an error, but my texture won't draw anything to the screen. UNorm works, but there's nothing in the documentation that explains why UInt doesn't. So now I'm paranoid that UNorm might not work on some other video card.
Here's the code I've got, if anyone wants to see it. Download the SharpDX full package, open the SharpDXToolkitSamples project, go to the SpriteBatchAndFont.WinRTXaml project, open the SpriteBatchAndFontGame class, and add code where indicated:
// Add new field to the class:
private Texture2D _newTexture;
// Add at the end of the LoadContent method:
_newTexture = Texture2D.New(GraphicsDevice, 8, 8, PixelFormat.R8G8B8A8.UNorm);
var colorData = new Color[_newTexture.Width*_newTexture.Height];
_newTexture.GetData(colorData);
for (var i = 0; i < colorData.Length; ++i)
colorData[i] = (i%3 == 0) ? Color.Red : Color.Transparent;
_newTexture.SetData(colorData);
// Add inside the Draw method, just before the call to spriteBatch.End():
spriteBatch.Draw(_newTexture, new Vector2(0, 0), Color.White);
This draws a small rectangle with diagonal lines in the top left of the screen. It works on the laptop I'm testing it on, but I have no idea how to know whether that means it's going to work everywhere, nor do I have any idea whether it's going to be the most performant.
What pixel format should I use to make sure my app will work on all hardware, and to get the best performance?

The formats in the SharpDX Toolkit map to the underlying DirectX/DXGI formats, so you can, as usual with Microsoft products, get your info from the MSDN:
DXGI_FORMAT enumeration (Windows)
32-bit-textures are a common choice for most texture scenarios and have a good performance on older hardware. UNorm means, as already answered in the comments, "in the range of 0.0 .. 1.0" and is, again, a common way to access color data in textures.
If you look at the Hardware Support for Direct3D 10Level9 Formats (Windows) page you will see, that DXGI_FORMAT_R8G8B8A8_UNORM as well as DXGI_FORMAT_B8G8R8A8_UNORM are supported on DirectX 9 hardware. You will not run into compatibility-problems with both of them.
Performance is up to how your Device is initialized (RGBA/BGRA?) and what hardware (=supported DX feature level) and OS you are running your software on. You will have to run your own tests to find it out (though in case of these common and similar formats the difference should be a single digit percentage at most).

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart