Optimise OpenGL ES 2.0 2D drawing using dirty rectangles - ios

Is it possible to optimise OpenGL ES 2.0 drawing by using dirty rectangles?
In my case, I have a 2D app that needs to draw a background texture (full screen on iPad), followed by the contents of several VBOs on each frame. The problem is that these VBOs can potentially contain millions of vertices, taking anywhere up to a couple of seconds to draw everything to the display. However, only a small fraction of the display would actually be updated each frame.
Is this optimisation possible, and how (or perhaps more appropriately, where) would this be implemented? Would some kind of clipping plane need to be passed into the vertex shader?

If you set an area with glViewport, clipping is adjusted accordingly. This however happens after the vertex shader stage, just before rasterization. As the GL cannot know the result of your own vertex program, it cannot sort out any vertex before applying the vertex program. After that, it does. How efficent it does depents on the actual GPU.
Thus you have to sort and split your objects to smaller (eg. rectangulary bounded) tiles and test them against the field of view by yourself for full performance.

Related

Making GL_POINTS look like a 3D Rectangle

Suppose one has an array of GL_POINTS and wants to make each appear to have a distinct "height" or "depth", so instead of appearing like a flat scatter of squares they appear to be a scatter of 3D rectangles / right rectangular prisms.
Is there a technique in WebGL that will allow one to achieve this effect? One could of course use vertices that actually articulate those 3D rectangles, but my goal is to optimize for performance as I have ~100,000 of these rectangles to render, and I thought points would be the best primitive to use.
Right now I am thinking one could probably use a series of point sprites each with varying depth, then assign each point the sprite that corresponds most closely with the desired depth (effectively quantizing the depth data field). But is there a way to keep the depth field continuous?
Any pointers on this question would be greatly appreciated!
In my experience POINTS are not faster than making your own vertices. Also, if you use instanced drawing you can get away with almost the same amount of data. You need one quad and then position, width, and height for each rectangle. Not sure instancing is as fast as just making all the vertices though. Might depend on the GPU/driver
As pointed out 😄 in many other Q&As on points, the maximum point size is GPU/driver specific and allowed to be as low as 1 pixel. There are plenty of GPUs that only allow size >= 256 pixels (no idea why) and a few with only size >= 64. Yet another reason to not use POINTS
Otherwise though, POINTS always draw a square so you'd have to draw a square large enough that contains your rectangle and then in the fragment shader, discard the pixels outside of the rectangle.
That's unlikely to be good for speed though. Every pixel of the square will still need to be evaluated by the fragment shader which is slower than drawing a rectangle with vertices since then those pixels outside the rectangle are not even considered. Further, using discard in a shader is often slower than not using it. This is because, for example, things like setting the depth buffer, if there is no discard nothing needs to be checked, the depth buffer can be updated unconditionally separate from the shader. With discard the depth buffer can't be updated until the GPU knows if the shader kept or discarded the fragment.
As for making them appear 3D I'm not sure what you mean. Effectively points are just like drawing a square quad so you can put anything you want on that square. The majority of shaders on shadertoy can be adapted to draw themselves on points. I wouldn't recommend it as it would likely be slow but just pointing out that it's just a quad. Draw a texture on them, draw a procedural texture on them, draw a solid color on them, draw a procedural snail on them.
Another possible solution is you can apply a normal map to the quad and then do lighting calculations on those normals so each quad will have the correct lighting for its position relative to your light(s)

What is the fastest way to render subsets of a single texture onto a WebGL canvas?

If you have a single power-of-two width/height texture (say 2048) and you want to blit out scaled and translated subsets of it (say 64x92-sized tiles scaled down) as quickly as possible onto another texture (as a buffer so it can be cached when not dirty), then draw that texture onto a webgl canvas, and you have no more requirements - what is the fastest strategy?
Is it first loading the source texture, binding an empty texture to a framebuffer, rendering the source with drawElementsInstancedANGLE to the framebuffer, then unbinding the framebuffer and rendering to the canvas?
I don't know much about WebGL and I'm trying to write a non-stateful version of https://github.com/kutuluk/js13k-2d (that just uses draw() calls instead of sprites that maintain state, since I would have millions of sprites). Before I get too far into the weeds, I'm hoping for some feedback.
There is no generic fastest way. The fastest way is different by GPU and also different by specifics.
Are you drawing lots of things the same size?
Are the parts of the texture atlas the same size?
Will you be rotating or scaling each instance?
Can their movement be based on time alone?
Will their drawing order change?
Do the textures have transparency?
Is that transparency 100% or not (0 or 1) or is it various values in between?
I'm sure there's tons of other considerations. For every consideration I might choose a different approach.
In general your idea if using drawElementsAngleInstanced seems fine but without knowing exactly what you're trying to do and on which device it's hard to know.
Here's some tests of drawing lots of stuff.

WebGL: How to interact between javascript and shaders, and how to use multiple shaders

I have seen demos on WebGL that
color rectangular surface
attach textures to the rectangles
draw wireframes
have semitransparent textures
What I do not understand is how to combine these effects into a single program, and how to interact with objects to change their look.
Suppose I want to create a scene with all the above, and have the ability to change the color of any rectangle, or change the texture.
I am trying to understand the organization of the code. Here are some short, related questions:
I can create a vertex buffer with corresponding color buffer. Can I have some rectangles with texture and some without?
If not, I have to create one vertex buffer for all objects with colors, and another with textures. Can I attach a different texture to each rectangle in a vector?
For a case with some rectangles with colors, and others with textures, it requires two different shader programs. All the demos I see have only one, but clearly more complicated programs have multiple. How do you switch between shaders?
How to draw wireframe on and off? Can it be combined with textures? In other words, is it possible to write a shader that can turn features like wireframe on and off with a flag, or does it take two different calls to two different shaders?
All the demos I have seen use an index buffer with triangles. Is Quads no longer supported in WebGL? Obviously for some things triangles would be needed, but if I have a bunch of rectangles it would be nice not to have to create an index of triangles.
For all three of the above scenarios, if I want to change the points, the color, the texture, or the transparency, am I correct in understanding the glSubBuffer will allow replacing data currently in the buffer with new data.
Is it reasonable to have a single object maintaining these kinds of objects and updating color and textures, or is this not a good design?
The question you ask is not just about WebGL, but also about OpenGL and 3D.
The most used way to interact is setting attributes at the start and uniforms at the start and on the run.
In general, answer to all of your questions is "use engine".
Imagine it like you have javascript, CPU based lang, then you have WebGL, which is like a library of stuff for JS that allows low level comunication with GPU (remember, low level), and then you have shader which is GPU program you must provide, but it works only with specific data.
Do anything that is more then "simple" requires a tool, that will allow you to skip using WebGL directly (and very often also write shaders directly). The tool we call engine. Engine usually binds together some set of abilities and skips the others (difference betwen 2D and 3D engine for example). Engine functions call some WebGL preset functions with specific order, so you must not ever touch WebGL API again. Engine also provides very complicated logic to build only single pair, or few pairs of shaders, based just on few simple engine api calls. The reason is that during entire program, swapping shader program cost is heavy.
Your questions
I can create a vertex buffer with corresponding color buffer. Can I
have some rectangles with texture and some without? If not, I have to
create one vertex buffer for all objects with colors, and another with
textures. Can I attach a different texture to each rectangle in a
vector?
Lets have a buffer, we call vertex buffer. We put various data in vertex buffer. Data doesnt go as individuals, but as sets. Each unique data in set, we call attribute. The attribute can has any meaning for its vertex that vertex shader or fragment shader code decides.
If we have buffer full of data for triangles, it is possible to set for example attribute that says if specific vertex should texture the triangle or not and do the texturing logic in the shader. Anyway I think that data size of attributes for each vertex must be equal (so the textured triangles will eat same size as nontextured).
For a case with some rectangles with colors, and others with textures,
it requires two different shader programs. All the demos I see have
only one, but clearly more complicated programs have multiple. How do
you switch between shaders?
Not true, even very complicated programs might have only one pair of shaders (one WebGL program). But still it is possible to change program on the run:
https://www.khronos.org/registry/webgl/specs/latest/1.0/#5.14.9
WebGL API function useProgram
How to draw wireframe on and off? Can it be combined with textures? In
other words, is it possible to write a shader that can turn features
like wireframe on and off with a flag, or does it take two different
calls to two different shaders?
WebGL API allows to draw in wireframe mode. It is shader program independent option. You can switch it with each draw call. Anyway it is also possible to write shader that will draw as wireframe and control it with flag (flag might be both, uniform or attribute based).
All the demos I have seen use an index buffer with triangles. Is Quads
no longer supported in WebGL? Obviously for some things triangles
would be needed, but if I have a bunch of rectangles it would be nice
not to have to create an index of triangles.
WebGL supports only Quads and triangles. I guess it is because without quads, shaders are more simple.
For all three of the above scenarios, if I want to change the points,
the color, the texture, or the transparency, am I correct in
understanding the glSubBuffer will allow replacing data currently in
the buffer with new data.
I would say it is rare to update buffer data on the run. It slows a program a lot. glSubBuffer is not in WebGL (different name???). Anyway dont use it ;)
Is it reasonable to have a single object maintaining these kinds of
objects and updating color and textures, or is this not a good design?
Yes, it is called Scene graph and is widely used and might be also combined with other techniques like display list.

iOS OpenGL ES 2.0 VBO confusion

I'm attempting to render a large number of textured quads on the iPhone. To improve render speeds I've created a VBO that I leverage to render my objects in a single draw call. This seems to work well, but I'm new to OpenGL and have run into issues when it comes to providing a unique transform for each of my quads (ultimately I'm looking for each quad to have a custom scale, position and rotation).
After a decent amount of Googling, it appears that the standard means of handling this situation is to pass a uniform matrix to the vertex shader and to have each quad take care of rendering itself. But this approach seems to negate the purpose of the VBO, by ultimately requiring a draw call per object.
In my mind, it makes sense that each object should keep it's own model view matrix, using it to transform, scale and rotate the object as necessary. But applying separate matrices to objects in a VBO has me lost. I've considered two approaches:
Send the model view matrix to the vertex shader as a non-uniform attribute and apply it within the shader.
Or transform the vertex data before it's stored in the VBO and sent to the GPU
But the fact that I'm finding it difficult to find information on how best to handle this leads me to believe I'm confusing the issue. What's the best way of handling this?
This is the "evergreen" question (a good one) on how to optimize the rendering of many simple geometries (a quad is in fact 2 triangles, 6 vertices most of the time unless we use a strip).
Anyway, the use of VBO vs VAO in this case should not mean a significant advantage since the size of the data to be transferred on the memory buffer is rather low (32 bytes per vertex, 96 bytes per triangle, 192 per quad) which is not a big effort for nowadays memory bandwidth (yet it depends on How many quads you mean. If you have 20.000 quads per frame then it would be a problem anyway).
A possible approach could be to batch the drawing of the quads by building a new VAO at each frame with the different quads positioned in your own coordinate system. Something like shifting the quads vertices to the correct position in a "virtual" mesh origin. Then you just perform a single draw of the newly creates mesh in your VAO.
In this way, you could batch the drawing of multiple objects in fewer calls.
The problem would be if your quads need to "scale" and "rotate" and not just translate, you can compute it with CPU the actual vertices position but it would be way to costly in terms of computing power.
A simple suggestion on top of the way you transfer the meshes is to use a texture atlas for all the textures of your quads, in this way you will need a much lower (if not needed at all) texture bind operation which might be costly in rendering operations.

How to batch sprites in iOS/OpenGL ES 2.0

I have developed my own sprite library on top of OpenGL ES 2.0. Right now, I am not doing any batching of draw calls; instead, each sprite has its own VBO/VAO of four textured vertices, drawn as a triangle strip (The VAO/VBO itself is managed by the Texture atlas, so identical sprites reuse the same VAO/VBO, which is 'reference counted' and hence deleted when no sprite instances reference it).
Before drawing each sprite, I'll bind its texture, upload its uniforms/attributes to the shader (modelview matrix, opacity - Projection matrix stays constant all along), bind its Vertex Array Object (4 textured vertices + four indices), and call glDrawElements(). I do cull off-screen sprites (based on position and bounds), but still it is one draw call per sprite, even if all sprites share the same texture. The vertex positions and texture coordinates for each sprite never change.
I must say that, despite this inefficiency, I have never experienced performance issues, even when drawing many sprites on screen. I do split the sprites into opaque/non-opaque, draw the opaque ones first, and the non-opaque ones after, back to front. I have seen performance suffer only when I overdraw (tax the fill rate).
Nevertheless, the OpenGL instruments in Xcode will complain that I draw too many small meshes and that I should consolidate my geometry into less objects. And in the Unity world everyone talks about limiting the number of draw calls as if they were the plague.
So, how should I go around batching very many sprites, each with a different transform and opacity value (but the same texture), into one draw call? One thing that comes to mind is to modify the vertex data every frame, and stream it: applying the modelview matrix of each sprite to all its vertices, assembling the transformed vertices for all sprites into one mesh, and submitting it to the GPU. This approach does not solve the problem of varying opacity between sprites.
Another idea that comes to mind is to have all the textured vertices of all the sprites assembled into a single mesh (VBO), treated as 'static' (same vertex format I am using now), and a separate array with the stuff that changes per sprite every frame (transform matrix and opacity), and only stream that data each frame, and pull it/apply it on the vertex shader side. That is, have a separate array where the 'attribute' being represented is the modelview matrix/alpha for the corresponding vertices. Still have to figure out the exact implementation in terms of data format/strides etc. In any case, there is the additional complication that arises whenever a new sprite is created/destroyed, the whole mesh has to be modified...
Or perhaps there is an ideal, 'textbook' solution to this problem out there that I haven't figured out? What does cocos2d do?
When I initially started reading you post I though that each quad used a different texture (since you stated "Before drawing each sprite, I'll bind its texture") but then you said that each sprite has "the same texture".
A possible easy win is to control the way you bind your textures during the draw since each call is a burden for the OpenGL driver. If (and I am not really sure abut this from your post) you use different textures, I suggest to go for a simple texture atlas where all the sprites are inside a single picture (preferably a power of 2 texture with mipmapping) and then you take the piece of the texture you need in the fragment using texture coordinates (this is the reason they exist in the end)
If the position of the sprites change over time (of course it does) at each frame, a possible advantage would be to pack the new vertex coordinates of your sprites at each frame and draw directly from memory (possibly over VAO. VBO could cost more since you need to build it each frame? to be tested in real scenario). This would be a good call pack operation and I am pretty sure it will bust the performances.
Consider that the VAO option could be feasible since we are talking about very small amount of data and the memory bandwidth should not represent a real bottleneck (each quad I guess uses 12 floats for vertex coordinates, 8 for textures and 12 for normals, 128 byte?), it shouldn't be a big problem over VAO.
About opacity, can't you play using an uniform to your fragment shader where you play with alpha? Am I wrong with it? It should work.
I hope this helps.
Ciao,
Maurizio

Resources