Hello world example of WebGL parallelism - webgl

There are many abstractions around WebGL for running parallel processing it seems, e.g.:
https://github.com/MaiaVictor/WebMonkeys
https://github.com/gpujs/gpu.js
https://github.com/turbo/js
But I am having a hard time understanding what a simple and complete example of parallelism would look like in plain GLSL code for WebGL. I don't have much experience with WebGL but I understand that there are fragment and vertex shaders and how to load them into a WebGL context from JavaScript. I don't know how to use the shaders or which one is supposed to do the parallel processing.
I am wondering if one could demonstrate a simple hello world example of a parallel add operation, essentially this but parallel form using GLSL / WebGL shaders / however it should be done.
var array = []
var size = 10000
while(size--) array.push(0)
for (var i = 0, n = 10000; i < n; i++) {
array[i] += 10
}
I guess I essentially don't understand:
If WebGL runs everything in parallel automatically.
Or if there is a max number of things run in parallel, so if you have 10,000 things, but only 1000 in parallel, then it would do 1,000 in parallel 10 times sequentially.
Or if you have to manually specify the amount of parallelism you want.
If the parallelism goes into the fragment shader or vertex shader, or both.
How to actually implement the parallel example.

First off, WebGL only rasterizes points, lines, and triangles. Using WebGL to do non rasterization (GPGPU) is basically a matter of realizing that the inputs to WebGL are data from arrays and the output, a 2D rectangle of pixels is also really just a 2D array so by creatively providing non graphic data and creatively rasterizing that data you can do non-graphics math.
WebGL is parallel in 2 ways.
it's running on a different processor, the GPU, while it's computing something your CPU is free to do something else.
GPUs themselves compute in parallel. A good example if you rasterize a triangle with 100 pixels the GPU can process each of those pixels in parallel up to the limit of that GPU. Without digging too deeply it looks like an NVidia 1080 GPU has 2560 cores so assuming they are not specialized and assuming the best case one of those could compute 2560 things in parallel.
As for an example all WebGL apps are using parallel processing by points (1) and (2) above without doing anything special.
Adding 10 to 10000 elements though in place is not what WebGL is good at because WebGL can't read from and write to the same data during one operation. In other words, your example would need to be
const size = 10000;
const srcArray = [];
const dstArray = [];
for (let i = 0; i < size; ++i) {
srcArray[i] = 0;
}
for (var i = 0, i < size; ++i) {
dstArray[i] = srcArray[i] + 10;
}
Just like any programming language there is more than one way to accomplish this. The fastest would probably probably be to copy all your values into a texture then rasterize into another texture, looking up from the first texture and writing +10 to the destination. But, there in is one of the issues. Transferring data to and from the GPU is slow so you need to weigh that into whether doing work on the GPU is a win.
Another is just like the limit that you can't read from and write to the same array you also can't randomly access the destination array. The GPU is rasterizing a line, point, or triangle. It's fastest at drawing triangles but that means its deciding which pixels to write to in what order so your problem also has to live with those limits. You can use points to as a way to randomly choose a destination but rendering points is much slower than rendering triangles.
Note that "Compute Shaders" (not yet part of WebGL) add the random access write ability to GPUs.
Example:
const gl = document.createElement("canvas").getContext("webgl");
const vs = `
attribute vec4 position;
attribute vec2 texcoord;
varying vec2 v_texcoord;
void main() {
gl_Position = position;
v_texcoord = texcoord;
}
`;
const fs = `
precision highp float;
uniform sampler2D u_srcData;
uniform float u_add;
varying vec2 v_texcoord;
void main() {
vec4 value = texture2D(u_srcData, v_texcoord);
// We can't choose the destination here.
// It has already been decided by however
// we asked WebGL to rasterize.
gl_FragColor = value + u_add;
}
`;
// calls gl.createShader, gl.shaderSource,
// gl.compileShader, gl.createProgram,
// gl.attachShaders, gl.linkProgram,
// gl.getAttributeLocation, gl.getUniformLocation
const programInfo = twgl.createProgramInfo(gl, [vs, fs]);
const size = 10000;
// Uint8Array values default to 0
const srcData = new Uint8Array(size);
// let's use slight more interesting numbers
for (let i = 0; i < size; ++i) {
srcData[i] = i % 200;
}
// Put that data in a texture. NOTE: Textures
// are (generally) 2 dimensional and have a limit
// on their dimensions. That means you can't make
// a 1000000 by 1 texture. Most GPUs limit from
// between 2048 to 16384.
// In our case we're doing 10000 so we could use
// a 100x100 texture. Except that WebGL can
// process 4 values at a time (red, green, blue, alpha)
// so a 50x50 will give us 10000 values
const srcTex = gl.createTexture();
gl.bindTexture(gl.TEXTURE_2D, srcTex);
const level = 0;
const width = Math.sqrt(size / 4);
if (width % 1 !== 0) {
// we need some other technique to fit
// our data into a texture.
alert('size does not have integer square root');
}
const height = width;
const border = 0;
const internalFormat = gl.RGBA;
const format = gl.RGBA;
const type = gl.UNSIGNED_BYTE;
gl.texImage2D(
gl.TEXTURE_2D, level, internalFormat,
width, height, border, format, type, srcData);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_S, gl.CLAMP_TO_EDGE);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_T, gl.CLAMP_TO_EDGE);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MAG_FILTER, gl.NEAREST);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.NEAREST);
// create a destination texture
const dstTex = gl.createTexture();
gl.bindTexture(gl.TEXTURE_2D, dstTex);
gl.texImage2D(
gl.TEXTURE_2D, level, internalFormat,
width, height, border, format, type, null);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_S, gl.CLAMP_TO_EDGE);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_T, gl.CLAMP_TO_EDGE);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MAG_FILTER, gl.NEAREST);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.NEAREST);
// make a framebuffer so we can render to the
// destination texture
const fb = gl.createFramebuffer();
gl.bindFramebuffer(gl.FRAMEBUFFER, fb);
// and attach the destination texture
gl.framebufferTexture2D(gl.FRAMEBUFFER, gl.COLOR_ATTACHMENT0, gl.TEXTURE_2D, dstTex, level);
// calls gl.createBuffer, gl.bindBuffer, gl.bufferData
// to put a 2 unit quad (2 triangles) into
// a buffer with matching texture coords
// to process the entire quad
const bufferInfo = twgl.createBufferInfoFromArrays(gl, {
position: {
data: [
-1, -1,
1, -1,
-1, 1,
-1, 1,
1, -1,
1, 1,
],
numComponents: 2,
},
texcoord: [
0, 0,
1, 0,
0, 1,
0, 1,
1, 0,
1, 1,
],
});
gl.useProgram(programInfo.program);
// calls gl.bindBuffer, gl.enableVertexAttribArray, gl.vertexAttribPointer
twgl.setBuffersAndAttributes(gl, programInfo, bufferInfo);
// calls gl.activeTexture, gl.bindTexture, gl.uniformXXX
twgl.setUniforms(programInfo, {
u_add: 10 / 255, // because we're using Uint8
u_srcData: srcTex,
});
// set the viewport to match the destination size
gl.viewport(0, 0, width, height);
// draw the quad (2 triangles)
const offset = 0;
const numVertices = 6;
gl.drawArrays(gl.TRIANGLES, offset, numVertices);
// pull out the result
const dstData = new Uint8Array(size);
gl.readPixels(0, 0, width, height, format, type, dstData);
console.log(dstData);
<script src="https://twgljs.org/dist/4.x/twgl-full.min.js"></script>
Making a generic math processor would require a ton more work.
Issues:
Textures are 2D arrays, WebGL only rasterizes points, lines, and triangles so for example it's much easier to process data that fits into a rectangle than not. In other words if you have 10001 values there is no rectangle that fits an integer number of units. It might be best to pad your data and just ignore the part past the end. In other words a 100x101 texture would be 10100 values. So just ignore the last 99 values.
The example above using 8bit 4 channel textures. It would be easier to use 8bit 1 channel textures (less math) but also less efficient since WebGL can process 4 values per operation.
Because it uses 8bit textures it can only store integer values from 0 to 255. We could switch the texture to 32bit floating point textures. Floating point textures are an optional feature of both WebGL (you need to enable extensions and check they succeeded). Rasterizing to a floating point texture is also an optional feature. Most mobile GPUs as of 2018 do not support rendering to a floating point texture so you have to find creative ways of encoding the results into a format they do support if you want your code to work on those GPUs.
Addressing the source data requires math to convert from a 1d index to a 2d texture coordinate. In the example above since we are converting directly from srcData to dstData 1 to 1 no math is needed. If you needed to jump around srcData you'd need to provide that math
WebGL1
vec2 texcoordFromIndex(int ndx) {
int column = int(mod(float(ndx),float(widthOfTexture)));
int row = ndx / widthOfTexture;
return (vec2(column, row) + 0.5) / vec2(widthOfTexture, heighOfTexture);
}
vec2 texcoord = texcoordFromIndex(someIndex);
vec4 value = texture2D(someTexture, texcoord);
WebGL2
ivec2 texcoordFromIndex(someIndex) {
int column = ndx % widthOfTexture;
int row = ndx / widthOfTexture;
return ivec2(column, row);
}
int level = 0;
ivec2 texcoord = texcoordFromIndex(someIndex);
vec4 value = texelFetch(someTexture, texcoord, level);
Let's say we want to sum every 2 numbers. We might do something like this
const gl = document.createElement("canvas").getContext("webgl2");
const vs = `
#version 300 es
in vec4 position;
void main() {
gl_Position = position;
}
`;
const fs = `
#version 300 es
precision highp float;
uniform sampler2D u_srcData;
uniform ivec2 u_destSize; // x = width, y = height
out vec4 outColor;
ivec2 texcoordFromIndex(int ndx, ivec2 size) {
int column = ndx % size.x;
int row = ndx / size.x;
return ivec2(column, row);
}
void main() {
// compute index of destination
ivec2 dstPixel = ivec2(gl_FragCoord.xy);
int dstNdx = dstPixel.y * u_destSize.x + dstPixel.x;
ivec2 srcSize = textureSize(u_srcData, 0);
int srcNdx = dstNdx * 2;
ivec2 uv1 = texcoordFromIndex(srcNdx, srcSize);
ivec2 uv2 = texcoordFromIndex(srcNdx + 1, srcSize);
float value1 = texelFetch(u_srcData, uv1, 0).r;
float value2 = texelFetch(u_srcData, uv2, 0).r;
outColor = vec4(value1 + value2);
}
`;
// calls gl.createShader, gl.shaderSource,
// gl.compileShader, gl.createProgram,
// gl.attachShaders, gl.linkProgram,
// gl.getAttributeLocation, gl.getUniformLocation
const programInfo = twgl.createProgramInfo(gl, [vs, fs]);
const size = 10000;
// Uint8Array values default to 0
const srcData = new Uint8Array(size);
// let's use slight more interesting numbers
for (let i = 0; i < size; ++i) {
srcData[i] = i % 99;
}
const srcTex = gl.createTexture();
gl.bindTexture(gl.TEXTURE_2D, srcTex);
const level = 0;
const srcWidth = Math.sqrt(size / 4);
if (srcWidth % 1 !== 0) {
// we need some other technique to fit
// our data into a texture.
alert('size does not have integer square root');
}
const srcHeight = srcWidth;
const border = 0;
const internalFormat = gl.R8;
const format = gl.RED;
const type = gl.UNSIGNED_BYTE;
gl.texImage2D(
gl.TEXTURE_2D, level, internalFormat,
srcWidth, srcHeight, border, format, type, srcData);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_S, gl.CLAMP_TO_EDGE);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_T, gl.CLAMP_TO_EDGE);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MAG_FILTER, gl.NEAREST);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.NEAREST);
// create a destination texture
const dstTex = gl.createTexture();
gl.bindTexture(gl.TEXTURE_2D, dstTex);
const dstWidth = srcWidth;
const dstHeight = srcHeight / 2;
// should check srcHeight is evenly
// divisible by 2
gl.texImage2D(
gl.TEXTURE_2D, level, internalFormat,
dstWidth, dstHeight, border, format, type, null);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_S, gl.CLAMP_TO_EDGE);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_T, gl.CLAMP_TO_EDGE);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MAG_FILTER, gl.NEAREST);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.NEAREST);
// make a framebuffer so we can render to the
// destination texture
const fb = gl.createFramebuffer();
gl.bindFramebuffer(gl.FRAMEBUFFER, fb);
// and attach the destination texture
gl.framebufferTexture2D(gl.FRAMEBUFFER, gl.COLOR_ATTACHMENT0, gl.TEXTURE_2D, dstTex, level);
// calls gl.createBuffer, gl.bindBuffer, gl.bufferData
// to put a 2 unit quad (2 triangles) into
// a buffer
const bufferInfo = twgl.createBufferInfoFromArrays(gl, {
position: {
data: [
-1, -1,
1, -1,
-1, 1,
-1, 1,
1, -1,
1, 1,
],
numComponents: 2,
},
});
gl.useProgram(programInfo.program);
// calls gl.bindBuffer, gl.enableVertexAttribArray, gl.vertexAttribPointer
twgl.setBuffersAndAttributes(gl, programInfo, bufferInfo);
// calls gl.activeTexture, gl.bindTexture, gl.uniformXXX
twgl.setUniforms(programInfo, {
u_srcData: srcTex,
u_srcSize: [srcWidth, srcHeight],
u_dstSize: [dstWidth, dstHeight],
});
// set the viewport to match the destination size
gl.viewport(0, 0, dstWidth, dstHeight);
// draw the quad (2 triangles)
const offset = 0;
const numVertices = 6;
gl.drawArrays(gl.TRIANGLES, offset, numVertices);
// pull out the result
const dstData = new Uint8Array(size / 2);
gl.readPixels(0, 0, dstWidth, dstHeight, format, type, dstData);
console.log(dstData);
<script src="https://twgljs.org/dist/4.x/twgl-full.min.js"></script>
Note the example above uses WebGL2. Why? Because WebGL2 supports rendering to R8 format textures which made the math easy. One value per pixel instead of 4 values per pixel like the previous example. Of course it also means it's slower but making it work with 4 values would have really complicated the math for computing indices or might have required re-arranging the source data to better match. For example instead of value indices going 0, 1, 2, 3, 4, 5, 6, 7, 8, ... it would be easier to sum every 2 values if they were arranged 0, 2, 4, 6, 1, 3, 5, 7, 8 .... that way pulling 4 out at a time and adding the next group of 4 the values would line up. Yet another way would be to use 2 source textures, put all the even indexed values in one texture and the odd indexed values in the other.
WebGL1 provides both LUMINANCE and ALPHA textures which are also one channel but whether or not you can render to them is an optional feature where as in WebGL2 rendering to an R8 texture is a required feature.
WebGL2 also provides something called "transform feedback". This lets you write the output of a vertex shader to buffer. It has the advantage that you just set the number of vertices you want to process (no need to have the destination data be a rectangle). That also means you can output floating point values (it's not optional like it is for rendering to textures). I believe (though I haven't tested) that it's slower than rendering to textures though.
Since you're new to WebGL might I suggest these tutorials.

Related

Summing the values in a Webgl2 R32F Texture by generating a MipMap

If I have rendered data into a R32F texture (of 2^18 (~250,000) texels) and I want to compute the sum of these values, is it possible to do this by asking the gpu to generate a mipmap?
(the idea being that the smallest mipmap level would have a single texel that contains the average of all the original texels)
What mipmap settings (clamp, etc) would I use to generate the correct average?
I'm not so good with webgl gymnastics, and would appreciate a snippet of how one would render into a R32F texture the numbers from 1 to 2^18 and then produce a sum over that texture.
For this number of texels, would this approach be faster than trying to transfer the texels back to the cpu and performing the sum in javascript?
Thanks!
There are no settings that define the algorithm used to generate mipmaps. Clamp settings, filter settings have no effect. There's only a hint you can set with gl.hint on whether to prefer quality over performance but a driver has no obligation to even pay attention to that flag. Further, every driver is different. The results of generating mipmaps is one of the differences used to fingerprint WebGL.
In any case if you don't care about the algorithm used and you just want to read the result of generating mipmaps then you just need to attach the last mip to a framebuffer and read the pixel after calling gl.generateMipmap.
You likely wouldn't render into a texture all the numbers from 1 to 2^18 but that's not hard. You'd just draw a single quad 512x512. The fragment shader could look like this
#version 300 es
precision highp float;
out vec4 fragColor;
void main() {
float i = 1. + gl_FragCoord.x + gl_FragCoord.y * 512.0;
fragColor = vec4(i, 0, 0, 0);
}
Of course you could pass in that 512.0 as a uniform if you wanted to work with other sizes.
Rendering to a floating point texture is an optional feature of WebGL2. Desktops support it but as of 2018 most mobile devices do not. Similarly being able to filter a floating point texture is also an optional feature which is also usually not supported on most mobile devices as of 2018 but is on desktop.
function main() {
const gl = document.createElement("canvas").getContext("webgl2");
if (!gl) {
alert("need webgl2");
return;
}
{
const ext = gl.getExtension("EXT_color_buffer_float");
if (!ext) {
alert("can not render to floating point textures");
return;
}
}
{
const ext = gl.getExtension("OES_texture_float_linear");
if (!ext) {
alert("can not filter floating point textures");
return;
}
}
// create a framebuffer and attach an R32F 512x512 texture
const numbersFBI = twgl.createFramebufferInfo(gl, [
{ internalFormat: gl.R32F, minMag: gl.NEAREST },
], 512, 512);
const vs = `
#version 300 es
in vec4 position;
void main() {
gl_Position = position;
}
`;
const fillFS = `
#version 300 es
precision highp float;
out vec4 fragColor;
void main() {
float i = 1. + gl_FragCoord.x + gl_FragCoord.y * 512.0;
fragColor = vec4(i, 0, 0, 0);
}
`
// creates a buffer with a single quad that goes from -1 to +1 in the XY plane
// calls gl.createBuffer, gl.bindBuffer, gl.bufferData
const quadBufferInfo = twgl.primitives.createXYQuadBufferInfo(gl);
const fillProgramInfo = twgl.createProgramInfo(gl, [vs, fillFS]);
gl.useProgram(fillProgramInfo.program);
// calls gl.bindBuffer, gl.enableVertexAttribArray, gl.vertexAttribPointer
twgl.setBuffersAndAttributes(gl, fillProgramInfo, quadBufferInfo);
// tell webgl to render to our texture 512x512 texture
// calls gl.bindBuffer and gl.viewport
twgl.bindFramebufferInfo(gl, numbersFBI);
// draw 2 triangles (6 vertices)
gl.drawElements(gl.TRIANGLES, 6, gl.UNSIGNED_SHORT, 0);
// compute the last mip level
const miplevel = Math.log2(512);
// get the texture twgl created above
const texture = numbersFBI.attachments[0];
// create a framebuffer with the last mip from
// the texture
const readFBI = twgl.createFramebufferInfo(gl, [
{ attachment: texture, level: miplevel },
]);
gl.bindTexture(gl.TEXTURE_2D, texture);
// try each hint to see if there is a difference
['DONT_CARE', 'NICEST', 'FASTEST'].forEach((hint) => {
gl.hint(gl.GENERATE_MIPMAP_HINT, gl[hint]);
gl.generateMipmap(gl.TEXTURE_2D);
// read the result.
const result = new Float32Array(4);
gl.readPixels(0, 0, 1, 1, gl.RGBA, gl.FLOAT, result);
log('mip generation hint:', hint);
log('average:', result[0]);
log('average * count:', result[0] * 512 * 512);
log(' ');
});
function log(...args) {
const elem = document.createElement('pre');
elem.textContent = [...args].join(' ');
document.body.appendChild(elem);
}
}
main();
pre {margin: 0}
<script src="https://twgljs.org/dist/4.x/twgl-full.min.js"></script>
Note I used twgl.js to make the code less verbose. If you don't know how to make a framebuffer and attach textures or how to setup buffers and attributes, compile shaders, and set uniforms then you're asking way too broad a question and I suggest you go read some tutorials.
Let me point how there's no guarantee this method is faster than others. First off it's up to the driver. It's possible the driver does this in software (though unlikely).
one obvious speed up is to use RGBAF32 and let the code do 4 values at a time then read all 4 channels (R,G,B,A) at the end and sum those.
Also since you only care about the last 1x1 pixel mip your asking the code to render a lot more pixels than a more direct method. Really you only need to render 1 pixel, the result. But for this example of 2^18 values which is a 512x512 texture that means a 256x526, a 128x128, a 64x64, a 32x32, a 16x16, a 8x8, a 4x4, and a 2x2 mip are all allocated and computed which is arguably wasted time. In fact the spec says all mips are generated from the first mip. Of course a driver is free to take shortcuts and most likely generates mip N from mip N-1 as the result will be similar but that's not how the spec is defined. But, even generating one mip from the previous is 87380 values computed you didn't care about.
I'm only guessing it would be faster to generate in larger chucks than 2x2. At the same time there are texture caches and if I understand correctly they usually cache a rectangular part of a texture so that reading 4 values from a mip is fast. When you have a texture cache miss it can really kill your performance. So, if your chunks are too large it's possible you'd have lots of cache misses. You'd basically have to test and each GPU would likely show different performance characteristics.
Yet another speed up would be to consider using multiple drawing buffers then you can write 16 to 32 values per fragment shader iteration instead of just 4.

WebGL shader z position not used in depth calculations

I've been trying out some WebGL but there's a bug I cannot seem to find out how to fix.
Currently I have the following setup:
I have around 100 triangles which all have a position and are being drawn by a single gl.drawArrays function. To have them drawn in the correct order I used gl.enable(gl.DEPTH_TEST); which gave the correct result.
The problem I have now is that if I update the gl_Position of the triangles in the vertex shader the updated Z value is not being used in the depth test. The result is that a triangle with a gl_Position.z of 1 can be drawn on top of a triangle with a gl_Position.z of 10, which is not exactly what I want..
What have I tried?
gl.enable(gl.DEPTH_TEST);
gl.depthFunc(gl.GEQUAL);
with
gl.clear(gl.DEPTH_BUFFER_BIT);
gl.clearDepth(0);
gl.drawArrays(gl.TRIANGLES, 0, verticesCount);
in the render function.
The following code is used to create the buffer:
gl.bindBuffer(gl.ARRAY_BUFFER, dataBuffer);
gl.bufferData(gl.ARRAY_BUFFER, positionBufferData, gl.STATIC_DRAW);
const positionLocation = gl.getAttribLocation(program, 'position');
gl.enableVertexAttribArray(positionLocation);
gl.vertexAttribPointer(positionLocation, 3, gl.FLOAT, false, false, 0, 0);
The triangles with a higher z value are much bigger in size (due to the perspective) but small triangles still appear over it (due to the render order).
In the fragment shader I've used gl_fragCoord.z to see if that was correct and smaller triangles (further away) received a higher alpha than bigger ones (up close).
What could be the cause of the weird drawing behaviour?
Depth in clipspace goes from -1 to 1. Depth written to the depth buffer goes from 0 to 1. You're clearing to 1. There is no depth value > 1 so the only things you should see drawn are at gl_Position.z = 1. Anything less than 1 will fail the test gl.depthFunc(gl.GEQUAL);. Anything > 1 will be clipped. Only 1 is both in the depth range and Greater than or Equal to 1
The example below draws smaller to larger rectangles with different z values. The red is standard gl.depthFunc(gl.LESS) with depth cleared to 1. The green is gl.depthFunc(gl.GEQUAL) with depth cleared to 0. The blue is gl.depthFunc(gl.GEQUAL) with depth cleared to 1. Notice blue only draws the single rectangle at gl_Position.z = 1 because all other rectangles fail the test since they are at Z < 1.
const m4 = twgl.m4;
const gl = document.querySelector("canvas").getContext("webgl");
const vs = `
attribute vec4 position;
varying vec4 v_position;
uniform mat4 matrix;
void main() {
gl_Position = matrix * position;
v_position = abs(position);
}
`;
const fs = `
precision mediump float;
varying vec4 v_position;
uniform vec4 color;
void main() {
gl_FragColor = vec4(1. - v_position.xxx, 1) * color;
}
`;
// compiles shaders, links program, looks up attributes
const programInfo = twgl.createProgramInfo(gl, [vs, fs]);
// calls gl.createBuffer, gl.bindBindbuffer, gl.bufferData for each array
const z0To1BufferInfo = twgl.createBufferInfoFromArrays(gl, {
position: [
...makeQuad( .2, 0.00),
...makeQuad( .4, .25),
...makeQuad( .6, .50),
...makeQuad( .8, .75),
...makeQuad(1.0, 1.00),
],
});
const z1To0BufferInfo = twgl.createBufferInfoFromArrays(gl, {
position: [
...makeQuad(.2, 1.00),
...makeQuad(.4, .75),
...makeQuad(.6, .50),
...makeQuad(.8, .25),
...makeQuad(1., 0.00),
],
});
function makeQuad(xy, z) {
return [
-xy, -xy, z,
xy, -xy, z,
-xy, xy, z,
-xy, xy, z,
xy, -xy, z,
xy, xy, z,
];
}
gl.useProgram(programInfo.program);
gl.enable(gl.DEPTH_TEST);
gl.clearDepth(1);
gl.clear(gl.DEPTH_BUFFER_BIT);
gl.depthFunc(gl.LESS);
drawRects(-0.66, z0To1BufferInfo, [1, 0, 0, 1]);
gl.clearDepth(0);
gl.clear(gl.DEPTH_BUFFER_BIT);
gl.depthFunc(gl.GEQUAL);
drawRects(0, z1To0BufferInfo, [0, 1, 0, 1]);
gl.clearDepth(1);
gl.clear(gl.DEPTH_BUFFER_BIT);
gl.depthFunc(gl.GEQUAL);
drawRects(0.66, z1To0BufferInfo, [0, 0, 1, 1]);
function drawRects(xoffset, bufferInfo, color) {
// calls gl.bindBuffer, gl.enableVertexAttribArray, gl.vertexAttribPointer
twgl.setBuffersAndAttributes(gl, programInfo, bufferInfo);
let mat = m4.translation([xoffset, 0, 0]);
mat = m4.scale(mat, [.3, .5, 1]);
// calls gl.uniformXXX
twgl.setUniforms(programInfo, {
color: color,
matrix: mat,
});
// calls gl.drawArrays or gl.drawElements
twgl.drawBufferInfo(gl, bufferInfo);
}
<script src="https://twgljs.org/dist/3.x/twgl-full.min.js"></script>
<canvas></canvas>
<pre>
red : depthFunc: LESS, clearDepth: 1
green: depthFunc: GEQUAL, clearDepth: 0
blue : depthFunc: GEQUAL, clearDepth: 1
</pre>

WebGL; Instanced rendering - setting up divisors

I'm trying to draw a lot of cubes in webgl using instanced rendering (ANGLE_instanced_arrays).
However I can't seem to wrap my head around how to setup the divisors. I have the following buffers;
36 vertices (6 faces made from 2 triangles using 3 vertices each).
6 colors per cube (1 for each face).
1 translate per cube.
To reuse the vertices for each cube; I've set it's divisor to 0.
For color I've set the divisor to 2 (i.e. use same color for two triangles - a face)).
For translate I've set the divisor to 12 (i.e. same translate for 6 faces * 2 triangles per face).
For rendering I'm calling
ext_angle.drawArraysInstancedANGLE(gl.TRIANGLES, 0, 36, num_cubes);
This however does not seem to render my cubes.
Using translate divisor 1 does but the colors are way off then, with cubes being a single solid color.
I'm thinking it's because my instances are now the full cube, but if I limit the count (i.e. vertices per instance), I do not seem to get all the way through the vertices buffer, effectively I'm just rendering one triangle per cube then.
How would I go about rendering a lot of cubes like this; with varying colored faces?
Instancing works like this:
Eventually you are going to call
ext.drawArraysInstancedANGLE(mode, first, numVertices, numInstances);
So let's say you're drawing instances of a cube. One cube has 36 vertices (6 per face * 6 faces). So
numVertices = 36
And lets say you want to draw 100 cubes so
numInstances = 100
Let's say you have a vertex shader like this
Let's say you have the following shader
attribute vec4 position;
uniform mat4 matrix;
void main() {
gl_Position = matrix * position;
}
If you did nothing else and just called
var mode = gl.TRIANGLES;
var first = 0;
var numVertices = 36
var numInstances = 100
ext.drawArraysInstancedANGLE(mode, first, numVertices, numInstances);
It would just draw the same cube in the same exact place 100 times
Next up you want to give each cube a different translation so you update your shader to this
attribute vec4 position;
attribute vec3 translation;
uniform mat4 matrix;
void main() {
gl_Position = matrix * (position + vec4(translation, 0));
}
You now make a buffer and put one translation per cube then you setup the attribute like normal
gl.vertexAttribPointer(translationLocation, 3, gl.FLOAT, false, 0, 0)
But you also set a divisor
ext.vertexAttribDivisorANGLE(translationLocation, 1);
That 1 says 'only advance to the next value in the translation buffer once per instance'
Now you want have a different color per face per cube and you only want one color per face in the data (you don't want to repeat colors). There is no setting that would to that Since your numVertices = 36 you can only choose to advance every vertex (divisor = 0) or once every multiple of 36 vertices (ie, numVertices).
So you say, what if instance faces instead of cubes? Well now you've got the opposite problem. Put one color per face. numVertices = 6, numInstances = 600 (100 cubes * 6 faces per cube). You set color's divisor to 1 to advance the color once per face. You can set translation divisor to 6 to advance the translation only once every 6 faces (every 6 instances). But now you no longer have a cube you only have a single face. In other words you're going to draw 600 faces all facing the same way, every 6 of them translated to the same spot.
To get a cube back you'd have to add something to orient the face instances in 6 direction.
Ok, you fill a buffer with 6 orientations. That won't work. You can't set divisor to anything that will use those 6 orientations advance only once every face but then resetting after 6 faces for the next cube. There's only 1 divisor setting. Setting it to 6 to repeat per face or 36 to repeat per cube but you want advance per face and reset back per cube. No such option exists.
What you can do is draw it with 6 draw calls, one per face direction. In other words you're going to draw all the left faces, then all the right faces, the all the top faces, etc...
To do that we make just 1 face, 1 translation per cube, 1 color per face per cube. We set the divisor on the translation and the color to 1.
Then we draw 6 times, one for each face direction. The difference between each draw is we pass in an orientation for the face and we change the attribute offset for the color attribute and set it's stride to 6 * 4 floats (6 * 4 * 4).
var vs = `
attribute vec4 position;
attribute vec3 translation;
attribute vec4 color;
uniform mat4 viewProjectionMatrix;
uniform mat4 localMatrix;
varying vec4 v_color;
void main() {
vec4 localPosition = localMatrix * position + vec4(translation, 0);
gl_Position = viewProjectionMatrix * localPosition;
v_color = color;
}
`;
var fs = `
precision mediump float;
varying vec4 v_color;
void main() {
gl_FragColor = v_color;
}
`;
var m4 = twgl.m4;
var gl = document.querySelector("canvas").getContext("webgl");
var ext = gl.getExtension("ANGLE_instanced_arrays");
if (!ext) {
alert("need ANGLE_instanced_arrays");
}
var program = twgl.createProgramFromSources(gl, [vs, fs]);
var positionLocation = gl.getAttribLocation(program, "position");
var translationLocation = gl.getAttribLocation(program, "translation");
var colorLocation = gl.getAttribLocation(program, "color");
var localMatrixLocation = gl.getUniformLocation(program, "localMatrix");
var viewProjectionMatrixLocation = gl.getUniformLocation(
program,
"viewProjectionMatrix");
function r(min, max) {
if (max === undefined) {
max = min;
min = 0;
}
return Math.random() * (max - min) + min;
}
function rp() {
return r(-20, 20);
}
// make translations and colors, colors are separated by face
var numCubes = 1000;
var colors = [];
var translations = [];
for (var cube = 0; cube < numCubes; ++cube) {
translations.push(rp(), rp(), rp());
// pick a random color;
var color = [r(1), r(1), r(1), 1];
// now pick 4 similar colors for the faces of the cube
// that way we can tell if the colors are correctly assigned
// to each cube's faces.
var channel = r(3) | 0; // pick a channel 0 - 2 to randomly modify
for (var face = 0; face < 6; ++face) {
color[channel] = r(.7, 1);
colors.push.apply(colors, color);
}
}
var buffers = twgl.createBuffersFromArrays(gl, {
position: [ // one face
-1, -1, -1,
-1, 1, -1,
1, -1, -1,
1, -1, -1,
-1, 1, -1,
1, 1, -1,
],
color: colors,
translation: translations,
});
var faceMatrices = [
m4.identity(),
m4.rotationX(Math.PI / 2),
m4.rotationX(Math.PI / -2),
m4.rotationY(Math.PI / 2),
m4.rotationY(Math.PI / -2),
m4.rotationY(Math.PI),
];
function render(time) {
time *= 0.001;
twgl.resizeCanvasToDisplaySize(gl.canvas);
gl.viewport(0, 0, gl.canvas.width, gl.canvas.height);
gl.enable(gl.DEPTH_TEST);
gl.clear(gl.COLOR_BUFFER_BIT | gl.DEPTH_BUFFER_BIT);
gl.bindBuffer(gl.ARRAY_BUFFER, buffers.position);
gl.enableVertexAttribArray(positionLocation);
gl.vertexAttribPointer(positionLocation, 3, gl.FLOAT, false, 0, 0);
gl.bindBuffer(gl.ARRAY_BUFFER, buffers.translation);
gl.enableVertexAttribArray(translationLocation);
gl.vertexAttribPointer(translationLocation, 3, gl.FLOAT, false, 0, 0);
gl.bindBuffer(gl.ARRAY_BUFFER, buffers.color);
gl.enableVertexAttribArray(colorLocation);
ext.vertexAttribDivisorANGLE(positionLocation, 0);
ext.vertexAttribDivisorANGLE(translationLocation, 1);
ext.vertexAttribDivisorANGLE(colorLocation, 1);
gl.useProgram(program);
var fov = 60;
var aspect = gl.canvas.clientWidth / gl.canvas.clientHeight;
var projection = m4.perspective(fov * Math.PI / 180, aspect, 0.5, 100);
var radius = 30;
var eye = [
Math.cos(time) * radius,
Math.sin(time * 0.3) * radius,
Math.sin(time) * radius,
];
var target = [0, 0, 0];
var up = [0, 1, 0];
var camera = m4.lookAt(eye, target, up);
var view = m4.inverse(camera);
var viewProjection = m4.multiply(projection, view);
gl.uniformMatrix4fv(viewProjectionMatrixLocation, false, viewProjection);
// 6 faces * 4 floats per color * 4 bytes per float
var stride = 6 * 4 * 4;
var numVertices = 6;
faceMatrices.forEach(function(faceMatrix, ndx) {
var offset = ndx * 4 * 4; // 4 floats per color * 4 floats
gl.vertexAttribPointer(
colorLocation, 4, gl.FLOAT, false, stride, offset);
gl.uniformMatrix4fv(localMatrixLocation, false, faceMatrix);
ext.drawArraysInstancedANGLE(gl.TRIANGLES, 0, numVertices, numCubes);
});
requestAnimationFrame(render);
}
requestAnimationFrame(render);
body { margin: 0; }
canvas { width: 100vw; height: 100vh; display: block; }
<script src="https://twgljs.org/dist/2.x/twgl-full.min.js"></script>
<canvas></canvas>

why is WebGL slower than Canvas 2D in my game?

I am adding WebGL support in my game, but I have a strange problem : it runs even slower than in Canvas 2D rendering mode, and I do not understand why.
I checked on both Firefox PC, Chrome PC, and Chrome Android, they run WebGL demos on the web with hundreds of sprites smoothly though, so I definitly made an error in my code.
Firefox's profiler says the whole game uses only 7% of the ressources, the rendering part takes only 1.2%. It is just the title screen of the game and there are only five sprites to draw. It is slow though...
update : Chrome's profiler says idle is only 4%, program is a huge 93%, and render 2.6%.
When using Canvas 2D things are very different, 76% idle, 16% program, 2.3% for the drawing function.
There definitly is a problem in my WebGL rendering code.
update : Android Chrome's profiler (on JXD S5110) always says program is ~39%, drawArrays is ~ 8%, bufferData ~5%, and bindTexture is 3%. Everything else is quite negligible.
If a function of mines was wasting all the ressources I would know what to do, but here the bottlenecks seem to be "program" (the browser itself ?) and webgl methods, two things I can't edit.
Please someone have a look at my code and tell me what I did wrong.
Here are my shaders
<script id="2d-vertex-shader" type="x-shader/x-vertex">
attribute vec2 a_position;
attribute vec2 a_texCoord;
uniform vec2 u_resolution;
uniform vec2 u_translation;
uniform vec2 u_rotation;
varying vec2 v_texCoord;
void main()
{
// Rotate the position
vec2 rotatedPosition = vec2(
a_position.x * u_rotation.y + a_position.y * u_rotation.x,
a_position.y * u_rotation.y - a_position.x * u_rotation.x);
// Add in the translation.
vec2 position = rotatedPosition + u_translation;
// convert the rectangle from pixels to 0.0 to 1.0
vec2 zeroToOne = a_position / u_resolution;
// convert from 0->1 to 0->2
vec2 zeroToTwo = zeroToOne * 2.0;
// convert from 0->2 to -1->+1 (clipspace)
vec2 clipSpace = zeroToTwo - 1.0;
gl_Position = vec4(clipSpace * vec2(1, -1), 0, 1);
// pass the texCoord to the fragment shader
// The GPU will interpolate this value between points
v_texCoord = a_texCoord;
}
</script>
<script id="2d-fragment-shader" type="x-shader/x-fragment">
precision mediump float;
// our texture
uniform sampler2D u_image;
// the texCoords passed in from the vertex shader.
varying vec2 v_texCoord;
void main()
{
// Look up a color from the texture.
gl_FragColor = texture2D(u_image, vec2(v_texCoord.s, v_texCoord.t));
}
</script>
Here is the creation code of my canvas and their contexts when in WebGL mode.
(I use to use several layered canvas in order to avoid drawing the backgrounds and foregrounds at every frame while they never change, that is why canvas and contexts are in arrays.)
// Get A WebGL context
liste_canvas[c] = document.createElement("canvas") ;
document.getElementById('game_div').appendChild(liste_canvas[c]);
liste_ctx[c] = liste_canvas[c].getContext('webgl',{premultipliedAlpha:false}) || liste_canvas[c].getContext('experimental-webgl',{premultipliedAlpha:false});
var gl = liste_ctx[c] ;
gl.viewport(0, 0, game.res_w, game.res_h);
// setup a GLSL program
gl.vertexShader = createShaderFromScriptElement(gl, "2d-vertex-shader");
gl.fragmentShader = createShaderFromScriptElement(gl, "2d-fragment-shader");
gl.program = createProgram(gl, [gl.vertexShader, gl.fragmentShader]);
gl.useProgram(gl.program);
// look up where the vertex data needs to go.
positionLocation = gl.getAttribLocation(gl.program, "a_position");
texCoordLocation = gl.getAttribLocation(gl.program, "a_texCoord");
// provide texture coordinates for the rectangle.
texCoordBuffer = gl.createBuffer();
gl.bindBuffer(gl.ARRAY_BUFFER, texCoordBuffer);
gl.bufferData(gl.ARRAY_BUFFER, new Float32Array([
0.0, 0.0,
1.0, 0.0,
0.0, 1.0,
0.0, 1.0,
1.0, 0.0,
1.0, 1.0]), gl.STATIC_DRAW);
gl.enableVertexAttribArray(texCoordLocation);
gl.vertexAttribPointer(texCoordLocation, 2, gl.FLOAT, false, 0, 0);
gl.blendFunc(gl.SRC_ALPHA, gl.ONE_MINUS_SRC_ALPHA);
gl.enable( gl.BLEND ) ;
gl.posBuffer = gl.createBuffer();
gl.bindBuffer(gl.ARRAY_BUFFER, gl.posBuffer);
gl.bufferData(gl.ARRAY_BUFFER, new Float32Array([
0.0, 0.0,
1.0, 0.0,
0.0, 1.0,
0.0, 1.0,
1.0, 0.0,
1.0, 1.0]), gl.STATIC_DRAW);
gl.enableVertexAttribArray(positionLocation);
gl.vertexAttribPointer(positionLocation, 2, gl.FLOAT, false, 0, 0);
In the .onload function of my images, I add
var gl = liste_ctx[c] ;
this.texture = gl.createTexture();
gl.bindTexture(gl.TEXTURE_2D, this.texture);
gl.texImage2D(gl.TEXTURE_2D, 0, gl.RGBA, gl.RGBA, gl.UNSIGNED_BYTE, this );
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.LINEAR);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MAG_FILTER, gl.LINEAR);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_S, gl.CLAMP_TO_EDGE);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_T, gl.CLAMP_TO_EDGE);
gl.bindTexture(gl.TEXTURE_2D, null);
And here is the WebGL part of my draw_sprite() function :
var gl = liste_ctx[c] ;
gl.bindTexture(gl.TEXTURE_2D, sprites[d_sprite].texture);
var resolutionLocation = gl.getUniformLocation(gl.program, "u_resolution");
gl.uniform2f(resolutionLocation, liste_canvas[c].width, liste_canvas[c].height);
gl.bindBuffer(gl.ARRAY_BUFFER, gl.posBuffer);
gl.bufferData(gl.ARRAY_BUFFER, new Float32Array([
topleft_x , topleft_y ,
topright_x , topright_y ,
bottomleft_x , bottomleft_y ,
bottomleft_x , bottomleft_y ,
topright_x , topright_y ,
bottomright_x , bottomright_y ]), gl.STATIC_DRAW);
gl.drawArrays(gl.TRIANGLES, 0, 6);
What did I do wrong ?
This may help: What do the "Not optimized" warnings in the Chrome Profiler mean?
Relevant links:
https://groups.google.com/forum/#!topic/v8-users/_oZ4fUSitRY
https://github.com/petkaantonov/bluebird/wiki/Optimization-killers
For "optimized too many times", that means the function parameters / behavior change too much, so Chrome keeps having to re-optimize it.
What was making it so slow was using several webgl canvas, I use only one now and it works way better. But it is still a bit slower than Canvas 2D though, and the profiler says 65% is idle while it lags as hell so I really don't understand...
edit : I think I got it. Since my computer is running WinXP, hardware acceleration for WebGL can't be enabled, so the browsers use software rendering, and that explains why 'program' is huge in Chrome's profiler. However, hardware acceleration seems to work for 2d context, that is why it is faster.

Drawing many shapes in WebGL

I was reading tutorials from here.
<script class = "WebGL">
var gl;
function initGL() {
// Get A WebGL context
var canvas = document.getElementById("canvas");
gl = getWebGLContext(canvas);
if (!gl) {
return;
}
}
var positionLocation;
var resolutionLocation;
var colorLocation;
var translationLocation;
var rotationLocation;
var translation = [50,50];
var rotation = [0, 1];
var angle = 0;
function initShaders() {
// setup GLSL program
vertexShader = createShaderFromScriptElement(gl, "2d-vertex-shader");
fragmentShader = createShaderFromScriptElement(gl, "2d-fragment-shader");
program = createProgram(gl, [vertexShader, fragmentShader]);
gl.useProgram(program);
// look up where the vertex data needs to go.
positionLocation = gl.getAttribLocation(program, "a_position");
// lookup uniforms
resolutionLocation = gl.getUniformLocation(program, "u_resolution");
colorLocation = gl.getUniformLocation(program, "u_color");
translationLocation = gl.getUniformLocation(program, "u_translation");
rotationLocation = gl.getUniformLocation(program, "u_rotation");
// set the resolution
gl.uniform2f(resolutionLocation, canvas.width, canvas.height);
}
function initBuffers() {
// Create a buffer.
var buffer = gl.createBuffer();
gl.bindBuffer(gl.ARRAY_BUFFER, buffer);
gl.enableVertexAttribArray(positionLocation);
gl.vertexAttribPointer(positionLocation, 2, gl.FLOAT, false, 0, 0);
// Set Geometry.
setGeometry(gl);
}
function setColor(red, green, blue) {
gl.uniform4f(colorLocation, red, green, blue, 1);
}
// Draw the scene.
function drawScene() {
// Clear the canvas.
gl.clear(gl.COLOR_BUFFER_BIT);
// Set the translation.
gl.uniform2fv(translationLocation, translation);
// Set the rotation.
gl.uniform2fv(rotationLocation, rotation);
// Draw the geometry.
gl.drawArrays(gl.TRIANGLES, 0, 6);
}
// Fill the buffer with the values that define a letter 'F'.
function setGeometry(gl) {
/*Assume size1 is declared*/
var vertices = [
-size1/2, -size1/2,
-size1/2, size1/2,
size1/2, size1/2,
size1/2, size1/2,
size1/2, -size1/2,
-size1/2, -size1/2 ];
gl.bufferData(
gl.ARRAY_BUFFER,
new Float32Array(vertices),
gl.STATIC_DRAW);
}
function animate() {
translation[0] += 0.01;
translation[1] += 0.01;
angle += 0.01;
rotation[0] = Math.cos(angle);
rotation[1] = Math.sin(angle);
}
function tick() {
requestAnimFrame(tick);
drawScene();
animate();
}
function start() {
initGL();
initShaders();
initBuffers();
setColor(0.2, 0.5, 0.5);
tick();
}
</script>
<!-- vertex shader -->
<script id="2d-vertex-shader" type="x-shader/x-vertex">
attribute vec2 a_position;
uniform vec2 u_resolution;
uniform vec2 u_translation;
uniform vec2 u_rotation;
void main() {
vec2 rotatedPosition = vec2(
a_position.x * u_rotation.y + a_position.y * u_rotation.x,
a_position.y * u_rotation.y - a_position.x * u_rotation.x);
// Add in the translation.
vec2 position = rotatedPosition + u_translation;
// convert the position from pixels to 0.0 to 1.0
vec2 zeroToOne = position / u_resolution;
// convert from 0->1 to 0->2
vec2 zeroToTwo = zeroToOne * 2.0;
// convert from 0->2 to -1->+1 (clipspace)
vec2 clipSpace = zeroToTwo - 1.0;
gl_Position = vec4(clipSpace, 0, 1);
}
</script>
<!-- fragment shader -->
<script id="2d-fragment-shader" type="x-shader/x-fragment">
precision mediump float;
uniform vec4 u_color;
void main() {
gl_FragColor = u_color;
}
</script>
My WebGL program for 1 shape works something like this:
Get a context (gl) from the canvas element.
initialize buffers with the shape of my object
drawScene() : a call to gl.drawArrays()
If there is animation, an update function, which updates my shape's angles, positions,
and then drawScene() both in tick(), so that it gets called repeatedly.
Now when I need more than 1 shape, should I fill the single buffer at once with many objects and then use it to later call drawScene() drawing all the objects at once
[OR]
should I repeated call the initBuffer and drawScene() from requestAnimFrame().
In pseudo code
At init time
Get a context (gl) from the canvas element.
for each shader
create shader
look up attribute and uniform locations
for each shape
initialize buffers with the shape
for each texture
create textures and/or fill them with data.
At draw time
for each shape
if the last shader used is different than the shader needed for this shape call gl.useProgram
For each attribute needed by shader
call gl.enableVertexAttribArray, gl.bindBuffer and gl.vertexAttribPointer for each attribute needed by shape with the attribute locations for the current shader.
For each uniform needed by shader
call gl.uniformXXX with the desired values using the locations for the current shader
call gl.drawArrays or if the data is indexed called gl.bindBuffer(gl.ELEMENT_ARRAY_BUFFER, bufferOfIndicesForCurrentShape) followed by gl.drawElements
Common Optimizations
1) Often you don't need to set every uniform. For example if you are drawing 10 shapes with the same shader and that shader takes a viewMatrix or cameraMatrix it's likely that viewMatrix uniform or cameraMatrix uniform is the same for every shape so just set it once.
2) You can often move the calls to gl.enableVertexAttribArray to initialization time.
Having multiple meshes in one buffer (and rendering them with a single gl.drawArrays() as you're suggesting) yields better performance in complex scenes but obviously at that point you're not able to change shader uniforms (such as transformations) per mesh.
If you want to have the meshes running around independently, you'll have to render each one separately. You could still keep all the meshes in one buffer to avoid some overhead from gl.bindBuffer() calls but imho that won't help that much, at least not in simple scenes.
Create your buffers separately for each object you want on the scene otherwise they won't be able to move and use shader effects independently.
But that is in case your objects are different. From what I got here I think you just want to draw the same shape more than once on different positions right?
The way you go about that is you just set that translationLocation uniform right there with a different translation matrix after drawing the shape for the first time. That way when you draw the shape again it will be located somewhere else and not in top of the other one so you can see it. You can set all those transformation matrices differently and then just call gl.drawElements again since you're going to draw the same buffers that are already in use.

Resources