VAO + VBOs logic for data visualization (boost graph) - ios

I'm using the Boost Graph Library to organize points linked by edges in a graph, and now I'm working on their display.
I'm a newbie in OpenGL ES 2/GLKit and Vertex Array Objects / Vertex Buffer Objects. I followed this tutorial which is really good, but at the end of what I guess I should do is :
Create vertices only once for a "model "instance of a Shape class (the "sprite" representing my boost point position) ;
Use this model to feed VBOs ;
Bind VBOs to a unique VAO ;
Draw everything in a single draw call, changing the matrix for each "sprite".
I've read that accessing VBOs is really bad for performances, and that I should use swapping VBOs.
My questions are :
is the matrix translation/scaling/rotation possible in a single call ?
then, if it is: is my logic good ?
finally: it would be great to have some code examples :-)

If you just want to draw charts, there are much easier libraries to use besides OpenGL ES. But assuming you have your reasons:
Just take a stab at what you've described and test it. If it's good enough then congratulations: you're done.
You don't mention how many graphs, how many points per graph, how often the points are modified, and the frame rate you desire.
If you're updating a few hundred vertices, and they don't change frequently, you might not even need VBOs. Recent hardware can render a lot of sprites even without them. Depends on how many verts and how often they change.
To start, try this:
// Bind the shader
glUseProgram(defaultShaderProgram);
// Set the projection (camera) matrix.
glUniformMatrix4fv(uProjectionMatrix, 1, GL_FALSE, (GLfloat*)projectionMatrix[0]);
for ( /* each chart */ )
{
// Set the sprite (scale/rotate/translate) matrix.
glUniformMatrix4fv(uModelViewMatrix, 1, GL_FALSE, (GLfloat*)spriteViewMatrix[0]);
// Set the vertices.
glVertexAttribPointer(ATTRIBUTE_VERTEX_POSITION, 3, GL_FLOAT, GL_FALSE, sizeof(Vertex), &pVertices->x));
glVertexAttribPointer(ATTRIBUTE_VERTEX_DIFFUSE, 4, GL_UNSIGNED_BYTE, GL_TRUE, sizeof(Vertex), &pVertices->color));
// Render. Assumes your shader does not use a texture,
// since we did not set one.
glDrawArrays(GL_TRIANGLES, 0, numVertices);
}

Related

OpenGL slows down when rendering nearby objects on top of others

I am writing an iOS app using OpenGL ES 2.0 to render a number of objects to the screen.
Currently, those objects are simple shapes (squares, spheres, and cylinders).
When none of the objects overlap each other, the program runs smoothly at 30 fps.
My problem arises when I add objects that appear behind the rest of my models (a background rectangle, for example). When I attempt to draw a background rectangle, I can only draw objects in front of it that take up less than half the screen. Any larger than that and the frame rate drops to between 15 and 20 fps.
As it stands, all of my models, including the background, are drawn with the following code:
- (void)drawSingleModel:(Model *)model
{
//Create a model transform matrix.
CC3GLMatrix *modelView = [CC3GLMatrix matrix];
//Transform model view
// ...
//Pass matrix to shader.
glUniformMatrix4fv(_modelViewUniform, 1, 0, modelView.glMatrix);
//Bind the correct buffers to openGL.
glBindBuffer(GL_ARRAY_BUFFER, [model vertexBuffer]);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, [model indexBuffer]);
glVertexAttribPointer(_positionSlot, 3, GL_FLOAT, GL_FALSE, sizeof(Vertex), 0);
glVertexAttribPointer(_colorSlot, 4, GL_FLOAT, GL_FALSE, sizeof(Vertex), (GLvoid*) (sizeof(float) * 3));
//Load vertex texture coordinate attributes into the texture buffer.
glVertexAttribPointer(_texCoordSlot, 2, GL_FLOAT, GL_FALSE, sizeof(Vertex), (GLvoid*) (sizeof(float) * 7));
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, [model textureIndex]);
glUniform1i(_textureUniform, 0);
glDrawElements([model drawMode], [model numIndices], GL_UNSIGNED_SHORT, 0);
}
This code is called from my draw method, which is defined as follows:
- (void)draw
{
glUseProgram(_programHandle);
//Perform OpenGL rendering here.
glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
glEnable(GL_BLEND);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
glEnable(GL_DEPTH_TEST);
glEnable(GL_CULL_FACE);
_camera = [CC3GLMatrix matrix];
//Camera orientation code.
//...
//Pass the camera matrix to the shader program.
glUniformMatrix4fv(_projectionUniform, 1, 0, _camera.glMatrix);
glViewport(0, 0, self.frame.size.width, self.frame.size.height);
//Render the background.
[self drawSingleModel:_background];
//Render the objects.
for(int x = 0; x < [_models count]; ++x)
{
[self drawSingleModel:[_models objectAtIndex:x]];
}
//Send the contents of the render buffer to the UI View.
[_context presentRenderbuffer:GL_RENDERBUFFER];
}
I found that by changing the render order as follows:
for(int x = 0; x < [_models count]; ++x)
{
[self drawSingleModel:[_models objectAtIndex:x]];
}
[self drawSingleModel:_background];
my frame rate when rendering on top of the background is 30 fps.
Of course, the slowdown still occurs if any objects in _models must render in front of each other. Additionally, rendering in this order causes translucent and transparent objects to be drawn black.
I'm still somewhat new to OpenGL, so I don't quite know where my problem lies. My assumption is that there is a slowdown in performing depth testing, and I also realize I'm working on a mobile device. But I can't believe that iOS devices are simply too slow to do this. The program is only rendering 5 models, with around 180 triangles each.
Is there something I'm not seeing, or some sort of workaround for this?
Any suggestions or pointers would be greatly appreciated.
You're running in one of the peculiarities of mobile GPUs: Those things (except the NVidia Tegra) don't do depth testing for hidden surface removal. Most mobile GPUs, including the one in the iPad are tile based rasterizers. The reason for this is to save memory bandwidth, because memory access is actually a power intensive operation. In the power constrained environment of a mobile device reducing required memory bandwidth gains significant battery lifetime.
Tile based renderers split up the viewport into a number of tiles. When sending geometry into it, it is split into the tiles and then for each tile it is intersected with the the geometry already in the tile. Most of the time the tile is covered by only a single primitive. If the incoming primitive happens to be in front of the already present geometry it replaces it. If there's a cutting intersection a new edge is added. Only if a certain threshold of number of edges is reached, that single tile will switch to depth testing mode.
Only at synchronization points the prepared tiles are rasterized, then.
Now it's obvious why overlapping objects reduce rendering performance: The more primitives overlap, the more preprocessing has to be done to setup the tiles.
See "transparency sorting"/"alpha sorting".
I suspect the slowness you're seeing is largely due to "overdraw", i.e. framebuffer pixels being drawn more than once. This is worst when you draw the scene back-to-front, since the depth test always passes. While the iPhone 4/4S/5 may have a beefy GPU, last I checked the memory bandwidth was pretty terrible (I don't know how big the GPU cache is).
If you render front-to-back, the problem is that transparent pixels still write to the depth buffer, causing them to occlude polys behind them. You can reduce this slightly (but only slightly) using the alpha test.
The simple solution: Render opaque polys approximately front-to-back and then transparent polys back-to-front. This may mean making two passes through your scene, and ideally you want to sort the transparent polys which isn't that easy to do well.
I think it's also possible (in principle) to render everything front-to-back and perform alpha testing on the destination alpha, but I don't think OpenGL supports this.

iOS OpenGL ES to draw a mesh wireframe

I have a human model in an .OBJ file I want to display as a mesh with triangles. No textures. I want also to be able to move, scale, and rotate in 3D.
The first and working option is to project the vertices to 2D using the maths manually and then draw them with Quartz 2D. This works, for I know the underlying math concepts for perspective projection.
However I would like to use OpenGL ES for that method, but I am not sure how to draw the triangles.
For example, the code in - (void)drawRect:(CGRect)rect is:
glClearColor(1,0,0,0);
glClear(GL_COLOR_BUFFER_BIT);
GLKBaseEffect *effect = [[GLKBaseEffect alloc] init];
[effect prepareToDraw];
glEnable(GL_DEPTH_TEST);
glEnable(GL_CULL_FACE);
Now what? I have an array of vertex positions (3 floats per vertex) and an array of triangle indices, so I tried this:
glEnableClientState(GL_VERTEX_ARRAY);
glVertexPointer(3, GL_FLOAT, numVertices,pVertices);
glDrawElements(GL_TRIANGLES, numTriangles, GL_UNSIGNED_INT,pTriangles);
but this doesn't show anything. I saw from a sample the usage of glEnableVertexAttribArray(GLKVertexAttribPosition) and glDrawArrays but I 'm not sure how to use them.
I also understand that rendering a wireframe is not possible with ES? So I have to apply color attributes to the vertices. That's ok, but before that the triangles have to be displayed first.
The first thing I'd ask is: where are your vertices? OpenGL (ES) draws in a coordinate space that extends from (-1, -1, -1) to (1, 1, 1), so you probably want to transform your points with a projection matrix to get them into that space. To learn about projection matrices and more of the basics of OpenGL ES 2.0 on iOS, I'd suggest finding a book or a tutorial. This one's not bad, and here's another that's specific to GLKit.
Drawing with OpenGL in drawRect: is probably not something you want to be doing. If you're already using GLKit, why not use GLKView? There's good example code to get you started if you create a new Xcode project with the "OpenGL Game" template.
Once you get up to speed with GL you'll find that the function glPolygonMode typically used for wireframe drawing on desktop OpenGL doesn't exist in OpenGL ES. Depending on how your vertex data is organized, though, you might be able to get a decent wireframe with GL_LINES or GL_LINE_LOOP. Or since you're using GLKit, you can skip wireframe and set up some lights and shading pretty easily with GLKBaseEffect.

optimizing openGL ES 2.0 2D texture output and framerate

I was hoping someone can help me make some progress in some texture benchmarks I'm doing in OpenGL ES 2.0 on and iPhone 4.
I have an array that contains sprite objects. the render loop cycles through all the sprites per texture, and retrieves all their texture coords and vertex coords. it adds those to a giant interleaved array, using degenerate vertices and indices, and sends those to the GPU (I'm embedding code are the bottom). This is all being done per texture so I'm binding the texture once and then creating my interleave array and then drawing it. Everything works just great and the results on the screen are exactly what they should be.
So my benchmark test is done by adding 25 new sprites per touch at varying opacities and changing their vertices on the update so that they are bouncing around the screen while rotation and running OpenGL ES Analyzer on the app.
Heres where I'm hoping for some help....
I can get to around 275 32x32 sprites with varying opacity bouncing around the screen at 60 fps. By 400 I'm down to 40 fps. When i run the OpenGL ES Performance Detective it tells me...
The app rendering is limited by triangle rasterization - the process of converting triangles into pixels. The total area in pixels of all of the triangles you are rendering is too large. To draw at a faster frame rate, simplify your scene by reducing either the number of triangles, their size, or both.
Thing is i just whipped up a test in cocos2D using CCSpriteBatchNode using the same texture and created 800 transparent sprites and the framerate is an easy 60fps.
Here is some code that may be pertinent...
Shader.vsh (matrixes are set up once in the beginning)
void main()
{
gl_Position = projectionMatrix * modelViewMatrix * position;
texCoordOut = texCoordIn;
colorOut = colorIn;
}
Shader.fsh (colorOut is used to calc opacity)
void main()
{
lowp vec4 fColor = texture2D(texture, texCoordOut);
gl_FragColor = vec4(fColor.xyz, fColor.w * colorOut.a);
}
VBO setup
glGenBuffers(1, &_vertexBuf);
glGenBuffers(1, &_indiciesBuf);
glGenVertexArraysOES(1, &_vertexArray);
glBindVertexArrayOES(_vertexArray);
glBindBuffer(GL_ARRAY_BUFFER, _vertexBuf);
glBufferData(GL_ARRAY_BUFFER, sizeof(TDSEVertex)*12000, &vertices[0].x, GL_DYNAMIC_DRAW);
glEnableVertexAttribArray(GLKVertexAttribPosition);
glVertexAttribPointer(GLKVertexAttribPosition, 2, GL_FLOAT, GL_FALSE, sizeof(TDSEVertex), BUFFER_OFFSET(0));
glEnableVertexAttribArray(GLKVertexAttribTexCoord0);
glVertexAttribPointer(GLKVertexAttribTexCoord0, 2, GL_FLOAT, GL_FALSE, sizeof(TDSEVertex), BUFFER_OFFSET(8));
glEnableVertexAttribArray(GLKVertexAttribColor);
glVertexAttribPointer(GLKVertexAttribColor, 4, GL_FLOAT, GL_FALSE, sizeof(TDSEVertex), BUFFER_OFFSET(16));
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, _indiciesBuf);
glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(ushort)*12000, indicies, GL_STATIC_DRAW);
glBindVertexArrayOES(0);
Update Code
/*
Here it cycles through all the sprites, gets their vert info (includes coords, texture coords, and color) and adds them to this giant array
The array is of...
typedef struct{
float x, y;
float tx, ty;
float r, g, b, a;
}TDSEVertex;
*/
glBindBuffer(GL_ARRAY_BUFFER, _vertexBuf);
//glBufferSubData(GL_ARRAY_BUFFER, sizeof(vertices[0])*(start), sizeof(TDSEVertex)*(indicesCount), &vertices[start]);
glBufferData(GL_ARRAY_BUFFER, sizeof(TDSEVertex)*indicesCount, &vertices[start].x, GL_DYNAMIC_DRAW);
glBindBuffer(GL_ARRAY_BUFFER, 0);
Render Code
GLKTextureInfo* textureInfo = [[TDSETextureManager sharedTextureManager].textures objectForKey:textureName];
glBindTexture(GL_TEXTURE_2D, textureInfo.name);
glBindVertexArrayOES(_vertexArray);
glDrawElements(GL_TRIANGLE_STRIP, indicesCount, GL_UNSIGNED_SHORT, BUFFER_OFFSET(start));
glBindVertexArrayOES(0);
Heres a screenshot at 400 sprites (800 triangles + 800 degenerate triangles) to give an idea of the opacity layering as the textures are moving...
Again i should note that a VBO is being created and sent per texture so Im binding and then drawing only twice per frame (since there are only two textures).
Sorry if this is overwhelming but its my first post on here and wanted to be thorough.
Any help would be much appreciated.
PS, i know that i could just use Cocos2D instead of writing everything from scratch, but wheres the fun(and learning) in that?!
UPDATE #1
When i switch my fragment shader to only be
gl_FragColor = texture2D(texture, texCoordOut);
it gets to 802 sprites at 50fps (4804 triangles including degenerate triangles), though setting sprite opacity is lost.. Any suggestions as to how I can still handle opacity in my shader without running at 1/4th the speed?
UPDATE #2
So i ditched GLKit's View and View controller and wrote a custom view loaded from the AppDelegate. 902 sprites with opacity & transparency at 60fps.
Mostly miscellaneous thoughts...
If you're triangle limited, try switching from GL_TRIANGLE_STRIP to GL_TRIANGLES. You're still going to need to specify exactly the same number of indices — six per quad — but the GPU never has to spot that the connecting triangles between quads are degenerate (ie, it never has to convert them into zero pixels). You'll need to profile to see whether you end up paying a cost for no longer implicitly sharing edges.
You should also shrink the footprint of your vertices. I would dare imagine you can specify x, y, tx and ty as 16-bit integers, and your colours as 8-bit integers without any noticeable change in rendering. That would reduce the footprint of each vertex from 32 bytes (eight components, each four bytes in size) to 12 bytes (four two-byte values plus four one-byte values, with no padding needed because everything is already aligned) — cutting almost 63% of the memory bandwidth costs there.
As you actually seem to be fill-rate limited, you should consider your source texture too. Anything you can do to trim its byte size will directly help texel fetches and hence fill rate.
It looks like you're using art that is consciously about the pixels so switching to PVR probably isn't an option. That said, people sometimes don't realise the full benefit of PVR textures; if you switch to, say, the 4 bits per pixel mode then you can scale your image up to be twice as wide and twice as tall so as to reduce compression artefacts and still only be paying 16 bits on each source pixel but likely getting a better luminance range than a 16 bpp RGB texture.
Assuming you're currently using a 32 bpp texture, you should at least see whether an ordinary 16 bpp RGB texture is sufficient using any of the provided hardware modes (especially if the 1 bit of alpha plus 5 bits per colour channel is appropriate to your art, since that loses only 9 bits of colour information versus the original while reducing bandwidth costs by 50%).
It also looks like you're uploading indices every single frame. Upload only when you add extra objects to the scene or if the buffer as last uploaded is hugely larger than it needs to be. You can just limit the count passed to glDrawElements to cut back on objects without a reupload. You should also check whether you actually gain anything by uploading your vertices to a VBO and then reusing them if they're just changing every frame. It might be faster to provide them directly from client memory.

OpenGL ES 2 (iOS) Morph / Animate between two set of vertexes

I have two sets of vertexes used as a line strip:
Vertexes1
Vertexes2
It's important to know that these vertexes have previously unknown values, as they are dynamic.
I want to make an animated transition (morph) between these two. I have come up with two different ways of doing this:
Option 1:
Set a Time uniform in the vertex shader, that goes from 0 - 1, where I can do something like this:
// Inside main() in the vertex shader
float originX = Position.x;
float destinationX = DestinationVertexPosition.x;
float interpolatedX = originX + (destinationX - originX) * Time;
gl_Position.x = interpolatedX;
As you probably see, this has one problem: How do I get the "DestinationVertexPosition" in there?
Option 2:
Make the interpolation calculation outside the vertex shader, where I loop through each vertex and create a third vertex set for the interpolated values, and use that to render:
// Pre render
// Use this vertex set to render
InterpolatedVertexes
for (unsigned int i = 0; i < vertexCount; i++) {
float originX = Vertexes1[i].x;
float destinationX = Vertexes2[i].x;
float interpolatedX = originX + (destinationX - originX) * Time;
InterpolatedVertexes[i].x = interpolatedX;
}
I have highly simplified these two code snippets, just to make the idea clear.
Now, from the two options, I feel like the first one is definitely better in terms of performance, given stuff happens at the shader level, AND I don't have to create a new set of vertexes each time the "Time" is updated.
So, now that the introduction to the problem has been covered, I would appreciate any of the following three things:
A discussion of better ways of achieving the desired results in OpenGL ES 2 (iOS).
A discussion about how Option 1 could be implemented properly, either by providing the "DestinationVertexPosition" or by modifying the idea somehow, to better achieve the same result.
A discussion about how Option 2 could be implemented.
In ES 2 you specify such attributes as you like — there's therefore no problem with specifying attributes for both origin and destination, and doing the linear interpolation between them in the vertex shader. However you really shouldn't do it component by component as your code suggests you want to as GPUs are vector processors, and the mix GLSL function will do the linear blend you want. So e.g. (with obvious inefficiencies and assumptions)
int sourceAttribute = glGetAttribLocation(shader, "sourceVertex");
glVertexAttribPointer(sourceAttribute, 3, GL_FLOAT, GL_FALSE, 0, sourceLocations);
int destAttribute = glGetAttribLocation(shader, "destVertex");
glVertexAttribPointer(destAttribute, 3, GL_FLOAT, GL_FALSE, 0, destLocations);
And:
gl_Position = vec4(mix(sourceVertex, destVertex, Time), 1.0);
Your two options here have a trade off: supply twice the geometry once and interpolate between that, or supply only one set of geometry, but do so for each frame. You have to weigh geometry size vs. upload bandwidth.
Given my experience with iOS devices, I'd highly recommend option 1. Uploading new geometry on every frame can be extremely expensive on these devices.
If the vertices are constant, you can upload them once into one or two vertex buffer objects (VBOs) with the GL_STATIC_DRAW flag set. The PowerVR SGX series has hardware optimizations for dealing with static VBOs, so they are very fast to work with after the initial upload.
As far as how to upload two sets of vertices for use in a single shader, geometry is just another input attribute for your shader. You could have one, two, or more sets of vertices fed into a single vertex shader. You just define the attributes using code like
attribute vec3 startingPosition;
attribute vec3 endingPosition;
and interpolate between them using code like
vec3 finalPosition = startingPosition * (1.0 - fractionalProgress) + endingPosition * fractionalProgress;
Edit: Tommy points out the mix() operation, which I'd forgotten about and is a better way to do the above vertex interpolation.
In order to inform your shader program as to where to get the second set of vertices, you'd use pretty much the same glVertexAttribPointer() call for the second set of geometry as the first, only pointing to that VBO and attribute.
Note that you can perform this calculation as a vector, rather than breaking out all three components individually. This doesn't get you much with a highp default precision on current PowerVR SGX chips, but could be faster on future ones than doing this one component at a time.
You might also want to look into other techniques used for vertex skinning, because there might be other ways of animating vertices that don't require two full sets of vertices to be uploaded.
The one case that I've heard where option 2 (uploading new geometry on each frame) might be preferable is in specific cases where using the Accelerate framework to do vector manipulation of the geometry ends up being faster than doing the skinning on-GPU. I remember the Unity folks were talking about this once, but I can't remember if it was for really small or really large sets of geometry. Option 1 has been faster in all the cases I've worked with myself.

Automatically calculate normals in GLKit/OpenGL-ES

I'm making some fairly basic shapes in OpenGL-ES based on sample code from Apple. They've used an array of points, with an array of indices into the first array and each set of three indices creates a polygon. That's all great, I can make the shapes I want. To shade the shapes correctly I believe I need to calculate normals for each vertex on each polygon. At first the shapes were cuboidal so it was very easy, but now I'm making (slightly) more advanced shapes I want to create those normals automatically. It seems easy enough if I get vectors for two edges of a polygon (all polys are triangles here) and use their cross product for every vertex on that polygon. After that I use code like below to draw the shape.
glEnableVertexAttribArray(GLKVertexAttribPosition);
glVertexAttribPointer(GLKVertexAttribPosition, 3, GL_FLOAT, GL_FALSE, 0, triangleVertices);
glEnableVertexAttribArray(GLKVertexAttribColor);
glVertexAttribPointer(GLKVertexAttribColor, 4, GL_FLOAT, GL_FALSE, 0, triangleColours);
glEnableVertexAttribArray(GLKVertexAttribNormal);
glVertexAttribPointer(GLKVertexAttribNormal, 3, GL_FLOAT, GL_FALSE, 0, triangleNormals);
glDrawArrays(GL_TRIANGLES, 0, 48);
glDisableVertexAttribArray(GLKVertexAttribPosition);
glDisableVertexAttribArray(GLKVertexAttribColor);
glDisableVertexAttribArray(GLKVertexAttribNormal);
What I'm having trouble understanding is why I have to do this manually. I'm sure there are cases when you'd want something other than just a vector perpendicular to the surface, but I'm also sure that this is the most popular use case by far, so shouldn't there be an easier way? Have I missed something obvious? glCalculateNormals() would be great.
//And here is an answer
Pass in a GLKVector3[] that you wish to be filled with your normals, another with the vertices (each three are grouped into polygons) and then the count of the vertices.
- (void) calculateSurfaceNormals: (GLKVector3 *) normals forVertices: (GLKVector3 *)incomingVertices count:(int) numOfVertices
{
for(int i = 0; i < numOfVertices; i+=3)
{
GLKVector3 vector1 = GLKVector3Subtract(incomingVertices[i+1],incomingVertices[i]);
GLKVector3 vector2 = GLKVector3Subtract(incomingVertices[i+2],incomingVertices[i]);
GLKVector3 normal = GLKVector3Normalize(GLKVector3CrossProduct(vector1, vector2));
normals[i] = normal;
normals[i+1] = normal;
normals[i+2] = normal;
}
}
And again the answer is: OpenGL is neither a scene managment library nor a geometry library, but just a drawing API that draws nice pictures to the screen. For lighting it needs normals and you give it the normals. That's all. Why should it compute normals if this can just be done by the user and has nothing to do with the actual drawing?
Often you don't compute them at runtime anyway, but load them from a file. And there are many many ways to compute normals. Do you want per-face normals or per-vertex normals? Do you need any specific hard edges or any specific smooth patches? If you want to average face normals to get vertex normals, how do you want to average these?
And with the advent of shaders and the removing of the builtin normal attribute and lighting computations in newer OpenGL versions, this whole question becomes obsolete anyway as you can do lighting any way you want and don't neccessarily need traditional normals anymore.
By the way, it sounds like at the moment you are using per-face normals, which means every vertex of a face has the same normal. This creates a very faceted model with hard edges and also doesn't work very well together with indices. If you want a smooth model (I don't know, maybe you really want a faceted look), you should average the face normals of the adjacent faces for each vertex to compute per-vertex normals. That would actually be the more usual use-case and not per-face normals.
So you can do something like this pseudo-code:
for each vertex normal:
intialize to zero vector
for each face:
compute face normal using cross product
add face normal to each vertex normal of this face
for each vertex normal:
normalize
to generate smooth per-vertex normals. Even in actual code this should result in something between 10 and 20 lines of code, which isn't really complex.

Resources