I queries the pointSize range gl.getParameter(gl.ALIASED_POINT_SIZE_RANGE) and got [1,1024] this means, that using this point to cover a texture (so it triggers the fragment shader to draw all pixels spans by the pointSize
at best, using this method i cannot render images larger then 1024x1024, ?
I guess i have to bind 2 triangles (6 points) to the fragment shader so it covers all of clipspace and then gl.viewport(x, y, width, height); will map this entire area to the output texture (frame buffer object or canvas)?
is there any other way (maybe something new in webgl2) other then using an attribute in the fragment shader?
Correct, the largest size area you can render with a single point is whatever is returned by gl.getParameter(gl.ALIASED_POINT_SIZE_RANGE)
The spec does not require any size larger than 1. The fact that your GPU/Driver/Browser returned 1024 does not mean that your users' machines will also return 1024.
note: Answering based on your history of questions
The normal thing to do in WebGL for 99% off all cases is to submit vertices. Want to draw a quad, submit 4 vertices and 6 indices or 6 vertex. Want to draw a triangle, submit 3 vertices. Want to draw a circle, submit the vertices for a circle. Want to draw a car, submit the vertices for a car or more likely submit the vertices for a wheel, draw 4 wheels with those vertices, submit the vertices for other parts of the car, draw each part of the car.
You multiply those vertices by some matrices to move, scale, rotate, and project them into 2D or 3D space. All your favorite games do this. The canvas 2D api does this via OpenGL ES internally. Chrome itself does this to render all the parts of this webpage. That's the norm. Anything else is an exception and will likely lead to limitations.
For fun, in WebGL2, there are some other things you can do. They are not the normal thing to do and they are not recommended to actually solve real world problems. They can be fun though just for the challenge.
In WebGL2 there is an global variable in the vertex shader called gl_VertexID which is the count of the vertex currently being processed. You can use that with clever math to generate vertices in the vertex shader with no other data.
Here's some code that draws a quad that covers the canvas
function main() {
const gl = document.querySelector('canvas').getContext('webgl2');
const vs = `#version 300 es
void main() {
int x = gl_VertexID % 2;
int y = (gl_VertexID / 2 + gl_VertexID / 3) % 2;
gl_Position = vec4(ivec2(x, y) * 2 - 1, 0, 1);
const fs = `#version 300 es
precision mediump float;
out vec4 outColor;
void main() {
outColor = vec4(1, 0, 0, 1);
// compile shaders, link program
const prg = twgl.createProgram(gl, [vs, fs]);
const count = 6;
gl.drawArrays(gl.TRIANGLES, 0, count);
<script src="https://twgljs.org/dist/4.x/twgl.min.js"></script>
Example: And one that draws a circle
function main() {
const gl = document.querySelector('canvas').getContext('webgl2');
const vs = `#version 300 es
#define PI radians(180.0)
void main() {
int triangleId = gl_VertexID / 3;
int pointId = gl_VertexID % 3;
int pointIdOffset = pointId % 2;
float angle = float((triangleId + pointIdOffset) * 2) * PI /
float radius = 1. - step(1.5, float(pointId));
float x = sin(angle) * radius;
float y = cos(angle) * radius;
gl_Position = vec4(x, y, 0, 1);
const fs = `#version 300 es
precision mediump float;
out vec4 outColor;
void main() {
outColor = vec4(1, 0, 0, 1);
// compile shaders, link program
const prg = twgl.createProgram(gl, [vs, fs]);
const count = 300; // 100 triangles, 3 points each
gl.drawArrays(gl.TRIANGLES, 0, 300);
<script src="https://twgljs.org/dist/4.x/twgl.min.js"></script>
There is an entire website based on this idea. The site is based on the puzzle of making pretty pictures given only an id for each vertex. It's the vertex shader equivalent of shadertoy.com. On Shadertoy.com the puzzle is basically given only gl_FragCoord as input to a fragment shader write a function to draw something interesting.
Both sites are toys/puzzles. Doing things this way is not recommended for solving real issues like drawing a 3D world in a game, doing image processing, rendering the contents of a browser window, etc. They are cute puzzles on given only minimal inputs, drawing something interesting.
Why is this technique not advised? The most obvious reason is it's hard coded and inflexible where as the standard techniques are super flexible. For example above to draw a fullscreen quad required one shader. To draw a circle required a different shader. Where a standard vertex buffer based attributes multiplied by matrices can be used for any shape provided, 2d or 3d. Not just any shape, with just a simple single matrix multiply in the shader those shapes can be translated, rotated, scaled, projected into 3D, there rotation centers and scale centers can be independently set, etc.
Note: you are free to do whatever you want. If you like these techniques then by all means use them. The reason I'm trying to steer you away form them is based on your previous questions you're new to WebGL and I feel like you'll end up making WebGL much harder for yourself if you use obscure and hard coded techniques like these instead of the traditional more common flexible techniques that experienced devs use to get real work done. But again, it's up to you, do whatever you want.
If I have rendered data into a R32F texture (of 2^18 (~250,000) texels) and I want to compute the sum of these values, is it possible to do this by asking the gpu to generate a mipmap?
(the idea being that the smallest mipmap level would have a single texel that contains the average of all the original texels)
What mipmap settings (clamp, etc) would I use to generate the correct average?
I'm not so good with webgl gymnastics, and would appreciate a snippet of how one would render into a R32F texture the numbers from 1 to 2^18 and then produce a sum over that texture.
For this number of texels, would this approach be faster than trying to transfer the texels back to the cpu and performing the sum in javascript?
There are no settings that define the algorithm used to generate mipmaps. Clamp settings, filter settings have no effect. There's only a hint you can set with gl.hint on whether to prefer quality over performance but a driver has no obligation to even pay attention to that flag. Further, every driver is different. The results of generating mipmaps is one of the differences used to fingerprint WebGL.
In any case if you don't care about the algorithm used and you just want to read the result of generating mipmaps then you just need to attach the last mip to a framebuffer and read the pixel after calling gl.generateMipmap.
You likely wouldn't render into a texture all the numbers from 1 to 2^18 but that's not hard. You'd just draw a single quad 512x512. The fragment shader could look like this
#version 300 es
precision highp float;
out vec4 fragColor;
void main() {
float i = 1. + gl_FragCoord.x + gl_FragCoord.y * 512.0;
fragColor = vec4(i, 0, 0, 0);
Of course you could pass in that 512.0 as a uniform if you wanted to work with other sizes.
Rendering to a floating point texture is an optional feature of WebGL2. Desktops support it but as of 2018 most mobile devices do not. Similarly being able to filter a floating point texture is also an optional feature which is also usually not supported on most mobile devices as of 2018 but is on desktop.
function main() {
const gl = document.createElement("canvas").getContext("webgl2");
if (!gl) {
alert("need webgl2");
const ext = gl.getExtension("EXT_color_buffer_float");
if (!ext) {
alert("can not render to floating point textures");
const ext = gl.getExtension("OES_texture_float_linear");
if (!ext) {
alert("can not filter floating point textures");
// create a framebuffer and attach an R32F 512x512 texture
const numbersFBI = twgl.createFramebufferInfo(gl, [
{ internalFormat: gl.R32F, minMag: gl.NEAREST },
], 512, 512);
const vs = `
#version 300 es
in vec4 position;
void main() {
gl_Position = position;
const fillFS = `
#version 300 es
precision highp float;
out vec4 fragColor;
void main() {
float i = 1. + gl_FragCoord.x + gl_FragCoord.y * 512.0;
fragColor = vec4(i, 0, 0, 0);
// creates a buffer with a single quad that goes from -1 to +1 in the XY plane
// calls gl.createBuffer, gl.bindBuffer, gl.bufferData
const quadBufferInfo = twgl.primitives.createXYQuadBufferInfo(gl);
const fillProgramInfo = twgl.createProgramInfo(gl, [vs, fillFS]);
// calls gl.bindBuffer, gl.enableVertexAttribArray, gl.vertexAttribPointer
twgl.setBuffersAndAttributes(gl, fillProgramInfo, quadBufferInfo);
// tell webgl to render to our texture 512x512 texture
// calls gl.bindBuffer and gl.viewport
twgl.bindFramebufferInfo(gl, numbersFBI);
// draw 2 triangles (6 vertices)
gl.drawElements(gl.TRIANGLES, 6, gl.UNSIGNED_SHORT, 0);
// compute the last mip level
const miplevel = Math.log2(512);
// get the texture twgl created above
const texture = numbersFBI.attachments[0];
// create a framebuffer with the last mip from
// the texture
const readFBI = twgl.createFramebufferInfo(gl, [
{ attachment: texture, level: miplevel },
gl.bindTexture(gl.TEXTURE_2D, texture);
// try each hint to see if there is a difference
['DONT_CARE', 'NICEST', 'FASTEST'].forEach((hint) => {
gl.hint(gl.GENERATE_MIPMAP_HINT, gl[hint]);
// read the result.
const result = new Float32Array(4);
gl.readPixels(0, 0, 1, 1, gl.RGBA, gl.FLOAT, result);
log('mip generation hint:', hint);
log('average:', result[0]);
log('average * count:', result[0] * 512 * 512);
log(' ');
function log(...args) {
const elem = document.createElement('pre');
elem.textContent = [...args].join(' ');
pre {margin: 0}
<script src="https://twgljs.org/dist/4.x/twgl-full.min.js"></script>
Note I used twgl.js to make the code less verbose. If you don't know how to make a framebuffer and attach textures or how to setup buffers and attributes, compile shaders, and set uniforms then you're asking way too broad a question and I suggest you go read some tutorials.
Let me point how there's no guarantee this method is faster than others. First off it's up to the driver. It's possible the driver does this in software (though unlikely).
one obvious speed up is to use RGBAF32 and let the code do 4 values at a time then read all 4 channels (R,G,B,A) at the end and sum those.
Also since you only care about the last 1x1 pixel mip your asking the code to render a lot more pixels than a more direct method. Really you only need to render 1 pixel, the result. But for this example of 2^18 values which is a 512x512 texture that means a 256x526, a 128x128, a 64x64, a 32x32, a 16x16, a 8x8, a 4x4, and a 2x2 mip are all allocated and computed which is arguably wasted time. In fact the spec says all mips are generated from the first mip. Of course a driver is free to take shortcuts and most likely generates mip N from mip N-1 as the result will be similar but that's not how the spec is defined. But, even generating one mip from the previous is 87380 values computed you didn't care about.
I'm only guessing it would be faster to generate in larger chucks than 2x2. At the same time there are texture caches and if I understand correctly they usually cache a rectangular part of a texture so that reading 4 values from a mip is fast. When you have a texture cache miss it can really kill your performance. So, if your chunks are too large it's possible you'd have lots of cache misses. You'd basically have to test and each GPU would likely show different performance characteristics.
Yet another speed up would be to consider using multiple drawing buffers then you can write 16 to 32 values per fragment shader iteration instead of just 4.
I'm making a drawing application using swift (based on GLPaint) and open gl. Now I would like to improve the curve so that it varies with stroke speed (in eg thicker if drawing fast)
However, since my knowledge in open gl is quite limited I need some guidance. What I want to do is to vary the size of my texture/point for each CGPoint I calculate and add to the screen. Is it possible?
func addQuadBezier(var from:CGPoint, var ctrl:CGPoint, var to:CGPoint, startTime:CGFloat, endTime:CGFloat) {
scalePoints(from: from, ctrl: ctrl, to: to)
let pointCount = calculatePointsNeeded(from: from, to: to, min: 16.0, max: 256.0)
var vertexBuffer: [GLfloat] = [GLfloat](count: Int(pointCount), repeatedValue:0.0)
var t : CGFloat = startTime + 0.0002
for i in 0..<Int(pointCount) {
let p = calculatePoint(from:from, ctrl: ctrl, to: to)
vertexBuffer.insert(p.x.f, atIndex: i*2)
vertexBuffer.insert(p.y.f, atIndex: i*2+1)
t += (CGFloat(1)/CGFloat(pointCount))
glBufferData(GL_ARRAY_BUFFER.ui, Int(pointCount)*2*sizeof(GLfloat), vertexBuffer, GL_STATIC_DRAW.ui)
glDrawArrays(GL_POINTS.ui, 0, Int(pointCount).i)
func render()
where render() is called every 1/60 s.
attribute vec4 inVertex;
uniform mat4 MVP;
uniform float pointSize;
uniform lowp vec4 vertexColor;
varying lowp vec4 color;
void main()
gl_Position = MVP * inVertex;
gl_PointSize = pointSize;
color = vertexColor;
Thanks in advance!
In your vertex shader, set gl_pointSize to the width you want. That measurement is in framebuffer pixels, so if the size of your framebuffer changes with the device's scale factor, you'll need to adjust your point size appropriately.
If you find a way to control the line width in the vertex shader it would most likely be the best solution. Not only the lines would have different width but even a single line may have an increasing width (interpolated) between the points. I am not sure you will be able to achieve this on your platform though.
So if you do find a way you would add the point size to your buffer and use it with a new attribute in the vertex shader.
If not you will need to use triangles to draw the line which is generally a better practice anyway. To define vertices between point A and B you can get the normal as W = (B-A).normalized(), normal = N = (W.y, -W.x). Then the 4 positions are k = lineWidth/2.0, t1 = A + N*k, t2 = A - N*k, t3 = B + N*k, t4 = B - N*k. So this is what you add into your buffer and draw as a triangle strip or triangles depending on what you are looking for.
I have an OpenGL view that displays a set of 3D points with some basic shaders:
// Fragment Shader
static const char* PointFS = STRINGIFY
void main(void)
gl_FragColor = vec4(0.8, 0.8, 0.8, 1.0);
// Vertex Shader
static const char* PointVS = STRINGIFY
uniform mediump mat4 uProjectionMatrix;
attribute mediump vec4 position;
void main( void )
gl_Position = uProjectionMatrix * position;
gl_PointSize = 3.0;
And the MVP matrix is calculated as:
- (void)setMatrices
// ModelView Matrix
GLKMatrix4 modelViewMatrix = GLKMatrix4Identity;
modelViewMatrix = GLKMatrix4Scale(modelViewMatrix, 2, 2, 2);
// Projection Matrix
const GLfloat aspectRatio = (GLfloat)(self.view.bounds.size.width) / (GLfloat)(self.view.bounds.size.height);
const GLfloat fieldView = GLKMathDegreesToRadians(90.0f);
const GLKMatrix4 projectionMatrix = GLKMatrix4MakePerspective(fieldView, aspectRatio, 0.1f, 10.0f);
glUniformMatrix4fv(self.pointShader.uProjectionMatrix, 1, 0, GLKMatrix4Multiply(projectionMatrix, modelViewMatrix).m);
This works fine, but I have a set of 500 points and I see only a few.
How do I scale/translate the MVP matrix to display all of them (they are a dynamic set)? Ideally the "centroid" should be at the origin, and all of the points visible. It should be able to adapt to rotations of the view (gestures are the next step I want to implement).
Seeing how you present this you might need quite a lot... I guess best approach might be using "look at", the point you are looking at is (0,0,0) as you stated, camera position should probably be (0,0,Z) and up (0,1,0). So the only issue here is the Z component of camera position.
If you start the Z with for instance -.1 and the iterate through all the points then sin(fieldView*.5f) * (p.z-Z) >= point.y for the point to be visible. So you can compute Z1 = p.z-(point.y/sin(fieldView*.5f)) and if Z1<Z then Z=Z1. This check is only for the positive Y check, you also need the same for negative Y and same for +-X. These evasions are very similar though when checking X you could also take the screen ratio into account.
This procedure should give you the smallest field possible to see all the points (with given limitations such as looking towards (0,0,0)) but is far from the simplest. You also need to consider if the equation will work if p.z<-Z.
Another bit easier approach is to generate the smallest cube around centre which holds all the points: iterate through points and get the coordinate with largest absolute value (any of X,Y or Z). When you have it use it with frustum instead perspective so that all rect parameters (top, bottom, left and right) are generated with this value as +-largest. Then you need to compute the translation which for 90 degrees field is Z = (largest*.5). Z is the zNear for the frustum and then also translate the matrix by -(Z+largest). Again one of the coordinate in frustum must be multiplied by screen ratio.
In any case do watch out what your zFar is, having it only 10.0f might be a bit too short in your case. Until you need the depth buffer you should not worry about that value being too large.
Code flow is as follows:
Is there any way for me to get that textured quad rendering code to show behind the scene in spite of the fact that the scene renders first? Assume that I cannot change that the rendering of the background textured quad will happen directly before I present the render buffer.
Rephrased: I can't change the rendering order. Essentially what I want is that every pixel that would've been colored only by glClearColor to instead be colored by this textured quad.
The easiest solution is to define the quad in normalized device coordinates directly and set the z-value to 1. You then don't need to project the quad and it will be screen-filling and behind anything else - except stuff that's also at z=1 after projection and perspective divide.
That's pretty much the standard procedure for screen-aligned quads, except there is usually no need to put the quad at z=1, not that it would matter. Usually, full screen quads are simply used to be able to process at least once fragment per pixel, normally a 1:1 mapping of fragments an pixels. Deferred shading, post-processing fx or image processing in general are the usual suspects. Since you only render the quad in most cases (and nothing else) the depth value is irrelevant, as long as it's inside the unit cube and not dropped by the depth test, for instance when you put it at z=1 and your depth functions is LESS.
EDIT: I made a little mistake. NDCs are defined in a left-handed coordinate system, meaning that the near plane is mapped to -1 and the far plane is mapped to 1. So, you need to define your quad in NDCs with a z value of 1 and set the DepthFunc to LEQUAL. Alternatively, you can leave the depth function untouched and simply subtract a very small value from 1.f:
float maxZ = 1.f - std::numeric_limits<float>::epsilon();
EDIT2: Let's assume you want to render a screen-aligned quad which is drawn behind everything else and with appropriate texture coordinates. Please note: I'm on a desktop here, so I'm writing core GL code which doesn't map to GLES 2.0 directly. However, there is nothing in my examnple you can't do with GLES and GLSL ES 2.0.
You may define the vertex attribs of the quad like this (without messing with the depth func):
GLfloat maxZ = 1.f - std::numeric_limits<GLfloat>::epsilon ();
// interleaved positions an tex coords
GLfloat quad[] = {-1.f, -1.f, maxZ, 1.f, // v0
0.f, 0.f, 0.f, 0.f, // t0
1.f, -1.f, maxZ, 1.f, // ...
1.f, 0.f, 0.f, 0.f,
1.f, 1.f, maxZ, 1.f,
1.f, 1.f, 0.f, 0.f,
-1.f, 1.f, maxZ, 1.f,
0.f, 1.f, 0.f, 0.f};
GLubyte indices[] = {0, 1, 2, 0, 2, 3};
The VAO and buffers are setup accordingly:
// generate and bind a VAO
gl::GenVertexArrays (1, &vao);
gl::BindVertexArray (vao);
// setup our VBO
gl::GenBuffers (1, &vbo);
gl::BindBuffer (gl::ARRAY_BUFFER, vbo);
gl::BufferData (gl::ARRAY_BUFFER, sizeof(quad), quad, gl::STATIC_DRAW);
// setup out index buffer
gl::GenBuffers (1, &ibo);
gl::BindBuffer (gl::ELEMENT_ARRAY_BUFFER, ibo);
gl::BufferData (gl::ELEMENT_ARRAY_BUFFER, sizeof(indices), indices, gl::STATIC_DRAW);
// setup our vertex arrays
gl::VertexAttribPointer (0, 4, gl::FLOAT, gl::FALSE_, 8 * sizeof(GLfloat), 0);
gl::VertexAttribPointer (1, 4, gl::FLOAT, gl::FALSE_, 8 * sizeof(GLfloat), (GLvoid*)(4 * sizeof(GLfloat)));
gl::EnableVertexAttribArray (0);
gl::EnableVertexAttribArray (1);
The shader code comes to a very, very simple pass-through vertex shader and, for simplicty a fragment shader which in my example simply exports the interpolated tex coords:
// Vertex Shader
#version 430 core
layout (location = 0) in vec4 Position;
layout (location = 1) in vec4 TexCoord;
out vec2 vTexCoord;
void main()
vTexCoord = TexCoord.xy;
// you don't need to project, you're already in NDCs!
gl_Position = Position;
//Fragment Shader
#version 430 core
in vec2 vTexCoord;
out vec4 FragColor;
void main()
FragColor = vec4(vTexCoord, 0.0, 1.0);
As you can see, the values written to gl_Position are simply the vertex positions passed to the shader invocation. No projection takes place because the result of projection and perspective divide is nothing else than normalized device coordinates. Since we already are in NDCs, we don't need projection and perspective divide and so simply pass through the positions unaltered.
The final depth is very close to the maximum of the depth range and so the quad will appear to be behind anthing else in your scene.
You can use the texcoords as usual.
I hope you get the idea. Except for the explicit attrib locations which aren't supported by GLES 2.0 (i.e. replace the stuff with BindAttribLocation() calls instead) you shouldn't have to do anything.
There is a way, but you have to put the quad behind the scene. If your quad is constructed correctly you can
enable DEPTH_TEST by using
and then by using
before rendering your background.
Your quad will be rendered behind the scene. But as I said, this only works, when your geometry is literally located behind the scene.
Is it possible to access the surface normal - the normal associated with the plane of a fragment - from within a fragment shader? Or perhaps this can be done in the vertex shader?
Is all knowledge of the associated geometry lost when we go down the shader pipeline or is there some clever way of recovering that information in either the vertex of fragment shader?
Thanks in advance.
twitter: #dugla
The surface normal vector can be calculated approximately by the partial derivative of the view space position in the frgament shader. The partial derivative can be get by the functions dFdx and dFdy. For this is required OpenGL es 3.0 or the OES_standard_derivatives extension:
in vec3 view_position;
void main()
vec3 normalvector = cross(dFdx(view_position), dFdy(view_position));
nv = normalize(normalvector * sign(normalvector.z));
In general it is possible to calculate the normal vector of a surface in a geometry shader (since OpenGL ES 3.2).
For example if you draw triangles you get all three points in the geometry shader.
Three points define a plane from which the normal vector can be calculated.
You just have to be careful if the points are arranged clockwise or counterclockwise.
The normal vector of a triangle is the normalized cross product of 2 vectors defined
by the corner points of the triangle.
See the folowing example which for counterclockwise triangles:
Vertex shader
#version 400
layout (location = 0) in vec3 inPos;
out vec3 vertPos;
uniform mat4 u_projectionMat44;
uniform mat4 u_modelViewMat44;
void main()
vec4 viewPos = u_modelViewMat44 * vec4( inPos, 1.0 );
vertPos = viewPos.xyz;
gl_Position = u_projectionMat44 * viewPos;
Geometry shader
#version 400
layout( triangles ) in;
layout( triangle_strip, max_vertices = 3 ) out;
in vec3 vertPos[];
out vec3 geoPos;
out vec3 geoNV;
void main()
vec3 leg1 = vertPos[1] - vertPos[0];
vec3 leg2 = vertPos[2] - vertPos[0];
geoNV = normalize( cross( leg1, leg2 ) );
geoPos = vertPos[0];
geoPos = vertPos[1];
geoPos = vertPos[2];
Fragment shader
#version 400
in vec3 geoPos;
in vec3 geoNV;
void main()
// ...
Of course you can calculate the normalvector also in the tesselation shaders (since OpenGL ES 3.2).
But this makes sense only if you already required tessellation shader for other reasons and additionally calculate
the normal vector of the face:
Vertex shader
The vertex shader is the same as above.
Tessellation control shader
#version 400
layout( vertices=3 ) out;
in vec3 vertPos[];
out vec3 tctrlPos[];
void main()
tctrlPos[gl_InvocationID] = vertPos[gl_InvocationID];
if ( gl_InvocationID == 0 )
gl_TessLevelOuter[0] = ;
gl_TessLevelOuter[1] = ;
gl_TessLevelOuter[2] = ;
gl_TessLevelInner[0] = ;
Tessellation evaluation shader
#version 400
layout(triangles, ccw) in;
in vec3 tctrlPos[];
out vec3 tevalPos;
out vec3 tevalNV;
void main()
vec3 leg1 = tctrlPos[1] - tctrlPos[0];
vec3 leg2 = tctrlPos[2] - tctrlPos[0];
tevalNV = normalize( cross( leg1, leg2 ) );
tevalPos = tctrlPos[0] * gl_TessCoord.x + tctrlPos[1] * gl_TessCoord.y + tctrlPos[2] * gl_TessCoord.z;
Fragmant shader
#version 400
in vec3 tevalPos;
in vec3 tevalNV;
void main()
// ...
You can get per-pixel normals interpolated from vertex normales by just using a "varying" (in newer OpenGL it is just in/out) variable. But do not forget to normalize this normal! Interpolated normals must not have a length of 1 any longer. These normals also give bad results on sharp edges.
If you want to use custom normals with a higher resolution a commonly used technique are normal maps. You create a texture with baked normals for your object. Then you can access the normal in the fragment texture using a textur look-up.
If you pass the vertex normal through to the fragment shader in a "varying" then you will get an interpolated fragment normal.
EDIT: You will have to calculate the normals in your application, and pass them into your shader as an attribute for each vertex of your triangle.
The usual way to calculate the normal for a triangle is with a cross product.
Call the three points making up the triangle P1, P2, and P3.
Calculate V1, the vector from P1 to P2.
Calculate V2, the vector from P1 to P3.
Calculate the cross product of V1 and V2.
This will give you the normal to the plane of the triangle. V2 should be "to the left of" V1, or your normal will point "in" instead of "out". See the Wikipedia article on cross products for details.
FURTHER EDIT: Right, I understand your problem now. Yes, it's true that with shared vertices you can't really have more than one normal per vertex.
The only other thing that I can think of is that maybe a geometry shader could help, because it gets passed all three vertices for a triangle. I don't have any experience with them though.