I'm reading a z-stack of 16-bit images into javascript (i.e. an array of images). I'm new to webgl shaders and I'm having problems with how to loop over the array.
I hope to put the whole variable-length list of images, up to hundreds, onto the GPU. Then, with each change in z-stack, the gpu will render the new image. Currently, I have the following:
<script id="shader-vertex" type="x-shader/x-vertex">
attribute float a_fluxes;
varying float v_flux;
attribute vec2 a_position;
uniform vec2 u_resolution;
void main() {
vec2 zeroToOne = a_position / u_resolution;
gl_Position = vec4((zeroToOne * 2.0 - 1.0) * vec2(1, -1), 0, 1);
v_flux = a_fluxes/256.;
}
</script>
<script id="shader-fragment" type="x-shader/x-fragment">
precision highp float;
varying float v_flux;
void main() {
gl_FragColor = vec4(v_flux, v_flux, v_flux, 1);
}
</script>
The 3 dimensional array of images (x, y, and z) is supposed to be converted to a 1d list going into a_fluxes.
How do I iterate over the x and y dimensions of one of the z images? Do I use a loop to iterate in the vertex shader or am I required to pass in an array of all of the possible x,y coordinates of the pixels to the vertex shader? Should I really be doing these calculations on the fragment shader?
I think you would be well-served to put the textures together into an atlas (a 2d array of images as one image), if you plan to jump between them using the GPU. WebGL doesn't support volume textures, if it did I'd recommend using a slice per image -- but instead you'll probably be best-off slicing-out windows from the UV space of a larger composite image, where each window corresponds to one of your sub-images.
Related
I need to access a buffer from my shader. The buffer is created from an array. (In the real scenario, the array has 10k+ (variable) numbers.)
var myBuffer = gl.createBuffer();
gl.bindBuffer(gl.ARRAY_BUFFER, myBuffer);
gl.bufferData(gl.ARRAY_BUFFER, new Uint8Array([1,2,3,4,5,6,7]), gl.STATIC_DRAW);
How do I send it so it's usable by the shader?
precision mediump float;
uniform uint[] myBuffer;//???
void main() {
gl_FragColor = vec4(myBuffer[0],myBuffer[1],0,1);
}
Normally, if it were a attribute, it'd be
gl.vertexAttribPointer(myBuffer, 2, gl.UNSIGNED_BYTE, false, 4, 0);
but I need to be able to access the whole array from any shader pixel, so it's not a vertex attribute.
Use a texture if you want random access to lots of data in a shader.
If you have 10000 values you might make a texture that's 100x100 pixels. you can then get each value from the texture with something like
uniform sampler2D u_texture;
vec2 textureSize = vec2(100.0, 100.0);
vec4 getValueFromTexture(float index) {
float column = mod(index, textureSize.x);
float row = floor(index / textureSize.x);
vec2 uv = vec2(
(column + 0.5) / textureSize.x,
(row + 0.5) / textureSize.y);
return texture2D(u_texture, uv);
}
Make sure your texture filtering is set to gl.NEAREST.
Of course if you make textureSize a uniform you could pass in the size of the texture.
As for why the + 0.5 part see this answer
You can use normal gl.RGBA, gl.UNSIGNED_BYTE textures and add/multiply the channels together to get a large range of values. Or, you could use floating point textures if you don't want to mess with that. You need to enable floating point textures.
I'm trying to perform convolution on image with a 16X16 generated kernel. I used opencv filterengine class but it's only operating on the CPU and i'm trying to accelerate the app.
I know opencv also has filterengine_gpu but for my understanding it's not IOS supported.
GPUimage let you perform convolution with 3X3 generated filter. Is there any other way to accelerate the convolution? Different libary that operates on the GPU?
You can use Apple's Accelerate framework for this. It's available on iOS and MacOS bythe way, so may be reuse your code later.
In order to achieve best performance, you may need to consider the following options:
if your convolution kernel is separable, use a separable implementation. This is the case of symmetric kernels (such as Gaussian convolution). This will save yo an order of magnitude in computation time;
if your images have power-of-two sizes, consider using the FFT-trick. Convolution in the spatial domain (complexity N^2) is equivalent to a multiplication in the Fourier domain (complexity N). Thus, you can 1) FFT your image and kernel, 2) multiply term-by-term the result and 3) invert FFT of the result. Since FFT algorithms are fast (e.g., Aple's FFT in the Accelerate framework), this series of operations can result in performance boost.
You can find more insight on iOS image processing optimization in this book that I did also review here.
You can do a 16x16 convolution using GPUImage, but you'll need to write your own filter to do so. The 3x3 convolution in the framework samples from pixels in a 3x3 area around each pixel in the input image and applies the matrix of weights you feed in. The GPUImage3x3ConvolutionFilter.m source file within the framework should be reasonably easy to read, but I can provide a little context if you wish to step beyond what I have there.
The first thing I do is use the following vertex shader:
attribute vec4 position;
attribute vec4 inputTextureCoordinate;
uniform float texelWidth;
uniform float texelHeight;
varying vec2 textureCoordinate;
varying vec2 leftTextureCoordinate;
varying vec2 rightTextureCoordinate;
varying vec2 topTextureCoordinate;
varying vec2 topLeftTextureCoordinate;
varying vec2 topRightTextureCoordinate;
varying vec2 bottomTextureCoordinate;
varying vec2 bottomLeftTextureCoordinate;
varying vec2 bottomRightTextureCoordinate;
void main()
{
gl_Position = position;
vec2 widthStep = vec2(texelWidth, 0.0);
vec2 heightStep = vec2(0.0, texelHeight);
vec2 widthHeightStep = vec2(texelWidth, texelHeight);
vec2 widthNegativeHeightStep = vec2(texelWidth, -texelHeight);
textureCoordinate = inputTextureCoordinate.xy;
leftTextureCoordinate = inputTextureCoordinate.xy - widthStep;
rightTextureCoordinate = inputTextureCoordinate.xy + widthStep;
topTextureCoordinate = inputTextureCoordinate.xy - heightStep;
topLeftTextureCoordinate = inputTextureCoordinate.xy - widthHeightStep;
topRightTextureCoordinate = inputTextureCoordinate.xy + widthNegativeHeightStep;
bottomTextureCoordinate = inputTextureCoordinate.xy + heightStep;
bottomLeftTextureCoordinate = inputTextureCoordinate.xy - widthNegativeHeightStep;
bottomRightTextureCoordinate = inputTextureCoordinate.xy + widthHeightStep;
}
to calculate the positions from which to sample the pixel colors used in the convolution. Because normalized coordinates are used, the X and Y spacings between pixels are 1.0/[image width] and 1.0/[image height], respectively.
The texture coordinates for the pixels to be sampled are calculated in the vertex shader for two reasons: it's more efficient to do this calculation once per vertex (of which there are six in the two triangles that make up the rectangle of the image) than per each fragment (pixel), and to avoid dependent texture reads where possible. Dependent texture reads are where the texture coordinate to be read from is calculated in the fragment shader, not simply passed in from the vertex shader, and they are much slower on the iOS GPUs.
Once I have the texture locations calculated in the vertex shader, I pass them into the fragment shader as varyings and use the following code there:
uniform sampler2D inputImageTexture;
uniform mat3 convolutionMatrix;
varying vec2 textureCoordinate;
varying vec2 leftTextureCoordinate;
varying vec2 rightTextureCoordinate;
varying vec2 topTextureCoordinate;
varying vec2 topLeftTextureCoordinate;
varying vec2 topRightTextureCoordinate;
varying vec2 bottomTextureCoordinate;
varying vec2 bottomLeftTextureCoordinate;
varying vec2 bottomRightTextureCoordinate;
void main()
{
vec3 bottomColor = texture2D(inputImageTexture, bottomTextureCoordinate).rgb;
vec3 bottomLeftColor = texture2D(inputImageTexture, bottomLeftTextureCoordinate).rgb;
vec3 bottomRightColor = texture2D(inputImageTexture, bottomRightTextureCoordinate).rgb;
vec4 centerColor = texture2D(inputImageTexture, textureCoordinate);
vec3 leftColor = texture2D(inputImageTexture, leftTextureCoordinate).rgb;
vec3 rightColor = texture2D(inputImageTexture, rightTextureCoordinate).rgb;
vec3 topColor = texture2D(inputImageTexture, topTextureCoordinate).rgb;
vec3 topRightColor = texture2D(inputImageTexture, topRightTextureCoordinate).rgb;
vec3 topLeftColor = texture2D(inputImageTexture, topLeftTextureCoordinate).rgb;
vec3 resultColor = topLeftColor * convolutionMatrix[0][0] + topColor * convolutionMatrix[0][1] + topRightColor * convolutionMatrix[0][2];
resultColor += leftColor * convolutionMatrix[1][0] + centerColor.rgb * convolutionMatrix[1][1] + rightColor * convolutionMatrix[1][2];
resultColor += bottomLeftColor * convolutionMatrix[2][0] + bottomColor * convolutionMatrix[2][1] + bottomRightColor * convolutionMatrix[2][2];
gl_FragColor = vec4(resultColor, centerColor.a);
This reads each of the 9 colors and applies the weights from the 3x3 matrix that was supplied for convolution.
That said, a 16x16 convolution is a fairly expensive operation. You're looking at 256 texture reads per pixel. On older devices (iPhone 4 or so), you got around 8 texture reads per pixel for free if they were non-dependent reads. Once you went over that, performance started to drop dramatically. Later GPUs sped this up significantly, though. The iPhone 5S, for example, does well over 40 dependent texture reads per pixel pretty much for free. Even the heaviest shaders on 1080p video barely slow it down.
As sansuiso suggests, if you have a way of separating your kernel into horizontal and vertical passes (like can be done for a Gaussian blur kernel), you can get much better performance due to a dramatic reduction in texture reads. For your 16x16 kernel, you could drop from 256 reads to 32, and even those 32 would be much faster because they would be from passes that only sample 16 texels at a time.
The crossover point for which doing an operation like this is faster in Accelerate on the CPU than in OpenGL ES will vary with the device you're running on. In general, GPUs on the iOS devices have outpaced CPUs in performance growth on each recent generation, so that bar has shifted farther to the GPU side over the last several iOS models.
NOTE: Right now I'm testing this in the simulator. But the idea is that I get acceptable performance in say, an iPhone 4s. (I know, I should be testing on the device, but I won't have a device for a few days).
I was playing around with making a convolution shader that would allow convolving an image with a filter of support 3x3, 5x5 or 7x7 and the option of multiple passes. The shader itself works I guess. But I notice the following:
A simple box filter 3x3, single-pass, barely blurs the image. So to get a more noticeable blur, I have to do either 3x3 2-pass or 5x5.
The simplest case (the 3x3, 1-pass) is already slow enough that it couldn't be used at say, 30 fps.
I tried two approaches so far (this is for some OGLES2-based plugins I'm doing for iPhone, that's why the methods):
- (NSString *)vertexShader
{
return SHADER_STRING
(
attribute vec4 aPosition;
attribute vec2 aTextureCoordinates0;
varying vec2 vTextureCoordinates0;
void main(void)
{
vTextureCoordinates0 = aTextureCoordinates0;
gl_Position = aPosition;
}
);
}
- (NSString *)fragmentShader
{
return SHADER_STRING
(
precision highp float;
uniform sampler2D uTextureUnit0;
uniform float uKernel[49];
uniform int uKernelSize;
uniform vec2 uTextureUnit0Offset[49];
uniform vec2 uTextureUnit0Step;
varying vec2 vTextureCoordinates0;
void main(void)
{
vec4 outputFragment = texture2D(uTextureUnit0, vTextureCoordinates0 + uTextureUnit0Offset[0] * uTextureUnit0Step) * uKernel[0];
for (int i = 0; i < uKernelSize; i++) {
outputFragment += texture2D(uTextureUnit0, vTextureCoordinates0 + uTextureUnit0Offset[i] * uTextureUnit0Step) * uKernel[i];
}
gl_FragColor = outputFragment;
}
);
}
The idea in this approach is that both the filter values and the offsetCoordinates to fetch texels are precomputed once in Client / App land, and then get set in uniforms. Then, the shader program will always have them available any time it is used. Mind you, the big size of the uniform arrays (49) is because potentially I could do up to a 7x7 kernel.
This approach takes .46s per pass.
Then I tried the following approach:
- (NSString *)vertexShader
{
return SHADER_STRING
(
// Default pass-thru vertex shader:
attribute vec4 aPosition;
attribute vec2 aTextureCoordinates0;
varying highp vec2 vTextureCoordinates0;
void main(void)
{
vTextureCoordinates0 = aTextureCoordinates0;
gl_Position = aPosition;
}
);
}
- (NSString *)fragmentShader
{
return SHADER_STRING
(
precision highp float;
uniform sampler2D uTextureUnit0;
uniform vec2 uTextureUnit0Step;
uniform float uKernel[49];
uniform float uKernelRadius;
varying vec2 vTextureCoordinates0;
void main(void)
{
vec4 outputFragment = vec4(0., 0., 0., 0.);
int kRadius = int(uKernelRadius);
int kSupport = 2 * kRadius + 1;
for (int t = -kRadius; t <= kRadius; t++) {
for (int s = -kRadius; s <= kRadius; s++) {
int kernelIndex = (s + kRadius) + ((t + kRadius) * kSupport);
outputFragment += texture2D(uTextureUnit0, vTextureCoordinates0 + (vec2(s,t) * uTextureUnit0Step)) * uKernel[kernelIndex];
}
}
gl_FragColor = outputFragment;
}
);
}
Here, I still pass the precomputed kernel into the fragment shader via a uniform. But I now compute the texel offsets and even the kernel indices in the shader. I'd expect this approach to be slower since I not only have 2 for loops but I'm also doing a bunch of extra computations for every single fragment.
Interestingly enough, this approach takes .42 secs. Actually faster...
At this point, the only other thing I can think of doing is braking the convolution into 2-passes by thinking of the 2D kernel as two separable 1D kernels. Haven't tried it out yet.
Just for comparison, and aware that the following example is a specific implementation of box filtering that is A - pretty much hardcoded and B - doesn't really adhere to theoretical definition of a classic nxn linear filter (it is not a matrix and doesn't add up to 1), I tried this approach from the OpenGL ES 2.0 Programming guide:
- (NSString *)fragmentShader
{
return SHADER_STRING
(
// Default pass-thru fragment shader:
precision mediump float;
// Input texture:
uniform sampler2D uTextureUnit0;
// Texel step:
uniform vec2 uTextureUnit0Step;
varying vec2 vTextureCoordinates0;
void main() {
vec4 sample0;
vec4 sample1;
vec4 sample2;
vec4 sample3;
float step = uTextureUnit0Step.x;
sample0 = texture2D(uTextureUnit0, vec2(vTextureCoordinates0.x - step, vTextureCoordinates0.y - step));
sample1 = texture2D(uTextureUnit0, vec2(vTextureCoordinates0.x + step, vTextureCoordinates0.y + step));
sample2 = texture2D(uTextureUnit0, vec2(vTextureCoordinates0.x + step, vTextureCoordinates0.y - step));
sample3 = texture2D(uTextureUnit0, vec2(vTextureCoordinates0.x - step, vTextureCoordinates0.y + step));
gl_FragColor = (sample0 + sample1 + sample2 + sample3) / 4.0;
}
);
}
This approach takes 0.06s per pass.
Mind you, the above is my adaptation where I made the step pretty much the same texel offset I was using in my implementation. With this step, the result is very similar to my implementation, but the original shader in the OpenGL guide uses a larger step which blurs more.
So with all the above being said, my questions is really two-fold:
I'm computing the step / texel offset as vec2(1 / image width, 1 / image height). With this offset, like I said, a 3x3 box filter is barely noticeable. Is this correct? or am I misunderstanding the computation of the step or something else?
Is there anything else I could do to try and get the "convolution in the general case" approach to run fast enough for real-time? Or do I necessarily need to go for a simplification like the OpenGL example?
If you run those through the OpenGL ES Analysis tool in Instruments or the Frame Debugger in Xcode, you'll probably see a note about dependent texture reads -- you're calculating texcoords in the fragment shader, which means the hardware can't fetch texel data until it gets to that point in evaluating the shader. If texel coordinates are known going into the fragment shader, the hardware can prefetch your texel data in parallel with other tasks, so it's ready to go by the time the fragment shader needs it.
You can speed things up greatly by precomputing texel coordinates in the vertex shader. Brad Larson has a good example of doing such in this answer to a similar question.
I don't have answers regarding your precise questions, but you should take a look at GPUImage framework - which implements several box blur filter (see this SO question) - among which a 2-pass 9x9 filter - you can also see this article for real-time FPS of different approaches : vImage VS GPUImage vs CoreImage
It seems this should be easy but I'm having a lot of difficulty using part of a texture with a point sprite. I have googled around extensively and turned up various answers but none of these deal with the specific issue I'm having.
What I've learned so far:
Basics of point sprite drawing
How to deal with point sprites rendering as solid squares
How to alter orientation of a point sprite
How to use multiple textures with a point sprite, getting closer here..
That point sprites + sprite sheets has been done before, but is only possible in OpenGL ES 2.0 (not 1.0)
Here is a diagram of what I'm trying to achieve
Where I'm at:
I have a set of working point sprites all using the same single square image. Eg: a 16x16 image of a circle works great.
I have an Objective-C method which generates a 600x600 image containing a sprite-sheet with multiple images. I have verified this is working by applying the entire sprite sheet image to a quad drawn with GL_TRIANGLES.
I have used the above method successfully to draw parts of a sprite sheet on to quads. I just cant get it to work with point sprites.
Currently I'm generating texture coordinates pointing to the center of the sprite on the sprite sheet I'm targeting. Eg: Using the image at the bottom; star: 0.166,0.5; cloud: 0.5,0.5; heart: 0.833,0.5.
Code:
Vertex Shader
uniform mat4 Projection;
uniform mat4 Modelview;
uniform float PointSize;
attribute vec4 Position;
attribute vec2 TextureCoordIn;
varying vec2 TextureCoord;
void main(void)
{
gl_Position = Projection * Modelview * Position;
TextureCoord = TextureCoordIn;
gl_PointSize = PointSize;
}
Fragment Shader
varying mediump vec2 TextureCoord;
uniform sampler2D Sampler;
void main(void)
{
// Using my TextureCoord just draws a grey square, so
// I'm likely generating texture coords that texture2D doesn't like.
gl_FragColor = texture2D(Sampler, TextureCoord);
// Using gl_PointCoord just draws my whole sprite map
// gl_FragColor = texture2D(Sampler, gl_PointCoord);
}
What I'm stuck on:
I don't understand how to use the gl_PointCoord variable in the fragment shader. What does gl_PointCoord contain initially? Why? Where does it get its data?
I don't understand what texture coordinates to pass in. For example, how does the point sprite choose what part of my sprite sheet to use based on the texture coordinates? I'm used to drawing quads which have effectively 4 sets of texture coordinates (one for each vertex), how is this different (clearly it is)?
A colleague of mine helped with the answer. It turns out the trick is to utilize both the size of the point (in OpenGL units) and the size of the sprite (in texture units, (0..1)) in combination with a little vector math to render only part of the sprite-sheet onto each point.
Vertex Shader
uniform mat4 Projection;
uniform mat4 Modelview;
// The radius of the point in OpenGL units, eg: "20.0"
uniform float PointSize;
// The size of the sprite being rendered. My sprites are square
// so I'm just passing in a float. For non-square sprites pass in
// the width and height as a vec2.
uniform float TextureCoordPointSize;
attribute vec4 Position;
attribute vec4 ObjectCenter;
// The top left corner of a given sprite in the sprite-sheet
attribute vec2 TextureCoordIn;
varying vec2 TextureCoord;
varying vec2 TextureSize;
void main(void)
{
gl_Position = Projection * Modelview * Position;
TextureCoord = TextureCoordIn;
TextureSize = vec2(TextureCoordPointSize, TextureCoordPointSize);
// This is optional, it is a quick and dirty way to make the points stay the same
// size on the screen regardless of distance.
gl_PointSize = PointSize / Position.w;
}
Fragment Shader
varying mediump vec2 TextureCoord;
varying mediump vec2 TextureSize;
uniform sampler2D Sampler;
void main(void)
{
// This is where the magic happens. Combine all three factors to render
// just a portion of the sprite-sheet for this point
mediump vec2 realTexCoord = TextureCoord + (gl_PointCoord * TextureSize);
mediump vec4 fragColor = texture2D(Sampler, realTexCoord);
// Optional, emulate GL_ALPHA_TEST to use transparent images with
// point sprites without worrying about z-order.
// see: http://stackoverflow.com/a/5985195/806988
if(fragColor.a == 0.0){
discard;
}
gl_FragColor = fragColor;
}
Point sprites are composed of a single position. Therefore any "varying" values will not actually vary, because there's nothing to interpolate between.
gl_PointCoord is a vec2 value where the XY values are between [0, 1]. They represent the location on the point. (0, 0) is the bottom-left of the point, and (1, 1) is the top-right.
So you want to map (0, 0) to the bottom-left of your sprite, and (1, 1) to the top-right. To do that, you need to know certain things: the size of the sprites (assuming they're all the same size), the size of the texture (because the texture fetch functions take normalized texture coordinates, not pixel locations), and which sprite is currently being rendered.
The latter can be set via a varying. It can just be a value that's passed as per-vertex data into the varying in the vertex shader.
You use that plus the size of the sprites to determine where in the texture you want to pull data for this sprite. Once you have the texel coordinates you want to use, you divide them by the texture size to produce normalized texture coordinates.
In any case, point sprites, despite the name, aren't really meant for sprite rendering. It would be easier to use quads/triangles for that, as you can have more assurance over exactly what positions everything has.
Is it possible to access the surface normal - the normal associated with the plane of a fragment - from within a fragment shader? Or perhaps this can be done in the vertex shader?
Is all knowledge of the associated geometry lost when we go down the shader pipeline or is there some clever way of recovering that information in either the vertex of fragment shader?
Thanks in advance.
Cheers,
Doug
twitter: #dugla
The surface normal vector can be calculated approximately by the partial derivative of the view space position in the frgament shader. The partial derivative can be get by the functions dFdx and dFdy. For this is required OpenGL es 3.0 or the OES_standard_derivatives extension:
in vec3 view_position;
void main()
{
vec3 normalvector = cross(dFdx(view_position), dFdy(view_position));
nv = normalize(normalvector * sign(normalvector.z));
.....
}
In general it is possible to calculate the normal vector of a surface in a geometry shader (since OpenGL ES 3.2).
For example if you draw triangles you get all three points in the geometry shader.
Three points define a plane from which the normal vector can be calculated.
You just have to be careful if the points are arranged clockwise or counterclockwise.
The normal vector of a triangle is the normalized cross product of 2 vectors defined
by the corner points of the triangle.
See the folowing example which for counterclockwise triangles:
Vertex shader
#version 400
layout (location = 0) in vec3 inPos;
out vec3 vertPos;
uniform mat4 u_projectionMat44;
uniform mat4 u_modelViewMat44;
void main()
{
vec4 viewPos = u_modelViewMat44 * vec4( inPos, 1.0 );
vertPos = viewPos.xyz;
gl_Position = u_projectionMat44 * viewPos;
}
Geometry shader
#version 400
layout( triangles ) in;
layout( triangle_strip, max_vertices = 3 ) out;
in vec3 vertPos[];
out vec3 geoPos;
out vec3 geoNV;
void main()
{
vec3 leg1 = vertPos[1] - vertPos[0];
vec3 leg2 = vertPos[2] - vertPos[0];
geoNV = normalize( cross( leg1, leg2 ) );
geoPos = vertPos[0];
EmitVertex();
geoPos = vertPos[1];
EmitVertex();
geoPos = vertPos[2];
EmitVertex();
EndPrimitive();
}
Fragment shader
#version 400
in vec3 geoPos;
in vec3 geoNV;
void main()
{
// ...
}
Of course you can calculate the normalvector also in the tesselation shaders (since OpenGL ES 3.2).
But this makes sense only if you already required tessellation shader for other reasons and additionally calculate
the normal vector of the face:
Vertex shader
The vertex shader is the same as above.
Tessellation control shader
#version 400
layout( vertices=3 ) out;
in vec3 vertPos[];
out vec3 tctrlPos[];
void main()
{
tctrlPos[gl_InvocationID] = vertPos[gl_InvocationID];
if ( gl_InvocationID == 0 )
{
gl_TessLevelOuter[0] = ;
gl_TessLevelOuter[1] = ;
gl_TessLevelOuter[2] = ;
gl_TessLevelInner[0] = ;
}
}
Tessellation evaluation shader
#version 400
layout(triangles, ccw) in;
in vec3 tctrlPos[];
out vec3 tevalPos;
out vec3 tevalNV;
void main()
{
vec3 leg1 = tctrlPos[1] - tctrlPos[0];
vec3 leg2 = tctrlPos[2] - tctrlPos[0];
tevalNV = normalize( cross( leg1, leg2 ) );
tevalPos = tctrlPos[0] * gl_TessCoord.x + tctrlPos[1] * gl_TessCoord.y + tctrlPos[2] * gl_TessCoord.z;
}
Fragmant shader
#version 400
in vec3 tevalPos;
in vec3 tevalNV;
void main()
{
// ...
}
You can get per-pixel normals interpolated from vertex normales by just using a "varying" (in newer OpenGL it is just in/out) variable. But do not forget to normalize this normal! Interpolated normals must not have a length of 1 any longer. These normals also give bad results on sharp edges.
If you want to use custom normals with a higher resolution a commonly used technique are normal maps. You create a texture with baked normals for your object. Then you can access the normal in the fragment texture using a textur look-up.
If you pass the vertex normal through to the fragment shader in a "varying" then you will get an interpolated fragment normal.
EDIT: You will have to calculate the normals in your application, and pass them into your shader as an attribute for each vertex of your triangle.
The usual way to calculate the normal for a triangle is with a cross product.
Call the three points making up the triangle P1, P2, and P3.
Calculate V1, the vector from P1 to P2.
Calculate V2, the vector from P1 to P3.
Calculate the cross product of V1 and V2.
This will give you the normal to the plane of the triangle. V2 should be "to the left of" V1, or your normal will point "in" instead of "out". See the Wikipedia article on cross products for details.
FURTHER EDIT: Right, I understand your problem now. Yes, it's true that with shared vertices you can't really have more than one normal per vertex.
The only other thing that I can think of is that maybe a geometry shader could help, because it gets passed all three vertices for a triangle. I don't have any experience with them though.