Sampling a texture within vertex shader? - directx

I'm using DirectX 11 targeting Shader Model 5 (well actually using SharpDX for directx 11.2 build) and i'm at loss with what is wrong with this simple shader i'm writing.
The pixel shader is applied on a flat high poly plane (so there are plenty of vertices to displace), the texture is sampled without issues (the pixel shader displays it fine), however the displacement in the vertex shader just doesn't work.
It's not an issue with the displacement itsels, replacing the += height.SampleLevel with += 0.5 shows all vertices displaced, it's not an issue with the sampling either since the very same code works in the pixel shader. And as far as i understand it's not an API usage issue since i hear SampleLevel (unlike sample) is perfectly usable in the VS (since you provide the LOD level).
Using a noise function to displace instead of a texture also works just fine, which leads me to think the issue is with the texture sampling and only Inside the VS, which is odd since i'm using a function that is supposedly VS compatible?
I've been trying random things for the past hour and i'm really clueless as to where to look, i also find nearly no info about displacement mapping in hlsl and even less so in DX11 + to use as reference.
struct VS_IN
{
float4 pos : POSITION;
float2 tex : TEXCOORD;
};
struct PS_IN
{
float4 pos : SV_POSITION;
float2 tex : TEXCOORD;
};
float4x4 worldViewProj;
Texture2D<float4> diffuse: register(t0);
Texture2D<float4> height: register(t1);
Texture2D<float4> lightmap: register(t2);
SamplerState pictureSampler;
PS_IN VS( VS_IN input )
{
PS_IN output = (PS_IN) 0;
input.pos.z += height.SampleLevel(pictureSampler, input.tex, 0).r;
output.pos = mul(input.pos, worldViewProj);
output.tex = input.tex;
return output;
}
float4 PS( PS_IN input ) : SV_Target
{
return height.SampleLevel(pictureSampler, input.tex, 0).rrrr;
}
For reference in case it matters:
Texture description :
new SharpDX.Direct3D11.Texture2DDescription()
{
Width = bitmapSource.Size.Width,
Height = bitmapSource.Size.Height,
ArraySize = 1,
BindFlags = SharpDX.Direct3D11.BindFlags.ShaderResource ,
Usage = SharpDX.Direct3D11.ResourceUsage.Immutable,
CpuAccessFlags = SharpDX.Direct3D11.CpuAccessFlags.None,
Format = SharpDX.DXGI.Format.R8G8B8A8_UNorm,
MipLevels = 1,
OptionFlags = SharpDX.Direct3D11.ResourceOptionFlags.None,
SampleDescription = new SharpDX.DXGI.SampleDescription(1, 0),
};
Sampler description :
var sampler = new SamplerState(device, new SamplerStateDescription()
{
Filter = Filter.MinMagMipLinear,
AddressU = TextureAddressMode.Clamp,
AddressV = TextureAddressMode.Clamp,
AddressW = TextureAddressMode.Clamp,
BorderColor = Color.Pink,
ComparisonFunction = Comparison.Never,
MaximumAnisotropy = 16,
MipLodBias = 0,
MinimumLod = 0,
MaximumLod = 16
});

So the answer was that Ronan wasn't setting the shader resource view to the vertex shader so it can access the textures and set the calculated values.

Related

DirectX + GLM Depth Reconstruction issues

I'm trying to port my engine to DirectX and I'm currently having issues with depth reconstruction. It works perfectly in OpenGL (even though I use a bit of an expensive method). Every part besides the depth reconstruction works so far. I use GLM because it's a good math library that has no need to install any dependencies or anything for the user.
So basically I get my GLM matrices:
struct DefferedUBO {
glm::mat4 view;
glm::mat4 invProj;
glm::vec4 eyePos;
glm::vec4 resolution;
};
DefferedUBO deffUBOBuffer;
// ...
glm::mat4 projection = glm::perspective(engine.settings.fov, aspectRatio, 0.1f, 100.0f);
// Get My Camera
CTransform *transform = &engine.transformSystem.components[engine.entities[entityID].components[COMPONENT_TRANSFORM]];
// Get the View Matrix
glm::mat4 view = glm::lookAt(
transform->GetPosition(),
transform->GetPosition() + transform->GetForward(),
transform->GetUp()
);
deffUBOBuffer.invProj = glm::inverse(projection);
deffUBOBuffer.view = glm::inverse(view);
if (engine.settings.graphicsLanguage == GRAPHICS_DIRECTX) {
deffUBOBuffer.invProj = glm::transpose(deffUBOBuffer.invProj);
deffUBOBuffer.view = glm::transpose(deffUBOBuffer.view);
}
// Abstracted so I can use OGL, DX, VK, or even Metal when I get around to it.
deffUBO->UpdateUniformBuffer(&deffUBOBuffer);
deffUBO->Bind());
Then in HLSL, I simply use the following:
cbuffer MatrixInfoType {
matrix invView;
matrix invProj;
float4 eyePos;
float4 resolution;
};
float4 ViewPosFromDepth(float depth, float2 TexCoord) {
float z = depth; // * 2.0 - 1.0;
float4 clipSpacePosition = float4(TexCoord * 2.0 - 1.0, z, 1.0);
float4 viewSpacePosition = mul(invProj, clipSpacePosition);
viewSpacePosition /= viewSpacePosition.w;
return viewSpacePosition;
}
float3 WorldPosFromViewPos(float4 view) {
float4 worldSpacePosition = mul(invView, view);
return worldSpacePosition.xyz;
}
float3 WorldPosFromDepth(float depth, float2 TexCoord) {
return WorldPosFromViewPos(ViewPosFromDepth(depth, TexCoord));
}
// ...
// Sample the hardware depth buffer.
float depth = shaderTexture[3].Sample(SampleType[0], input.texCoord).r;
float3 position = WorldPosFromDepth(depth, input.texCoord).rgb;
Here's the result:
This just looks like random colors multiplied with the depth.
Ironically when I remove transposing, I get something closer to the truth, but not quite:
You're looking at Crytek Sponza. As you can see, the green area moves and rotates with the bottom of the camera. I have no idea at all why.
The correct version, along with Albedo, Specular, and Normals.
I fixed my problem at gamedev.net. There was a matrix majorness issue as well as a depth handling issue.
https://www.gamedev.net/forums/topic/692095-d3d-glm-depth-reconstruction-issues

Texture sampler in HLSL does not interpolate

I am currently working on a multi-textured terrain and I have problems with the Sample function of Texture2DArray.
In my example, I use a Texture2DArray to store a set of different terrain texture, e.g. grass, sand, asphalt, etc. Each of my vertices stores a texture coordinate (UV coordinate) and an index of the texture I want to use. So, if my index is 0, I use the first texture. If the index is 1, I use the second texture, and so on. This works fine, as long as my index is a natural number (0, 1, ..). However, it fails, if the index is a real number (like 1.5f).
In order to look for the problem, I reduced my entire pixel shader to this:
Texture2DArray DiffuseTextures : register(t0);
Texture2DArray NormalTextures : register(t1);
Texture2DArray EmissiveTextures : register(t2);
Texture2DArray SpecularTextures : register(t3);
SamplerState Sampler : register(s0);
struct PS_IN
{
float4 pos : SV_POSITION;
float3 nor : NORMAL;
float3 tan : TANGENT;
float3 bin : BINORMAL;
float4 col : COLOR;
float4 TextureIndices : COLOR1;
float4 tra : COLOR2;
float2 TextureUV : TEXCOORD0;
};
float4 PS(PS_IN input) : SV_Target
{
float4 texCol = DiffuseTextures.Sample(Sampler, float3(input.TextureUV, input.TextureIndices.r));
return texCol;
}
The following image shows the result of a sample scene on the left side. As you can see, there is a hard border between the used textures. There is no form of interpolation.
In order to check my texture indices, I changed my pixel shader from above by returning the texture indices as a color:
return float4(input.TextureIndices.r, input.TextureIndices.r, input.TextureIndices.r, 1.0f);
The result can be seen on the right side of the image. The texture indices are correct, since they range in the interval [0, 1] and you can clearly see the interpolation at the border of the area. However, my sampled texture does not show any form of interpolation.
Since my pixel shader is pretty simple, I wonder what causes this behaviour? Is there any setting in DirextX responsible for this?
I use DirectX 11, pixel shader ps_5_0 (I also tested with ps_4_0) and I use DDS textures (BC3 compression).
Edit
This is the sampler I am using:
SharpDX.Direct3D11.SamplerStateDescription samplerStateDescription = new SharpDX.Direct3D11.SamplerStateDescription()
{
AddressU = SharpDX.Direct3D11.TextureAddressMode.Wrap,
AddressV = SharpDX.Direct3D11.TextureAddressMode.Wrap,
AddressW = SharpDX.Direct3D11.TextureAddressMode.Wrap,
Filter = SharpDX.Direct3D11.Filter.MinMagMipLinear
};
SharpDX.Direct3D11.SamplerState samplerState = new SharpDX.Direct3D11.SamplerState(_device, samplerStateDescription);
_deviceContext.PixelShader.SetSampler(0, samplerState);
Solution
I made a function using the code presented by catflier for getting a texture color:
float4 GetTextureColor(Texture2DArray textureArray, float2 textureUV, float textureIndex)
{
float tid = textureIndex;
int id = (int)tid;
float l = frac(tid);
float4 texCol1 = textureArray.Sample(Sampler, float3(textureUV, id));
float4 texCol2 = textureArray.Sample(Sampler, float3(textureUV, id + 1));
return lerp(texCol1, texCol2, l);
}
This way, I can get the desired texture color for all texture types (diffuse, specular, emissive, ...) with a simple function call:
float4 texCol = GetTextureColor(DiffuseTextures, input.TextureUV, input.TextureIndices.r);
float4 bumpMap = GetTextureColor(NormalTextures, input.TextureUV, input.TextureIndices.g);
float4 emiCol = GetTextureColor(EmissiveTextures, input.TextureUV, input.TextureIndices.b);
float4 speCol = GetTextureColor(SpecularTextures, input.TextureUV, input.TextureIndices.a);
The result is as smooth as I wanted it to be: :-)
Texture arrays do not sample across slices, so technically, this is expected result.
If you want to interpolate between slices (eg: 1.5f gives you "half" of second texture and "half" of third texture), you can use a Texture3d instead, which allows this (but will cost some more as it will perform trilinear filtering)
Otherwise, you can perform your sampling that way :
float4 PS(PS_IN input) : SV_Target
{
float tid = input.TextureIndices.r;
int id = (int)tid;
float l = frac(tid); //lerp amount
float4 texCol1 = DiffuseTextures.Sample(Sampler, float3(input.TextureUV,id));
float4 texCol2 = DiffuseTextures.Sample(Sampler, float3(input.TextureUV,id+1));
return lerp(texCol1,texCol2, l);
}
Please note that this technique is quite more flexible, since you can also provide non adjacent slices as input (so you can lerp between slice 2 and 23 for example), and eventually use a different blend mode by changing lerp by some other function.

Custom HLSL shader making weird patterns across icosphere

really hoping that someone can help me here - I rarely can't resolve bugs in C# since I have a fair amount of experience in it but I don't have a lot to go on with HLSL.
The picture linked to below is of the same model (programmatically generated on run) twice, the first (white) time using BasicEffect and the second time using my custom shader, listed below. The fact that it works with BasicEffect makes me think that it's not an issue with generating the normals for the model or anything like that.
I've included different levels of subdividing to better illustrate the issue. It's worth mentioning that both effects are using the same lighting direction.
https://imagizer.imageshack.us/v2/801x721q90/673/qvXyBk.png
Here's my shader code (feel free to pick it apart, any tips are very welcome):
float4x4 WorldViewProj;
float4x4 NormalRotation = float4x4(1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1);
float4 ModelColor = float4(1, 1, 1, 1);
bool TextureEnabled = false;
Texture ModelTexture;
sampler ColoredTextureSampler = sampler_state
{
texture = <ModelTexture>;
magfilter = LINEAR; minfilter = LINEAR; mipfilter = LINEAR;
AddressU = mirror; AddressV = mirror;
};
float4 AmbientColor = float4(1, 1, 1, 1);
float AmbientIntensity = 0.1;
float3 DiffuseLightDirection = float3(1, 0, 0);
float4 DiffuseColor = float4(1, 1, 1, 1);
float DiffuseIntensity = 1.0;
struct VertexShaderInput
{
float4 Position : POSITION0;
float4 Normal : NORMAL0;
float2 TextureCoordinates : TEXCOORD0;
};
struct VertexShaderOutput
{
float4 Position : POSITION0;
float4 Color : COLOR0;
float2 TextureCoordinates : TEXCOORD0;
};
VertexShaderOutput VertexShaderFunction(VertexShaderInput input)
{
VertexShaderOutput output = (VertexShaderOutput)0;
output.Position = mul(input.Position, WorldViewProj);
float4 normal = mul(input.Normal, NormalRotation);
float lightIntensity = dot(normal, DiffuseLightDirection);
output.Color = saturate(DiffuseColor * DiffuseIntensity * lightIntensity);
output.TextureCoordinates = input.TextureCoordinates;
return output;
}
float4 PixelShaderFunction(VertexShaderOutput input) : COLOR0
{
float4 pixBaseColor = ModelColor;
if (TextureEnabled == true)
{
pixBaseColor = tex2D(ColoredTextureSampler, input.TextureCoordinates);
}
float4 lighting = saturate((input.Color + AmbientColor * AmbientIntensity) * pixBaseColor);
return lighting;
}
technique BestCurrent
{
pass Pass1
{
VertexShader = compile vs_2_0 VertexShaderFunction();
PixelShader = compile ps_2_0 PixelShaderFunction();
}
}
In general, when implementing a lighting equation, there are a few things to ensure:
Normals, light directions, and other directional vectors should be normalized before using them in a dot product. In your case you could add something like:
normal = normalize(normal);
The same should be done for DiffuseLightDirection if it is already not normalized. It already is with your default value, but if your app changes it, it might not be normalized anymore. For that, it would be better to normalize in the application code since it only needs to be done once when it changes, and not per vertex.
Also remember that if you are multiplying the vector by a matrix that contains a scale, the vector will no longer be normalized, so it will need to be re-normalized.
The light direction and the normal must point in the same direction which is out from the surface. Your default light direction is (1,0,0). If you want light to point in the +x direction, then you must actually negate the vector before performing the dot product with the normal so that it is pointing out from the surface just like the normal. If you already take this into account, then it's not a problem.
Vectors can't be translated since they are just a direction not a position. So it is important to ensure when you transform them with a matrix that either the fourth component (w) of the vector is 0 or the matrix you are transforming it with has no translation. Setting w to 0 will zero out any translation from the matrix during the multiply. Since your matrix is called NormalRotation, I'm assuming it only contains a rotation, so this probably isn't an issue.

How can I repeat my texture in DX

There is a handy feature in three.js 3d library that you can set the sampler to repeat mode and set the repeat attribute to some values you like, for example, (3, 5) means this texture will repeat 3 times horizontally and 5 times vertically. But now I'm using DirectX and I cannot find some good solutions for this problem. Note that the UV coordinates of vertices still ranges from 0 to 1, and I don't want to change my HLSL codes because I want a programmable solution for this, thanks very much!
Edit : presume I have a cube model already. And the texture coordinates of its vertices are between0 and 1. If i use wrap mode or clamp mode for sampling textures it's all OK now. But I want to repeat a texture on one of its faces, and I first need to change to wrap mode. That's i already knows. Then I have to edit my model so that texture coordinates range 0-3. What if I don't change my model? So far i came out one way: I need to add a variable to pixel shader represents how many times does the map repeats and I will multiply this factor to coordinate when sampling. Not a graceful solution i think emmmm…
Since you've edited your Question, there is another Answer to your problem:
From what I understood, you have a face with uv's like so:
0,1 1,1
-------------
| |
| |
| |
-------------
0,0 1,0
But want the texture repeated 3 times (for example) instead of 1 time.
(Without changing the original model)
Multiple solutions here:
You could do it, when updating your buffers (if you do it):
D3D11_MAPPED_SUBRESOURCE resource;
HRESULT hResult = D3DDeviceContext->Map(vertexBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &resource);
if(hResult != S_OK) return false;
YourVertexFormat *ptr=(YourVertexFormat*)resource.pData;
for(int i=0;i<vertexCount;i++)
{
ptr[i] = vertices[i];
ptr[i].uv.x *= multiplyX; //in your case 3
ptr[i].uv.y *= multiplyY; //in your case 5
}
D3DDeviceContext->Unmap(vertexBuffer, 0);
But if you don't need updating the buffer anyways, i wouldn't recommend it, because it is terribly slow.
A faster way is to use the vertex shader:
cbuffer MatrixBuffer
{
matrix worldMatrix;
matrix viewMatrix;
matrix projectionMatrix;
};
struct VertexInputType
{
float4 position : POSITION0;
float2 uv : TEXCOORD0;
// ...
};
struct PixelInputType
{
float4 position : SV_POSITION;
float2 uv : TEXCOORD0;
// ...
};
PixelInputType main(VertexInputType input)
{
input.position.w = 1.0f;
PixelInputType output;
output.position = mul(input.position, worldMatrix);
output.position = mul(output.position, viewMatrix);
output.position = mul(output.position, projectionMatrix);
This is what you basicly need:
output.uv = input.uv * 3; // 3x3
Or more advanced:
output.uv = float2(input.u * 3, input.v * 5);
// ...
return output;
}
I would recommend the vertex shader solution, because it's fast and in directx you use vertex shaders anyways, so it's not as expensive as the buffer update solution...
Hope that helped solving your problems :)
You basicly want to create a sampler state like so:
ID3D11SamplerState* m_sampleState;
3D11_SAMPLER_DESC samplerDesc;
samplerDesc.Filter = D3D11_FILTER_MIN_MAG_MIP_LINEAR;
samplerDesc.AddressU = D3D11_TEXTURE_ADDRESS_WRAP;
samplerDesc.AddressV = D3D11_TEXTURE_ADDRESS_WRAP;
samplerDesc.AddressW = D3D11_TEXTURE_ADDRESS_WRAP;
samplerDesc.MipLODBias = 0.0f;
samplerDesc.MaxAnisotropy = 1;
samplerDesc.ComparisonFunc = D3D11_COMPARISON_ALWAYS;
samplerDesc.BorderColor[0] = 0;
samplerDesc.BorderColor[1] = 0;
samplerDesc.BorderColor[2] = 0;
samplerDesc.BorderColor[3] = 0;
samplerDesc.MinLOD = 0;
samplerDesc.MaxLOD = D3D11_FLOAT32_MAX;
// Create the texture sampler state.
result = ifDEVICE->ifDX11->getD3DDevice()->CreateSamplerState(&samplerDesc, &m_sampleState);
And when you are setting your shader constants, call this:
ifDEVICE->ifDX11->getD3DDeviceContext()->PSSetSamplers(0, 1, &m_sampleState);
Then you can write your pixel shaders like this:
Texture2D Texture;
SamplerState SampleType;
...
float4 main(PixelInputType input) : SV_TARGET
{
float4 textureColor = shaderTexture.Sample(SampleType, input.uv);
...
}
Hope that helps...

HLSL Pixel shader lighting performance (XNA)

I have a simple enough shader that supports multiple point lights.
Lights are stored as an array of Light structs (up to a max size) and I pass in the number of active lights when it changes.
The problem is in the PixelShader function:
It's basic stuff, get the base color from the texture, loop through the lights array for 0 to numActiveLights and add the effect, and it works fine, but performance is terrible!
BUT if I replace the reference to the global var numActiveLights with a constant of the same value performance is fine.
I just can't fathom why referencing the variable makes a 30+ fps difference.
Can anyone please explain?
Full Shader code:
#define MAX_POINT_LIGHTS 16
struct PointLight
{
float3 Position;
float4 Color;
float Radius;
};
float4x4 World;
float4x4 View;
float4x4 Projection;
float3 CameraPosition;
float4 SpecularColor;
float SpecularPower;
float SpecularIntensity;
float4 AmbientColor;
float AmbientIntensity;
float DiffuseIntensity;
int activeLights;
PointLight lights[MAX_POINT_LIGHTS];
bool IsLightingEnabled;
bool IsAmbientLightingEnabled;
bool IsDiffuseLightingEnabled;
bool IsSpecularLightingEnabled;
Texture Texture;
sampler TextureSampler = sampler_state
{
Texture = <Texture>;
Magfilter = POINT;
Minfilter = POINT;
Mipfilter = POINT;
AddressU = WRAP;
AddressV = WRAP;
};
struct VS_INPUT
{
float4 Position : POSITION0;
float2 TexCoord : TEXCOORD0;
float3 Normal : NORMAL0;
};
struct VS_OUTPUT
{
float3 WorldPosition : TEXCOORD0;
float4 Position : POSITION0;
float3 Normal : TEXCOORD1;
float2 TexCoord : TEXCOORD2;
float3 ViewDir : TEXCOORD3;
};
VS_OUTPUT VS_PointLighting(VS_INPUT input)
{
VS_OUTPUT output;
float4 worldPosition = mul(input.Position, World);
output.WorldPosition = worldPosition;
float4 viewPosition = mul(worldPosition, View);
output.Position = mul(viewPosition, Projection);
output.Normal = normalize(mul(input.Normal, World));
output.TexCoord = input.TexCoord;
output.ViewDir = normalize(CameraPosition - worldPosition);
return output;
}
float4 PS_PointLighting(VS_OUTPUT IN) : COLOR
{
if(!IsLightingEnabled) return tex2D(TextureSampler,IN.TexCoord);
float4 color = float4(0.0f, 0.0f, 0.0f, 0.0f);
float3 n = normalize(IN.Normal);
float3 v = normalize(IN.ViewDir);
float3 l = float3(0.0f, 0.0f, 0.0f);
float3 h = float3(0.0f, 0.0f, 0.0f);
float atten = 0.0f;
float nDotL = 0.0f;
float power = 0.0f;
if(IsAmbientLightingEnabled) color += (AmbientColor*AmbientIntensity);
if(IsDiffuseLightingEnabled || IsSpecularLightingEnabled)
{
//for (int i = 0; i < activeLights; ++i)//works but perfoemnce is terrible
for (int i = 0; i < 7; ++i)//performance is fine but obviously isn't dynamic
{
l = (lights[i].Position - IN.WorldPosition) / lights[i].Radius;
atten = saturate(1.0f - dot(l, l));
l = normalize(l);
nDotL = saturate(dot(n, l));
if(IsDiffuseLightingEnabled) color += (lights[i].Color * nDotL * atten);
if(IsSpecularLightingEnabled) color += (SpecularColor * SpecularPower * atten);
}
}
return color * tex2D(TextureSampler, IN.TexCoord);
}
technique PerPixelPointLighting
{
pass
{
VertexShader = compile vs_3_0 VS_PointLighting();
PixelShader = compile ps_3_0 PS_PointLighting();
}
}
My guess is that changing the loop constraint to be a compile-time constant is allowing the HLSL compiler to unroll the loop. That is, instead of this:
for (int i = 0; i < 7; i++)
doLoopyStuff();
It's getting turned into this:
doLoopyStuff();
doLoopyStuff();
doLoopyStuff();
doLoopyStuff();
doLoopyStuff();
doLoopyStuff();
doLoopyStuff();
Loops and conditional branches can be a significant performance hit inside of shader code, and should be avoided wherever possible.
EDIT
This is just off the top of my head, but maybe you could try something like this?
for (int i = 0; i < MAX_LIGHTS; i++)
{
color += step(i, activeLights) * lightingFunction();
}
This way you calculate all possible lights, but always get a value of 0 for inactive lights. The benefit would depend on the complexity of the lighting function, of course; you would need to do more profiling.
Try using PIX to profile it. http://wtomandev.blogspot.com/2010/05/debugging-hlsl-shaders.html
Alternatively, read this rambling speculation:
Maybe because with a constant, the compiler can unravel and collapse your loop's instructions. When you replace it with a variable, the compiler becomes unable to make the same assumptions.
Though, somewhat unrelated to your actual question, I would push a lot of those conditions /calculations to the software level.
if(IsDiffuseLightingEnabled || IsSpecularLightingEnabled)
^- Like that.
Also, I think you could precompute a few thing before you call the shader program as well. Like l = (lights[i].Position - IN.WorldPosition) / lights[i].Radius; Pass a precomputed array of those rather than calculating each time over every pixel.
I might be misinformed of the optimizations that the HLSL compiler does, but I think each calculation you do like that on the pixel shader gets executed screen w*h times (though this is done insanely parallel), and I vaguely remember there being some limits to the number of instructions you could have in a shader (like 72?). (though I think that restriction was liberalized a lot in higher versions of HLSL). Maybe the fact that your shader generates so many instructions -- maybe it breaks your program up and turns it into a multi-pass pixel shader on compilation. If that's the case, that probably adds significant overhead.
Actually, here's another idea that might be stupid: Passing a variable to a shader has it transmit the data to the GPU. That transmission happens with limited bandwidth. Perhaps the compiler is smart enough such that when you're only staticly indexing the first 7 elements in an array, only transfer 7 elements. When the compiler doesn't make that optimization (because you aren't iterating with constants), it pushes the WHOLE array every frame, and you're flooding the bus. If that's the case, then my earlier suggestion of pushing calculations out, and passing more results in, would only make the problem worse, heh.

Resources