Three.js, 256×256 png textures crash chrome tab - memory

Loading 256x256 textures into Three.js materials, which are then used for planegeometry deformation. Encountering a bottleneck at 15th texture. Chrome apparently crashes at the render call. When each mesh is added to scene, I call the renderer.render call, but the sequence is pretty tight, so I believe, the gpu bus may be overwhelmed. It is hard to believe that a small number of such small textures is enough to cause this. Cpu memory is not a problem, as textures are loaded into cpu and if meshes are not added to scene, there is no crash. Also, there is a significant delay while the textures are being copied from cpu to gpu.
function loadTexture(texture) {
var x = 512;
var y = 512;
var dx = 256;
var dy = 256;
var geometry = new THREE.PlaneGeometry(x, y, dx, dy);
var material = new THREE.ShaderMaterial({
side: THREE.DoubleSide,
uniforms: {
heightMap: {
type: "t",
value: texture
}
},
vertexShader: vertexShader,
fragmentShader: fragmentShader
});
var mesh = new THREE.Mesh(geometry, material);
scene.add(mesh);
this.renderer.render(scene, camera);
}

PlaneGeometries are turning out to be expensive in Three.js r76. I cannot create more than 14 of these (at mentioned resolution of 256x256 data points). I happen to need many but a limited amount of plane geometries, so I can continue to push against this limit but eventually I will need more memory for the PlaneGeometries. This bottleneck is GPU only, as this happens only on renderer.render call.

Related

how can I update dynamic vertex buffer fastly?

I'm trying to make a simple 3D modeling tool.
there is some work to move a vertex( or vertices ) for transform the model.
I used dynamic vertex buffer because thought it needs much update.
but performance is too low in high polygon model even though I change just one vertex.
is there other methods? or did I wrong way?
here is my D3D11_BUFFER_DESC
Usage = D3D11_USAGE_DYNAMIC;
CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
BindFlags = D3D11_BIND_VERTEX_BUFFER;
ByteWidth = sizeof(ST_Vertex) * _nVertexCount
D3D11_SUBRESOURCE_DATA d3dBufferData;
d3dBufferData.pSysMem = pVerticesInfo;
hr = pd3dDevice->CreateBuffer(&descBuffer, &d3dBufferData, &_pVertexBuffer);
and my update funtion
D3D11_MAPPED_SUBRESOURCE d3dMappedResource;
pImmediateContext->Map(_pVertexBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &d3dMappedResource);
ST_Vertex* pBuffer = (ST_Vertex*)d3dMappedResource.pData;
for (int i = 0; i < vIndice.size(); ++i)
{
pBuffer[vIndice[i]].xfPosition.x = pVerticesInfo[vIndice[i]].xfPosition.x;
pBuffer[vIndice[i]].xfPosition.y = pVerticesInfo[vIndice[i]].xfPosition.y;
pBuffer[vIndice[i]].xfPosition.z = pVerticesInfo[vIndice[i]].xfPosition.z;
}
pImmediateContext->Unmap(_pVertexBuffer, 0);
As mentioned in the previous answer, you are updating your whole buffer every time, which will be slow depending on model size.
The solution is indeed to implement partial updates, there are two possibilities for it, you want to update a single vertex, or you want to update
arbitrary indices (for example, you want to move N vertices in one go, in different locations, like vertex 1,20,23 for example.
The first solution is rather simple, first create your buffer with the following description :
Usage = D3D11_USAGE_DEFAULT;
CPUAccessFlags = 0;
BindFlags = D3D11_BIND_VERTEX_BUFFER;
ByteWidth = sizeof(ST_Vertex) * _nVertexCount
D3D11_SUBRESOURCE_DATA d3dBufferData;
d3dBufferData.pSysMem = pVerticesInfo;
hr = pd3dDevice->CreateBuffer(&descBuffer, &d3dBufferData, &_pVertexBuffer);
This makes sure your vertex buffer is gpu visible only.
Next create a second dynamic buffer which has the size of a single vertex (you do not need any bind flags in that case, as it will be used only for copies)
_pCopyVertexBuffer
Usage = D3D11_USAGE_DYNAMIC; //Staging works as well
CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
BindFlags = 0;
ByteWidth = sizeof(ST_Vertex);
D3D11_SUBRESOURCE_DATA d3dBufferData;
d3dBufferData.pSysMem = NULL;
hr = pd3dDevice->CreateBuffer(&descBuffer, &d3dBufferData, &_pCopyVertexBuffer);
when you move a vertex, copy the changed vertex in the copy buffer :
ST_Vertex changedVertex;
D3D11_MAPPED_SUBRESOURCE d3dMappedResource;
pImmediateContext->Map(_pVertexBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &d3dMappedResource);
ST_Vertex* pBuffer = (ST_Vertex*)d3dMappedResource.pData;
pBuffer->xfPosition.x = changedVertex.xfPosition.x;
pBuffer->.xfPosition.y = changedVertex.xfPosition.y;
pBuffer->.xfPosition.z = changedVertex.xfPosition.z;
pImmediateContext->Unmap(_pVertexBuffer, 0);
Since you use D3D11_MAP_WRITE_DISCARD, make sure to write all attributes there (not only position).
Now once you done, you can use ID3D11DeviceContext::CopySubresourceRegion to only copy the modified vertex in the current location :
I assume that vertexID is the index of the modified vertex :
pd3DeviceContext->CopySubresourceRegion(_pVertexBuffer,
0, //must be 0
vertexID * sizeof(ST_Vertex), //location of the vertex in you gpu vertex buffer
0, //must be 0
0, //must be 0
_pCopyVertexBuffer,
0, //must be 0
NULL //in this case we copy the full content of _pCopyVertexBuffer, so we can set to null
);
Now if you want to update a list of vertices, things get more complicated and you have several options :
-First you apply this single vertex technique in a loop, this will work quite well if your changeset is small.
-If your changeset is very big (close to almost full vertex size, you can probably rewrite the whole buffer instead).
-An intermediate technique is to use compute shader to perform the updates (thats the one I normally use as its the most flexible version).
Posting all c++ binding code would be way too long, but here is the concept :
your vertex buffer must have BindFlags = D3D11_BIND_VERTEX_BUFFER | D3D11_BIND_UNORDERED_ACCESS; //this allows to write wioth compute
you need to create an ID3D11UnorderedAccessView for this buffer (so shader can write to it)
you need the following misc flags : D3D11_RESOURCE_MISC_BUFFER_ALLOW_RAW_VIEWS //this allows to write as RWByteAddressBuffer
you then create two dynamic structured buffers (I prefer those over byteaddress, but vertex buffer and structured is not allowed in dx11, so for the write one you need raw instead)
first structured buffer has a stride of ST_Vertex (this is your changeset)
second structured buffer has a stride of 4 (uint, these are the indices)
both structured buffers get an arbitrary element count (normally i use 1024 or 2048), so that will be the maximum amount of vertices you can update in a single pass.
both structured buffers you need an ID3D11ShaderResourceView (shader visible, read only)
Then update process is the following :
write modified vertices and locations in structured buffers (using map discard, if you have to copy less its ok)
attach both structured buffers for read
attach ID3D11UnorderedAccessView for write
set your compute shader
call dispatch
detach ID3D11UnorderedAccessView for write (this is VERY important)
This is a sample compute shader code (I assume you vertex is position only, for simplicity)
cbuffer cbUpdateCount : register(b0)
{
uint updateCount;
};
RWByteAddressBuffer RWVertexPositionBuffer : register(u0);
StructuredBuffer<float3> ModifiedVertexBuffer : register(t0);
StructuredBuffer<uint> ModifiedVertexIndicesBuffer : register(t0);
//this is the stride of your vertex buffer, since here we use float3 it is 12 bytes
#define WRITE_STRIDE 12
[numthreads(64, 1, 1)]
void CS( uint3 tid : SV_DispatchThreadID )
{
//make sure you do not go part element count, as here we runs 64 threads at a time
if (tid.x >= updateCount) { return; }
uint readIndex = tid.x;
uint writeIndex = ModifiedVertexIndicesBuffer[readIndex];
float3 vertex = ModifiedVertexBuffer[readIndex];
//byte address buffers do not understand float, asuint is a binary cast.
RWVertexPositionBuffer.Store3(writeIndex * WRITE_STRIDE, asuint(vertex));
}
For the purposes of this question I'm going to assume you already have a mechanism for selecting a vertex from a list of vertices based upon ray casting or some other picking method and a mechanism for creating a displacement vector detailing how the vertex was moved in model space.
The method you have for updating the buffer is sufficient for anything less than a few hundred vertices, but on large scale models it becomes extremely slow. This is because you're updating everything, rather than the individual vertices you modified.
To fix this, you should only update the vertices you have changed, and to do that you need to create a change set.
In concept, a change set is nothing more than a set of changes made to the data - a list of the vertices that need to be updated. Since we already know which vertices were modified (otherwise we couldn't have manipulated them), we can map in the GPU buffer, go to that vertex specifically, and copy just those vertices into the GPU buffer.
In your vertex modification method, record the index of the vertex that was modified by the user:
//Modify the vertex coordinates based on mouse displacement
pVerticesInfo[SelectedVertexIndex].xfPosition.x += DisplacementVector.x;
pVerticesInfo[SelectedVertexIndex].xfPosition.y += DisplacementVector.y;
pVerticesInfo[SelectedVertexIndex].xfPosition.z += DisplacementVector.z;
//Add the changed vertex to the list of changes.
changedVertices.add(SelectedVertexIndex);
//And update the GPU buffer
UpdateD3DBuffer();
In UpdateD3DBuffer(), do the following:
D3D11_MAPPED_SUBRESOURCE d3dMappedResource;
pImmediateContext->Map(_pVertexBuffer, 0, D3D11_MAP_WRITE, 0, &d3dMappedResource);
ST_Vertex* pBuffer = (ST_Vertex*)d3dMappedResource.pData;
for (int i = 0; i < changedVertices.size(); ++i)
{
pBuffer[changedVertices[i]].xfPosition.x = pVerticesInfo[changedVertices[i]].xfPosition.x;
pBuffer[changedVertices[i]].xfPosition.y = pVerticesInfo[changedVertices[i]].xfPosition.y;
pBuffer[changedVertices[i]].xfPosition.z = pVerticesInfo[changedVertices[i]].xfPosition.z;
}
pImmediateContext->Unmap(_pVertexBuffer, 0);
changedVertices.clear();
This has the effect of only updating the vertices that have changed, rather than all vertices in the model.
This also allows for some more complex manipulations. You can select multiple vertices and move them all as a group, select a whole face and move all the connected vertices, or move entire regions of the model relatively easily, assuming your picking method is capable of handling this.
In addition, if you record the change sets with enough information (the affected vertices and the displacement index), you can fairly easily implement an undo function by simply reversing the displacement vector and reapplying the selected change set.

WebGL - rendering a buffer from memory

I'm a newbie with webgl, is it possible to read a buffer from host memory (RAM) and render it?
Or perhaps to read a buffer from a CUDA buffer and render it. I can do the first in openGL but I'm wondering if I can do that in webGL
Edit: I'm trying to be more precise in what I want to do..
I have a RGBA buffer with an image which might immediately be rendered on a GL context with glDrawPixels. This was produced by a CUDA code (so the buffer was originally on the gpu memory). Is there any way to visualize this into a webGL window?
If you mean some JavaScript buffer by "host memory", then yes: You upload that into a WebGL buffer object and do whatever you want with it then. There's no CUDA interaction for WebGL. And you can't just "grab" any other process's memory for your things. Memory protection is a good thing.
Please be a bit more specific at what you want to do.
Update due to comment.
You think you may have a RGBA buffer, but you don't. Some other process has or had this buffer, but as far as WebGL is concerned, this buffer doesn't exist. It's part of a different process, which memory (and the system makes no difference between CPU and GPU memory there) is (or at least should be) protected from the access of other processes.
Think about it: If your browser (via WebGL) could access random stuff that resides in part of your system's memory that don't belong to it, this would hugely impair security. When it comes to regular OpenGL you are actually able to look into the uninitialized parts of memory to see what was there before. But for WebGL, being an Internet technology, a lot of checking is done, that you can't reach outside your sandbox. Heck, there were even new extensions introduced to make OpenGL more robust against data exfiltration and GPU based attack vectors.
It goes even so far, that in WebGL not only the memory is protected between processes, but also between WebGL instances, which may run in the same process.
Here's an excerpt from http://omino.com/experiments/webgl/OmGl.js which is the meat of a "drawtexture" method that renders a JavaScript array onto a webgl canvas.
A simple instance of its use it at http://omino.com/experiments/webgl/manualTexture.html. (It references some helpers in OmGl.js; hopefully clear enough, though.)
Hope that helps!
var _omglDrawTextureProgGl = null;
var _omglDrawTextureProg = null;
var _omglPosBuffer = null;
/*
* fill the current frame buffer with the texture.
* Right up to the edges.
* you can specify which texture unit, if you like.
* reduce thrashing maybe.
*
* pass in your own
*/
OmGl.drawTexture = function drawTexture(gl,texture,textureUnit,prog)
{
if(!textureUnit)
textureUnit = 0;
if(!prog)
{
if(gl != _omglDrawTextureProgGl || !_omglDrawTextureProg)
{
var dtvs = "attribute vec2 pos;"
+ "varying vec2 unitPos;"
+ "void main() {"
+ " unitPos = (pos.xy + 1.0) / 2.0;"
+ " gl_Position = vec4(pos,0,1);"
+ "}";
var dtfs = "precision mediump float;"
+ "uniform sampler2D uSampler;"
+ "varying vec2 unitPos;"
+ "void main() {"
+ " gl_FragColor = texture2D(uSampler, unitPos);"
+ "}";
_omglDrawTextureProg = OmGl.linkShaderProgram(gl,dtvs,dtfs);
_omglDrawTextureProgGl = gl;
}
prog = _omglDrawTextureProg;
}
gl.useProgram(prog);
// Two triangles which fill the entire canvas.
var posPoints = [
-1,-1,
1,-1,
1,1,
-1,-1,
1,1,
-1,1
];
gl.activeTexture(gl.TEXTURE0 + textureUnit);
gl.bindTexture(gl.TEXTURE_2D, texture);
var z = gl.getUniformLocation(prog, "uSampler");
gl.uniform1i(z, textureUnit);
_omglPosBuffer = OmGl.setAttributeFloats(gl, prog, "pos", 2, posPoints, _omglPosBuffer);
gl.clear(gl.DEPTH_BUFFER_BIT);
gl.drawArrays(gl.TRIANGLES, 0, posPoints.length / 2);
}
(edit -- updated the code at http://omino.com/experiments/webgl/manualTexture.html to be much cleaner.)

Displacement Map UV Mapping?

Summary
I'm trying to apply a displacement map (Height map) to a rather simple object (Hexagonal plane) and I'm having some unexpected results. I am using grayscale and as such, I was under the impression my height map should only be affecting the Z values of my mesh. However, the displacement map I've created stretches the mesh across the X and Y planes. Furthermore, it doesn't seem to use the UV mapping I've created that all other textures are successfully applied to.
Model and UV Map
Here are reference images of my hexagonal mesh and its corresponding UV map in Blender.
Diffuse and Displacement Textures
These are the diffuse and displacement map textures I am applying to my mesh through Three.JS.
Renders
When I render the plane without a displacement map, you can see that the hexagonal plane stays within the lines. However, when I add the displacement map it clearly affects the X and Y positions of the vertices rather than affecting only the Z, expanding the plane well over the lines.
Code
Here's the relevant Three.js code:
// Textures
var diffuseTexture = THREE.ImageUtils.loadTexture('diffuse.png', null, loaded);
var displacementTexture = THREE.ImageUtils.loadTexture('displacement.png', null, loaded);
// Terrain Uniform
var terrainShader = THREE.ShaderTerrain["terrain"];
var uniformsTerrain = THREE.UniformsUtils.clone(terrainShader.uniforms);
//uniformsTerrain["tNormal"].value = null;
//uniformsTerrain["uNormalScale"].value = 1;
uniformsTerrain["tDisplacement"].value = displacementTexture;
uniformsTerrain["uDisplacementScale"].value = 1;
uniformsTerrain[ "tDiffuse1" ].value = diffuseTexture;
//uniformsTerrain[ "tDetail" ].value = null;
uniformsTerrain[ "enableDiffuse1" ].value = true;
//uniformsTerrain[ "enableDiffuse2" ].value = true;
//uniformsTerrain[ "enableSpecular" ].value = true;
//uniformsTerrain[ "uDiffuseColor" ].value.setHex(0xcccccc);
//uniformsTerrain[ "uSpecularColor" ].value.setHex(0xff0000);
//uniformsTerrain[ "uAmbientColor" ].value.setHex(0x0000cc);
//uniformsTerrain[ "uShininess" ].value = 3;
//uniformsTerrain[ "uRepeatOverlay" ].value.set(6, 6);
// Terrain Material
var material = new THREE.ShaderMaterial({
uniforms:uniformsTerrain,
vertexShader:terrainShader.vertexShader,
fragmentShader:terrainShader.fragmentShader,
lights:true,
fog:true
});
// Load Tile
var loader = new THREE.JSONLoader();
loader.load('models/hextile.js', function(g) {
//g.computeFaceNormals();
//g.computeVertexNormals();
g.computeTangents();
g.materials[0] = material;
tile = new THREE.Mesh(g, new THREE.MeshFaceMaterial());
scene.add(tile);
});
Hypothesis
I'm currently juggling three possibilities as to why this could be going wrong:
The UV map is not applying to my displacement map.
I've made the displacement map incorrectly.
I've missed a crucial step in the process that would lock the displacement to Z-only.
And of course, secret option #4 which is none of the above and I just really have no idea what I'm doing. Or any mixture of the aforementioned.
Live Example
You can view a live example here.
If anybody with more knowledge on the subject could guide me I'd be very grateful!
Edit 1: As per suggestion, I've commented out computeFaceNormals() and computeVertexNormals(). While it did make a slight improvement, the mesh is still being warped.
In your terrain material, set wireframe = true, and you will be able to see what is happening.
Your code and textures are basically fine. The problem occurs when you compute vertex normals in the loader callback function.
The computed vertex normals for the outer ring of your geometry point somewhat outward. This is most likely because in computeVertexNormals() they are computed by averaging the face normals of each neighboring face, and the face normals of the "sides" of your model (the black part) are averaged into the vertex normal calculation for those vertices that make up the outer ring of the "cap".
As a result, the outer ring of the "cap" expands outward under the displacement map.
EDIT: Sure enough, straight from your model, the vertex normals of the outer ring point outward. The vertex normals for the inner rings are all parallel. Perhaps Blender is using the same logic to generate vertex normals as computeVertexNormals() does.
The problem is how your object is constructed becuase the displacement happens along the normal vector.
the code is here.
https://github.com/mrdoob/three.js/blob/master/examples/js/ShaderTerrain.js#L348-350
"vec3 dv = texture2D( tDisplacement, uvBase ).xyz;",
This takes a the rgb vector of the displacement texture.
"float df = uDisplacementScale * dv.x + uDisplacementBias;",
this takes only red value of the vector becuase uDisplacementScale is normally 1.0 and uDisplacementBias is 0.0.
"vec3 displacedPosition = normal * df + position;",
This displaces the postion along the normal vector.
so to solve you either update the normals or the shader.

Efficient way to render a bunch of layered textures?

What's the efficient way to render a bunch of layered textures? I have some semitransparent textured rectangles that I position randomly in 3D space and render them from back to front.
Currently I call d3dContext->PSSetShaderResources() to feed the pixel shader with a new texture before each call to d3dContext->DrawIndexed(). I have a feeling that I am copying the texture to the GPU memory before each draw. I might have 10-30 ARGB textures roughly 1024x1024 pixels each and they are associated across 100-200 rectangles that I render on screen. My FPS is OK at 100, but goes pretty bad around 200. I possibly have some inefficiencies elsewhere since this is my first semi-serious D3D code, but I strongly suspect this has to do with copying the textures back and forth. 30*1024*1024*4 is 120MB, which is a bit high for a Metro Style App that should target any Windows 8 device. So putting them all in there might be a stretch, but maybe I could at least cache a few somehow? Any ideas?
*EDIT - Some code snippets added
Constant Buffer
struct ModelViewProjectionConstantBuffer
{
DirectX::XMMATRIX model;
DirectX::XMMATRIX view;
DirectX::XMMATRIX projection;
float opacity;
float3 highlight;
float3 shadow;
float textureTransitionAmount;
};
The Render Method
void RectangleRenderer::Render()
{
// Clear background and depth stencil
const float backgroundColorRGBA[] = { 0.35f, 0.35f, 0.85f, 1.000f };
m_d3dContext->ClearRenderTargetView(
m_renderTargetView.Get(),
backgroundColorRGBA
);
m_d3dContext->ClearDepthStencilView(
m_depthStencilView.Get(),
D3D11_CLEAR_DEPTH,
1.0f,
0
);
// Don't draw anything else until all textures are loaded
if (!m_loadingComplete)
return;
m_d3dContext->OMSetRenderTargets(
1,
m_renderTargetView.GetAddressOf(),
m_depthStencilView.Get()
);
UINT stride = sizeof(BasicVertex);
UINT offset = 0;
// The vertext buffer only has 4 vertices of a rectangle
m_d3dContext->IASetVertexBuffers(
0,
1,
m_vertexBuffer.GetAddressOf(),
&stride,
&offset
);
// The index buffer only has 4 vertices
m_d3dContext->IASetIndexBuffer(
m_indexBuffer.Get(),
DXGI_FORMAT_R16_UINT,
0
);
m_d3dContext->IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
m_d3dContext->IASetInputLayout(m_inputLayout.Get());
FLOAT blendFactors[4] = { 0, };
m_d3dContext->OMSetBlendState(m_blendState.Get(), blendFactors, 0xffffffff);
m_d3dContext->VSSetShader(
m_vertexShader.Get(),
nullptr,
0
);
m_d3dContext->PSSetShader(
m_pixelShader.Get(),
nullptr,
0
);
m_d3dContext->PSSetSamplers(
0, // starting at the first sampler slot
1, // set one sampler binding
m_sampler.GetAddressOf()
);
// number of rectangles is in the 100-200 range
for (int i = 0; i < m_rectangles.size(); i++)
{
// start rendering from the farthest rectangle
int j = (i + m_farthestRectangle) % m_rectangles.size();
m_vsConstantBufferData.model = m_rectangles[j].transform;
m_vsConstantBufferData.opacity = m_rectangles[j].Opacity;
m_vsConstantBufferData.highlight = m_rectangles[j].Highlight;
m_vsConstantBufferData.shadow = m_rectangles[j].Shadow;
m_vsConstantBufferData.textureTransitionAmount = m_rectangles[j].textureTransitionAmount;
m_d3dContext->UpdateSubresource(
m_vsConstantBuffer.Get(),
0,
NULL,
&m_vsConstantBufferData,
0,
0
);
m_d3dContext->VSSetConstantBuffers(
0,
1,
m_vsConstantBuffer.GetAddressOf()
);
m_d3dContext->PSSetConstantBuffers(
0,
1,
m_vsConstantBuffer.GetAddressOf()
);
auto a = m_rectangles[j].textureId;
auto b = m_rectangles[j].targetTextureId;
auto srv1 = m_textures[m_rectangles[j].textureId].textureSRV.GetAddressOf();
auto srv2 = m_textures[m_rectangles[j].targetTextureId].textureSRV.GetAddressOf();
ID3D11ShaderResourceView* srvs[2];
srvs[0] = *srv1;
srvs[1] = *srv2;
m_d3dContext->PSSetShaderResources(
0, // starting at the first shader resource slot
2, // set one shader resource binding
srvs
);
m_d3dContext->DrawIndexed(
m_indexCount,
0,
0
);
}
}
Pixel Shader
cbuffer ModelViewProjectionConstantBuffer : register(b0)
{
matrix model;
matrix view;
matrix projection;
float opacity;
float3 highlight;
float3 shadow;
float textureTransitionAmount;
};
Texture2D baseTexture : register(t0);
Texture2D targetTexture : register(t1);
SamplerState simpleSampler : register(s0);
struct PixelShaderInput
{
float4 pos : SV_POSITION;
float3 norm : NORMAL;
float2 tex : TEXCOORD0;
};
float4 main(PixelShaderInput input) : SV_TARGET
{
float3 lightDirection = normalize(float3(0, 0, -1));
float4 baseTexelColor = baseTexture.Sample(simpleSampler, input.tex);
float4 targetTexelColor = targetTexture.Sample(simpleSampler, input.tex);
float4 texelColor = lerp(baseTexelColor, targetTexelColor, textureTransitionAmount);
float4 shadedColor;
shadedColor.rgb = lerp(shadow.rgb, highlight.rgb, texelColor.r);
shadedColor.a = texelColor.a * opacity;
return shadedColor;
}
As Jeremiah has suggested, you are not probably moving texture from CPU to GPU for each frame as you would have to create new texture for each frame or using "UpdateSubresource" or "Map/UnMap" methods.
I don't think that instancing is going to help for this specific case, as the number of polygons is extremely low (I would start to worry with several millions of polygons). It is more likely that your application is going to be bandwidth/fillrate limited, as your are performing lots of texture sampling/blending (It depends on tecture fillrate, pixel fillrate and the nunber of ROP on your GPU).
In order to achieve better performance, It is highly recommended to:
Make sure that all your textures have all mipmaps generated. If they
don't have any mipmaps, It will hurt a lot the cache of the GPU. (I also assume that you are using texture.Sample method in HLSL, and not texture.SampleLevel or variants)
Use Direct3D11 Block Compressed texture on the GPU, by using a tool
like texconv.exe or preferably the sample from "Windows DirectX 11
Texture Converter".
On a side note, you will probably get more attention for this kind of question on https://gamedev.stackexchange.com/.
I don't think you are doing any copying back and forth from GPU to system memory. You usually have to explicitly do that a call to Map(...), or by blitting to a texture you created in system memory.
One issue, is you are making a DrawIndexed(...) call for each texture. GPUs work most efficiently if you send it a bunch of work to do by batching. One way to accomplish this is to set n-amount of textures to PSSetShaderResources(i, ...), and do a DrawIndexedInstanced(...). Your shader code would then read each of the shader resources and draw them. I do this in my C++ DirectCanvas code here (SpriteInstanced.cpp). This can make for a lot of code, but the result is very efficient (I even do the matrix ops in the shader for more speed).
One other, maybe a lot easier way, is to give the DirectXTK spritebatch a shot.
I used it here in this project...only for a simple blit but it may be a good start to see the minor amount of setup needed to use the spritebatch.
Also, if possible, try to "atlas" your texture. For instance, try to fit as many "images" in a texture as possible and blit from them vs having a single texture for each.

View GPU Memory / View Texture2D memory space for debugging

I've got a question about a PixelShader I am trying to implement, and what I currently do (this is just for debugging, and trying to figure stuff out):
int3 loc;
loc.x = (int)(In.TextureUV.x * resolution_XY.x);
loc.y = (int)(In.TextureUV.x * resolution_XY.x);
loc.z = 0;
float4 r = g_txDiffuse.Load(loc);
return float4(r.x, r.y, r.z, 1);
The point is, this is always 0,0,0,1
The texture buffer is created:
D3D11_TEXTURE2D_DESC tDesc;
tDesc.Height = 480;
tDesc.Width = 640;
tDesc.Usage = D3D11_USAGE_DYNAMIC;
tDesc.MipLevels = 1;
tDesc.ArraySize = 1;
tDesc.SampleDesc.Count = 1;
tDesc.SampleDesc.Quality = 0;
tDesc.Format = DXGI_FORMAT_R8_UINT;
tDesc.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
tDesc.BindFlags = D3D11_BIND_SHADER_RESOURCE;
tDesc.MiscFlags = 0;
V_RETURN(pd3dDevice->CreateTexture2D(&tDesc, NULL, &g_pCurrentImage));
I upload the texture (which should be a live display at the end) via:
D3D11_MAPPED_SUBRESOURCE resource;
pd3dImmediateContext->Map(g_pCurrentImage, 0, D3D11_MAP_WRITE_DISCARD, 0, &resource);
memcpy( resource.pData, g_Images.GetData(), g_Images.GetDataSize() );
pd3dImmediateContext->Unmap( g_pCurrentImage, 0 );
I've checked the resource.pData, the data in there is a valid 8bit monochrome image. I made sure the data coming from the camera is 8bit monochrome 640x480.
There's a few things I don't fully understand:
if I run the Map / memcpy / Unmap routine in every frame, the driver will ultimately crash, the system will be unresponsive. Is there a different way to update a complete texture every frame which should be done?
the texture I uploaded is 8bit, why is the Texture2D.load() a float4 return? Do I have to use a different method to access the texture data? I tried to .sample it, but that didn't work either. Would I have to use a int buffer or something instead?
is there a way to debug the GPU memory, to check if the memcpy worked in the first place?
The Map, memcpy, Unmap really ought not to crash unless2 you are trying to copy too much data into the texture. It would be interesting to know what "GetDataSize()" returns. Does it equal 307,200? If its more than that then there lies your problem.
Texture2D returns a float4 because thats what you've asked for. If you write float r = g_txDiffuse.Load( ... ). The 8-bits get extended to a normalised float as part of the load process. Are you sure, btw, that your calculation of "loc" is correct because as you have it now loc.x and loc.y will always be the same.
You can debug whats going on with DirectX using PIX. Its a great tool and I highly recommend you familiarise yourself with it.

Resources