Hardware Instancing for Primitives

Hardware Instancing for Primitives - xna

I have read about hardware instancing.
I wanted to apply it to primitives instead of models.
But my question is:
I want to draw a circle build by vertices(in total 6300 Vertices for all circles)
Do I have to set a transform matrix for every vertex?
Thank you in advance

This is what MSDN has to say on GraphicsDevice.DrawInstancedPrimitives
This method provides a mechanism for gathering vertex shader input data from different streams at different frequencies (typically by interpreting one data stream as a per-instance world transform); this approach requires a custom shader to interpret the input data. For more information, see DrawInstancedPrimitives in XNA Game Studio 4.0. Tell me more
Returning to your question:
Do I have to set a transform matrix for every vertex?
No you don't. The instance (not each vertex) will have a transform that you bind to your shader prior to rendering. A model/instance will have a transform describing it's position in the world. Each vertex is essentially a relative coordinate describing an offset from the model's origin.
e.g. courtesy of Shawn Hargreaves Blog
float4x4 WorldViewProj;
void InstancingVertexShader(inout float4 position : POSITION0,
in float4x4 world : TEXCOORD0)
{
position = mul(mul(position, transpose(world)), WorldViewProj);
}
...your XNA code: again courtesy of Shawn Hargreaves Blog
instanceVertexBuffer.SetData(instanceTransformMatrices, 0, numInstances, SetDataOptions.Discard);
graphicsDevice.SetVertexBuffers(modelVertexBuffer, new VertexBufferBinding(instanceVertexBuffer, 0, 1));
graphicsDevice.Indices = indexBuffer;
instancingEffect.CurrentTechnique.Passes[0].Apply();
graphicsDevice.DrawInstancedPrimitives(PrimitiveType.TriangleList, 0, 0,
modelVertexBuffer.VertexCount, 0,
indexBuffer.IndexCount / 3,
numInstances);

Related

DirectX and DirectXTK translation limits

I use DirectX Toolkit to display a 3d model, following the 'Rendering the model' and my pyramid is displayed:
When trying to transform the object, the scaling and rotation work well but I'm not sure how to move the object (translate) around. Basically I'm looking for an algorithm that determines, given the current camera position, focus, viewport and the rendered model (which the DirectX toolkit gives me the bounding box so it's "size") the minimum and maximum values for XYZ translation so the object remains visible.
The bounding box is always the same, no matter the view port size, so how do I compare it's size against my viewport?
Please excuse my newbiness, I'm not a 3D developer, at least yet.
The "Simple Rendering" example which draws a triangle:
Matrix proj = Matrix::CreateScale( 2.f/float(backBufferWidth),
-2.f/float(backBufferHeight), 1.f)
* Matrix::CreateTranslation( -1.f, 1.f, 0.f );
m_effect->SetProjection(proj);
says that the normalized triangle size is [1,1,1] but here normalized values do not work.

TL:DR: To move your model around the world, create a matrix for the translation and set it as the world matrix with SetWorld.
Matrix world = Matrix::CreateTranslation( 2.f, 1.f, 3.f);
m_effect->SetWorld(world);
// Also be sure you have called SetView and SetProjection for the 3D camera setup
//covered in the 3D shapes / Rendering a model tutorial
You should start with a basic review of 3D transformations, in particular the world -> view -> projection transformation pipeline.
The world transformation performs the affine transformation to get the model you are rendering into it's 'world' position. (a.k.a. 'local coordinates to world coordinates transformation').
The view transformation performs the transformation to get world positions into the camera's point of view (i.e. position and direction) (a.k.a. 'world coordinates to view coordinates transformation').
The projection transformation performs the transformation to get the view positions into the canonical "-1 to 1" range that the actual hardware uses, including any perspective projection (a.k.a. 'view coordinates to 'clip' coordinates transformation).
The hardware itself performs the final step of converting the "-1 to 1" to pixel locations in the render target based on the Direct3D SetViewport information (a.k.a. 'clip' coordinates to pixel coordinates transformation).
This Direct3D 9 era article is a bit dated, but it covers the overall idea well.
In the DirectX Tool Kit BasicEffect system, there are distinct methods for each of these matrices: SetWorld, SetView, and SetProjection. There is also a helper if you want to set all three at once SetMatrices.
The simple rendering tutorial is concerned with the simplest form of rendering, 2D rendering, where you want the coordinates you provide to be in natural 'pixel coordinates'
Matrix proj = Matrix::CreateScale( 2.f/float(backBufferWidth),
-2.f/float(backBufferHeight), 1.f)
* Matrix::CreateTranslation( -1.f, 1.f, 0.f );
m_effect->SetProjection(proj);
The purpose of this matrix is to basically 'undo' what the SetViewport will do so that you can think in simple pixel coordinates. It's not suitable for 3D models.
In the 3D shapes tutorial I cover the basic camera model, but I leave the world matrix as the identity so the shape is sitting at the world origin.
m_view = Matrix::CreateLookAt(Vector3(2.f, 2.f, 2.f),
Vector3::Zero, Vector3::UnitY);
m_proj = Matrix::CreatePerspectiveFieldOfView(XM_PI / 4.f,
float(backBufferWidth) / float(backBufferHeight), 0.1f, 10.f);
In the Rendering a model tutorial, I also leave the world matrix as identity. I get into the basics of this in Basic game math tutorial.
One of the nice properties of affine transformations is that you can perform them all at once by transforming by the concatenation of the individual transforms. Point p transformed by matrix W, then transformed by matrix V, then transformed by matrix P is the same as point p transformed by matrix W * V * P.

HLSL vertex shader

I've been studying shaders in HLSL for an XNA project (so no DX10-DX11) but almost all resouces I found were tutorial of effects where the most part of the work was done in the pixel shader. For istance in lights the vertex shader is used only to serve to the pixel one normals and other things like that.
I'd like to make some effect based on the vertex shader rather than the pixel one, like deformation for istance. Could someone suggest me a book or a website? Even the bare effect name would be useful since than I could google it.

A lot of lighting, etc. is done in the pixel shader because the resulting image quality will be much better.
Imagine a sphere that is created by subdividing a cube or icosahedron. If lighting calculations are done in the vertex shader, the resulting values will be interpolated between face edges, which can lead to a flat or faceted appearance.
Things like blending and morphing are done in the vertex shader because that's where you can manipulate the vertices.
For example:
matrix World;
matrix View;
matrix Projection;
float WindStrength;
float3 WindDirection;
VertexPositionColor VS(VertexPositionColor input)
{
VertexPositionColor output;
matrix wvp = mul(mul(World,View),Projection);
float3 worldPosition = mul(World,input.Position);
worldPosition += WindDirection * WindStrength * worldPosition.y;
output.Position = mul(mul(View,Projection),worldPositioninput);
output.Color = input.Color;
return output;
}
(Pseudo-ish code since I'm writing this in the SO post editor.)
In this case, I'm offsetting vertices that are "high" on the Y axis with a wind direction and strength. If I use this when rendering grass, for instance, the tops of the blades will lean in the direction of the wind, while the vertices that are closer to the ground (ideally with a Y of zero) will not move at all. The math here should be tweaked a bit to take into account really tall things that would cause unacceptable large changes, and the wind should not be uniformly applied to all blades, but it should be clear that here the vertex shader is modifying the mesh in a non-uniform way to get an interesting effect.
No matter the effect you are trying to achieve - morphing, billboards (so the item you're drawing always faces the camera), etc., you're going to wind up passing some parameters into the VS that are then selectively applied to vertices as they pass through the pipeline.
A fairly trivial example would be "inflating" a model into a sphere, based on some parameter.
Pseudocode again,
matrix World;
matrix View;
matrix Projection;
float LerpFactor;
VertexShader(VertexPositionColor input)
float3 normal = normalize(input.Position);
float3 position = lerp(input.Position,normal,LerpFactor);
matrix wvp = mul(mul(World,View),Projection);
float3 outputVector = mul(wvp,position);
....
By stepping the uniform LerpFactor from 0 to 1 across a number of frames, your mesh (ideally a convex polyhedron) will gradually morph from its original shape to a sphere. Of course, you could include more explicit morph targets in your vertex declaration and morph between two model shapes, collapse it to a less complex version of a model, open the lid on a box (or completely unfold it), etc. The possibilites are endless.
For more information, this page has some sample code on generating and using morph targets on the GPU.
If you need some good search terms, look for "xna bones," "blendweight" and "morph targets."

Drawing Multiple 2d shapes in DirectX

I completed a tutorial on rendering 2d triangles in directx. Now, I want to use my knowledge of rendering a single triangle to render multiple triangles, or for that matter multiple objects on screen.
Should I create a list/stack/vector of vertexbuffers and input layouts and then draw each object? Or is there a better approach to this?
My process would be:
Setup directx, including vertex and pixel shaders
Create vertex buffers for each shape that has to be drawn on the screen and store them in an array.
Draw them to the render target for each frame(each frame)
Present the render target(each frame)
Please assume very rudimentary knowledge of DirectX and graphics programming in general when answering.

You don't need to create vertex buffer for each shape, you can just create one to store all the vertices of all triangles, then create a index buffer to store all indices of all shapes, at last draw them with index buffer.
I am not familiar with DX11, So, I just list the links for D3D 9 for your reference, I think the concept was same, just with some API changes.
Index Buffers(Direct3D 9)
Rendering from Vertex and Index buffers
If the triangles are in the same shape, just with different position or colors, you can consider using geometry instancing, it's a powerful way to render multiple copies of the same geometry.
Geometry Instancing
Efficiently Drawing Multiple Instances of Geometry(D3D9)

I don't know much about DirectX but general rule in rendering on GPU is to use separate vertex and index buffers for every mesh.
Although there is nothing limiting you from using single vertex buffer with many index buffers, in fact you may get some performance gains especially for small meshes...

You'll need just one vertex buffer for do this , and then Batching them,
so here is what you can do, you can make an array/vector holding the triangle information, let's say (pseudo-code)
struct TriangleInfo{
..... texture;
vect2 pos;
vect2 dimension;
float rot;
}
then in you draw method
for(int i=0; i < vector.size(); i++){
TriangleInfo tInfo = vector[i];
matrix worldMatrix = Transpose(matrix(tInfo.dimension) * matrix(tInfo.rot) * matrix(tInfo.pos));
shaderParameters.worldMatrix = worldMatrix; //info to the constabuffer
..
..
dctx->PSSetShaderResources(0, 1, &tInfo.texture);
dctx->Draw(0,4);
}
then in your vertex shader:
cbuffer cbParameters : register( b0 ) {
float4x4 worldMatrix;
};
VOut main(float4 position : POSITION, float4 texCoord : TEXCOORD)
{
....
output.position = mul(position,worldMatrix);
...
}
Remenber all is pseudo-code, but this should give you the idea, but there is a problem if you are planing to make a lot of Triangle, let's say 1000 triangles, maybe this is not the best option, you should using DrawIndexed and modifying the vertex position of each triangle, or you can use DrawInstanced , that is simpler , to be able to send all the information in just once Draw call, because calling Draw * triangleCount , is very heavy for large amounts

iOS GLSL. Is There A Way To Create An Image Histogram Using a GLSL Shader?

Elsewhere on StackOverflow a question was asked regarding a depthbuffer histogram - Create depth buffer histogram texture with GLSL.
I am writing an iOS image-processing app and am intrigued by this question but unclear on the answer provided. So, is it possible to create an image histogram using the GPU via GLSL?

Yes, there is, although it's a little more challenging on iOS than you'd think. This is a red histogram generated and plotted entirely on the GPU, running against a live video feed:
Tommy's suggestion in the question you link is a great starting point, as is this paper by Scheuermann and Hensley. What's suggested there is to use scattering to build up a histogram for color channels in the image. Scattering is a process where you pass in a grid of points to your vertex shader, and then have that shader read the color at that point. The value of the desired color channel at that point is then written out as the X coordinate (with 0 for the Y and Z coordinates). Your fragment shader then draws out a translucent, 1-pixel-wide point at that coordinate in your target.
That target is a 1-pixel-tall, 256-pixel-wide image, with each width position representing one color bin. By writing out a point with a low alpha channel (or low RGB values) and then using additive blending, you can accumulate a higher value for each bin based on the number of times that specific color value occurs in the image. These histogram pixels can then be read for later processing.
The major problem with doing this in shaders on iOS is that, despite reports to the contrary, Apple clearly states that texture reads in a vertex shader will not work on iOS. I tried this with all of my iOS 5.0 devices, and none of them were able to perform texture reads in a vertex shader (the screen just goes black, with no GL errors being thrown).
To work around this, I found that I could read the raw pixels of my input image (via glReadPixels() or the faster texture caches) and pass those bytes in as vertex data with a GL_UNSIGNED_BYTE type. The following code accomplishes this:
glReadPixels(0, 0, inputTextureSize.width, inputTextureSize.height, GL_RGBA, GL_UNSIGNED_BYTE, vertexSamplingCoordinates);
[self setFilterFBO];
[filterProgram use];
glClearColor(0.0, 0.0, 0.0, 1.0);
glClear(GL_COLOR_BUFFER_BIT);
glBlendEquation(GL_FUNC_ADD);
glBlendFunc(GL_ONE, GL_ONE);
glEnable(GL_BLEND);
glVertexAttribPointer(filterPositionAttribute, 4, GL_UNSIGNED_BYTE, 0, (_downsamplingFactor - 1) * 4, vertexSamplingCoordinates);
glDrawArrays(GL_POINTS, 0, inputTextureSize.width * inputTextureSize.height / (CGFloat)_downsamplingFactor);
glDisable(GL_BLEND);
In the above code, you'll notice that I employ a stride to only sample a fraction of the image pixels. This is because the lowest opacity or greyscale level you can write out is 1/256, meaning that each bin becomes maxed out once more than 255 pixels in that image have that color value. Therefore, I had to reduce the number of pixels processed in order to bring the range of the histogram within this limited window. I'm looking for a way to extend this dynamic range.
The shaders used to do this are as follows, starting with the vertex shader:
attribute vec4 position;
void main()
{
gl_Position = vec4(-1.0 + (position.x * 0.0078125), 0.0, 0.0, 1.0);
gl_PointSize = 1.0;
}
and finishing with the fragment shader:
uniform highp float scalingFactor;
void main()
{
gl_FragColor = vec4(scalingFactor);
}
A working implementation of this can be found in my open source GPUImage framework. Grab and run the FilterShowcase example to see the histogram analysis and plotting for yourself.
There are some performance issues with this implementation, but it was the only way I could think of doing this on-GPU on iOS. I'm open to other suggestions.

Yes, it is. It's not clearly the best approach, but it's indeed the best one available in iOS, since OpenCL is not supported. You'll lose elegance, and your code will probably not as straightforward, but almost all OpenCL features can be achieved with shaders.
If it helps, DirectX11 comes with a FFT example for compute shaders. See DX11 August SDK Release Notes.

OpenGL ES 2 (iOS) Morph / Animate between two set of vertexes

I have two sets of vertexes used as a line strip:
Vertexes1
Vertexes2
It's important to know that these vertexes have previously unknown values, as they are dynamic.
I want to make an animated transition (morph) between these two. I have come up with two different ways of doing this:
Option 1:
Set a Time uniform in the vertex shader, that goes from 0 - 1, where I can do something like this:
// Inside main() in the vertex shader
float originX = Position.x;
float destinationX = DestinationVertexPosition.x;
float interpolatedX = originX + (destinationX - originX) * Time;
gl_Position.x = interpolatedX;
As you probably see, this has one problem: How do I get the "DestinationVertexPosition" in there?
Option 2:
Make the interpolation calculation outside the vertex shader, where I loop through each vertex and create a third vertex set for the interpolated values, and use that to render:
// Pre render
// Use this vertex set to render
InterpolatedVertexes
for (unsigned int i = 0; i < vertexCount; i++) {
float originX = Vertexes1[i].x;
float destinationX = Vertexes2[i].x;
float interpolatedX = originX + (destinationX - originX) * Time;
InterpolatedVertexes[i].x = interpolatedX;
}
I have highly simplified these two code snippets, just to make the idea clear.
Now, from the two options, I feel like the first one is definitely better in terms of performance, given stuff happens at the shader level, AND I don't have to create a new set of vertexes each time the "Time" is updated.
So, now that the introduction to the problem has been covered, I would appreciate any of the following three things:
A discussion of better ways of achieving the desired results in OpenGL ES 2 (iOS).
A discussion about how Option 1 could be implemented properly, either by providing the "DestinationVertexPosition" or by modifying the idea somehow, to better achieve the same result.
A discussion about how Option 2 could be implemented.

In ES 2 you specify such attributes as you like — there's therefore no problem with specifying attributes for both origin and destination, and doing the linear interpolation between them in the vertex shader. However you really shouldn't do it component by component as your code suggests you want to as GPUs are vector processors, and the mix GLSL function will do the linear blend you want. So e.g. (with obvious inefficiencies and assumptions)
int sourceAttribute = glGetAttribLocation(shader, "sourceVertex");
glVertexAttribPointer(sourceAttribute, 3, GL_FLOAT, GL_FALSE, 0, sourceLocations);
int destAttribute = glGetAttribLocation(shader, "destVertex");
glVertexAttribPointer(destAttribute, 3, GL_FLOAT, GL_FALSE, 0, destLocations);
And:
gl_Position = vec4(mix(sourceVertex, destVertex, Time), 1.0);

Your two options here have a trade off: supply twice the geometry once and interpolate between that, or supply only one set of geometry, but do so for each frame. You have to weigh geometry size vs. upload bandwidth.
Given my experience with iOS devices, I'd highly recommend option 1. Uploading new geometry on every frame can be extremely expensive on these devices.
If the vertices are constant, you can upload them once into one or two vertex buffer objects (VBOs) with the GL_STATIC_DRAW flag set. The PowerVR SGX series has hardware optimizations for dealing with static VBOs, so they are very fast to work with after the initial upload.
As far as how to upload two sets of vertices for use in a single shader, geometry is just another input attribute for your shader. You could have one, two, or more sets of vertices fed into a single vertex shader. You just define the attributes using code like
attribute vec3 startingPosition;
attribute vec3 endingPosition;
and interpolate between them using code like
vec3 finalPosition = startingPosition * (1.0 - fractionalProgress) + endingPosition * fractionalProgress;
Edit: Tommy points out the mix() operation, which I'd forgotten about and is a better way to do the above vertex interpolation.
In order to inform your shader program as to where to get the second set of vertices, you'd use pretty much the same glVertexAttribPointer() call for the second set of geometry as the first, only pointing to that VBO and attribute.
Note that you can perform this calculation as a vector, rather than breaking out all three components individually. This doesn't get you much with a highp default precision on current PowerVR SGX chips, but could be faster on future ones than doing this one component at a time.
You might also want to look into other techniques used for vertex skinning, because there might be other ways of animating vertices that don't require two full sets of vertices to be uploaded.
The one case that I've heard where option 2 (uploading new geometry on each frame) might be preferable is in specific cases where using the Accelerate framework to do vector manipulation of the geometry ends up being faster than doing the skinning on-GPU. I remember the Unity folks were talking about this once, but I can't remember if it was for really small or really large sets of geometry. Option 1 has been faster in all the cases I've worked with myself.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart