my trying to use XNA4.0 to render a dense point cloud from Kinect. The only way I know is to render each point as a triangle primitive. It works fine for a small set of points however, the maximum number of primitive I can draw from one call is 65535, but I want to draw a dense 640*480 depth image. Any suggestion on how to do this? Thanks!
You are targetting Reach profile, change your project settings to HiDef instead; this way you will be able to draw 1048575 primitives per call.
Is there a reason you want to draw the entire point cloud in one call? Populate a dynamic buffer with as many points as you can fit, render it, then populate it with the next batch and render again, etc. It's not quite as efficient as a single draw call, but 640x480 points is still only 5 batches of 65535, which is by no means excessive.
You might also want to look into hardware instancing, which will still run into the same problem, but which is more efficient for rendering large numbers of identical objects.
Related
I'm trying to create an implementation of seam carving which will run on the GPU using Metal. The dynamic programming part of the algorithm requires the image to be processed row-by-row (or column-by-column), so it's not perfectly suited to GPU I know, but I figure with images potentially thousands of pixels wide/tall, it will still benefit from parallelisation.
My two ideas were either
write a shader that uses 2D textures, but ensure that Metal computes over the image in the correct order, finishing one row before starting the next
write a shader using 1D textures, then manually pass in each row of the image to compute; ideally creating a view into the 2D textures rather than having to copy the data into separate 1D textures
I am still new to Metal and Swift, so my naïve attempts at both of these did not work. For option 1, I tried to dispatch threads using a threadgroup of Nx1x1, but the resulting texture just comes back all zeros – besides I am not convinced this is even right in theory, even if I tell it to use a threadgroup of height one, I'm not sure I can guarantee it will start on the first row. For option 2, I simply couldn't find a nice way to create a 1D view into a 2D texture by row/column – the documentation seems to suggest that Metal does not like giving you access to the underlying data, but I wouldn't be surprised if there was a way to do this.
Thanks for any ideas!
I'm currently learning how to use Metal and having some difficulty using the stencil buffer – possibly because its the wrong solution for the problem I have.
The problem: I have a tree of 2D render nodes, quads, that I'm rendering with metal. For some quads, I'd like to enable a 'clipping mask', that clips the rendering of all its sub-nodes to within its bounds.
I imagined that this might be a good use-case for the stencil attachment (Metal is my first foray into low-level graphics APIs) but am not 100% sure.
Having set-up the depth attachment however, I've got no idea of what to actually do with it. Is it even possible to implement this idea of nested clipping masks with this method?
My rough idea for how it might work would be:
Set-up a pipeline state for each quad as usual
Set-up a couple of depth stencil states, one for tree elements that will clip, and one for nodes that won't. (Write masks of 0xFF and 0x00 respectively.)
Begin a render pass as usual and begin traversing the tree.
If a node should clip, use the clipping depth stencil state otherwise the non-clipping.
Any idea if this is the right approach?
Any thoughts as to the specifics of tackling this problem. i.e configuration of the MTLStencilDescriptor and its read/write masks and various comparison operations and functions. How I would set the stencilReferenceValue on the render command encoder? Increment it at each level of the tree?
EDIT: A Similar question attempts to tackle this problem in Open GL (although the solution comes with its own compromises), so it appears that the above problem can be tackled using a stencil attachment.
The solution in the linked question notes "by giving each level in your scene graph a higher number than the last, you can assign each level its own stencil mask," but comes with the caveat: "Of course, if two siblings at the same depth level overlap, or simply are too close, then you've got a problem."
Is there a better way of achieving this with the capabilities that Metal provides? An idea of/pointers to a recommended algorithm/method to do this within the capabilities of Metal's API would be appreciated!
In 3d terrain that consists of thousands of cubes (i.e. Minecraft ), what is a way to handle each block in terms of location and rendering? More specifically, I know that drawing a primitive of a cube and world transforming it everywhere in directX 9 is probably a ridiculous way to accomplish this since there are so many performance issues, so I was wondering what a more reasonable method would be.
Should each cube be a mesh that's copied many times, or is their a way to create the appropriate meshes from the data in your vertex buffer?
I found this article that walks through some of the theory behind implementing what I want to implement, but I've never used octrees before so I wasn't able to take too much from the source code. If octrees are indeed the way to go, where is a good starting point to learn about them? Most of my google searches only turned up blog posts about theory with little or no implementation examples.
It seems like using voxels would be useful in doing this, but like with octrees, I'm coming from no experience here, so I don't really know what to study first.
Anyway, thanks for any advice\resources\book names you can spare. I'm sure it's obvious, but I'm still very new to 3d programming, so I appreciate your help.
First off if you're using Minecraft as your reference, think of their use of chunks and relate it to Oct-trees. Minecraft divides up their world into smaller chunks to handle the massive amount information that is needed to be stored so use Oct-trees to organize this data that will be stored. Goz has a very accurate description of how Oct-trees and Quad-trees work, so use his information as a reference.
Another thing to consider is that you don't actually want to draw every cube to the screen as this will eat up your framerate. Use Object Culling to only draw visible cubes to the screen. Again if you think Minecraft; have you ever encountered a glitch where you can see through the blocks and under the world? This is because Minecraft only draws the top layer of blocks. With this many objects on screen, it would be a worthwhile investment to look into Object Culling using both the camera frustum and occlusion query.
For information on using DirectX I would recommend any book by Frank Luna. I own this book myself and it never leaves my side when programming in DirectX. http://www.amazon.com/Introduction-Game-Programming-Direct-9-0c/dp/1598220160/ref=sr_1_3?ie=UTF8&qid=1332478780&sr=8-3
I highly recommend this book as I've learned almost everything I know about DirectX from it.
Upon a Google search I found this link that discusses Occlusion Culling, because Luna doesn't cover occlusion culling, only frustum culling. I hear the Programming Gems series mentioned a lot, but I can't attest to its name personally. http://http.developer.nvidia.com/GPUGems/gpugems_ch29.html
Hope this helps.
Oct-trees are fairly simple, especially axis aligned ones like those in mine craft.
It is basically just a 3D extension of the quad-tree. You may find it easier to learn about Quad-trees first.
To give you a quick overview of a quad-tree; basically you start off with a square. Now imagine placing a much smaller square in that square. If you wish to build a quad tree representing it you first divide the original square into 4 equal sized squares.
Next you check each quadrant and if the smaller square is in that quadrant you split that quadrant into 4 smaller sized squares. Then you check those 4 quadrants choose the quadrant and subdivide. Eventually your smaller square will be wholly contained in one or more quadrants inside quadrants inside quadrants (etc). You have now built your quad tree.
Now if you imagine you are searching for a specific square inside the larger square you can quickly see the bonus of a quad-tree. Instead of searching every possible square in the quad tree (equivalent to searching every pixel in a texture) you can now check the first 4 quadrants to see if they contain it. If one does you can check its 4 sub quadrants and so on until you find the smallest quadrant wholly containing your square (or pixel). This way you end up doing many fewer tests to find your object.
Now an oct-tree is basically the same thing but instead of encoding squares in squares you now encode cubes in cubes. Every cube can be split into 8 smaller octants (and hence the name oct-tree).
Oct-trees have the advantage that by knowing which octant you are starting in you can easily cast rays through the oct-tree to find collisions (as an octant is either full, partially full or it is empty). If an octant is empty then you pass right through it and then check the octant on the other side. If it is partially full you check its sub-octants and so on until you either find a full octant (ie you've hit a solid cube and you render it) or you pass through the octant entirely and hence there is no cube to render. This is how minecraft works (I'm guessing anyway ;)). This is also a good way of quickly rendering voxel data which more people are looking into these days as a possible future rendering mechanism.
Hope thats some help! :)
Oct-trees and quad-trees are useful for culling sections of your geometry to render. Minecraft uses 16x16x16 render blocks to break up the terrain into manageable pieces.
Another technique to consider is instancing. Instancing is where you tell the GPU to render an object multiple times in different locations. It's used for crowd rendering, trees, anything where the geometry is the same, but you have lots of them.
http://msdn.microsoft.com/en-us/library/windows/desktop/bb173349(v=vs.85).aspx
http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter03.html
Here is an article where the writer duplicates the minecraft renderer in OpenGL 4. While the code won't apply to your case the techniques (culling cubes that are surrounded, etc) can be applied to a directx renderer.
http://codeflow.org/entries/2010/dec/09/minecraft-like-rendering-experiments-in-opengl-4/
Don't be fooled by the blocky graphics and the low quality textures. Minecraft is an extremely complex renderer and you'll need to come up with ways to handle the sheer number of items involved. For example even a "small" part of the world, say 100x100x100 blocks is 1 million blocks. To push each block to the GPU as a separate mesh would kill your GPU. The Minecraft renderer is far more complex than most first person shooters when you get down to the technology.
...or am I insane to even try?
As a novice to using bare vertices for 3d graphics, I haven't ever worked with vertex buffers and the like before. I am guessing that I should use a dynamic buffer because my game deals with manipulating, adding and deleting primitives. But how would I go about doing that?
So far I have stored my indices in a Triangle.cs class. Triangles are stored in Quads (which contain the vertices that correspond to their indices), quads are stored in blocks. In my draw method, I iterate through each block, each quad in each block, and finally each triangle, apply the appropriate texture to my effect, then call DrawUserIndexedPrimitives to draw the vertices stored in the triangle.
I'd like to use a vertex buffer because this method cannot support the scale I am going for. I am assuming it to be dynamic. Since my vertices and indices are stored in a collection of separate classes, though, can I still effectively use a buffer? Is using separate buffers for each quad silly (I'm guessing it is)? Is it feasible and effective for me to dump vertices into the buffer the first time a quad is drawn and then store where those vertices were so that I can apply that offset to that triangle's indices for successive draws? Is there a feasible way to handle removing vertices from the buffer in this scenario (perhaps event-based shifting of index offsets in triangles)?
I apologize that these questions may be either far too novicely or too confusing/vague. I'd be happy to provide clarification. But as I've said, I'm new to this and I may not even know what I'm talking about...
I can't exactly tell what you're trying to do, but using a seperate buffer for every quad is very silly.
The golden rule in graphics programming is batch, batch, batch. This means to pack as much stuff into a single DrawUserIndexedPrimitives call as possible, your graphics card will love you for it.
In your case, put all of your verticies and indicies into one vertex buffer and index buffer (you might need to use more, I have no idea how many verticies we're talking about). Whenever the user changes one of the primatives, regenerate the entire buffer. If you really have a lot of primatives, split them up into multiple buffers and on only regenerate the ones you need when the user changes something.
The most important thing is to minimize the amount of 'DrawUserIndexedPrimitives' calls, those things have a lot of overhead, you could easily make your game on the order of 20x faster.
Graphics cards are pipelines, they like being given a big chunk of data for them to eat away at. What you're doing by giving it one triangle at a time is like forcing a large-scale car factory to only make one car at a time. Where they can't start on building the next car before the last one is finished.
Anyway good luck, and feel free to ask any questions.
Basically, I'm trying to cover a slot machine reel (white cylinder model) with multiple evenly spaced textures around the exterior. The program will be Windows only and the textures will be dynamically loaded at run-time instead of using the content pipeline. (Windows based multi-screen setup with XNA from the Microsoft example)
Most of the examples I can find online are for XNA3 and are still seemingly gibberish to me at this point.
So I'm looking for any help someone can provide on the subject of in-game texturing of objects like cylinders with multiple textures.
Maybe there is a good book out there that can properly describe how texturing works in XNA (4.0 specifically)?
Thanks
You have a few options. It depends two things: whether the model is loaded or generated at runtime, and whether your multiple textures get combined into one or kept individual.
If you have art skills or know an artist, probably the easiest approach is to get them to texture map the cylinder with as many textures as you want (multiple materials). You'd want your Model to have one mesh (ModelMesh) and one material (ModelMeshPart) per texture required. This is assuming the cylinders always have a fixed number of textures!. Then, to swap the textures at runtime you'd iterate through the ModelMesh.Effects collection, cast each to a BasicEffect and set it's Texture property.
If you can't modify the model, you'll have to generate it. There's an example of this on the AppHub site: http://create.msdn.com/en-US/education/catalog/sample/primitives_3d. It probably does not generate texture coordinates so you'd need to add them. If you wanted 5 images per cylinder, you should make sure the number of segments is a multiple of 5 and the V coordinate should go from 0 to 1, 5 times as it wraps around the cylinder. To keep your textures individual with this technique, you'd need to draw the cylinder in 5 chunks, each time setting the GraphicsDevice.Textures[0] to your current texture.
With both techniques it would be possible to draw the cylinder in a single draw call, but you'd need to merge your textures into a single one using Texture2D.GetData and Texture2D.SetData. This is going to be more efficient, but really isn't worth the trouble. Well not unless you making some kind of crazy slot machine particle system anyway.