I'm a fairly extreme newbie to the world of graphics programming, so forgive me if this question has an obvious answer, but I haven't managed to track it down.
When the depth test is being performed in DirectX 11, how do you identify which triangles are being tested at any given time? I'm trying to manually tell the depth test to fail unless specific polygons are being viewed through specific polygons (i.e., you can see one set of geometry through only one surface, and another set only through another surface).
I guess the real question is, when a pixel is being tested, how does the system reference the data for that pixel (position, color, etc.)? Is there just something I'm completely overlooking?
For the time being, I'm not interested in alternate solutions to the greater problem, just an answer to this particular question. Thanks for any help!
Related
I'm trying to implement a moving light source in webGL.
I understand that the normalMatrix is the key to managing the lighting equation in the fragment shader but am not winning the battle to set it up correctly. The only tutorial I can find is the excellent "Introduction to Computer Graphics"
and he says:
It turns out that you need to drop the fourth row and the fourth column and then take something called the "inverse transpose" of the resulting 3-by-3 matrix. You don't need to know what that means or why it works.
I think I do need to understand this to really master this baby.
So I'm looking for guidance on how and why to use mat3.normalFromMat4.
(PS. I have achieved the moving light source using 3Js, but its handling of texture maps degrades the images to too great an extent for my application. In webGL I can achieve the desired resolution.)
For me, from reading this discussion, the answer appears to be simple. mat3.normalFromMat4 is only required if you scale the object non-uniformly (i.e. more in one dimension than the others) after the normals have been computed.
Since I'm not doing that, it's a null issue.
I'm trying to do an application which, among other things, is able to recognize chess positions on a computer screen from screenshots. I have very limited experience with image processing techniques and don't wish to invest a great amount of time in studying this, as this is just a pet project of mine.
Can anyone recommend me one or more image processing techniques that would yield me a good result?
The conditions are:
The image is always crispy clean, no noise, poor light conditions etc (since it's a screenshot)
I'm expecting a very low impact on computer performance while doing 1 image / second
I've thought of two modes to start the process:
Feed the piece shapes to the program (so that it knows what a queen, king etc. looks like)
just feed the program an initial image which contains the startup position, from which the program can (after it recognizes the position of the board) pick each chess piece
The process should be relatively easy to understand, as I don't have a very good grasp of image processing techniques (yet)
I'm not interested in using any specific technology, so technology-agnostic documentation would be ideal (C/C++, C#, Java examples would also be fine).
Thanks for taking the time to read this, and I hope to get some good answers.
It' an interesting problem, but you need to specify a lot more than in your original question in order to find an acceptable answer.
On the input images: "screenshots" is quote vague a category. Can you assume that the chessboard will always be entirely in view? Will you have multiple views of the same board? Can you assume that no pieces will be partially or completely occluded in all views?
On the imaged objects and the capture system: will the same chessboard and pieces be used, under very similar illumination? Will the same lens/camera/digitization pipeline be used?
Salut Andrei,
I have done a coin counting algorithm from a picture so the process should be helpful.
The algorithm is called Generalized Hough transform
Make the picture black and white, it is easier that way
Take the image from 1 piece and "slide it over the screenshot"
For each cell you calculate the nr of common pixel in the 2 images
Where you have the largest number there you have the piece
Hope this helps.
Yeah go with Salut Andrei,
Convert the picture into greyscale
Slice into 64 squares and store in array
Using Mat lab can identify the pieces easily
Color can be obtained from Calculating the percentage of No. dot pixels(black pixels)
threshold=no.black pixels /no. of black pixels + no. of white pixels,
If ur value is above threshold then WHITE else BLACK
I'm working on a similar project in c# finding which piece is which isn't the hard part for me. First step is to find a rectangle that shows just the board and cuts everything else out. I first hard-coded it to search for the colors of the squares but would like to make it more robust and reliable regardless of the color scheme. Trying to make it find squares of pixels that match within a certain threshold and extrapolate the board location from that.
In 3d terrain that consists of thousands of cubes (i.e. Minecraft ), what is a way to handle each block in terms of location and rendering? More specifically, I know that drawing a primitive of a cube and world transforming it everywhere in directX 9 is probably a ridiculous way to accomplish this since there are so many performance issues, so I was wondering what a more reasonable method would be.
Should each cube be a mesh that's copied many times, or is their a way to create the appropriate meshes from the data in your vertex buffer?
I found this article that walks through some of the theory behind implementing what I want to implement, but I've never used octrees before so I wasn't able to take too much from the source code. If octrees are indeed the way to go, where is a good starting point to learn about them? Most of my google searches only turned up blog posts about theory with little or no implementation examples.
It seems like using voxels would be useful in doing this, but like with octrees, I'm coming from no experience here, so I don't really know what to study first.
Anyway, thanks for any advice\resources\book names you can spare. I'm sure it's obvious, but I'm still very new to 3d programming, so I appreciate your help.
First off if you're using Minecraft as your reference, think of their use of chunks and relate it to Oct-trees. Minecraft divides up their world into smaller chunks to handle the massive amount information that is needed to be stored so use Oct-trees to organize this data that will be stored. Goz has a very accurate description of how Oct-trees and Quad-trees work, so use his information as a reference.
Another thing to consider is that you don't actually want to draw every cube to the screen as this will eat up your framerate. Use Object Culling to only draw visible cubes to the screen. Again if you think Minecraft; have you ever encountered a glitch where you can see through the blocks and under the world? This is because Minecraft only draws the top layer of blocks. With this many objects on screen, it would be a worthwhile investment to look into Object Culling using both the camera frustum and occlusion query.
For information on using DirectX I would recommend any book by Frank Luna. I own this book myself and it never leaves my side when programming in DirectX. http://www.amazon.com/Introduction-Game-Programming-Direct-9-0c/dp/1598220160/ref=sr_1_3?ie=UTF8&qid=1332478780&sr=8-3
I highly recommend this book as I've learned almost everything I know about DirectX from it.
Upon a Google search I found this link that discusses Occlusion Culling, because Luna doesn't cover occlusion culling, only frustum culling. I hear the Programming Gems series mentioned a lot, but I can't attest to its name personally. http://http.developer.nvidia.com/GPUGems/gpugems_ch29.html
Hope this helps.
Oct-trees are fairly simple, especially axis aligned ones like those in mine craft.
It is basically just a 3D extension of the quad-tree. You may find it easier to learn about Quad-trees first.
To give you a quick overview of a quad-tree; basically you start off with a square. Now imagine placing a much smaller square in that square. If you wish to build a quad tree representing it you first divide the original square into 4 equal sized squares.
Next you check each quadrant and if the smaller square is in that quadrant you split that quadrant into 4 smaller sized squares. Then you check those 4 quadrants choose the quadrant and subdivide. Eventually your smaller square will be wholly contained in one or more quadrants inside quadrants inside quadrants (etc). You have now built your quad tree.
Now if you imagine you are searching for a specific square inside the larger square you can quickly see the bonus of a quad-tree. Instead of searching every possible square in the quad tree (equivalent to searching every pixel in a texture) you can now check the first 4 quadrants to see if they contain it. If one does you can check its 4 sub quadrants and so on until you find the smallest quadrant wholly containing your square (or pixel). This way you end up doing many fewer tests to find your object.
Now an oct-tree is basically the same thing but instead of encoding squares in squares you now encode cubes in cubes. Every cube can be split into 8 smaller octants (and hence the name oct-tree).
Oct-trees have the advantage that by knowing which octant you are starting in you can easily cast rays through the oct-tree to find collisions (as an octant is either full, partially full or it is empty). If an octant is empty then you pass right through it and then check the octant on the other side. If it is partially full you check its sub-octants and so on until you either find a full octant (ie you've hit a solid cube and you render it) or you pass through the octant entirely and hence there is no cube to render. This is how minecraft works (I'm guessing anyway ;)). This is also a good way of quickly rendering voxel data which more people are looking into these days as a possible future rendering mechanism.
Hope thats some help! :)
Oct-trees and quad-trees are useful for culling sections of your geometry to render. Minecraft uses 16x16x16 render blocks to break up the terrain into manageable pieces.
Another technique to consider is instancing. Instancing is where you tell the GPU to render an object multiple times in different locations. It's used for crowd rendering, trees, anything where the geometry is the same, but you have lots of them.
http://msdn.microsoft.com/en-us/library/windows/desktop/bb173349(v=vs.85).aspx
http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter03.html
Here is an article where the writer duplicates the minecraft renderer in OpenGL 4. While the code won't apply to your case the techniques (culling cubes that are surrounded, etc) can be applied to a directx renderer.
http://codeflow.org/entries/2010/dec/09/minecraft-like-rendering-experiments-in-opengl-4/
Don't be fooled by the blocky graphics and the low quality textures. Minecraft is an extremely complex renderer and you'll need to come up with ways to handle the sheer number of items involved. For example even a "small" part of the world, say 100x100x100 blocks is 1 million blocks. To push each block to the GPU as a separate mesh would kill your GPU. The Minecraft renderer is far more complex than most first person shooters when you get down to the technology.
I would like to ask if it is possible to deal inside openCV with the registration of an virtual object into a real image(real world object).
After detecting the region of interest in a real captured image frame, I would like to substitute the pixels of the real image by the pixels of a virtual object which should appear as a real part of the new generated image.
plz he
For sure, (almost) everything is possible if you program it by yourself. However if you expect an on the shelf solution from OpenCV it doesn't exist...
What you are talking about is called : pose-estimation
Depending on the context of your problem, it can be very difficult to do (depending on your computer sciences skills as well)
Instead of a very very long explanation, I think the best is to look at this :
Foundations about 2D-3D Pose Estimation
Posit tutorial with OpenGL and OpenCV
An excellent presentation to understand the context of Pose Estimation
You should try to look at what the field of virtual/augmented reality is, i think it could answer some of your questions... I don't have better answers as your question is very very wide.
Moreover a last tip would be to look at features detection and extraction as a lot of these techniques rely on a good detection of keypoints (to then replace a 2D-3D model into the scene)
Julien,
I've been using OpenGL for years, but after trying to use D3D for the first time, I wasted a significant amount of time trying figure out how to make my scene lights stay fixed in the world rather than fixed on my objects.
In OpenGL light positions get transformed just like everything else with the MODELVIEW matrix, so to get lights fixed in space, you set up your MODELVIEW the way you want for the lights, and call glLightPosition then set it up for your geometry and make geometry calls. In D3D that doesn't help.
(Comment -- I eventually figured out the answer to this one, but I couldn't find anything helpful on the web or in the MSDN. It would have saved me a few hours of head scratching if I could have found this answer then.)
The answer I discovered eventually was that while OpenGL only has its one amalgamated MODELVIEW matrix, in D3D the "world" and "view" transforms are kept separate, and placing lights seems to be the major reason for this. So the answer is you use D3DTS_VIEW to set up matrices that should apply to your lights, and D3DTS_WORLD to set up matrices that apply to the placement of your geometry in the world.
So actually the D3D system kinda makes more sense than the OpenGL way. It allows you to specify your light positions whenever and wherever the heck you feel like it once and for all, without having to constantly reposition them so that they get transformed by your current "view" transform. OpenGL has to work that way because it simply doesn't know what you think your "view" is vs your "model". It's all just a modelview to GL.
(Comment - apologies if I'm not supposed to answer my own questions here, but this was a real question that I had a few weeks ago and thought it was worth posting here to help others making the shift from OpenGL to D3D. Basic overviews of the D3D lighting and rendering pipeline seem hard to come by.)
For the fixed function pipeline, the lights position and direction are set in world space. The docs for the light structures do tell you that, but I'm not surprised that you missed it in the docs. There's not much information on the fixed function pipeline anymore as the focus has to programmable shaders.