How to divide a runtime procedural generated world into chunks - procedural-generation

I've been thinking of making a top-down 2D game with a pseudo-infinite runtime procedural generated world. I've read several articles about procedural generation and, maybe I've misread or misunderstood them, but I have yet to come across one explaining how to divide the world into chunks (like Minecraft apparently does).
Obviously, I need to generate only the part of the world that the player can currently see. If my game is tile-based, for example, I could divide the world into n*n chunks. If the player were at the border of such a chunk, I would also generate the adjacent chunk(s).
What I can't figure out is how exactly do I take a procedural world generation algorithm and only use it on one chunk at a time. For example, if I have an algorithm that generates a big structure (e.g. castle, forest, river) that would spread across many chunks, how can I adjust it to generate only one chunk, and afterwards the adjacent chunks?
I apologize if I completely missed something obvious. Thank you in advance!

Study the Midpoint displacement algorithm. Note that the points all along one side are based on the starting values of the corners. You can calculate them without knowing the rest of the grid.
I used this approach to generate terrain. I needed the edges of each 'chunk' of terrain to line up with the adjacent chunks. Using a variation of the Midpoint displacement algorithm I made it so that the height of each point along the edge of a chunk was calculated based only on values at the two corners. If I needed to add randomness, I seeded a random number generator with data from the two corners. This way, any two adjacent chunks could be generated independently and the edges were sure to match.
You can use approaches for height-maps for other things. Instead of height, the data could determine vegetation type, population density, etc. Instead of a chunks of height map where the hills and valleys match up you can have a vegetation map where the forests match up.
It certainly takes some creative programming for any kind of complex world.

Related

Fast way to find corresponding objects across stereo views

Thanks for taking your time to read this.
We have fixed stereo pairs of cameras looking into a closed volume. We know the dimensions of the volume and have the intrinsic and extrinsic calibration values
for the camera pairs. The objective being to be able to identify the 3d positions of multiple duplicate objects accurately.
Which naturally leads to what is described as the correspondence problem in litrature. We need a fast technique to match ball A from image 1 with Ball A from image 2 and so on.
At the moment we use the properties of epipolar geomentry (Fundamental matrix) to match the balls from different views in a crude way and works ok when the objects are sparse,
but gives a lot of false positives if the objects are densely scattered. Since ball A in image 1 can lie anywhere on the epipolar line going across image 2, it leads to mismatches
when multiple objects lie on that line and look similar.
Is there a way to re-model this into a 3d line intersection problem or something? Since the ball A in image 1 can only take a bounded limit of 3d values, Is there a way to represent
it as a line in 3d? and do a intersection test to find the closest matching ball in image 2?
Or is there a way to generate a sparse list of 3d values which correspond to each 2d grid of pixels in image 1 and 2, and do a intersection test
of these values to find the matching objects across two cameras?
Because the objects can be identical, OpenCV feature matching algorithms like FLANN, ORB doesn't work.
Any ideas in the form of formulae or code is welcome.
Thanks!
Sak
You've set yourself quite a difficult task. Because one point can occlude another in a view, it's not generally possible even to count the number of points. If each view has two points, but those points fall on the same epipolar line on the other view, then you can count anywhere between 2 and 4 points.
Assuming you want to minimize the points, this starts to look like Minimum Vertex Cover in a dense bipartite graph, with each edge representing the association of a point from each view, and the weight of each edge taken from the registration error of associating the corresponding points (vertices) from each view. MVC is, of course, NP-hard, and if you treat the problem as a general MVC problem then you'll never do better than O(n^2) because that's how many edges there are to examine.
Your particular MVC problem might have structure that can be exploited to perform a more efficient approximation. In particular, I might suggest calculating the epipolar lines in one view, ordering them by angle from the epipole, and similarly sorting the points in that view from the epipole. You can then iterate over the two sorted lists roughly in parallel, greedily associating each point with a nearby epipolar line. Then you can do the same in the other view, but only looking at points in that view which had not yet been associated during the previous pass. I think that a more regimented and provably optimal approach might be possible with dynamic programming (particularly if you strictly bound the registration error) which wouldn't require the second pass, but I can't sketch it out offhand.
For different types of objects it's easy- to find the match using sum-of-absolute-differences. For similar objects, the idea(s) could lead to publish a good paper. Anyway here's one quick algorithm:
detect the two balls in first image (using object detection methods).
divide the image into two segments cantaining two balls.
repeat steps 1 & 2 for second image also.
the direction of segments in two images should give correspondence of the two balls.
Try this, it should work for two balls.

Image similarity of apartment photos

I want to design an algorithm that would find matches in images of the same apartment, when put up by different real estate agents.
Photos are relatively taken in similar time so the interior of the rooms should not change that much but of course every guys takes different pictures from different angles, etc.
(TLDR; a apartment goes for sale, and different real estate guys come in and make their own pictures, and I want to know if the given pictures from various guys are of the same place)
I know that image processing and recognition algorithm selections highly depend on the use case, so could you point me in correct direction given my use-case?
http://reality.bazos.sk/inzerat/56232813/Prenajom-1-izb-bytu-v-sirsom-centre.php
http://reality.bazos.sk/inzerat/56371292/-PRENAJOM-krasny-1i-byt-rekonstr-Kupeckeho-Ruzinov-BA-II.php
You can actually use Clarifai's Custom Training API endpoint, fairly simple and straightforward. All you would have to do is train the initial image and then compare the second to it. If the probability is high, it is likely the same apartment. For example:
In javascript, to declare a positive it is:
clarifai.positive('http://example.com/apartment1.jpg', 'firstapartment', callback);
And a negative is:
clarifai.negative('http://example.com/notapartment1.jpg', 'firstapartment', callback);
You don't necessarily have to do a negative, but it could only help. Then, when you are comparing images to the first aparment, you do:
clarifai.predict('http://example.com/someotherapartment.jpg', 'firstapartment', callback);
This will give you a probability regarding the likeness of the photo to what you've trained ('firstapartment'). This API is basically doing machine learning without the hassle of the actual machine. Clarifai's API also has a tagging input that is extremely accurate with some basic tags. The API is free for a certain number of calls/month. Definitely worth it to check out for this case.
As user Shaked mentioned in a comment, this is a difficult problem. Even if you knew the position and orientation of each camera in space, and also the characteristics of each camera, it wouldn't be a trivial problem to match the images.
A "bag of words" (BoW) approach may be of use here. Rather than try to identify specific objects and/or deduce the original 3D scene, you determine what "feature descriptors" can distinguish objects from one another in your image sets.
https://en.wikipedia.org/wiki/Bag-of-words_model_in_computer_vision
Imagine you could describe the two images by the relative locations of textures and colors:
horizontal-ish line segments at far left
red blob near center left
green clumpy thing at bottom left
bright round object near top left
...
then for a reasonably constrained set of images (e.g. photos just within a certain zip code), you may be able to yield a good match between the two images above.
The Wikipedia article on BoW may look a bit daunting, but I think if you hunt around you'll find an article that describes "bag of words" for image processing clearly. I've seen a very good demo of a BoW approach used to identify objects such as boats and delivery vans in arbitrary video streams, and it worked impressively well. I wish I had a copy of the presentation to pass along.
If you don't suspect the image to change much, you could try the standard first step of any standard structure-from-motion algorithm to establish a notion of similarity between a pair of images. Any pair of images are similar if they contain a number of matching image features larger than a threshold which satisfy the geometrical constraint of the scene as well. For a general scene, that geometrical constraint is given by a Fundamental Matrix F computed using a subset of matching features.
Here are the steps. I have inserted the opencv method for each step, but you could write your methods too:
Read the pair of images. Use img = cv2.imread(filename).
Use SIFT/SURF to detect image features/descriptors in both images.
sift = cv2.xfeatures2d.SIFT_create()
kp, des = sift.detectAndCompute(img,None)
Match features using the descriptors.
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = bf.match(des1,des2)
Use RANSAC to compute funamental matrix.
cv2.findFundamentalMatrix(pts1, pts2, cv2.FM_RANSAC, 3, 0.99, mask)
mask contains all the inliers. Simply count them to determine if the number of matches satisfying geometrical constraint is large enough.
CAUTION: In case of a planar scene, we use homography instead of a fundamental matrix and the steps described above work out pretty nicely because homography takes a point to a corresponding point in the other image. However, Fundamental matrix takes a point to the corresponding epipolar line in the other image, which makes the entire process a bit less stable. So I would recommend trying these steps a few more times with a little bit of jitter to the feature locations and collating the evidence over more than one trial to make the decision. You can also use more advanced steps to introduce robustness to this process but only if the steps described above don't yield the results you need.

Homography and projective transformation

im trying to write a code that will do projective transformation, but with more than 4 key points. i found this helpful guide but it uses 4 points of reference
https://math.stackexchange.com/questions/296794/finding-the-transform-matrix-from-4-projected-points-with-javascript
i know that matlab uses has a function tcp2form that handles that, but i haven't found a way so far.
anyone can give me some guidance, on how to do so? i can solve the equations using (least squares), but i'm stuck since i have a matrix that is larger than 3*3 and i can't multiple the homogeneous coordinates.
Thanks
If you have more than four control points, you have an overdetermined system of equations. There are two possible scenarios. Either your points are all compatible with the same transformation. In that case, any four points can be used, and the rest will match the transformation exactly. At least in theory. For the sake of numeric stability you'd probably want to choose your points so that they are far from being collinear.
Or your points are not all compatible with a single projective transformation. In this case, all you can hope for is an approximation. If you want the best approximation, you'll have to be more specific about what “best” means, i.e. some kind of error measure. Measuring things in a projective setup is inherently tricky, since there are usually a lot of arbitrary decisions involved.
What you can try is fixing one matrix entry (e.g. the lower right one to 1), then writing the conditions for the remaining 8 coordinates as a system of linear equations, and performing a least squares approximation. But the choice of matrix representative (i.e. fixing one entry here) affects the least squares error measure while it has no effect on the geometric meaning, so this is a pretty arbitrary choice. If the lower right entry of the desired matrix should happen to be zero, you'd computation will run into numeric problems due to overflow.

Digitize a Floor Map with Easy Image Processing Techniques?

I am currently working on a project, where I need a huge amount of trajectories as the input data to test my algorithm. I have devised a draft the algorithm and now wish to do some rough testing on it before turning the draft into the final version. For this purpose, I need a lot of trajectories.
In this trial-and-error phase, due to the time limit, spending a lot of energy and time in collecting a huge bunch of trajectories is not a wise option. Therefore, I am trying to generate some trajectories to simulate first. Using computer to generate is easy, fast and cheap.
Obviously, a series of random points is NOT acceptable. The trajectories can be random but have to be reasonable. e.g. one cannot penetrate the walls or just randomly spin around.
Let's say I wish to generate the trajectories within this indoor map.
Some basic ideas that I have are as follows:
Random choose a point as the start point and another as the end point. These two points must be chosen among the reachable ones.
Generate a trajectory that does not penetrate the wall and roughly goes along the walls. In this case all the walls are either vertical or horizontal. So here the trajectories are all more or less straight.
I feel very reluctant to just manually "digitize" the .png map into the point representation. Doing that is very time consuming. Can I use some easy image processing techniques to "digitize" the map for my trajectory generation purpose?
Can I just somewhat extract the information, like expressing all the inaccessible areas as in terms of points, from the .png file? I know Python, by the way.

Handling blocks in Minecraft-style terrain (d3d/c++)

In 3d terrain that consists of thousands of cubes (i.e. Minecraft ), what is a way to handle each block in terms of location and rendering? More specifically, I know that drawing a primitive of a cube and world transforming it everywhere in directX 9 is probably a ridiculous way to accomplish this since there are so many performance issues, so I was wondering what a more reasonable method would be.
Should each cube be a mesh that's copied many times, or is their a way to create the appropriate meshes from the data in your vertex buffer?
I found this article that walks through some of the theory behind implementing what I want to implement, but I've never used octrees before so I wasn't able to take too much from the source code. If octrees are indeed the way to go, where is a good starting point to learn about them? Most of my google searches only turned up blog posts about theory with little or no implementation examples.
It seems like using voxels would be useful in doing this, but like with octrees, I'm coming from no experience here, so I don't really know what to study first.
Anyway, thanks for any advice\resources\book names you can spare. I'm sure it's obvious, but I'm still very new to 3d programming, so I appreciate your help.
First off if you're using Minecraft as your reference, think of their use of chunks and relate it to Oct-trees. Minecraft divides up their world into smaller chunks to handle the massive amount information that is needed to be stored so use Oct-trees to organize this data that will be stored. Goz has a very accurate description of how Oct-trees and Quad-trees work, so use his information as a reference.
Another thing to consider is that you don't actually want to draw every cube to the screen as this will eat up your framerate. Use Object Culling to only draw visible cubes to the screen. Again if you think Minecraft; have you ever encountered a glitch where you can see through the blocks and under the world? This is because Minecraft only draws the top layer of blocks. With this many objects on screen, it would be a worthwhile investment to look into Object Culling using both the camera frustum and occlusion query.
For information on using DirectX I would recommend any book by Frank Luna. I own this book myself and it never leaves my side when programming in DirectX. http://www.amazon.com/Introduction-Game-Programming-Direct-9-0c/dp/1598220160/ref=sr_1_3?ie=UTF8&qid=1332478780&sr=8-3
I highly recommend this book as I've learned almost everything I know about DirectX from it.
Upon a Google search I found this link that discusses Occlusion Culling, because Luna doesn't cover occlusion culling, only frustum culling. I hear the Programming Gems series mentioned a lot, but I can't attest to its name personally. http://http.developer.nvidia.com/GPUGems/gpugems_ch29.html
Hope this helps.
Oct-trees are fairly simple, especially axis aligned ones like those in mine craft.
It is basically just a 3D extension of the quad-tree. You may find it easier to learn about Quad-trees first.
To give you a quick overview of a quad-tree; basically you start off with a square. Now imagine placing a much smaller square in that square. If you wish to build a quad tree representing it you first divide the original square into 4 equal sized squares.
Next you check each quadrant and if the smaller square is in that quadrant you split that quadrant into 4 smaller sized squares. Then you check those 4 quadrants choose the quadrant and subdivide. Eventually your smaller square will be wholly contained in one or more quadrants inside quadrants inside quadrants (etc). You have now built your quad tree.
Now if you imagine you are searching for a specific square inside the larger square you can quickly see the bonus of a quad-tree. Instead of searching every possible square in the quad tree (equivalent to searching every pixel in a texture) you can now check the first 4 quadrants to see if they contain it. If one does you can check its 4 sub quadrants and so on until you find the smallest quadrant wholly containing your square (or pixel). This way you end up doing many fewer tests to find your object.
Now an oct-tree is basically the same thing but instead of encoding squares in squares you now encode cubes in cubes. Every cube can be split into 8 smaller octants (and hence the name oct-tree).
Oct-trees have the advantage that by knowing which octant you are starting in you can easily cast rays through the oct-tree to find collisions (as an octant is either full, partially full or it is empty). If an octant is empty then you pass right through it and then check the octant on the other side. If it is partially full you check its sub-octants and so on until you either find a full octant (ie you've hit a solid cube and you render it) or you pass through the octant entirely and hence there is no cube to render. This is how minecraft works (I'm guessing anyway ;)). This is also a good way of quickly rendering voxel data which more people are looking into these days as a possible future rendering mechanism.
Hope thats some help! :)
Oct-trees and quad-trees are useful for culling sections of your geometry to render. Minecraft uses 16x16x16 render blocks to break up the terrain into manageable pieces.
Another technique to consider is instancing. Instancing is where you tell the GPU to render an object multiple times in different locations. It's used for crowd rendering, trees, anything where the geometry is the same, but you have lots of them.
http://msdn.microsoft.com/en-us/library/windows/desktop/bb173349(v=vs.85).aspx
http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter03.html
Here is an article where the writer duplicates the minecraft renderer in OpenGL 4. While the code won't apply to your case the techniques (culling cubes that are surrounded, etc) can be applied to a directx renderer.
http://codeflow.org/entries/2010/dec/09/minecraft-like-rendering-experiments-in-opengl-4/
Don't be fooled by the blocky graphics and the low quality textures. Minecraft is an extremely complex renderer and you'll need to come up with ways to handle the sheer number of items involved. For example even a "small" part of the world, say 100x100x100 blocks is 1 million blocks. To push each block to the GPU as a separate mesh would kill your GPU. The Minecraft renderer is far more complex than most first person shooters when you get down to the technology.

Resources