What i'm doing is GPGPU on WebGL and I don't know the access pattern which I'd be talking about applies to general graphics and gaming programs. In our code, frequently, we come across data which needs to be summarized or reduced per output texel. A very simple example is matrix multiplication during which, for every output texel, your return a value which is a dot product of a row of one input and a column of the other input.
This has been the sore point of our performance because of not so much the computation but multiplied data access. So I've been trying to find a pattern of reads or data layouts which would expedite this operation and I have been completely unsuccessful.
I will be describing some assumptions and some schemes below. The sample code for all these are under https://github.com/jeffsaremi/webgl-experiments
Unfortunately due to size I wasn't able to use the 'snippet' feature of StackOverflow. NOTE: All examples write to console not the html page.
Base matmul implementation: Example: [2,3]x[3,4]->[2,4] . This produces in a simplistic form 2 textures of (w:3,h:2) and (w:4,h:3). For each output texel I will be reading along the X axis of the left texture but going along the Y axis of the right texture. (see webgl-matmul.html)
Assuming that GPU accesses data similar to CPU -- that is block by block -- if I read along the width of the texture I should be hitting the cache pretty often.
For this, I'd layout both textures in a way that I'd be doing dot products of corresponding rows (along texture width) only. Example: [2,3]x[4,3]->[2,4] . Note that the data for the right texture is now transposed so that for each output texel I'd be doing a dot product of one row from the left and one row from the right. (see webgl-matmul-shared-alongX.html)
To ensure that the above assumption is indeed working, I created a negative test also. In this test I'd be reading along the Y axis of both left and right textures which should have the worst performance ever. Data is pre-transposed so that the results make sense. Example: [3,2]x[3,4]->[2,4]. (see webgl-matmul-shared-alongY.html).
So I ran these -- and I hope you could do as well to see -- and I found no evidence to support existence or non-existence of such caching behavior. You need to run each example a few times to get consistent results for comparison.
Then I came along this paper http://fileadmin.cs.lth.se/cs/Personal/Michael_Doggett/pubs/doggett12-tc.pdf which in short claims that the GPU caches data in blocks (or tiles as I call them).
Based on this promising lead I created a version of matmul (or dot product) which uses blocks of 2x2 to do its calculation. Prior to using this of course I had to rearrange my inputs into such layout. The cost of that re-arrangement is not included in my comparison. Let's say I could do that once and run my matmul many times after. Even this scheme did not contribute anything to the performance if not taking something away. (see webgl-dotprod-tiled.html).
A this point I am completely out of ideas and any hints would be appreciated.
thanks
I have location data from a large number of users (hundreds of thousands). I store the current position and a few historical data points (minute data going back one hour).
How would I go about detecting crowds that gather around natural events like birthday parties etc.? Even smaller crowds (let's say starting from 5 people) should be detected.
The algorithm needs to work in almost real time (or at least once a minute) to detect crowds as they happen.
I have looked into many cluster analysis algorithms, but most of them seem like a bad choice. They either take too long (I have seen O(n^3) and O(2^n)) or need to know how many clusters there are beforehand.
Can someone help me? Thank you!
Let each user be it's own cluster. When she gets within distance R to another user form a new cluster and separate again when the person leaves. You have your event when:
Number of people is greater than N
They are in the same place for the timer greater than T
The party is not moving (might indicate a public transport)
It's not located in public service buildings (hospital, school etc.)
(good number of other conditions)
One minute is plenty of time to get it done even on hundreds of thousands of people. In naive implementation it would be O(n^2), but mind there is no point in comparing location of each individual, only those in close neighbourhood. In first approximation you can divide the "world" into sectors, which also makes it easy to make the task parallel - and in turn easily scale. More users? Just add a few more nodes and downscale.
One idea would be to think in terms of 'mass' and centre of gravity. First of all, do not mark something as event until the mass is not greater than e.g. 15 units. Sure, location is imprecise, but in case of events it should average around centre of the event. If your cluster grows in any direction without adding substantial mass, then most likely it isn't right. Look at methods like DBSCAN (density-based clustering), good inspiration can be also taken from physical systems, even Ising model (here you think in terms of temperature and "flipping" someone to join the crowd)ale at time of limited activity.
How to avoid "single-linkage problem" mentioned by author in comments? One idea would be to think in terms of 'mass' and centre of gravity. First of all, do not mark something as event until the mass is not greater than e.g. 15 units. Sure, location is imprecise, but in case of events it should average around centre of the event. If your cluster grows in any direction without adding substantial mass, then most likely it isn't right. Look at methods like DBSCAN (density-based clustering), good inspiration can be also taken from physical systems, even Ising model (here you think in terms of temperature and "flipping" someone to join the crowd). It is not a novel problem and I am sure there are papers that cover it (partially), e.g. Is There a Crowd? Experiences in Using Density-Based Clustering and Outlier Detection.
There is little use in doing a full clustering.
Just uses good database index.
Keep a database of the current positions.
Whenever you get a new coordinate, query the database with the desired radius, say 50 meters. A good index will do this in O(log n) for a small radius. If you get enough results, this may be an event, or someone joining an ongoing event.
I'm trying to think of the best way to conduct some sort of analysis between two 3D models of the same object.
The first scan is of the original item and the second scan is after it has been put under some load x.
An example would be trying to find the difference between two types of metal.
I would like to be able to scan the initial metal cylinder, apply a measured load, scan it again, and then finally apply some sort of algorithm to compare the difference.
Is it possible to do this efficiently (maybe using Mablab) over say 50 - 100 items for an object around 5inch^3?
I am assuming I will need to work out some sort of utility function as the total mass should be the same?
Would machine learning be beneficial in this case?
Any suggestions or direction would be amazing.
Thank you :)
EDIT: The scan files are coming through as '.stl'
I am using the ELKI MiniGUI to run LOF. I have found out how to normalize the data before running by -dbc.filter, but I would like to look at the original data records and not the normalized ones in the output.
It seems that there is some flag called -normUndo, which can be set if using the command-line, but I cannot figure out how to use it in the MiniGUI.
This functionality used to exist in ELKI, but has effectively been removed (for now).
only a few normalizations ever supported this, most would fail.
there is no longer a well defined "end" with the visualization. Some users will want to visualize the normalized data, others not.
it requires carrying over normalization information along, which makes data structures more complex (albeit the hierarchical approach we have now would allow this again)
due to numerical imprecision of floating point math, you would frequently not get out the exact same values as you put in
keeping the original data in memory may be too expensive for some use cases, so we would need to add another parameter "keep non-normalized data"; furthermore you would need to choose which (normalized or non-normalized) to use for analysis, and which for visualization. This would not be hard with a full-blown GUI, but you are looking at a command line interface. (This is easy to do with Java, too...)
We would of course appreciate patches that contribute such functionality to ELKI.
The easiest way is this: Add a (non-numerical) label column, and you can identify the original objects, in your original data, by this label.
For example, in Fallout 3, a save game stores the state and location of every single object and NPC in the game, and only takes up a few MB's. How do they do that!?!?
And then, during game play, how is this data added/retrieved in/from memory such that it can be displayed to the player in real-time?
UPDATED: (I'm going to make you work for your answers :P)
Based on Kevin Crowell's answer...
So I guess you would have a rendering distance that would apply to objects and NPC's, and you would "SELECT" the objects and NPC's within the given range. However, what type of data store would you use in order to get these objects?
In other words, you would you have a gigantic array of every object in the game, and constantly update a smaller list that holds the visible objects to render?
Also, per Chaos' answer...
Would would happen if you eventually touched every object in the game? Would your save game get bigger and bigger? In the case of Fallout 3, I'm pretty sure there aren't "stages", where the past data could just be dropped. Everything is persisted when you leave/return to a location. So how do you think this specific case is implemented?
With all the big hardisks nowaday, even developers seem to forget how many bytes there are in a megabyte. So to answer the question in the title: games store large amounts of data by creating savegames that are several megabyte large.
To illustrate how big a megabyte is, it's 8 million bits. That is sufficient to encode 2^8000000 = 10^2666666 states. In comparison, there are only 10^80 atoms in the universe. Now in a (save)game there are multiple subsystems with distinct states; e.g. in a RPG each NPC has its own state. But how much of a state is there, really? Their position in a town might be saved as 16 bits (do you remember their exact position if they're walking around anyway?). Their mood/disposition/etc as another 8 bits, and that allows for more emotions then some people have.
When it comes to storing this kind of data in-game, the typical datastructure is a QuadTree. This is a datastructure that allows you to determine objects in a certain X-Y region in O(log N). In some cases, game developers find it easier to pre-partition the world in zones. This reduces the amount of run-time calculations. A good example was Doom. Its maps had visibility pre-calculated; for each point one could determine quickly to which zone it belonged, and for each zone the amount of visible objects was pre-calculated. This reduced the amount of objects that needed runtime visibility checks.
It can simply be mapping objects, or NPCs, to an X,Y,Z coordinate plane. That information that be stored cheaply.
During gameplay, all of those objects are still mapped to a coordinate system at all times. They just need to read in the save information and start from there.
I think you're overestimating the complexity of what's being stored/retrieved. You don't need to store the 3D models for the objects, or their textures, or any of the things that make up large parts of a game's size-on-disk.
First of all, as chaos mentioned, it's only necessary to store information about things that have been moved. Even then, you probably only need to store, for each of those, the new position and orientation (assuming there's not other variables involved, like "damaged"). So that's two vectors for each object, which will be around a grand total of 24 bytes per object. That means you can store the information for 40,000 objects per megabyte. That's an awful lot of objects to have moved around.
Restoring this data is no more complex than placing the objects in the first place. Every object has to have a default position/orientation defined for the game to put it somewhere, so all you're doing is replacing the default with the stored value in the save file. This is not complex, and doesn't require any significant additional processing.
In Fallout 3 in particular, the map is divided in a grid fashion. You can only see your current square and the ones immediately next. The type of data store is not really important - can be a SQLite database, can be a tree serialized to disk, or can be something else entirely.
...you would you have a
gigantic array of every object in the
game, and constantly update a smaller
list that holds the visible objects to
render?
Generally yes, but the "gigantic array" doesn't need to be in memory. And there are more lists - objects in current and adjacent grid square (you can be attacked from behind - not in visible list), the visible list, the timer list...
Would would happen if you eventually touched every object in the game? Would your save
game get bigger and bigger?
Could - if there is a default state table for everything, the save can contain only the differences. The save will then grow as you progress.
Everything is persisted when you leave/return to a location.
Nope. Items you drop outside of your house will eventually disappear. Bodies too, probably. Random monsters are respawned every once in a while. This is both convenient to game designers and consistent with the real world.
If you think about the information you need to save it's really not that much;
E.g.
Position
Orientation
Inventory
Health
Objective-state
There are lots more of course, many of which dependend on both the type of game and how the save structure is organized.
Some games like Resident Evil only allow saves when you enter a new zone meaning you don't have to store all the information for entities in both zones. When you "load" a save their attributes come from the disc.
As to how this is data is retrieved/modofied, I'm not quite sure I understand. It's just data in the consoles memory. When the player saves it's written to the save device, and when they load it's restored.
One major technique is differential saves: only saving state that's something other than its default. Compare and contrast "saving the state and location of every object in the game world" with "saving the state and location of every object in the game world that the player has moved or altered".
Echoing the other answers, the biggest savings comes from eliminating all unnecessary state data.
If you look at 8-bit side-scroller games, they will start discarding state as soon as things are offscreen, and oftentimes retain nothing, because their resources are too tight to keep around more than the minimum number of instances.
Doing it on the macro-level for a game like Fallout 3 is just a matter of increasing the scope of the problem. You start sectioning up the landscape by grid or other geometrical methods, and spawn/despawn stuff as the player moves from one section to the next. You ideally keep the size of each area small so that in-memory state is not high. You figure out the bare minimum of state needed to keep around NPC and item instances, and in the layout data you tag as much as possible to auto-respawn so that it doesn't need any state saved.
If you want to be pointed at a specific data structure, an example serialization format might be a linear stream indexed by a tree of pointers, where the organization of the tree corresponds to the map layout.
On a related note, game engines often employ Zip compression, to keep the size of all that content down and also make some operations faster.
Besides what everybody else said, i would like to add state doesn't necessarly imply just position and movement,but also properites for the respective state. Usually a Game Engine has a feature witch allows you to save the data of a certain class.
Say you have a Player class and you are well into the story, when you click save the possible data that can be stored is :
Where is the player located in the
level/map
What are his attributes :
health,mana,strenght,
intelligence,etc
What skills does he have.
What level is he.
Globally we can also have:
How many references (names that allow the engine to pick up an object from a list) to objects are being stored in that specific level,in other words when you load what objects should be loaded along with it.
Are we using physics, if so who uses it.
And many more. Fallout 3 has one type of save, another game will have another. It really depends on the genre and the engine in use.