This question bothers me for a long time. The basic vehicle counting program includes: 1. recognize a vehicle. 2. track the vehicle by features.
However, if the vehicle #1 was found at time t, then at t+1 the program start to track the vehicle, but #1 can also be found by recognizing process, then t+2 program two vehicles will be tracked, but actually just one #1 in the frame. How can the recognized vehicle avoiding duplicate detect?
Thanks in advance!
If I understood correctly, you are concerned about detecting the object that you are already tracking (lack of detector/tracker communication). In that case you can either:
Pre-check - during detection exclude the areas, where you already track objects or
Post-check - discard detected objects, that are near tracked ones (if "selective" detection is not possible for your approach for some reason)
There are several possible implementations.
Mask. Create a binary mask, where areas near tracked objects are "marked" (e.g. ones near tracked objects and zeros everywhere else). Given such a mask, before detection in particular location you can quickly check if something is being tracked there, and abort detection (Pre-check approach) or remove detected object, if you stick with the Post-check approach.
Brute-force. Calculate distances between particular location and each of the tracked ones (you can also check overlapping area and other characteristics). You can then discard detections, that are too close and/or similar to already tracked objects.
Lets consider which way is better (and when).
Mask needs O(N) operations to add all tracked objects to the mask and O(M) operations to check all locations of interest. That's O(N + M) = O(max(N, M)), where N is number of tracked objects and M is number of checked locations (detected objects, for example). Which number (N or M) will be bigger depends on your application. Additional memory is also needed to hold the binary mask (usually it is not very important, but again, it depends on the application).
Brute-force needs O(N * M) operations (each of M locations is checked against N candidates). It doesn't need additional memory, and allows doing more complex logic during checks. For example, if object suddenly changes size/color/whatever within one frame - we should probably not track it (since it may be a completely different object occluding original one) and do something else instead.
To sum up:
Mask is asymptotically better when you have a lot of objects. It is almost essential if you do something like a sliding window search during detection, and can exclude some areas (since in this case you will likely have a large M). You will likely use it with Pre-check.
Brute-force is OK when you have few objects and need to do checks that involve different properties. It makes most sense to use it with Post-check.
If you happen to need something inbetween - you'll have to be more creative and either encode object properties in mask somehow (to achieve constant look-up time) or use more complex data structures (to speed up "Brute-force" search).
Related
I'm interested in replicating "hierachies" of data say similar to addresses.
Area
District
Sector
Unit
but you may have different pieces of data associated to each layer, so you may know the area of Sectors, but not of units, and you may know the population of a unit, basically its not a homogenious tree.
I know little about replication of data except brushing Brewers theorem/CAP, and some naive intuition about what eventual consistency is.
I'm looking for SIMPLE mechanisms to replicate this data from an ACID RDB, into other ACID RDBs, systemically the system needs to eventually converge, and obviously each RDB will enforce its own local consistent view, but any 2 nodes may not match at any given time (except 'eventually').
The simplest way to approach this is to simple store all the data in a single message from some designated leader and distribute it...like an overnight dump and load process, but thats too big.
So the next simplest thing (I thought) was if something inside an area changes, I can export the complete set of data inside an area, and load it into the nodes, thats still quite a coarse algorithm.
The next step was if, say an 'object' at any level changed, was to send all the data in the path to that 'object', i.e. if something in a sector is amended, you would send the data associated to the sector, its parent the district, and its parent the sector (with some sort of version stamp and lets say last update wins)....what i wanted to do was to ensure that any replication 'update' was guaranteed to succeed (so it needs the whole path, which potentially would be created if it didn't exist).
then i stumbled on CRDTs and thought....ah...I'm reinventing the wheel here, and the algorithms are allegedly easy in principle, but tricky to get correct in practice
are there standards accepted patterns to do this sort of thing?
In my use case the hierarchies are quite shallow, and there is only a single designated leader (at this time), I'm quite attracted to state based CRDTs because then I can ignore ordering.
Simplicity is the key requirement.
Actually it appears I've reinvented (in a very crude naive way) the SHELF algorithm.
I'll write some code and see if I can get it to work, and try to understand whats going on.
I am using opencv and openvino and am trying to figure out when I have a face detected, use the cv2.rectangle and have my coordinates sent but only on the first person bounded by the box so it can move the motors because when it sees multiple people it sends multiple coordinates and thus causing the servo and stepper motors to go crazy. Any help would be appreciated. Thank you
Generally, each code would run line by line. You'll need to create a proper function for each scenario so that the data could be handled and processed properly. In short, you'll need to implement error handling and data handling (probably more than these, depending on your software/hardware design). If you are trying to implement multiple threads of executions at the same time, it is better to use multithreading.
Besides, you are using 2 types of motors. Simply taking in all data is inefficient and prone to cause missing data. You'll need to be clear about what servo motor and stepper motor tasks are, the relations between coordinates, who will trigger what, if something fails or some sequence is missing then do task X, etc.
For example, the sequence of Data A should produce Result A but it is halted halfway because Data B went into the buffer and interfered with Result A and at the same time screwed Result B which was anticipated to happen. (This is what happened in your program)
It's good to review and design your whole process by creating a coding flowchart (a diagram that represents an algorithm). It will give you a clear idea of what should happen for each sequence of code. Then, design a proper handler for each situation.
Can you share more insights of your (pseudo-)code, please?
It sounds easy - you trigger a face-detection inference-request and you get a list/vector with all detected faces (the region-of-interest for each detected face) (including false-positive and false-positives, requiring some consistency-checks to filter those).
If you are interested in the first detected face only - then it could be to just process the first returned result from the list/vector.
However, you will see that sometimes the order of results might change, i.e. when 2 faces A and B were detected, in the next run it could still return faces, but B first and then A.
You could add object-tracking on top of face-detection to make sure you always process the same face.
(But even that could fail sometimes)
Is it preferable to store redundant information, (which can be otherwise generated from existing data,) or to instead convert the existing data each time you need access?
I've simplified my specific problem as best as I can below, hoping that the provided answers are useful as future-reference material.
Example:
Let's say we've developed a program that places data into Squares on a grid (like a super-descriptive game of Tic-Tac-Toe or something) and assigns various details, and a unique identification number to each:
Throughout our program, we often perform logic based on a square's X and/or Y coordinates (checking for 3 in a row) and other times we only need the ID (perhaps to access a string at "SquareName[ID]") - We aren't exactly certain which of these two is accessed more often, but it's a rather close competition.
Up until now we've simply stored the ID inside the square class, and converted it with some simple formulas whenever just the X or Y are needed. Say we want to get coordinates for one square in particular:
int CurrentX = (this.Square.ID - 1) % 3) + 1; // X coordinate, 1 through 3
int CurrentY = (this.Square.ID + 1) / 3; // Y, 1 through 3
Since the squares don't move around or change ID after setup, part of me believes it would be simpler just to store all 3 values inside the Square class, but my other part cringes at the redundancy since access to X and Y is already easy enough to calculate from the existing ID.
(Note, This program itself is not very memory or resource intensive, nor does the size of the grid get much larger, so it mostly comes down to which option is a better practice or rule of thumb.)
What would you do?
As a rule of thumb, for a system where the data is read/write, store your basic data without redundancy.
When performance or other considerations become a practical issue, then you should denormalize as necessary. (i.e. wait for it to be a problem, don't pre-optimize overly much).
Your goal should be the most maintainable code possible. That usually means writing the least code possible. Having extra code to maintain redundant copies of data points will make your code more brittle.
If those are values which can be determined at the moment of creation and then do not change anymore, I would go for variables populated in the constructor. It's not redundant info in so far as that it isn't stored anywhere else, but that's not my main point. When reading my code, I'd usually expect that whenever something is computed at the time of request, it might change per request. It is easy to find the point in the source where the field is populated and where it is changed, especially if it does never change, but you might end up slightly confused when looking at some calculation which will return always the same result, as it's variables can't change, and wonder whether you're just missing a case or this is really static.
Also, using a descriptive variable name, you can get rid of the comments. Not that I generally aim at not commenting, but source code which doesn't even need comments is a pretty save signal for easy to understand code, which might (/should) be your aim.
For example, in Fallout 3, a save game stores the state and location of every single object and NPC in the game, and only takes up a few MB's. How do they do that!?!?
And then, during game play, how is this data added/retrieved in/from memory such that it can be displayed to the player in real-time?
UPDATED: (I'm going to make you work for your answers :P)
Based on Kevin Crowell's answer...
So I guess you would have a rendering distance that would apply to objects and NPC's, and you would "SELECT" the objects and NPC's within the given range. However, what type of data store would you use in order to get these objects?
In other words, you would you have a gigantic array of every object in the game, and constantly update a smaller list that holds the visible objects to render?
Also, per Chaos' answer...
Would would happen if you eventually touched every object in the game? Would your save game get bigger and bigger? In the case of Fallout 3, I'm pretty sure there aren't "stages", where the past data could just be dropped. Everything is persisted when you leave/return to a location. So how do you think this specific case is implemented?
With all the big hardisks nowaday, even developers seem to forget how many bytes there are in a megabyte. So to answer the question in the title: games store large amounts of data by creating savegames that are several megabyte large.
To illustrate how big a megabyte is, it's 8 million bits. That is sufficient to encode 2^8000000 = 10^2666666 states. In comparison, there are only 10^80 atoms in the universe. Now in a (save)game there are multiple subsystems with distinct states; e.g. in a RPG each NPC has its own state. But how much of a state is there, really? Their position in a town might be saved as 16 bits (do you remember their exact position if they're walking around anyway?). Their mood/disposition/etc as another 8 bits, and that allows for more emotions then some people have.
When it comes to storing this kind of data in-game, the typical datastructure is a QuadTree. This is a datastructure that allows you to determine objects in a certain X-Y region in O(log N). In some cases, game developers find it easier to pre-partition the world in zones. This reduces the amount of run-time calculations. A good example was Doom. Its maps had visibility pre-calculated; for each point one could determine quickly to which zone it belonged, and for each zone the amount of visible objects was pre-calculated. This reduced the amount of objects that needed runtime visibility checks.
It can simply be mapping objects, or NPCs, to an X,Y,Z coordinate plane. That information that be stored cheaply.
During gameplay, all of those objects are still mapped to a coordinate system at all times. They just need to read in the save information and start from there.
I think you're overestimating the complexity of what's being stored/retrieved. You don't need to store the 3D models for the objects, or their textures, or any of the things that make up large parts of a game's size-on-disk.
First of all, as chaos mentioned, it's only necessary to store information about things that have been moved. Even then, you probably only need to store, for each of those, the new position and orientation (assuming there's not other variables involved, like "damaged"). So that's two vectors for each object, which will be around a grand total of 24 bytes per object. That means you can store the information for 40,000 objects per megabyte. That's an awful lot of objects to have moved around.
Restoring this data is no more complex than placing the objects in the first place. Every object has to have a default position/orientation defined for the game to put it somewhere, so all you're doing is replacing the default with the stored value in the save file. This is not complex, and doesn't require any significant additional processing.
In Fallout 3 in particular, the map is divided in a grid fashion. You can only see your current square and the ones immediately next. The type of data store is not really important - can be a SQLite database, can be a tree serialized to disk, or can be something else entirely.
...you would you have a
gigantic array of every object in the
game, and constantly update a smaller
list that holds the visible objects to
render?
Generally yes, but the "gigantic array" doesn't need to be in memory. And there are more lists - objects in current and adjacent grid square (you can be attacked from behind - not in visible list), the visible list, the timer list...
Would would happen if you eventually touched every object in the game? Would your save
game get bigger and bigger?
Could - if there is a default state table for everything, the save can contain only the differences. The save will then grow as you progress.
Everything is persisted when you leave/return to a location.
Nope. Items you drop outside of your house will eventually disappear. Bodies too, probably. Random monsters are respawned every once in a while. This is both convenient to game designers and consistent with the real world.
If you think about the information you need to save it's really not that much;
E.g.
Position
Orientation
Inventory
Health
Objective-state
There are lots more of course, many of which dependend on both the type of game and how the save structure is organized.
Some games like Resident Evil only allow saves when you enter a new zone meaning you don't have to store all the information for entities in both zones. When you "load" a save their attributes come from the disc.
As to how this is data is retrieved/modofied, I'm not quite sure I understand. It's just data in the consoles memory. When the player saves it's written to the save device, and when they load it's restored.
One major technique is differential saves: only saving state that's something other than its default. Compare and contrast "saving the state and location of every object in the game world" with "saving the state and location of every object in the game world that the player has moved or altered".
Echoing the other answers, the biggest savings comes from eliminating all unnecessary state data.
If you look at 8-bit side-scroller games, they will start discarding state as soon as things are offscreen, and oftentimes retain nothing, because their resources are too tight to keep around more than the minimum number of instances.
Doing it on the macro-level for a game like Fallout 3 is just a matter of increasing the scope of the problem. You start sectioning up the landscape by grid or other geometrical methods, and spawn/despawn stuff as the player moves from one section to the next. You ideally keep the size of each area small so that in-memory state is not high. You figure out the bare minimum of state needed to keep around NPC and item instances, and in the layout data you tag as much as possible to auto-respawn so that it doesn't need any state saved.
If you want to be pointed at a specific data structure, an example serialization format might be a linear stream indexed by a tree of pointers, where the organization of the tree corresponds to the map layout.
On a related note, game engines often employ Zip compression, to keep the size of all that content down and also make some operations faster.
Besides what everybody else said, i would like to add state doesn't necessarly imply just position and movement,but also properites for the respective state. Usually a Game Engine has a feature witch allows you to save the data of a certain class.
Say you have a Player class and you are well into the story, when you click save the possible data that can be stored is :
Where is the player located in the
level/map
What are his attributes :
health,mana,strenght,
intelligence,etc
What skills does he have.
What level is he.
Globally we can also have:
How many references (names that allow the engine to pick up an object from a list) to objects are being stored in that specific level,in other words when you load what objects should be loaded along with it.
Are we using physics, if so who uses it.
And many more. Fallout 3 has one type of save, another game will have another. It really depends on the genre and the engine in use.
I am developing a game for the web. The map of this game will be a minimum of 2000km by 2000km. I want to be able to encode elevation and terrain type at some level of granularity - 100m X 100m for example.
For a 2000km by 2000km map storing this information in 100m2 buckets would mean 20000 by 20000 elements or a total of 400,000,000 records in a database.
Is there some other way of storing this type of information?
MORE INFORMATION
The map itself will not ever be displayed in its entirety. Units will be moved on the map in a turn based fashion and the players will get feedback on where they are located and what the local area looks like. Terrain will dictate speed and prohibition of movement.
I guess I am trying to say that the map will be used for the game and not necessarily for a graphical or display purposes.
It depends on how you want to generate your terrain.
For example, you could procedurally generate it all (using interpolation of a low resolution terrain/height map - stored as two "bitmaps" - with random interpolation seeded from the xy coords to ensure that terrain didn't morph), and use minimal storage.
If you wanted areas of terrain that were completely defined, you could store these separately and use them where appropriate, randomly generating the rest.)
If you want completely defined terrain, then you're going to need to look into some kind of compression/streaming technique to only pull terrain you are currently interested in.
I would treat it differently, by separating terrain type and elevation.
Terrain type, I assume, does not change as rapidly as elevation - there are probably sectors of the same type of terrain that stretch over much longer than the lowest level of granularity. I would map those sectors into database records or some kind of hash table, depending on performance, memory and other requirements.
Elevation I would assume is semi-contiuous, as it changes gradually for the most part. I would try to map the values into set of continuous functions (different sets between parts that are not continues, as in sudden change in elevation). For any set of coordinates for which the terrain is the same elevation or can be described by a simple function, you just need to define the range this function covers. This should reduce much the amount of information you need to record to describe the elevation at each point in the terrain.
So basically I would break down the map into different sectors which compose of (x,y) ranges, once for terrain type and once for terrain elevation, and build a hash table for each which can return the appropriate value as needed.
If you want the kind of granularity that you are looking for, then there is no obvious way of doing it.
You could try a 2-dimensional wavelet transform, but that's pretty complex. Something like a Fourier transform would do quite nicely. Plus, you probably wouldn't go about storing the terrain with a one-record-per-piece-of-land way; it makes more sense to have some sort of database field which can store an encoded matrix.
I think the usual solution is to break your domain up into "tiles" of manageable sizes. You'll have to add a little bit of logic to load the appropriate tiles at any given time, but not too bad.
You shouldn't need to access all that info at once--even if each 100m2 bucket occupied a single pixel on the screen, no screen I know of could show 20k x 20k pixels at once.
Also, I wouldn't use a database--look into height mapping--effectively using a black & white image whose pixel values represent heights.
Good luck!
That will be awfully lot of information no matter which way you look at it. 400,000,000 grid cells will take their toll.
I see two ways of going around this. Firstly, since it is a web-based game, you might be able to get a server with a decently sized HDD and store the 400M records in it just as you would normally. Or more likely create some sort of your own storage mechanism for efficiency. Then you would only have to devise a way to access the data efficiently, which could be done by taking into account the fact that you doubtfully will need to use it all at once. ;)
The other way would be some kind of compression. You have to be careful with this though. Most out-of-the-box compression algorithms won't allow you to decompress an arbitrary location in the stream. Perhaps your terrain data has some patterns in it you can use? I doubt it will be completely random. More likely I predict large areas with the same data. Perhaps those can be encoded as such?