Dynamic spatial data structures

suszterpatt · Feb 20, 2009

I'm about to start a programming project that involves tracking a virtual space in which objects are added, moved or removed dynamically (e.g. players running around a map, particle simulation, that sort of thing). I've spent the past couple hours reading up on different data structures to use, and I'm simply overwhelmed by the sheer amount of data types and algorithms out there (kd-trees, R-trees, Hilbert-trees, etc). Alas, I turn to AT:HT to guide my decision.

The data structure needs to meet the following criteria:

1) Easy to understand/implement in C/C++, OR existence of a lightweight, easy to use C/C++ library that implements it
2) Efficient inserting/altering/deleting of objects in a 2D space (3D is a plus but not necessary)
3) Efficient circular/rectangular range search (doesn't really matter which)

Any helpful pointers?

CycloWizard · Feb 20, 2009

You might want to try posting this in the Programming forum. I know next to nothing about data structures (don't even really know what a tree is in this sense), though I have done a few particle simulations. About the only insight I have is that the choice likely depends on the size of the space and number of objects you have to keep track of.

Cogman · Feb 20, 2009

A tree is the thing for you. And probably the best tree implementation would be the STL map. IIRC the STL map has a worst case log n insertion time and a worst case log n location time. If the amount of memory used is not a big problem, then a hash table might be a better option for you. A hash table with no collisions has a constant insertion and lookup time. However, if you don't make your hash table big enough, it can quickly degrade into linear (O(n)) search times.

For ease of implementation, I would say go with the STL. Its very flexible, doesn't have a lot of memory overhead, and is realitivly quick.

For pure speed. A large hash table will probably be unbeatable in any context. They aren't TOO difficult to implement (they are hard to implement right), offer phenomenal speed, but need a pretty large memory footprint to avoid collisions.

Btw, could you elaborate on your 3rd criteria? If you mean something like being able to say "what are the values between a and b" then you can do that with the maps iterator. You can't do anything like that with a hash table.

TuxDave · Feb 20, 2009

Hash!

<--- didn't quite finish reading your requirements

suszterpatt · Feb 20, 2009

By circular/rectangular range search, I mean the operation of finding all the objects within a circle/box of given size around a specific point in the virtual space. For instance, if we're doing collision detection in a highly populated space, it'd be inefficient to test every object against every other one in the entire space; instead, when testing a specific object, we find all the other objects within its vicinity and only test against those. By efficient range search, I mean that finding an object's "neighbors" is faster than iterating over every single object in the model and deciding whether or not they are neighbors. (Note that collision detection was just an arbitrary example here, it's range search I'm really looking for)

Cogman · Feb 20, 2009

Originally posted by: suszterpatt
By circular/rectangular range search, I mean the operation of finding all the objects within a circle/box of given size around a specific point in the virtual space. For instance, if we're doing collision detection in a highly populated space, it'd be inefficient to test every object against every other one in the entire space; instead, when testing a specific object, we find all the other objects within its vicinity and only test against those. By efficient range search, I mean that finding an object's "neighbors" is faster than iterating over every single object in the model and deciding whether or not they are neighbors. (Note that collision detection was just an arbitrary example here, it's range search I'm really looking for)

I would say go with a map then, or some other b-tree system (Red-black trees work well). As they will most likely have the best range searching capabilities.

pcy · Feb 22, 2009

Hi,

It's a long time since I did any programming, but if I were tackling this problem I think I'd build the data structure myself to exactly meet the requirements.

I would devide the space into rectalinear regions (i.e square in 2-D, cube in 3-D) and then hold an Object Location Table for each region.

The object location table would hold N+1 (where N is the number of dimesions of your space) values per object, namely:
Pointer to Object
Dimenstion 1 location
...
Dimension N location

Though on reflection you might want to add an "Existance" Flag so that you can maintain spare rows in the tables and avoid resizing them when adding or deleting an object from a particualr region. A null pointer might be the Non-Existance Flag.

Also... if the objects themselves have signifcant size you might need to store some size parameter (max distance from object referance point to any point in it's surface)

You can easily compute which regions are adjacent (or close to) any other, and a proximity search from a given point would be trivial:

all objects in any region entirely within the required range of specified point
range test on all objects in any region partially but not entirely within the required range of the specified point.

The issue then arises of how big to make the regions, and how to manage the Object Location Tables in terms of their size. Much depends on the total number of objects you want to manage

Lets say you subdivide your space so that the linear dimensions of the regions is 1/100th of the size of your space, then in 3D you have 1000000 regions. Much then depends on how many objects you plan to track. If this is a climate or weather-forcasting model, tracking billios of molecules, a mere 1000000 regions is nothing, and all will be heavily populated; but if you have a relitively small number of objects, then many of the regions will be empty. The obvious thing to do is then to (slightly) increase the region size and implment a quick reference to non-empty regions. If you do this correctly the total space could potentially be infinite (i.e. limited only by the range of intergers in you machine needed to address each dimension), provided the total number of objects does not run you out of virtual memory.

However, the speed of access/searching improves as you restrict the total size of your space relative to your region size. The is a break point at around 1000,000,000 regions (i.e. the space is 1000 regions long along each of three dimension) because you can then encode the region location (identity) into a 32bit integer, and another at around 1000,000 regions or so, because at that point a boolean string (1 bit per region) saying whether each region is empty or not becomes managable.

To answer your next question - No, I don't envisage the Object Location Tables as separate variables of dynamic size managed by C++, I think a single N+1 column table which you manage yourself as a set of dynamic stacks would be much more efficient.

If you could give more info on the number and size of the objects, relative to the size of the space it would make this easier to design.

Peter

Search

Dynamic spatial data structures

suszterpatt

Senior member

CycloWizard

Lifer

Cogman

Lifer

TuxDave

Lifer

suszterpatt

Senior member

Cogman

Lifer

pcy

Senior member

TRENDING THREADS