Category Archives: Simulation

Scene Collision Detection

Here you will find a description of the collision detection for a whole scene in Pathfinder. The post Collision Detection Basics covers the mathematically detection of object - object contacts. The algorithm shown here is designed to be small and efficient at the same time.

An outdoor scene in Pathfinder currently has more than a million polygons. Testing them all for collision would cost far too many operations. So the first thing is to think about a representation of objects for which collision detection can be faster. Easy to see is that the spatial resolution of objects must not as high as there visual representation. A cylinder with 10 edges would not look smooth but behaves well in physics. A closed ten-sided cylinder still consists of 36 triangles. Wouldn't it be better to use an analytic solution for a cylinder? Yes, but it would also make things much more complicated. So I decided to represent everything by spheres and triangles only. Beside the simplification in the polygonal representations one can wrap spheres around everything. Just using a sphere for each object and test against each of this bounding spheres was fast enough that it worked for months. But if you wanna have more than one dynamic object this is not enough.

Common Pruning Approaches

There are different techniques to exclude many objects even without testing against there bounding spheres. Most times a spatial partitioning tree is used. A very famous one is the octree where the space is subdivided into boxes with 8 subboxes each. The objects of the scene are contained in the leaf-boxes which are not further subdivided. Objects which lie in more than one box have to be stored more than once. This causes overhead in the maintainance of the tree which is why I discarded it. For the same reason I did not choose the kd-tree which splits the space along the 3 dimensions recursively.

Another nice approach is the Sweep & Prune technique where all objects are protected to an arbitrary number of axis. Each projection is an interval on one axis which may overlap with the intervals of other objects. Only if the intervals for two objects intersect on all axis they can collide. Due to temporal coherence the change in the projections between two frames is very small and can be computed fast. The drawback is that ray-scene intersections test cannot be improved much with this technique. E.g. a ray with a direction of (1,1,1) (diagonal) would have a maximal projection interval in x-, y- and z-axis direction. From that point of view it would collide with everything. Since ray-scene intersections is something I would like to have I discarded Sweep & Prune too.

My favourite is the hierarchy of bounding volumes. There close objects are grouped and each group has a bounding volume the same way objects already have them. An optimal grouping which creates a balanced binary tree is hard to get, but who says that it must be optimal if it is fast and easy.

The Sphere Tree in Detail

Of course - if I represent everything by triangles and spheres the bounding volumes must be spheres. Other things as boxes could have smaller volumes for the same group but a collision with a sphere is so much faster that ~6 test could be done instead. So lets build a binary tree of spheres.

Construction

One could wait for all objects and build the tree in one big step (offline approach). A good idea would be to search for the two objects where their bounding sphere is the smallest over all pairs. This would create an optimal tree to exclude approximating half of the space in each step in a tree-search. Such an algorithm would take lots of time and only static geometry can be considered at all.

The other way is to insert elements directly to the tree (online approach). Each node of the tree has either two children or is a leaf representing some object. To decide in which branch the new leaf should be inserted test which one would have the smaller bounding volume afterwards. This prefers close and small sub-branches. I will test later if this is a well performing condition (but seems to be good in current benchmarks). A possible alternative would be to use the sphere which growth less.

The MeldSpheres function computes the smallest sphere containing the initial two. If one is inside the other the result it the outer sphere. The resulting spheres could be larger than necessary to bound all contained objects. This overhead means that a search could test some spheres two much but statistically this will have no effect.

 Detection

How to search for collisions in sphere tree? That is really easy! Take any geometry your basic collision test functions support and test recursively. If the current object collides with the sphere of the tree search in both children if possible. Otherwise we have a collision with a leaf i.e. with the bounding sphere of a certain object. Now the collision test between sphere and object must be performed e.g. on triangle level. Therefore the reference to the object is stored in the leaf node.

Now one last optimization should be mentioned. I have always spoken of objects, but what happens with large objects as terrains? They would always cause a triangle-level check if used as whole object. One option would be to split the terrain into more small parts. I used a much simpler version - an object in a tree leaf is either a sphere or a constant amount (2) of triangles. Once in a leaf the rest of the collision detection are two further primitive tests!

The switch from the brute force object-object test to the sphere tree saved ~2 milliseconds per frame (on my laptop with Core i5 3rd gen) while having one dynamic object.

Example Rule: The Redwood Tree

In order to make the core of Voxel Seeds more understandable I would like to explain two generic rules which together create the oak tree.

Each tree is build upon two rules - one for the wood and one for the leaves/needles. Thinking of a real automaton there are much more rules, but considering the form explained in About the automaton behind Voxel Seeds allows branching inside a rule. This is why I refer to generic rules.

As mentioned in the last post about the automaton we added a small memory to each living voxel. The first one was a "Tick" counter. This is some stopwatch when to apply the rule next. Each species has its own intervall length determining the growth speed. Further this counter can be manipulated randomly to break the synchronous growth.

The redwood has the mightiest rule allowing a procedural fractal tree. Lets start with the needles first. The rule is called SphericalLeafRule and is used for oak trees and redwoods. The idea is to seed a center voxel of the needle/leaf type and do a flood-fill of a spherical region. How can the euclidean distance to the center be estimated? This question is answered later for the redwood itself. For needles it is sufficient to use some random manhatten like distance. Therefore the "Generation" memory is used. This is monotonously increasing and the growth stops if the number gets too large. Here is some pseudo code of the rule:

To compute the euclidean distance to some center voxel a little bit more work has to be done. It is possible to store a relative position into a single 32 bit integer value. So one of the two memories could be used to store the center's position relative to the current voxel. If a new voxel is created in the neighbourhood it is easy to add or subtract one from the position to compute the new relative position. With this information each voxel can compute its euclidean distance to some source or destination point.

Now we have the ingredents for a  RedWoodRule . First of all the "Generation" can be used to distinguish voxels of the same type but with a different growth behavior. The first important type is the kernel voxel c. The trunk and each branch consists of one line of these. When spawning a new kernel voxel it gets an encoded target in the "Resource" memory. Afterward these voxels try to reproduce into target direction. This is done randomly where the most probable chosen direction is that where the difference to the target is largest. During that growth new kernel voxels are created tangential to the chosen direction to build branches. If the growth reaches the target it creates a center voxel for the needles.
The second type is the raw wood r. These have a "Generation" counter larger than 1000 where the difference to 1000 describes the target width of the trunk/branch. For each kernel voxel the width is determined by the distance to its target. If this width is at least one r voxels are spawned in the neighbourhood. These remember there spawner position in the "Resource" field and growth further if they are not too far away. I.e. if the stored width from the "Generation" counter is larger than there actual euclidean distance. In general this creates a sphere for a single kernel voxel filled with wood. The sum of many such neighboured spheres creates a cylinder. The advantage in contrast to a real cylinder is that twists in the kernel line do not cause holes in the surrounding.

About the automaton behind Voxel Seeds

Cellular automaton rule with 6-neighbourhood

Cellular automaton rule with 6-neighbourhood

For a regular grid it is easy to do a simulation based on rules. Each rule should be applied to each grid cell to change the state of the cell. In our case the grid is a 3D voxel array. If there is a single output of which state the cell should have this whole thing would be called cellular automaton.

We changed this to multiple input multiple output rules to be faster and more flexible.

Automaton rule with multiple output

Automaton rule with multiple output

Why faster? In a normal cellular automation on each cell each rule has to be applied, because e.g. an empty cell could get anything in the next step. For 300 x 60 x 300 world with 10 species this causes 54,000,000 rule computations per simulation step. We decided to simulate 4 steps per second...
So we changed the point of view: Only a few living cells of a known type are simulated. These can  change their neighbourhoods too. Therefore an empty cell could be filled and the automaton can basically do the same thing as before. Now we are just simulating at most 5,000 cells and apply exactly one rule depending on the voxel type. This is less than before by a factor of >10,000.

Why should the multiple output structure be more flexible? In a standard automaton it would be necessary to create rules for any possible neighbourhood of different voxel types. For a 3 x 3 x 3 area this means to write rules for 27^10 different settings. Of course we could group them but it is stell not feasible. The new view allows to write rules per species. To fill the gab of the remaining 26^10 rules per species rules can be written generic and are allowed to return don't cares.

Example for a simple beetle rule: Iterate over all neighbours and count the empty cells. If there is at least one empty cell choose one of them randomly. Return EMPTY for the current cell and BEETLE for the chosen cell.

Rule for a flying beetle (most boxes on the right side are don't cares)

Rule for a flying beetle (most boxes on the right side are don't cares)

From a local rule to a more global decion

Until now a rule sees only a 3 x 3 x 3 region and has to make decisions on that. Using larger regions would only make things much more complicated. So we added more information as "resources" and "generation" to the living voxels. It is therefore possible to now the distance to the last branch in a tree or things like that. The usage and maintenance of these information depends on the type specific rules.

Summary

The Voxel Seeds automaton implementation maintains a 3D voxel grid with one byte type information per cell. Additionally it holds a list of living voxels with further information. Four times in a second all living voxels are simulated in parallel to the rendering, input handling, .... So a simulation could take 0.25s before the game would be slowed done. After each step an incremental update of all changed voxels is pushed to the GPU synchronously.

We are very happy with the performance and behavior of our small rule sets. Remember: all this was designed and implemented in 48 h on one weekend.