Trees in Games: Bounding Volume Hierarchy (Implementation)

Again, thanks to CMU 15662 Computer Graphics course, I have got a nice visualization tool that can clearly show the bounding boxes in the Bounding Volume Hierarchy (BVH). Then under Zoe’s help, I have successfully implemented a simple BVH and visualized it.

My BVH will firstly build the root bounding box, and find the longest axis, then recursively equally divide space along the longest axis, recalculate the child bounding boxes and put them into the tree. Here is the relevant code for it.

BVH(const std::vector<Primitive *> &_primitives, size_t max_leaf_size) {
    this->primitives = _primitives;
    
    // Construct a BVH from the given vector of primitives and maximum leaf
    // size configuration. The starter code build a BVH aggregate with a
    // single leaf node (which is also the root) that encloses all the
    // primitives.
    
    BBox bb;
    for (size_t i = 0; i < primitives.size(); ++i) {
        bb.expand(primitives[i]->get_bbox());
    }
    
    root = new BVHNode(bb, 0, primitives.size());

    // start

    // choose longest axis
    int longestAxisIndex = 0;
    int longestAxisDistance = -INF_D;
    for (int i = 0; i < 3; ++i)
    {
        if (bb.extent[i] > longestAxisDistance)
        {
            longestAxisDistance = bb.extent[i];
            longestAxisIndex = i;
        }
    }

    // sort according to longestAxisIndex
    multimap<float, int> centroidMap;
    for (size_t i = 0; i < primitives.size(); ++i)
    {
        float centroidPointOnLongestAxis = primitives[i]->get_bbox().centroid()[longestAxisIndex];
        centroidMap.insert(pair<float, int>(centroidPointOnLongestAxis, i));
    }

    std::vector<Primitive *> temp_primitives = primitives;
    int count = 0;
    for (multimap<float, int>::iterator it = centroidMap.begin(); it != centroidMap.end(); ++it)
    {
        primitives[count] = temp_primitives[it->second];
        count++;
    }

    PartitionEqually(primitives, bb, root, longestAxisIndex, max_leaf_size);

}

void PartitionEqually(std::vector<Primitive *> &primitives, BBox bb, BVHNode* node, int longestAxisIndex, size_t max_leaf_size) {
    
    if (node->range <= max_leaf_size)
    {
        return;
    }

    // part equally
    float middlePoint = bb.min[longestAxisIndex] + bb.extent[longestAxisIndex] / 2.0f;

    // left child
    BBox bbLeft;
    int leftCount = 0;

    // right child
    BBox bbRight;
    int rightCount = 0;

    for (size_t i = 0; i < node->range; ++i)
    {
        if (primitives[(node->start) + i]->get_bbox().centroid()[longestAxisIndex] >= middlePoint)
        {
            bbRight.expand(primitives[(node->start) + i]->get_bbox());
            rightCount++;
        }
        else
        {
            bbLeft.expand(primitives[(node->start) + i]->get_bbox());
            leftCount++;
        }
    }

    BVHNode* leftNode = new BVHNode(bbLeft, node->start, leftCount);
    BVHNode* rightNode = new BVHNode(bbRight, node->start + leftCount, rightCount);
    
    node->l = leftNode;
    node->r = rightNode;

    if (leftCount == 0 || rightCount == 0)
    {
        return;
    }

    PartitionEqually(primitives, bbLeft, leftNode, longestAxisIndex, max_leaf_size);
    PartitionEqually(primitives, bbRight, rightNode, longestAxisIndex, max_leaf_size);

}

And it works!

This slideshow requires JavaScript.

Advertisements

Trees in Games: Bounding Volume Hierarchy

A Bounding Volume Hierarchy (BVH) is a tree structure implements the Binary Space Partitioning Trees that I talked in my last post. It often applies to a set of geometric objects, and all of them are wrapped in bounding volumes that form the leaf nodes of the tree. These nodes are then grouped as small sets and enclosed with larger bounding volumes recursively, eventually resulting in a tree structure with a single bounding volume at the top of the tree. This structure is efficient when work on collision detection or ray tracing.

image.png

Thanks to CMU 15662 Computer Graphics Course, I have got a very clear example of ray-scene intersection using a BVH. Here we go:

WeChat Screenshot_20180418095249.png

struct BVHNode {
    bool leaf;  // am I a leaf node?
    BBox bbox;  // min/max coords of enclosed primitives
    BVHNode* child1; // “left” child (could be NULL)
    BVHNode* child2; // “right” child (could be NULL)
    Primitive* primList; // for leaves, stores primitives
};

struct HitInfo {
    Primitive* prim;  // which primitive did the ray hit?
    float t;   // at what t value? (Ray ray = o + td)
};

void find_closest_hit(Ray* ray, BVHNode* node, HitInfo* closest) {
    HitInfo hit = intersect(ray, node->bbox);  // test ray against node’s bounding box
    if (hit.prim == NULL || hit.t > closest.t)) {
        return; // don’t update the hit record
    }
    if (node->leaf) {
        for (each primitive p in node->primList) {
            hit = intersect(ray, p);
            if (hit.prim != NULL && hit.t < closest.t) {                 
                closest.prim = p;
                closest.t = t;
            }
        }
    } else {
        find_closest_hit(ray, node->child1, closest);
        find_closest_hit(ray, node->child2, closest);
    }
}

 

Trees in Games: Binary Space Partitioning Trees

Talking about data structures in game industry, we cannot ignore trees. For example, the Binary Space Partitioning (BSP) Tree is a very common data structure for processing three-dimensional virtual scenes, especially for rendering and collision detection. I have just learnt it from Data Structures and Algorithms for Game Developers by Allen Sherrod, and now it is a good time to write some notes as a review.

In a BSP tree, the child nodes are often called the front and back nodes instead of left and right, and I think this will help you understand ray tracing more easily. The main idea of BSP is choosing a plane, then recursively partition the scene (all polygons) into halves, until some condition is met, which could be a minimum number of polygons in a node, or a certain depth of levels.

The BSP tree is a binary search tree for polygons, and leaf nodes have no children but a list of polygons. Theoretically, the creation of a BSP tree is a generally simple process made up of the following steps:

  1. Set the polygon list to the node (root if this is the first node).
  2. Calculate a plane that will be used as the splitter plane.
  3. Loop through all polygons to choose side. If there is a polygon spans the splitter plane, split it into two new planes.
  4. Repeat steps 1 to 3 for the font and back notes recursively until the stop condition is met.

The pseudo code can look like this:

listOfPolygons = new PolygonList(...);
bspTree = CreateBSPNode(listOfPolygons);

CreateBSPNode(listOfPolygons)
{
    node = new BSPNode();

    // stop recursion if threshold is met
    if(listOfPolygon.count <= MIN_NODE_POLY_COUNT)
    {
        node.polygonList = listOfPolygons;
        return node;
    }
    // else continue
    splitPlane = GetBestSplitter(listOfPolygons);

    for each(element i in listOfPolygons)
    {
        if(splitPlane.Classify(listOfPolygons[i]) == FRONT)
        {
            // front polygons
            subPolyListFront.push(listOfPolygons[i]);
        }
        else if(splitPlane.Classify(listOfPolygons[i]) == BACK)
        {
            // back polygons
            subPolyListBack.push(listOfPolygons[i]);
        }
        else
        {
            // Return two split polygons in pf and pb
            splitPlane.ClipPolygon(listOfPolygons[i], &pf, &pb);
            subPolyListFront.push(pf);
            subPolyListBack.push(pb);
        }
    }

    node.frontNode = CreateBSPNode(subPolyListFront);
    node.backNode  = CreateBSPNode(subPolyListBack);

    return node;
}

One of the challenges of making a BSP tree is to pick up a good splitter plane. If the tree is severely unbalanced, performance traversing the tree can suffer. We may choose it by a simple scoring algorithm that performs these steps:

  1. Loop through all polygons and create a plane for each one.
  2. For every plane add up the number of polygons on each side.
  3. Whichever plane has the lowest absolute difference (Math.abs(frontTotal – backTotal) is the best splitting plane, also know as the score.

This is how to implement a BSP tree, I will take a further look on using it in some real conditions later.

Implement custom call() in JavaScript

In my last article, I used Function.prototype.call() to invoke parent constructor in child constructor, in order to inherit parent properties. The call() function is a very interesting feature in JavaScript, and I was asked to implement my custom call() function during a interview, unfortunately I didn’t make it through. Now with the help from mqyqingfeng’s blog, I finally made it.

According to MDN,

The call() method calls a function with a given this value and arguments provided individually.

Take an example:

var person = {
    name: "Bob",
    age: 28
}

var dog = {
    name: "woof",
    age: "woof-woof"
}

function identify() {
    console.log(this.name + ", " + this.age);
}

identify.call(person); // Bob, 28
identify.call(dog); // woof, woof-woof

we can see there are two steps in call(): change this in the function, then execute the function.

Change caller function’s this

So in my custom implementation, we can set the function as a method of the callee object, it will change this in the function. Then we invoke this method, and delete it when we are done.

var person = {
    name: "Bob",
    age: 28
}

var dog = {
    name: "woof",
    age: "woof-woof"
}

function identify() {
    console.log(this.name + ", " + this.age);
}

// identify.call(person); // Bob, 28
// identify.call(dog); // woof, woof-woof

Function.prototype.myCall = function (context) {
    context.fn = this;
    context.fn();
    delete context.fn;
}

identify.myCall(person); // Bob, 28
identify.myCall(dog); // woof, woof-woof

It works. However it has not been done yet, because the real call() can accept arguments.

Pass arguments

var person = {
    name: "Bob",
    age: 28
}

var dog = {
    name: "woof",
    age: "woof-woof"
}

function identify(isHuman) {
    console.log("is " + this.name + " human? " + isHuman);
}

identify.call(person, true); // is Bob human? true
identify.call(dog, false); // is woof human? false

In call() function, the first argument is the new this, and the rest are those you want to pass to the function. Luckily, JavaScript functions have a built-in object called the arguments object, contains an array of the arguments used when the function was called (invoked).

So in function call(this, args1, args2, ...argsN), argument[0] is the new this, and arguments[1] to arguments[arguments.length-1] are the real args1 to argsN. Then we can use eval() to assemble the function with arguments.

Function.prototype.myCall = function (context) {
    // get arguments
    var args = [];
    for (var i = 1; i < arguments.length; i++) {
        args.push("arguments[" + i + "]");
    }
    context.fn = this;
    // args[] will call toString() in eval()
    eval("context.fn(" + args + ")");
    delete context.fn;
}

Return value

Further, functions can have a return value, for example:

var person = {
    name: "Bob",
    age: 28
}

var dog = {
    name: "woof",
    age: "woof-woof"
}
        
function identify(isHuman) {
    console.log("is " + this.name + " human? " + isHuman);
    return {
        name: this.name,
        age: this.age,
        isHuman: isHuman,
    }
}

var info1 = identify.call(person, true); 
// is Bob human? true

console.log(info1); 
// {name: "Bob", age: 28, isHuman: true}

var info2 = identify.call(dog, false); 
// is woof human? false

console.log(info2); 
// {name: "woof", age: "woof-woof", isHuman: false}

That is not that difficult, we just need assign the assembled function to a variable and return it.

Function.prototype.myCall = function (context) {
    var args = [];
    for (var i = 1; i < arguments.length; i++) {
        args.push("arguments[" + i + "]");
    }  
    context.fn = this;
    var result = eval("context.fn(" + args + ")");
    delete context.fn;
    
    return result;
}

When this == null

One more corner case, in the real call() function, when the first argument is null, this will pointer to window object.

So add one more condition for context == null, then the final implementation looks like below:

var person = {
    name: "Bob",
    age: 28
}

var dog = {
    name: "woof",
    age: "woof-woof"
}

var name = "it";
        
function identify(isHuman) {
    console.log("is " + this.name + " human? " + isHuman);
    return {
        name: this.name,
        age: this.age,
        isHuman: isHuman,
    }
}
        
identify.call(null, "maybe"); // is it human? maybe

var info1 = identify.call(person, true); 
// is Bob human? true

console.log(info1); 
// {name: "Bob", age: 28, isHuman: true}

var info2 = identify.call(dog, false); 
// is woof human? false

console.log(info2); 
// {name: "woof", age: "woof-woof", isHuman: false}
        
Function.prototype.myCall = function (context) {
    var context = context || window;
    var args = [];
    for (var i = 1; i < arguments.length; i++) {
        args.push("arguments[" + i + "]");
    }  
    context.fn = this;
    var result = eval("context.fn(" + args + ")");
    delete context.fn;
    
    return result;
}


identify.myCall(null, "maybe"); // is it human? maybe

var info3 = identify.myCall(person, true); 
// is Bob human? true

console.log(info3); 
// {name: "Bob", age: 28, isHuman: true}
        
var info4 = identify.myCall(dog, false); 
// is woof human? false

console.log(info4); 
// {name: "woof", age: "woof-woof", isHuman: false}

JavaScript Inheritance

I had an interview for a web developer position several days ago. I was asked about some basic JavaScript questions, like inheritance. Unfortunately I didn’t make it through as I only know there is something about “prototype”, but did not fully understand it. Now it is time to make it clear.

Talking about inheritance, JavaScript is a little bit different as it is not a typical object-oriented programming language. It is all about “prototype” rather than “class”. For example, we have a Person constructor, or “class”:

function Person(fn, ln, a, g) {
    this.firstName = fn;
    this.lastName = ln;
    this.age = a;
    this.gender = g;
}

Except for all the properties, we want to have a sayName method to get the full name, so we add this method on Person.prototype, which will be shared for every instance. Notice that I cannot write this.prototype in the constructor, because the prototype property will be automatically created after creating the constructor (and every other function). I made this mistake during the interview. 😦

function Person(fn, ln, a, g) {
    this.firstName = fn;
    this.lastName = ln;
    this.age = a;
    this.gender = g;
 // this.prototype = ... // NO! prototype is undefined yet
}

// modify prototype after the constructor is created
Person.prototype.sayName = function() {
    console.log(this.firstName + " " + this.lastName);
}

Then we want one more Employee type inherit from the Person, and has its own office property.


function Employee(o) {
    // inherit from Person
    // ... How?

    // create its own property
    this.office = o;
}

How?

According to Professional JavaScript for Web Developers by Nicholas C. Zakas, and MDN web docs, we can implement the inheritance like below:

First, call parent’s constructor to inherit the properties.

function Person(fn, ln, a, g) {
    this.firstName = fn;
    this.lastName = ln;
    this.age = a;
    this.gender = g;
}

Person.prototype.sayName = function() {
    console.log(this.firstName + " " + this.lastName);
}

function Employee(fn, ln, a, g, o) {
    // inherit from Person
    Person.call(this, fn, ln, a, g);

    // create its own property
    this.office = o;
}      

We use the call function to call parent’s constructor inside child’s constructor, and the first parameter specifies the value of this when running the function, which now is the Employee constructor.

We have already inherited all parent’s properties in the constructor, then we need to inherit parent’s methods on its prototype, like the sayName function. The Object.create()function is exactly designed for this.

function Person(fn, ln, a, g) {
    this.firstName = fn;
    this.lastName = ln;
    this.age = a;
    this.gender = g;
}

Person.prototype.sayName = function() {
    console.log(this.firstName + " " + this.lastName);
}

function Employee(fn, ln, a, g, o) {
    // inherit properties from constructor
    Person.call(this, fn, ln, a, g);

    // create its own property
    this.office = o;
}     

// inherit methods from prototype
Employee.prototype = Object.create(Person.prototype); 

Now the child type Employee has inherited all the properties and methods from parent type Person, even include one thing we do not need to inherit: constructor. So we should set it back.

function Person(fn, ln, a, g) {
    this.firstName = fn;
    this.lastName = ln;
    this.age = a;
    this.gender = g;
}

Person.prototype.sayName = function() {
    console.log(this.firstName + " " + this.lastName);
}

function Employee(fn, ln, a, g, o) {
    // inherit properties from constructor
    Person.call(this, fn, ln, a, g);

    // create its own property
    this.office = o;
}     

// inherit methods from prototype
Employee.prototype = Object.create(Person.prototype);

// parent constructor is also inherited
console.log(Employee.prototype.constructor === Employee);
// false

// so set child constructor back
Employee.prototype.constructor = Employee;

console.log(Employee.prototype.constructor === Employee);
// true

Done! Let’s test it.

function Person(fn, ln, a, g) {
    this.firstName = fn;
    this.lastName = ln;
    this.age = a;
    this.gender = g;
}

Person.prototype.sayName = function() {
    console.log(this.firstName + " " + this.lastName);
}

function Employee(fn, ln, a, g, o) {
    // inherit properties from constructor
    Person.call(this, fn, ln, a, g);

    // create its own property
    this.office = o;
}     

// inherit methods from prototype
Employee.prototype = Object.create(Person.prototype);

// set child constructor back
Employee.prototype.constructor = Employee;

// test
var bob = new Employee("Bob", "Smith", 28, "male", "Web");
bob.sayName(); // Bob Smith 

It works! But too late.

Memory Profiler in Visual Studio 2017

Recently I have read some articles talking about memory optimization theoretically, and I am looking for someways to prove that. Then I was told Visual Studio has builtin memory profiler tools, so let’s take a look.

Visual Studio 2017 has “Diagnostic Tools” which will automatically show up when you start debugging.

memoryprofiler1

Using “Heap Profiling” function, we can set breakpoints and take snapshots to monitor the heap memory usage. I have written the code below to test it, with breakpoints on lines 6, 11, 12, 14.

#include "stdafx.h"

int main()
{
	// on stack
	int m = 100;
	int *array;
	int array1[100]; // not on heap

	// on heap
	array = new int[m];
	delete[] array;

	return 0;
}

When we start debugging, the program will pause on each breakpoint, then we can take snapshots accordingly.

memoryprofiler2

The first snapshot was took at the first breakpoint, before the program entered line 6. It shows the initial heap memory usage before we do any real tests.

Then the program declared some local variables, paused at line 11. We can see from the snapshot 2 that the heap memory usage didn’t change. It is because all local variables are on the stack memory, and their space will be automatically released. There is another “Locals” window can keep track local variables.

memoryprofiler3

The line 11 used the “new” operator to allocate some heap memory for the array. So we can see on the snapshot 3, there is one more allocation, and the heap size increased. We can click it to see more details.

memoryprofiler4

Yes we have allocated a int[100] array, and the size should be exactly 4 * 100 = 400 Bytes.

Finally the line 12 deleted the array, so the memory is freed. The heap memory change is exactly opposite to previous one.

memoryprofiler5

Stack Allocation

This is another gem from Game Programming Gems 8, describes a method of allocation that shave valuable cycles off, to optimize our code for speed and space.

The typical way of doing a rapid allocation is to use a linked list. Here we will use a simpler stack allocator. This system uses a pre-allocated list of items, with arbitrary-size indices for the pointers, INTs, WORDs, BYTEs or even BITs. But unlike a normal list, we use a simple stack concept and position the stack pointer at the end. Then we can pop the next free item off the stack, and return it with less code, while we push items onto the list to free them.

1601519543482_.pic_hd.jpg

The ability to easily use varying sizes of indices, bits, pointers, or even POD types directly without having to really change the implementation is very powerful. Here is a example with basic pointer allocation.

Pointer-Based Implementation

First, we need to create and initialize the array.

#define      MAX_OBJECT    10000

Particle*    pStack[MAX_OBJECT];          // Our object stack
int          SP;                          // The Stack Pointer
Particle     ParticleArray[MAX_OBJECT];   // The object pool

// Initialise the stack with object indexes from 0 to MAX_OBJECT-1
void InitStack (void)
{
    // Create our object list
    // Pre-Fill stack with indexes (or pointers)
    for (int i = 0; i < MAX_OBJECT; i++) {
        Stack[i] = &ParticleArray[i];
    }
    // Initialise the stack pointer
    SP = MAX_OBJECT;
}

Creation is simply a matter of filling the array with object pointers and then setting up a stack pointer. Then allocation and releasing of a particle object:

// Pop a particle from the free pool.
// -1 is returned if there are no free particles left.
Particle* Pop (void)
{
    assert(SP >= 0, "Error: Stack pointer has gone negative!");
    if (SP == 0) return -1;
    return pStack[--SP];
} 

// Push a used particle back onto the free pool
void Push (Particle* _pParticle) {
    assert (SP >= 0, "Error: Stack pointer has gone negative!");
    assert (SP != MAX_OBJECT, "Error: Not enough space in stack. Object freed twice?");
    pStack[SP++] = _pParticle;
}

Allocating a particle is simply a matter of calling Pop(), while releasing only requires to call Push(). To clarify when dealing with a stack method, we will stick with Push() and Pop().

Index-Based Implementation

Now let’s try to allocate an index as handle to an object rather than a pointer to the object. In this example, we have 256 sprites that we wish to allocate from, and rather than allocating an object and returning a whole pointer, we will simply allocate and return the BYTE index, thereby saving memory.

#define          MAX_SPRITES    256
unsigned char    Stack[MAX_SPRITES];
Sprite           SpritePool[MAX_SPRITES];

// Initialize the stack with object indexes from 0 to MAX_OBJECT-1
void InitStack (void)
{
    // Pre-Fill stack with ready to use particles.
    for (int i = 0; i < MAX_SPRITES; i++) {
        Stack[i] = (unsigned char) i;
    }

    // Initialize the stack pointer
    SP = MAX_SPRITES;
}

Here we changed the allocation type by simply replacing the Particle* with a sprite index. Because the total index number is 256, so we can use an 8 bit unsigned char to store them. The calling class can then store the index and use it as a handle directly, or at least take the address of &SpritePool[index] just before using it.

Taking an extreme case of 16 million objects (a full 24-bit number), normally we would choose a full INT index or a linked list. However, using the stack method, we can easily store 3 bytes per entry by storing two tables of a byte and a short. Thereby saving 1 byte, that is 16 MB for total. Taking this further, for example, a 18 bits of information, we can deal with multiple tables of one holding a short and another holding the leftover bits.


const int         MAX_OBJ = 0x40000;
unsigned short    Stack_short[MAX_OBJ];
unsigned int      Stack_bits[MAX_OBJ/16];
int               SP;
// Initialize the free list
void Init (void)
{
    // Create and fill our 18 bit table
    // Use Push to build the table as it is easier in this case
    for (int i = 0; i < MAX_OBJ; i++) {
        Push(i);
    }
}

There are some tricky bit manipulations in Push command for dividing 18 bits into a short and 2 bits, then push them onto two stacks.

// Push an 18-bit index onto the object stack
void Push (int _value)
{
    assert(_value >= 0);
    assert(_value < 0x400000);
    assert(SP < MAX_OBJ);

    Stack_short[SP] = (unsigned short) _value & 0xffff;

    // Clear the bits we're about to OR into.
    int shift = (SP & 0xf) << 1; // 0, 2, 4, 6, 8, ...30
    int index = SP >> 4; // every 16 entries have the same most significant 2 bits
    Stack_bits[index] &= ~(3 << shift) // keep the rest bits in the entry
    Stack_bits[index] |= ((_value >> 16) & 0x3) << shift;
    SP++;
}

It is written for storage efficiency, although much slower than a linked list or a pointer array. Then for the allocation:

// Return the next free index (an 18-bit number), or -1 for an error.
int Pop (void)
{
    if (SP == 0) return -1;

    // Get the main 16 bits
    int val = Stack_short[--SP];

    // OR in the extra bits we need to make up the 18bit index
    val |= ((Stack_bits[SP>>4]>>((SP&0xf)<<1))&0x3)<<16;

    return val;
}

The real strength is the ease and speed at which we can adapt the stack allocation system to our needs. Not only pointers, BYTEs, SHORTs, and INTs, even groups of BITs.

We can also use POD(Plain Old Data) type, which store a whole structure but can also fit inside a standard native type. For example, we need a POD to hold 64×64 tile coordinate inside multiple 4096×4096 texture pages.

struct STileCoordinate
{
    unsigned char    X;    // 64x64 X tile coordinate
    unsigned char    Y;    // 64x64 Y tile coordinate
    short            page; // Texture page to use. (also padding)
};

Since this data fits inside 4 bytes, we can actually allocate and return it directly without the need of pointers. We can also make an array of these that is the same amount of memory as an array of INTs. In fact this is copied around at the same speed as a standard integer number, as now modern compilers will recognize this and allow us to pass around this data inside a register.

// Return a free tile structure
STileCoordinate Pop(void)
{
    // If none left, then return an error
    if (SP == 0) return EmptyTile

    // Get the main 16 bits
    return TileStack[--SP];
}