Stack Allocation

This is another gem from Game Programming Gems 8, describes a method of allocation that shave valuable cycles off, to optimize our code for speed and space.

The typical way of doing a rapid allocation is to use a linked list. Here we will use a simpler stack allocator. This system uses a pre-allocated list of items, with arbitrary-size indices for the pointers, INTs, WORDs, BYTEs or even BITs. But unlike a normal list, we use a simple stack concept and position the stack pointer at the end. Then we can pop the next free item off the stack, and return it with less code, while we push items onto the list to free them.

1601519543482_.pic_hd.jpg

The ability to easily use varying sizes of indices, bits, pointers, or even POD types directly without having to really change the implementation is very powerful. Here is a example with basic pointer allocation.

Pointer-Based Implementation

First, we need to create and initialize the array.

#define      MAX_OBJECT    10000

Particle*    pStack[MAX_OBJECT];          // Our object stack
int          SP;                          // The Stack Pointer
Particle     ParticleArray[MAX_OBJECT];   // The object pool

// Initialise the stack with object indexes from 0 to MAX_OBJECT-1
void InitStack (void)
{
    // Create our object list
    // Pre-Fill stack with indexes (or pointers)
    for (int i = 0; i < MAX_OBJECT; i++) {
        Stack[i] = &ParticleArray[i];
    }
    // Initialise the stack pointer
    SP = MAX_OBJECT;
}

Creation is simply a matter of filling the array with object pointers and then setting up a stack pointer. Then allocation and releasing of a particle object:

// Pop a particle from the free pool.
// -1 is returned if there are no free particles left.
Particle* Pop (void)
{
    assert(SP >= 0, "Error: Stack pointer has gone negative!");
    if (SP == 0) return -1;
    return pStack[--SP];
} 

// Push a used particle back onto the free pool
void Push (Particle* _pParticle) {
    assert (SP >= 0, "Error: Stack pointer has gone negative!");
    assert (SP != MAX_OBJECT, "Error: Not enough space in stack. Object freed twice?");
    pStack[SP++] = _pParticle;
}

Allocating a particle is simply a matter of calling Pop(), while releasing only requires to call Push(). To clarify when dealing with a stack method, we will stick with Push() and Pop().

Index-Based Implementation

Now let’s try to allocate an index as handle to an object rather than a pointer to the object. In this example, we have 256 sprites that we wish to allocate from, and rather than allocating an object and returning a whole pointer, we will simply allocate and return the BYTE index, thereby saving memory.

#define          MAX_SPRITES    256
unsigned char    Stack[MAX_SPRITES];
Sprite           SpritePool[MAX_SPRITES];

// Initialize the stack with object indexes from 0 to MAX_OBJECT-1
void InitStack (void)
{
    // Pre-Fill stack with ready to use particles.
    for (int i = 0; i < MAX_SPRITES; i++) {
        Stack[i] = (unsigned char) i;
    }

    // Initialize the stack pointer
    SP = MAX_SPRITES;
}

Here we changed the allocation type by simply replacing the Particle* with a sprite index. Because the total index number is 256, so we can use an 8 bit unsigned char to store them. The calling class can then store the index and use it as a handle directly, or at least take the address of &SpritePool[index] just before using it.

Taking an extreme case of 16 million objects (a full 24-bit number), normally we would choose a full INT index or a linked list. However, using the stack method, we can easily store 3 bytes per entry by storing two tables of a byte and a short. Thereby saving 1 byte, that is 16 MB for total. Taking this further, for example, a 18 bits of information, we can deal with multiple tables of one holding a short and another holding the leftover bits.


const int         MAX_OBJ = 0x40000;
unsigned short    Stack_short[MAX_OBJ];
unsigned int      Stack_bits[MAX_OBJ/16];
int               SP;
// Initialize the free list
void Init (void)
{
    // Create and fill our 18 bit table
    // Use Push to build the table as it is easier in this case
    for (int i = 0; i < MAX_OBJ; i++) {
        Push(i);
    }
}

There are some tricky bit manipulations in Push command for dividing 18 bits into a short and 2 bits, then push them onto two stacks.

// Push an 18-bit index onto the object stack
void Push (int _value)
{
    assert(_value >= 0);
    assert(_value < 0x400000);
    assert(SP < MAX_OBJ);

    Stack_short[SP] = (unsigned short) _value & 0xffff;

    // Clear the bits we're about to OR into.
    int shift = (SP & 0xf) << 1; // 0, 2, 4, 6, 8, ...30
    int index = SP >> 4; // every 16 entries have the same most significant 2 bits
    Stack_bits[index] &= ~(3 << shift) // keep the rest bits in the entry
    Stack_bits[index] |= ((_value >> 16) & 0x3) << shift;
    SP++;
}

It is written for storage efficiency, although much slower than a linked list or a pointer array. Then for the allocation:

// Return the next free index (an 18-bit number), or -1 for an error.
int Pop (void)
{
    if (SP == 0) return -1;

    // Get the main 16 bits
    int val = Stack_short[--SP];

    // OR in the extra bits we need to make up the 18bit index
    val |= ((Stack_bits[SP>>4]>>((SP&0xf)<<1))&0x3)<<16;

    return val;
}

The real strength is the ease and speed at which we can adapt the stack allocation system to our needs. Not only pointers, BYTEs, SHORTs, and INTs, even groups of BITs.

We can also use POD(Plain Old Data) type, which store a whole structure but can also fit inside a standard native type. For example, we need a POD to hold 64×64 tile coordinate inside multiple 4096×4096 texture pages.

struct STileCoordinate
{
    unsigned char    X;    // 64x64 X tile coordinate
    unsigned char    Y;    // 64x64 Y tile coordinate
    short            page; // Texture page to use. (also padding)
};

Since this data fits inside 4 bytes, we can actually allocate and return it directly without the need of pointers. We can also make an array of these that is the same amount of memory as an array of INTs. In fact this is copied around at the same speed as a standard integer number, as now modern compilers will recognize this and allow us to pass around this data inside a register.

// Return a free tile structure
STileCoordinate Pop(void)
{
    // If none left, then return an error
    if (SP == 0) return EmptyTile

    // Get the main 16 bits
    return TileStack[--SP];
}
Advertisements

Social Networks in Games: Playing with Your Facebook Friends

This is another article in Game Programming Gems 8, talking about accessing web services of social networks from our own games. For example, log people in using their Facebook account.

RESTful Web Service

Representational State Transfer (REST) is the predominant architecture for offering programmatic access to data stored on the web. A RESTful service is composed of a collection of resources, which are identified by a web address, such as http://example.com/resource. It is based on stateless operations, which means any state information is held in the client, so a service can scale to a large number of clients–ideal for web services.

In practice, a RESTful service works with HTTP requests. We send HTTP GET to retrieve data, and the response are usually in JavaScript Object Notation (JSON) format, which looks like this:

{“trends”:{“2009-08-23 04:00:47”:[
  {“query”:”\”Best love song?\”“,”name”:”Best love song?”},
  {“query”:”#fact”,”name”:”#fact”},
  {“query”:”#shoutout”,”name”:”#shoutout”},
  {“query”:”#HappyBDayHowieD”,”name”:”#HappyBDayHowieD”},
  {“query”:”\”District 9\”“,”name”:”District 9”},
  {“query”:”\”Inglourious Basterds\”“,”name”:”Inglourious Basterds”},
  {“query”:”\”Hurricane Bill\”“,”name”:”Hurricane Bill”},
  {“query”:”#peacebetweenjbfans”,”name”:”#peacebetweenjbfans”},
  {“query”:”#Nascar”,”name”:”#Nascar”},
  {“query”:”Raiders”,”name”:”Raiders”}
]},”as_of”:1251000047}

Authenticating a User

Normally we have to confirm a user’s identity before we gain access to data. The most basic authentication mechanism requires users to enter a user name and password, which our application sends to the web service. It requires users to trust our application not to collect passwords and abuse them for other purposes. This fear might stop users from trying out new applications.

Applications on the web have answered this need by offering authentication mechanisms based on forwarding. When logging into an application, users will be forwarded to the login page of the account provider and enter user name and password there. The application will never see user’s credentials, but will only receive a confirmation of whether the login was successful.

Facebook Login

Let’s try the Facebook Login on a web page. There are 4 steps:

  1. Set Redirect URLs.
  2. Check the login statues.
  3. Log people in.
  4. Log people out.

The first step can be done in Facebook App Settings. It ensures Facebook login page only responses to calls from valid URLs.

When loading our webpage, the first thing to do is check if a user is already logged into our application with Facebook Login. We can start that process with a call to FB.getLoginStatus, this function will trigger a call to Facebook to get the login status and call our callback function with the results.

However definitely we should load and initialize Facebook JavaScript SDK. Here is the codes:

<script>
  // This Init function should be inserted
  // directly after the opening  tag
  window.fbAsyncInit = function() {
    FB.init({
      appId            : 'your-app-id',
      cookie           : true,  // enable cookies to allow the server to access
                                // the session
      autoLogAppEvents : true,
      xfbml            : true,  // parse social plugins on this page
      version          : 'v2.12'
    });
  };

  // Load the SDK asynchronously
  (function(d, s, id){
     var js, fjs = d.getElementsByTagName(s)[0];
     if (d.getElementById(id)) {return;}
     js = d.createElement(s); js.id = id;
     js.src = "https://connect.facebook.net/en_US/sdk.js";
     fjs.parentNode.insertBefore(js, fjs);
   }(document, 'script', 'facebook-jssdk'));
</script>

Now that we’ve initialized the JavaScript SDK, we call FB.getLoginStatus(). This function gets the state of the person visiting this page and can return one of three following states:

  • Logged into our app (‘connected’)
  • Logged into Facebook, but not our app yet (‘not_authorized’)
  • Not logged into Facebook, so we cannot tell if they are logged into our app or not (‘unknown’)

These three cases are handled in the callback funtion.

<script>
  // This Init function should be inserted
  // directly after the opening &amp;amp;lt;body&amp;amp;gt; tag
  window.fbAsyncInit = function() {
    FB.init({
      appId            : 'your-app-id',
      cookie           : true,  // enable cookies to allow the server to access
                                // the session
      autoLogAppEvents : true,
      xfbml            : true,  // parse social plugins on this page
      version          : 'v2.12'
    });

    FB.getLoginStatus(function(response) {
      statusChangeCallback(response);
    });

  };

  function statusChangeCallback(response) {
    console.log('statusChangeCallback');
    console.log(response.status);  // three states
  }

  // Load the SDK asynchronously
  (function(d, s, id){
     var js, fjs = d.getElementsByTagName(s)[0];
     if (d.getElementById(id)) {return;}
     js = d.createElement(s); js.id = id;
     js.src = "https://connect.facebook.net/en_US/sdk.js";
     fjs.parentNode.insertBefore(js, fjs);
   }(document, 'script', 'facebook-jssdk'));
</script>

Once our app knows the login status of the person using it, it can do one of the following:

  • If the person is logged into Facebook and our app, redirect them to our app’s logged in experience.
  • I the person isn’t logged into our app or isn’t logged into Facebook, prompt them with the Login dialog with FB.login() or show them the Login Button.

Facebook provides an easy way to generate a Login Button by one click:

WX20180214-004201

After user logged in, we can access authorized data using FB.api(). I have made a demo on http://chenglongyi.com/test/fbapi/, and the full code is below:

<!DOCTYPE html>
<html>
<head>
  <title>Facebook login</title>
</head>
<body>

<script>
  window.fbAsyncInit = function() {
    FB.init({
      appId      : 'your-app-id',
      cookie     : true,  // enable cookies to allow the server to access 
                          // the session
      xfbml      : true,  // parse social plugins on this page
      version    : 'v2.12' // use graph api version 2.8
    });

    // Now that we've initialized the JavaScript SDK, we call 
    // FB.getLoginStatus().  This function gets the state of the
    // person visiting this page and can return one of three states to
    // the callback you provide.  They can be:
    //
    // 1. Logged into your app ('connected')
    // 2. Logged into Facebook, but not your app ('not_authorized')
    // 3. Not logged into Facebook and can't tell if they are logged into
    //    your app or not.
    //
    // These three cases are handled in the callback function.

    FB.getLoginStatus(function(response) {
      statusChangeCallback(response);
    });
  };

  function checkLoginState() {
    FB.getLoginStatus(function(response) {
      statusChangeCallback(response);
    });
  }

  function statusChangeCallback(response) {
    console.log('statusChangeCallback');
    console.log(response);

    if (response.status === "connected") {
      welcome();
    } else {
      document.getElementById('welcome').innerHTML = "";
    }
  }

  function welcome() {
    FB.api(
      '/me',
      'GET',
      {"fields":"id,name"},
      function(response) {
        console.log(response);
        document.getElementById('welcome').innerHTML = "Welcome " + response.name + "!";
      }
    );
  }

  // Load the SDK asynchronously
  (function(d, s, id) {
    var js, fjs = d.getElementsByTagName(s)[0];
    if (d.getElementById(id)) return;
    js = d.createElement(s); js.id = id;
    js.src = "https://connect.facebook.net/en_US/sdk.js";
    fjs.parentNode.insertBefore(js, fjs);
  }(document, 'script', 'facebook-jssdk'));

</script>
<div class="fb-login-button" data-max-rows="1" data-size="large" data-button-type="continue_with" data-show-faces="false" data-auto-logout-link="true" data-use-continue-as="true" onlogin="checkLoginState"></div>

</body>
</html>

Fast-IsA

This is a GEM from Game Programming Gems 8, by Joshua Grass, provides a better method for processing class hierarchy data that can increase the efficiency of IsA check from O(N) to O(1). Here is my summary after reading.

Given a typical class hierarchy like below:

WeChat Image_20180130164516

A normal way to perform IsA check, determine whether Class A is a subclass of Class B would be:

bool IsA(Class *pA, Class *pB)
{
    while (*pA != NULL)
    {
        if (pA == pB)
        {
            return true;
        }
        pA = pA -> GetParentClass();
    }
    return false;
}

The worst case of this algorithm would perform a traversal from leaf to root, which can be very expensive.

So let’s simplify the problem. Imagine we are lucky and the class is in a perfectly balanced binary hierarchy, which means each node branched exactly twice, just like below:

WeChat Image_20180130174959

In this situation the class hierarchy can be put into a contiguous array.

WeChat Screenshot_20180130175525

We noticed that on each level we add 2^(N-1) new nodes to the array, where N is the new level. According to the index in the array(the second row in above table), we can easily find the relationship between each node and its parent:

int parentIndex(int nodeIndex)
{
    return nodeIndex >> 1;
}

So according to this algorithm, on a perfectly balanced tree, we can improve IsA() from O(N) to O(logN):

bool IsA_Balanced2Tree(Class *pA, Class *pB)
{
    int nodeAIndex = pA -> GetClassIndex();
    int nodeBIndex = pB -> GetClassIndex();

    while (nodeAIndex != 0)
    {
        if (nodeAIndex == nodeBIndex)
        {
            return true;
        }
        nodeAIndex = nodeAIndex >> 1;
    }
    return false;
}

Actually once the index for a parent of A is less than the index for B, there is no way that they can be equal. So we can reduce worst-case scenario.

bool IsA_Balanced2Tree_V2(Class *pA, Class *pB)
{
    int nodeAIndex = pA -> GetClassIndex();
    int nodeBIndex = pB -> GetClassIndex();

    while (nodeAIndex >= nodeBIndex)
    {
        if (nodeAIndex == nodeBIndex)
        {
            return true;
        }
        nodeAIndex = nodeAIndex >> 1;
    }
    return false;
}

According to the relationship between a node and its parent, here is the algorithm to find the index of a child, depends on its position in the sub-tree:

int childIndex(int nodeIndex, bool bRight)
{
    if (bRight)
    {
        return (nodeIndex << 1) + 1;
    }
    else
    {
        return (nodeIndex << 1);
    }
}

The binary representation looks like this:

WeChat Image_20180130190605

We can observe that: if Class A is a child of Class B, then the leftmost N bits of B will match A, where N is the highest bit set in A.

Using this rule we can remove the while loop in IsA():

bool IsA_Balanced2Tree_V3(Class *pA, Class *pB)
{
    int nodeAIndex = pA -> GetClassIndex();
    int nodeBIndex = pB -> GetClassIndex();

    if (nodeAIndex > (BSR(nodeAIndex) - BSR(nodeBIndex));

    return nodeAIndex == nodeBIndex;
}

The BSR() here is a wrapper for an inline assembly function that uses the BSR assembly instruction BitScanReverse, which returns the index of the leftmost set bit. If would be easy to pre-calculate the most significant bit index even if the platform does not support BSR().

However, all of the previous work has been built upon the notion that our class hierarchy is a perfectly balanced tree. Fortunately, our IsA() function does not care about the depth between nodes, only ancestry matters. So we can convert the unbalanced tree into balanced tree by adding phantom class, as long as we keep them in fact ancestors.

WechatIMG144

The following algorithm is the simplest implementation for building the class tree.

void BuildTree(Class *pA)
{
    int nCurrentClassIndex = pA -> GetClassIndex();
    int nNumChildClasses = pA -> GetNumberOfChildClasses();
    int nNumLeverls = BSR(nNumChildClasses() + 1;
    int nChildIndexStart = nCurrentClassIndex << nNumLevels;

    for (int i = 0; i  GetChildClass(i);
        pChild -> SetClassIndex(nChildIndexStart + i);
        BuildTree(pChild);
    }
}