Business Analytics with Regression

Original Author: Ted Spence

So you’ve got a nice group of players interested in your game. Maybe you’ve even got a possibility of profit coming up! Now the challenge is to keep your success rolling along. You need to identify ways to reach out to customers, and figure out which players could benefit from a promotional offering. It’s time to develop a regression model for our data!

An Introduction To Regression for Business Analytics

I won’t beat around the bush: there’s a lot to know about regression. It was a mathematical technique cooked up by some of the smartest mathematicians ever – including Gauss, who used it to forecast the locations of planets – so this is not an easy field. But for today’s article, I’m going to just get you started.

First off, most companies have no problem generating ratios. It’s very easy to say:

  • “23% of the people who visit our site launch the game.”
  • “5.6% of people who play our game purchase something.”
  • “The majority of our revenue comes from 5% of the paying players.”

Know what? In most cases, that kind of simple math is enough to get the job done.

Lesson #1 of Business Analytics: Use the simplest tool that works.

Why is this the first lesson? Because complex tools are easy to screw up. Feynman once said, “The first principle is that you must not fool yourself, and you are the easiest person to fool.” Using complex tools can create complex and subtle problems that are difficult to anticipate and detect.

When Do We Want Regression?

Most people get up to the point where they’re doing A/B testing – frankly, this is the best case scenario for “ratio” modelling. You do two tests, A and B. A results in 5% increase in sales, and B results in 6% increase in sales. Therefore B is better!

But ratios become very difficult when you have lots of interdependent variables. Let’s say you are trying to figure out why players cancel your game. You’ve got a hunch that you can predict when a player will cancel by looking at some of a few potential factors, but you don’t know for sure which one is the most relevant.

For example, let’s say we are looking at the number of times a player logged in, the duration of play, the number of guildmates who cancelled recently, the amount of EXP they gained, and the number of achievements they gained.

Modelling all those variables using ratios would take forever! Some of those variables are discrete, but most of them are continuous. You’d have to slot them into buckets (1-5 achievements, 6-10, 11-15, etc) and grade each bucket separately! You’d have to create ratios for every possible permutation of variables and compare them in a gigantic matrix. Blech! There’s got to be a better way.

Well, that’s where regression modelling comes in.

How Does Regression Work?

I won’t proclaim that I’m a math teacher, so let me describe this in a way that a casual user can appreciate it. A regression model assumes that all your independent variables have some influence on the target (called the dependent variable).

But – and this is important – In order to get started, you first need to come up with a theory of how you think the variables work. Without a theory, your work will be blind and your results may not show anything at all!

This is the great part: if you come up with a theory that doesn’t work, regression modeling will help you confirm it. The fact that regression models can also give negative results can prevent you from spending tons of time researching data that isn’t useful (or is misleading).

Getting back to our model: Let’s presume that each one of these variables has some influence on whether a player decides to cancel. Using the most common kind of regression, Ordinary Least Squares (OLS), we will assume that we can construct a basic algebraic equation that helps us decide if a player will cancel or not. Using OLS, our theory looks like this in algebra:

(user cancelled) = x + (y1 * logins) + (y2 * playtime) + (y3 * friends_cancelled) + (y4 * exp_gained) + (y5 * achievements)

This is exactly the kind of algebra that a computer can solve in no time.

First, let’s go to your data team and ask them to provide a dump of information. But before we do, I need to remind you that it is vitally important that we get an unbiased sample of data. A common rookie mistake is to say “Gee, I want to figure out what makes people cancel, so let’s run a report on all the people who cancelled.” That’s bad because it leads to selection bias.

The way to avoid selection bias is to pretend you have no knowledge of the results of the study beforehand. Pretend you live behind the veil of ignorance. Ask your data guys,

“Could you run a report that gives me logins, playtime, friends cancelled, exp_gained, and achievements for all players for the month of July? The report should cover only users who have began playing before July 1st, and should exclude anyone who cancelled in July. Oh – and add as the last column a 1 if the user cancelled during the first week of August or a 0 if they did not.”

The reason this query works is that it meets a few requirements:

  • Every user who is involved in this data set has had the same measurements applied to them. Every user who participated in this study supplied an entire month’s worth of data.
  • Our dependent variable, “cancelled in August”, is entirely separate from the independent variables.
  • Ideally, we’ll get lots and lots of rows of results. The more rows we get, the better our regression software can help us understand our variables.

This is good enough to get started. I’ll pretend that you got a report that looks something like this fake data I cooked up. Next, we need to get the software involved.

Regression Analysis Software

Next, we need Let’s say you’re rich and your company overflows with money. In that case, buy SPSS, Stata, or Mathematica. Heck, get your company to send you to graduate school! But for the rest of us, there’s a very fun useful open source package called “GRETL”, which is a great place to start for someone who wants to learn. Go download my test dataset, and let’s get started.

First, save your report in CSV format and launch Gretl. Select File | Open Data | Import | text/CSV. Specify the data delimiter and choose the file.

Whoah! All of a sudden, Gretl asks a question “Do you want to give the data a time-series or panel interpretation?” Let’s tell it no for now; time series and panels are topics for a different lesson, maybe not an introduction. Then again, even if I have a potentially complex problem, I’ll try modeling it as a simple one first and only move to a more complex model once the simple one fails.

You should now see a screen that looks like this. It lists seven variables, including an auto-generated constant (basically a row number).

So let’s get modelling! From the top menu, choose Model | Ordinary Least Squares. We now need to tell GRETL about our theory. For the dependent variable, select “Cancelled_”; and for the independent variables choose everything else, then click OK.

You’ll probably see a screen that looks a lot like this. That’s a lot of text and complicated numbers. How can we make sense of it all?

For a beginner, there are two things you should look for in this chart. The little asterisks next to each row of data are a little visual hint as to which variables are most useful – the more stars, the more useful. Second, look at the bottom where it says “p-value was highest for playtime”. This is a suggestion to tell you which variables should be omitted from your model. In this case, the math says that playtime just doesn’t matter – we can’t tell if a customer is going to cancel by looking at their playtime.

In general, any variable with a p-value close to 1 (or lacking stars) should probably be dropped from your model.

Why is that the case? I don’t know in advance; this is where your theory comes in handy. It may be that some people log on a lot when they’re trying to decide whether or not to cancel, whereas others just fade away and forget about the site. You won’t know until you start gathering some raw-in-the-field research. This is something to bring to your game designer or community manager! Eventually, you might discover something fun, like perhaps there are two different kinds of playtime and only one kind is a reliable indicator of cancelling. But for the moment, let’s omit this flawed variable and move on.

To re-run the model without the bad variables, select from the main menu Test | Omit Variables; then select both “playtime” and “experience gained” to be omitted. Click OK. You’ll see another screen like this:

Now you’ve got an awesome model with some really useful variables. Every one of your variables has a really low p-value. The actual algebraic formula you created is this:

(likelihood of cancelling) = 1.31132 – (0.0470642 * logins) + (0.0567763 * friends_cancelled) – (0.0795353 * achievements)

So how can we make this work in practice? Let’s graph the output of our formula and see how well it works. From the main menu, select Graphs | Fitted, actual plot | Actual vs Fitted. You’ll see this chart:

Your model basically scores people who actually cancel as 0.6 and higher; and people who don’t cancel as 0.4 and lower. Based on this model, you might want to start offering promotional discounts or freebies to people who score above 0.6 – maybe giving really good incentives if the customer has spent lots of money in the past!

Next Steps

There’s so much you can do with regression – I want to encourage you to learn, but frankly some parts of it can get daunting and aren’t easily taught. The Gretl has a good manual too, so good luck and happy regressing!

Voodoo at Origin!

Original Author: Michael A. Carr-Robb-John

We all love bugs they are like nuggets of entertainment for programmers that sit roughly between the extremes of love and hate. There are some bugs however that years after they have been squished still come to mind, indeed ask any engineer about bugs and he / she will undoubtedly have a story or two to tell.

One bug that I thought worth sharing relates to a streaming game where if you walked a specific route from room A to B to C to D and finally to E you would find a book lying on the floor in the centre of the room. Loading into room E directly would not generate the book and looking at room E in the editor showed that the room itself didn’t contain any books. After a few tests walking different routes it seemed obvious that it was only that specific route that generated the offending object.

With the exception of “The Big Bang” things don’t generally pop into existence for no reason and this bug for me falls into the “Love” category. I also tip my hat to those chaps and lasses in QA that not only managed to find this but also managed to work out how to reproduce it.

The clue to solving the bug came when I happened to notice that the object was sitting at the origin (0.0f ,0.0f ,0.0f), and as I soon learned Room E was at the centre of the streaming world. The world origin in my opinion is the Bermuda triangle of games, strange things happen there and everyone would be better off wearing protective eyewear when looking directly at it.

In a lot of systems, the creation of game objects very rarely pop into existence all initialized and ready to rock and roll, usually there is a stage where an objects data is undefined (random) or cleared (zeroed)  because of this every game object generally exists at the world origin at one point or another regardless of how briefly.

The evidence, a streaming object is being created it’s visuals were being initialized but for some reason the last part of actually setting it to it’s correct position was failing. It didn’t take long to track  it down, turns out that the physics setup for this specific book was bad which meant that the creation of the physics failed. This then lead to the position and orientation never getting set leaving the book at the world origin. The book was actually part of Room B but because it failed to initialize it never got cleaned up when that room was unloaded (very dangerous in a streaming environment).

An hour later,  the following changes were made:

  • The books physics data was fixed.
  • The position of an object is set regardless of success / failure of other systems.
  • If a failure occurs in initializing a system (i.e. Physics) a very loud assert fires!

As I said earlier strange things happen at the origin I and quite a few other developers used to store un-used objects at the origin, the running joke being don’t look under the floorboards which was generally where the origin existed. I stopped using this as a storage point however when I started using physics engines and objects started interacting with each other pinging themselves through the floor into the gaming environment. Hilarious the first time it happened but ultimately I lost my easy storage point.

Five Little Things, Episode 1: Triple Town

Original Author: Aaron San Filippo

(Note: This was originally posted on

In this series, I satisfy my urge to categorize and make lists, by breaking down various videogame related topics into five little nuggets of easily digestible wisdom.

If you haven’t tried Triple Town from SpryFox yet, you owe it to yourself to check it out (free to play on mobile and Facebook.) It could be described as a “match 3 puzzle game,” but that would do it a great disservice, as it has considerably more depth than most in that genre, and a theme that sets it apart.

The basic concept is simple: You play in a randomly-generated field, where each turn you place 3 or more items in a grid next to each other, which combine to become a more valuable item, which can then be combined in another set of 3 for an even more valuable item. 3 grass become a bush, 3 bushes become a tree, 3 trees become a house, etc. Each turn, you’re given a random new item to place in the grid, and the goal is to get the highest possible score before the board fills up.

For a game with such simplicity, it’s surprisingly addictive. It’s so hard to put down,  that I thought it was worth exploring exactly why. So without further ado, I present:

Five Reasons Triple Town Is Like Crack For Your Prefrontal Cortex.

One: Shiny things, sparkling often

Games often forget to reward players for their gameplay feats often enough. Not Triple Town. Every match you make is a reward – a little rush of endorphins as the bushes combine into a tree, the satisfying little reward sound plays, and the score ticks up another few hundred points. It seems trivial, and it’s routine once you get into the game, but don’t take this rewarding for granted, it’s like candy for your brain!

Two: Easy to play, hard to master

 Triple Town distinguishes itself from many popular F2P games of today by actually offering some depth and challenge.

What’s brilliant about it though, is that this challenge scales naturally and perfectly as the player’s skill increases. Achieving valuable items requires planning because of the nature of the game and its limited space – the more valuable the item, the more planning is required, and so each time you figure out how to make a new type of structure, the next level presents itself at just one level deeper. In this sense, the game achieves a self-balancing difficulty level, which is rarely pulled off well in games.

Three: A perfect level of randomness

This game could have been built with no randomness. The designer could have just given the player the least-valuable item for each turn, and it would have been a game of pure planning. But the chance of getting a more valuable item has a certain draw to it that’s hard to describe. When you’re planning a pattern out and you know there’s a chance you could get a bush or a tree instead of a grass, this makes it feel just a bit more exciting to try and account for that possibility. And when you have just a few spaces left and you need one of these higher value items to succeed, it makes the end game much more exciting.

In addition, the bears serve as a sort of wildcard greifer – they move around in random patterns, but still exhibit consistency: They always move from the space you put them initially, and so there is an element of strategy. Additionally, when you trap three of them together, they form a church in the space of the youngest bear. This adds a gameplay element where the player must keep track of their movement, which adds another form of brainpower for the player to utilize. This helps keep the game from becoming boring or exhausting, which would both kill the appeal.

Lastly, there are the Ninja bears. They can only be destroyed by imperial robots, which can only be gained through random chance, or by purchasing them with virtual currency. These almost feel out of place in the game’s overall design, to be honest – but their randomness nonetheless adds to the intrigue and excitement, in the same way that Mario Kart messes with your mind with the possibility that someone behind you might pickup a red shell.

Four: A constant sense of exploration

Since each additional tier of item is increasingly hard to achieve, and the randomness of the game tends to foil your best laid plans, it’s actually fairly challenging to discover all of the possible items in the game. Until you do, there’s always a pull to keep exploring and trying for those more valuable combinations. It’s more than just the high score as well – it’s about figuring out what happens if you combine three cathedrals, or three castles, just for the sense of achievement and discovery.

Five: A constant  goal

Lastly – Triple Town is very good at giving the player a constant high-level score goal, which is of course achieved through smaller player-created goals within the board. You think your 40,000 point city was great? Try for 100k! And by the way, here’s a sparkly little progress bar to show you how you’re doing!

Putting it all together

These elements are all good on their own – but when you combine them into one package, you have a game that’s very accessible, yet optimally challenging at any skill level ( how many games can you say this about?! ) and combined into a package that always presents you with a simple goal and constantly rewards your progress along the way. These excellent traits should not be ignored by game designers who want to make highly compelling games that players can’t put down!

The monetization strategy on the other hand, can die in a fire.

A subject for another day 🙂

Like these posts? Follow me on Twitter for more.

A simpler design for asynchronous APIs

Original Author: Niklas Frykholm

Accessing Internet services, e.g. to fetch a web page or to store data on a leaderboard, requires an asynchronous API. You send a request and then, at some later point, you receive a reply.

Asynchronous APIs are trickier to design than synchronous ones. You can’t simply return the result of the operation, since it isn’t ready yet. Instead you have to wait until it is done and then send it to the caller through some other channel. This often results in designs that are needlessly complicated and cumbersome to work with.


The most common approach is perhaps to use callbacks. You make the asynchronous request and when it completes the callback is called. The callback can either be a global system-wide callback, or (which is nicer) a callback that you supply when you make the asynchronous call.

leaderboard->set_score(100, set_score_cb, my_user_data);
  void set_score_cb(SetScoreResult *result, void *user_data)

I have already mentioned in a previous article that I’m not too fond of callbacks and that I prefer polling in most cases. Badly designed polling can be expensive, but in the case of asynchronous network operations we wouldn’t expect to have more than a dozen or so in-flight at any one time, which means the cost of polling is negligible.

Callbacks tend to make code worse. There are several reasons.

First, you usually have little control over when a callback happens. This means that it can happen at a time that isn’t very suitable to you. For cleanliness, you may want to do all your leaderboard processing in your update_leaderboard() function. But the callback might be called outside update_leaderboard(), messing up all your carefully laid plans.

Second, it can be tricky to know what you can and cannot do in a callback. The code that calls you might make some assumptions that you inadvertently violate. These things can sometimes be really tricky to spot. Consider something as simple as:

int n = _leaderboard_operations.size();
  for (int i=0; i!=n; ++i) {
  	if (done(_leaderboard_operations[i]))

This looks perfectly innocent. But if the callback happens to do something that changes the _leaderboard_operations vector, for example by posting a new request or removing an old one, the code can blow up with memory access errors. I have been bitten by things like this many times. By now, every time I see a callback a warning clock goes off in my head: “danger, danger — there is a callback here, remember that when you make a callback anything can happen”.

Sometimes it can be necessary to double buffer data to get rid of bugs like this.

Third, callbacks always happen in the wrong context. You get the callback in some “global”, “top-level” context, and from there you have to drill down to the code that actually knows what to do with the information. (Typically by casting the user_data pointer to some class and calling a member function on it.) This makes the code hard to follow.

In other words, callbacks lead to hard-to-read code, hard-to-follow code flow, subtle bugs, redundant boilerplate forwarding stubs and instruction cache misses. Bleh!

Request objects

Another common approach is to have some sort of request object that represents the asynchronous operation. Something like:

SetScoreRequest *request = _leaderboard->set_score(100);
  if (request->is_done()) {
  	bool success = request->result();
  	delete request;

Or perhaps, using the C++11 concepts of promises and futures (I have only a passing acquaintance with C++11, so forgive me if I mess something up):

std::promise<bool> *promise = new std::promise<bool>();
  _leaderboard->set_score(100, promise);
  std::future<bool> future = promise->get_future();
  if (future.valid()) {
  	bool success = future.get();
  	delete promise;

This is a lot better than the callback approach, but still in my view, overly complicated. It is clearly a design based on the object-oriented philosophy of — when in doubt, make more objects.

But these extra objects don’t really do much. They just act as pointless intermediaries that pass some information back and forth between our code and the _leaderboard object. And they are a hassle for the caller to keep track of. She must store them somewhere and make sure to delete them when she is done to avoid memory leaks.

Furthermore, if we want to expose this API to a scripting language, such as Lua, we have to expose these extra objects as well.

ID tokens

As readers of my previous articles know, I’m a big fan of using IDs. Instead of exposing internal system objects to the caller of an API, I prefer to give the caller IDs that uniquely identifies the objects and provide functions for obtaining information about them.

That way, I am free to organize my internal data however I like. And it is easier to see when the state of my objects might mutate, since all calls go through a single API.

With this approach the interface would look something like this:

unsigned set_score(int value);
  SetScoreResult set_score_result(unsigned id);

Note that there are no objects that the user must maintain and release. The ID can easily be manipulated by a scripting layer. If the user doesn’t need to know if the operation succeeded, she can just throw away the returned ID.

In this API I don’t have any method for freeing tokens. I don’t want to force the user to do that, since it is both a hassle (the user must track all IDs and decide who owns them) and error prone (easy to forget to release an ID).

But obviously, we must free tokens somehow. We can’t store the results of the set_score() operations forever. If we did, we would eventually run out of memory.

There are several ways you could approach this problem. My preferred solution in this particular case is to just have a fixed limit on the number of operations that we remember. Since we don’t expect more than a dozen simultaneous operations, if we make room for 64, we have plenty of slack and still use only 64 bytes of memory. A modest amount by any standard.

We can keep the results in a round-robin buffer:

/// Maximum number of requests whose result we remember.
  static const int MAX_IN_FLIGHT = 64;
  /// The result of the last MAX_IN_FLIGHT requests.
  char results[MAX_IN_FLIGHT];
  /// Number of requests that have been made.
  unsigned num_requests;
  SetScoreResult set_score_result(unsigned id)
  	// If more than MAX_IN_FLIGHT requests have been made after this one,
  	// the information about it is lost.
  	if (num_requests - id > MAX_IN_FLIGHT)
  	return results[id % MAX_IN_FLIGHT];

This means that you can only ask about the result of the last 64 operations. On the other hand, this solution uses very little memory, does not allocate anything, has very quick lookups and doesn’t require the user to explicitly free tokens.

To me, this added simpleness and flexibility outweighs the disadvantage of having a limit on the maximum number of in flight operations that we support.

Implicit APIs

In many cases, the best solution to asynchronous conundrums is to redesign the API to abstract away the entire concept of asynchronous operations, so that the user doesn’t even have to bother with it.

This can require some creative rethinking in order to focus on what it is the user really wants to do. For example, for our example, we might come up with this:

/// Sets the score to the specified value. This is an asynchronous operation.
  /// You can use acknowledged_score() to find out when it has completed.
  void set_score(int score);
  /// Returns the last score that has been acknowledged by the server.
  int acknowledged_score();

This is probably all that the user needs to know.

Now we have really simplified the API. The user still needs to be aware that set_score() isn’t propagated to the server immediately, but she doesn’t at all have to get involved in what asynchronous operations are performed and how they progress.

This kind of radical rewrite might not be possible (or even desirable) for all asynchronous systems. You have to balance the value of high-level abstractions and simplifications against the need for low-level control. But it is almost always worth exploring the possibility since it can lead to interesting ideas and dramatically simplified APIs.

For example, the interface for an asynchronous web fetcher could be as simple as:

const char *fetch(const char *url);

If called with an URL that hadn’t been fetched yet, the function would issue a request for the URL and return NULL. Once the data was available, the function would return it. On the next call, the data would be freed. To fetch a web page, you would just repeatedly call the function with an URL until you got a reply.

Quite fetching, wouldn’t you say?

This has also been posted to The Bitsquid blog.

Shader Generator

Original Author: Simon Yeung

Technology/ Code / Visual Arts /


In the last few weeks, I was busy with rewriting my iPhone engine so that it can also run on the Windows platform (so that I can use Visual Studio in stead of Xcode~) and most importantly, I can play around with D3D11. During the rewrite, I want to improve the process of writing shaders so that I don’”>Unreal Engine, because being a programer, I feel more comfortable (and faster) to write code than dragging tree nodes using the GUI. In the current implementation of the shader generator, it can only generate vertex and pixel shaders for the light pre pass renderer which is the lighting model used before.

Defining the surface

To generate the target vertex and pixel shaders by the shader generator, we need to define how the surface looks like by writing surface shader. In my version of surface shader, I need to define 3 functions: vertex function, surface function and lighting function. The vertex function defines the vertex properties like position and texture coordinates.

  2. {
  3.     VTX_FUNC_OUTPUT output;
  4.     output.position = mul( float4(input.position, 1), worldViewProj  );
  5.     output.normal = mul( worldInv, float4(input.normal, 0) ).xyz;
  6.     output.uv0 = input.uv0;
  7.     return output;
  8. }

The surface function which describe how the surface looks like by defining the diffuse color of the surface, glossiness and the surface normal.

  2. {
  3.     SUF_FUNC_OUTPUT output;
  4.     output.normal = input.normal;
  5.     output.diffuse = diffuseTex.Sample( samplerLinear, input.uv0 ).rgb;
  6.     output.glossiness = glossiness;
  7.     return output;
  8. }

Finally the lighting function will decide which lighting model is used to calculate the reflected color of the surface.

  2. {
  3.     LIGHT_FUNC_OUTPUT output;
  4.     float4 lightColor = lightBuffer.Sample(samplerLinear, input.pxPos.xy * renderTargetSizeInv.xy );
  5.     output.color = float4(input.diffuse * lightColor.rgb, 1);
  6.     return output;
  7. }

By defining the above functions, writer of the surface shader only need to fill in the output structure of the function by using the input structure with some auxiliary functions and shader constants provided by the engine.

Generating the shaders

As you can see in the above code snippet, my surface shader is just defining normal HLSL function with a fixed input and output structure for the functions. So to generate the vertex and pixel shaders, we just need to  copy these functions to the target shader code which will invoke those functions defined in the surface shader. Take the above vertex function as an example, the generated vertex shader would look like:

  1. #include “include.h”
  2. struct VS_INPUT
  3. {
  4.     float3 position : POSITION0;
  5.     float3 normal : NORMAL0;
  6.     float2 uv0 : UV0;
  7. };
  8. struct VS_OUTPUT
  9. {
  10.     float4 position : SV_POSITION0;
  11.     float3 normal : NORMAL0;
  12.     float2 uv0 : UV0;
  13. };
  14. typedef VS_INPUT VTX_FUNC_INPUT;
  16. /********************* User Defined Content ********************/
  18. {
  19.     VTX_FUNC_OUTPUT output;
  20.     output.position = mul( float4(input.position, 1), worldViewProj  );
  21.     output.normal = mul( worldInv, float4(input.normal, 0) ).xyz;
  22.     output.uv0 = input.uv0;
  23.     return output;
  24. }
  25. /******************** End User Defined Content *****************/
  26. VS_OUTPUT main(VS_INPUT input)
  27. {
  28.     return vtxFunc(input);
  29. }

During code generation, the shader generator need to figure out what input and output structure are needed to feed into the user defined functions. This task is simple and can be accomplished by using some string functions.

Simplifying the shader

As I mentioned before, my shader generator is used for generating shaders used in the light pre pass renderer. There are 2 passes in light pre pass renderer which need different shader input and output. For example in the G-buffer pass, the shaders are only interested in the surface normal data but not the diffuse color while the data need by second geometry pass are the opposite. However all the surface information (surface normal and diffuse color) are defined in the surface function inside the surface shader. If we simply generating shaders like last section, we will generate some redundant code that cannot be optimized by the shader compiler. For example, the pixel shader in G buffer pass may need to sample the diffuse texture which require the texture coordinates input from vertex shader but the diffuse color is actually don’t needed in this pass, the compiler may not be able to figure out we don’t need the texture coordinates output in vertex shader. Of course we can force the writer to define some #if preprocessor inside the surface function for the particular render pass to eliminate the useless output, but this will complicated the surface shader authoring process as writing surface shader is to describe how the surface looks like, ideally, don’t need to worry about the output of a render pass.

So the problem is to figure out what the output data are actually need in a given pass and eliminate those outputs that are not needed. For example, given we are generating shaders for the G buffer pass and a surface function:

  2. {
  3.     SUF_FUNC_OUTPUT output;
  4.     output.normal = input.normal;
  5.     output.diffuse = diffuseTex.Sample( samplerLinear, input.uv0 ).rgb;
  6.     output.glossiness = glossiness;
  7.     return output;
  8. }

We only want to keep the variables output.normal and output.glossiness. And the variable output.diffuse, and other variables that is referenced by output.diffuse (diffuseTex, samplerLinear, input.uv0) are going to be eliminated. To find out such variable dependency, we need to teach the shader generator to understand HLSL grammar and find out all the assignment statements and branching conditions to derive the variable dependency.

To do this, we need to generate an abstract syntax tree from the shader source code. Of course we can write our own LALR parser to achieve this goal, but I chose to use rule used to generate the parse tree. By traversing the parse tree, the variable dependency can be obtained, hence we know which variables need to be eliminated and eliminate them by taking out the assignment statements, then the compiler will do the rest. Below is the simplified pixel shader generated in the previous example:

  1. #include “include.h”
  2. cbuffer _materialParam : register( MATERIAL_CONSTANT_BUFFER_SLOT_0 )
  3. {
  4.     float glossiness;
  5. };
  6. Texture2D diffuseTex: register( MATERIAL_SHADER_RESOURCE_SLOT_0 );
  7. struct PS_INPUT
  8. {
  9.     float4 position : SV_POSITION0;
  10.     float3 normal : NORMAL0;
  11. };
  12. struct PS_OUTPUT
  13. {
  14.     float4 gBuffer : SV_Target0;
  15. };
  16. struct SUF_FUNC_OUTPUT
  17. {
  18.     float3 normal;
  19.     float glossiness;
  20. };
  21. typedef PS_INPUT SUF_FUNC_INPUT;
  22. /********************* User Defined Content ********************/
  24. {
  25.     SUF_FUNC_OUTPUT output;
  26.     output.normal = input.normal;
  27.                                                                  ;
  28.     output.glossiness = glossiness;
  29.     return output;
  30. }
  31. /******************** End User Defined Content *****************/
  32. PS_OUTPUT main(PS_INPUT input)
  33. {
  34.     SUF_FUNC_OUTPUT sufOut= sufFunc(input);
  35.     PS_OUTPUT output;
  36.     output.gBuffer= normalToGBuffer(sufOut.normal, sufOut.glossiness);
  37.     return output;
  38. }

Extending the surface shader syntax

As I use lex&yacc to parse the surface shader, I can extend the surface shader syntax by adding more grammar rule, so that writer of the surface shader can define what shader constants and textures are needed in their surface function to generate the constant buffer and shader resources in the source code. Also my surface shader syntax permit user to define their struct and function other than their 3 main functions (vertex, surface and lighting function), where they will also be copied into the generated source code. Here is a sample of how my surface shader would looks like:

  1. RenderType{
  2.     opaque;
  3. };
  4. ShaderConstant
  5. {
  6.     float glossiness: ui_slider_0_255_Glossiness;
  7. };
  8. TextureResource
  9. {
  10.     Texture2D diffuseTex;
  11. };
  13. {
  14.     VTX_FUNC_OUTPUT output;
  15.     output.position = mul( float4(input.position, 1), worldViewProj  );
  16.     output.normal = mul( worldInv, float4(input.normal, 0) ).xyz;
  17.     output.uv0 = input.uv0;
  18.     return output;
  19. }
  21. {
  22.     SUF_FUNC_OUTPUT output;
  23.     output.normal = input.normal;
  24.     output.diffuse = diffuseTex.Sample( samplerLinear, input.uv0 ).rgb;
  25.     output.glossiness = glossiness;
  26.     return output;
  27. }
  29. {
  30.     LIGHT_FUNC_OUTPUT output;
  31.     float4 lightColor = lightBuffer.Sample(samplerLinear, input.pxPos.xy * renderTargetSizeInv.xy );
  32.     output.color = float4(input.diffuse * lightColor.rgb, 1);
  33.     return output;
  34. }


This post described how I generate vertex and pixel shader source codes for different render passes by defining a surface shader which avoid me to write similar shaders multiple times and without worrying the particular shader input and output for each render pass. Currently, the shader generator can only generate vertex and pixel shader in HLSL for static mesh in the light pre pass renderer. The shader generator is still under progress where generating shader source code for the forward pass is still have not done yet. Besides domain, hull and geometry shaders are not implemented. Also GLSL support is missing, but this can be generated (in theory…) by building a more sophisticated abstract syntax tree during parsing the surface shader grammar or defining some new grammar rule in the surface shader (using lex&yacc) for easier generating both HLSL and GLSL source code. But these will be left for the future as I still need to rewrite my engine and get it running again…


[1] Unity – Surface Shader Examples 

[3] ANSI C grammar, Lex specification 



Porting your game from iOS to Android

Original Author: Charilaos Kalogirou

So you created a C/C++ game for iOS that gives joy to iPhone and iPad gamers from around the

world. How can you deny this joy from all loyal Android users? I can’t, so I had to port full of gain as I say, and I think it would be nice to share some information and knowledge on the subject.

The basics

First of all, if you are feeling comfortable in your xcode environment and enjoying the

feathery wings of mother Apple, get prepared for a rough landing to the Android land.

Be prepared to face lots of not-so-streamlined tools, and basically no documentation.

The NDK (the toolchain and libraries you need to build native apps on Android) has no

relation to the SDK. It is obvious that Google is working hard on bringing native code

support to the platform but we are not at the place where developing native code

is as nice as it is with Java.

The first thing that you must get comfortable with are the build tools and process.

Google exposes the same tools that are used for building the actual Android platform to

the NDK developers, and by tools I mean a set of shell scripts and makefiles. What you are

actually being requested to do in order to build your project is write a makefile part

that gets included in the main makefiles provided by the NDK. This causes a steep

learning curve at the beginning that might demoralize some. However when you get the

grasp of it, it works fine and it is probably better that rolling out your custom makefiles.

In the end what you are ultimately building is a dynamically linked library that Dalvic

can load with JNI. That is right, your game will still be a Dalvic Java VM process that

just calls out to your library.

Mixing Java and native code

So you will not be able to fully escape Java. Your code must get along with it, and this

is actually something you want, as almost all of Android API’s are still Java only. Also

most libraries for Android that you might want to use are also written in Java. For

example if you want Openfeint for leaderboards and achievements, of Flurry for analytics,

you need to talk to Java.

This is accomplished with Java Native Interface (JNI). This interface allows Java code

running in the VM to call out, and to be called back, by native code written in C/C++.

This is an example of how the code to call out to from native code.

jclass cls = javaEnv->FindClass("com/openfeint/api/ui/Dashboard");
  jmethodID open = javaEnv->GetStaticMethodID(cls, "open", "()V");
  javaEnv->CallStaticVoidMethod(cls, open);

The only dark point in the above code is the “()V” which is the internal type signature

of the method. This is the way that the Java VM describes the parameters and return

value of a method. The syntax is error prune and I suggest you always use the “javap -s myclass” command that prints out all the methods along with their signatures.

Copy and paste from there. Keep in mind that if you miss-spell a signature you will

only find out at runtime.

Even though that the latest versions of the NDK allow you to write an activity in

full native code, I went with the traditional way of creating the activity in Java

and then do the calling out to native code from that.


Handling touch input is slightly more complex on Android than on iOS, as the designers

of Android thought it would be cool to have the system pass in an array of “historical”

touch events instead of calling you for each. Appart from that the story is the same.

Just make sure you handle both ACTIONUP and ACTIONPOINTER_UP events.

The major issue however, that also applies to many other aspects of the porting details,

is that these events come in on a different thread. This might surprise some iOS

developers that are accustomed to almost everything happening on the main thread’s

looper. It did surprise me at least, but it turns out that Android is very generous with

threads. So depending on how your engine is coded you might have to queue up the events

and then pass them to your native code from the thread it expects them.

Finally, there is button’s -real hardware button’s- handling. You would want to handle

at least the back and home button, in the way Android users expect them to work.


This is where the Android platform got me by surprise… Brace… there is no OpenAL!

It was one of those things that you can’t believe and you keep looking desperately

denying the simple truth. So it is true, if you are hopping to easily port your OpenAL

based sound engine to Android you are in for a big disappointment. I believe it had to

do with some licensing rights or something. The choices you are left with are MediaPlayer, SoundPool and OpenSL ES. The first two are Java API’s while the third is native.

MediaPlayer is basically for playing music and sounds that don’t require low latency. I

could have used that for playing music, but I decided to try OpenSL. I implemented the

music playing part of the engine on OpenSL and decided that I don’t like the API. If I

knew from the beginning I would go straight for MediaPlayer which is very straightforward.

The SoundPool class is very well suited for playing sound effects. It also does the sound

decompression for you and stores the uncompressed-ready-to-play samples in memory. It

has its drawbacks as it can’t support effects that are bigger that 1MB in most of my

test cases. The SoundPool class also has a bad history behind it. Due to a race

condition in the code, SoundPool caused every app that used it on the first dual core

phones to crash! Mainly the Samsung Galaxy S2 with the vanilla Android version. Can you

imagine? You have your nice game running on the store and one day a company releases

a phone that causes your game to crash… and sells millions of it! The fix from Samsung

came a year later. Until then game developers had to drop SoundPool and probably

implement the same functionality in OpenSL ES -which I tell you is not fun. The worst

part is that even now that Samsung released newer versions of Android that don’t

have the problem, most of the users don’t upgrade. So even last month that I released

Pop Corny, most S2s had a buggy SoundPool. I took the decision not to drop SoundPool

and simply detect when running on a buggy version and don’t play sound effects at all.


Thank god Android does support OpenGL! You will have no problem here. Just be a little

careful with the multi-threaded nature of Android and you will be ok (all GL commands

must come from the GL thread). But you must be prepared for the variety of

resolutions that Android phones and tablets have. You are no longer in the iOS

ecosystem where you can get away with just two aspect ratios (iPhone and iPad).

For Pop Corny the game already supported the aspect ratios of iPhone and iPad,

so I just generalized and made the code accept a certain range of aspect ratios

and after that add black bars as necessary. For example the exotic resolution of 480×854 pixels

on some phones is so extreme that can’t be handled without redesigning the whole game.

Therefore it gets black bars.

It will also be useful to only load the appropriate texture mipmap and below, depending

on the screen resolution. This will save precious memory specially on the low end

devices that usually come with the low resolution displays.

The major problem that you are going to face with OpenGL when porting to Android is

dealing with the activity life cycle. As you probably already know, everything on

Android are activities. Even a small dialog box that you will bring up is an activity.

The problem is that when the dialog comes up if pauses your current activity and

if that activity was your OpenGL view, Android will trash your OpenGL context! What

that means is that when the dialog will go away, to get back to rendering you will

have to reload every resource that OpenGL had. The same applies for when the user

puts your application in the background, or when the user takes a call mid game.

Loading all your textures again, whenever something like that happens, is unacceptable.

This one took me a while to sort out. Possibly due to the fact that I had no

actual Android device to test on and I was relaying on the slow beta tester round trip.

Anyhow, it turns out that this was fixed on version 3.0 of Android. The GLViewSurface of that

version adds a method named setPreserveEGLContextOnPause(boolean) that when

set to true is tries to preserve the GL context. But as you know very few people upgrade

on the Android ecosystem. So what I did was take the GLSurfaceView class from the

latest sources of Android, do some changes, and use that instead of the one in the

users phone. Simple as that. However even with that, many phones were losing the GL

context. It turns out that the GLSurfaceView did not preserve the context in the case

the GPU was an Adreno regardless of whether the GPU supported multiple contextes.

Well, all Adreno based devices I tried can

preserve the context and simply removing that test in GLViewSurface’s source allows the

game to continue instantly after an activity pause. Case closed.


The final thorn in the porting endeavor was the asset management and loading. Those that come from iOS will be surprised to find out that Android does not decompress the application bundle when installing an application, like iOS does. The files will remain in the .apk file, which is essentially a zip file. This causes a number of

problems. You can’t just use your trusted system calls to open a file and read it. You have to open the apk file and poke in it, find your file, decompress it, and then use it.

For some files you can skip the decompression part. There are some kinds of files that the build process stores uncompressed in the apk. Mostly media files that are

already compressed. If you use ant for building, you can actually add more file extensions to the no-compression list. Unfortunately, I didn’t manage to find a way to do it with Eclipse. These files can be loaded easily (with the usual file manipulation functions) using the file descriptor of the apk, an offset, and a length that you can get from the Java AssetManager. In the case of a compressed file however you will have to load it completely in Java using the asset manager and copy the whole file data over to C using JNI, which is inefficient.

Thankfully, Google added native asset loading capabilities after version 2.3. So if you

are only supporting 2.3 and up, you can forget all the above and use the native API

directly. It does all the work for you.

Closing words

As you can see the Android platform has its quirks. Most of the times due to the

NDK being still too young. It is getting better though with every new version. If only

Android users where quick to upgrade to the latest version…

To all the above add that you probably want to compile for three different CPUs:

ARM, ARM7 and x86. That’s right, x86. There are quite some tablets out there that

are based on x86 right now, and we are probably going to see even more with time.

Endianess might also give you some trouble if you

wasn’t so careful with it when originally developing the game for iOS. Not because of actual difference in the endianess, but mainly from depending on iOS/OSX specific C defines for detecting it.

It might be a little cumbersome at times, but the effort really pays of. In the

end you have a completely new world for your game to explore. And the Android

users are also very welcome and warm. A lot more that iOS users I think. So lets

give them our games!

Good luck!

[This was originally posted on Thoughts Serializer]