Why Audio Matters

Original Author: Jefferson Hobbs

There are two kinds of people who are reading this article. First are the people who saw the word “audio” in a game industry blog and got excited because that doesn’t happen very often. And then there’s the people who I am actually writing this post for. If you you fall in the first group, don’t fret. You should still keep reading. I am sure you will learn something new. If you are in the second group, be brave and venture forth. This won’t hurt too much.

Current State of Audio

Audio is probably the most under rated element in a game. Granted they are a few companies that value audio a lot but generally speaking it does not get the respect it deserves. More often than not audio is treated like a checkbox that you have to check before you ship. It’s not something that everyone looks at with a fine tooth comb and tries to get 110% right.

However, things are changing…slowly but surely. It’s tough to quantize the amount of buzz audio is getting around the industry but I’ve seen more people are starting to get into it lately. It’s not over-welming but it is increasing from the bottom up.   This article hopefully will show you why audio is important and simple steps you can do to improve it in your games.

The Impact of Audio

Audio has more of an impact to the game than most people think. For some reason people forget about the sense of hearing. It truly baffles me. How can someone think that the a human sense doesn’t play much of a factor in a gamer’s experience? It doesn’t make sense (yep…that’s a bad pun and you just read it. Sorry). What does make sense (sorry again) is for game developers to take a step back and look  at what audio is actually doing.

Below is a list of things about audio that you may not know. Hopefully, you are able to take away the point that audio is an important element in games.

WARNING: I am posting some information below that are not really referenced…unless you count the word on the street as a valid reference

Ear is hard to trick
If you have bad audio, people are going to notice. They might not realize what it is but they will feel something is wrong. One theory that I have been told by Alan Kraemer (CTO of SRS Labs), is that audio has a  relatively low bandwidth compared to other senses like vision. The brain is able to do a more thorough analysis of the sound waves as a result. This makes it tough to trick.

Audio makes things Look Better
There is a famous study in the consumer electronics industry where they actually tricked people into thinking a TV looked better by only changing the quality of the sound. They took two TVs that were exactly the same except for the speakers. When asked what they thought about the video quality, more people said that the TV with the better audio looked better. I am sure that this also applies to games. Who would of thought that one could improve a game’s graphics simply by improving the sound.

Movie Industry

The movie industry has known the value of audio for a long time now. A lot of the emotion and thrill is actaully in the sound track. Imagine a scary movie without erie sounds followed by a sudden screech. Or think about what Star Wars would be like if you replaced the sounds a crappy free sound library and generic music. The movie industry has invested a lot into sound because they know it helps them make money.

The whole brain is involved

Music has the ability to activate the whole brain and even trigger the production of certain chemicals. If audio wasnt important than clearly the brain would just filter it out.

Audio is something that people need even if they don’t demand it

Just like with every other misconception in psychology, a person might not know what they actually need. Just because someone doesn’t ask for something doesn’t mean that it’s not important.

Good Ol’ Fashion Shotgun
The shotgun is everyone’s favorite weapon. Why? Because it goes BOOM! QED.

How do you fix this

First off, audio is not hard. Its not. Really. The amount of energy and man hours it takes to get good audio is far far less than what it would take to get good graphics, game play or physics. Usually the biggest road block in adding audio features is realizing that there are other audio features to add besides playback. Below I have a list of other small steps a company can do to help out their audio:
  • Get good sounding assets. You can’t make something out of nothing…Well you can but it’s crap.
  • Hire at least one engineer that knows audio. He/she doesn’t have to work on audio 100% of the time but he/she should have a knowledge of sound effects, music, or interactive audio.
  • Try to get an inhouse sound designer. This might break the bank for some but its worth it if you can afford it. This will speed up integration time and can offer up ideas for new audio features.
  • Listen to your game’s audio and critique it like you would a new game mechanic. If it doesn’t sound right, send it back to the cook and have it remade.
  • Put your energy into the music first and then the sound effects. Music is played ALL the time and carries most of the emotion. Make sure that it resonates with that game and sets the environment that you are aiming for. Once you get that, you can then focus on the sound effects.

Audio Effects to Think About

To conclude my post, here is a quick list of simple sound effects that you can do to enhance your game. These may be things that you have not thought of before but are trivial to setup with an API like OpenSL ES.
  • EQ of Death – Lower the 3D effect and apply a low pass filter (ie cut the treble) when the player is low on health. This will lower the clarity and disorient the player just like blurring the screen does in the graphics world.
  • Cinematic Stereo Widening – Widen the sound during cinematic points. For instance, if the player is walking down a small hall way into a big room with a boss or dramatic cut scene, you can start off with a narrow sound stage and then widen it as the room opens up.
  • Bigger Explosions – Apply a bass boost effect to enhance the explosion’s boom (note that bass boost is different than just increasing the bass). Afterwards apply a low pass filter for a short time afterwards to shock the player audibly.
  • Engulfing Sounds –  Apply stereo widening on sounds that encompass the player like crowd noise, fire, rain, bees, etc.

Thanks for reading! If you know of other cool things about audio, please share them in the comments. I (and others too) would love to hear about them.


Should we be worried about Nintendo?

Original Author: Kyle-Kulyk

As the saying goes, where there’s smoke, there’s fire and lately there seems a lot of smoke centered over Nintendo, with good reason.  Recently, Nintendo revised their annual forecast, predicting profits to plunge to 27 year lows on the news of losses related to the 3DS, waning Wii sales and foreign exchange concerns.  It’s difficult to discuss Nintendo’s future without having die-hard fans descend on you like a pack of furious monkeys, but I feel the games industry and gamers alike should be concerned about what this could mean for the iconic company and the impact on the industry.

There’s no denying that the Wii was a runaway hit, however with Wii sales beginning to fade Nintendo has readied its replacement in the Wii-U.  The Wii-U looks to match other current gen consoles graphically, but it will use both Wii controllers and a giant, expensive looking iPad like controller.  Now, correct me if I’m wrong, but does anyone else see this as an issue?  Growing up with three other siblings, I can only imagine the chaos in presenting us with the choice of one giant, special controller with everyone else receiving regular controllers.  I’m all for teaching children the value of sharing, but let’s be realistic here.  Analysts have also expressed their concerns and investors began to dump Nintendo shares en masse after the Wii-U’s reveal.  Does the Wii-U even have a shot at the Wii’s level of success, or will it be passed up as being nothing more than the Wii HD with an ipad attachment?

And what if the Wii-U doesn’t match the Wii’s success?  Nintendo has faced struggles with home consoles before and pulled through.  Many considered the Gamecube a blunder with complaints of it being too “toy-ish” and for not offering the technical advantages available from their competition, but Nintendo pulled through.  What’s different this time?

I think the main difference this time, and the real reason we need be concerned for one of the industry giants is due to changes in the portable gaming market.  Portable games have been Nintendo’s bread and butter for some time.  Even back when the GameCube was attempting to wrestle out a corner of the market, Nintendo leaned on excellent sales from their GameBoy and GameBoy Advance platforms, both for hardware and software sales.  The GameBoy Advance didn’t even begin to decline until its replacement, the Nintendo DS, became available and the DS picked up the Nintendo revenue torch and ran with it despite what many considered a lacklustre launch.

Unlike the GameBoy Advance, however, the DS has begun its sales descent well ahead of the launch of the 3DS, which is a large part of why Nintendo found themselves forced to slash their forecasts.  The DS’s sales decline coincidentally (or not) coincided with a surge in another device.  The Apple iPhone.  That’s why as a long-term industry watcher I think Nintendo is in trouble.  The DS’s replacement, the 3DS, has already received a massive price drop to spur sales while agencies like Reuters are describing the device as a “flop” amidst complaints of a lacklustre software line-up and confusion regarding the safety of 3D displays and children’s eye development.  Industry analysts are already predicting far fewer 3DS units sold compared to the DS’s accomplishments, so where does that leave Nintendo?  If the 3DS is unable to find a foothold in a new market where casual gamers can find their gaming fix via their cell phones and if Nintendo is hit with a double whammy if the Wii-U fails to capture consumers – where does that leave Nintendo?

It was heresy in the past to even suggest that Sega might be forced to withdraw from the console market, but we all know what happened there.  Could we see Nintendo become a software only company with Mario and Link making appearances on Sony’s Playstation?  Might a partnership with another company to create a unified console take some of the burden off Nintendo when it comes to the enormous costs associated with new hardware development?  Might we see Nintendo apply their portable gaming expertise to the Android and iPhone markets or will Nintendo merely weather this storm?  No matter what happens, my gut tells me we may witness a shift within the games industry sooner rather than later.

“How did I crash in that function?”

Original Author: Chad Bramwell

1
 
  2
 
  3
 
  4
 
  5
 
  
void main()
 
  {
 
  	int* p = 0;
 
  	*p = 4919;
 
  }

*CRASH*

First-chance exception at 0×00161015 in test.exe: 0xC0000005: Access violation writing location 0×00000000.

Unhandled exception at 0×00161015 in test.exe: 0xC0000005: Access violation writing location 0×00000000.

    Have you ever stopped to wonder what those 2 lines mean?

  • What is a “First-chance exception?”
  • What is an “Access violation?”
  • What is an “Unhandled exception?”
  • What are those 3 hex values?

Exceptions

To answer the “First-chance exception” and “Unhandled exception” I’ll link you to a nice little article written by David Kline that happens to be at the top of the google search hits. For the lazy: when your program encounters an “exceptional” circumstance then a “first-chance” exception is thrown. This allows your debugger to do something (i.e. stop execution) before it is passed off to any exception handling routine you have setup (for ex: try/catch blocks). Breaking on a first-chance exception is disabled by default but can be turned on in Visual Studio (2005/2008) through Debug->Exceptions… by checking all the “Thrown” boxes. If the debugger decides to do nothing with the exception then it allows the program to handle it, if the program is unable to handle the exception then a second “unhandled” exception is thrown and your program will stop.

Violations

“Access violation” is any sort of action, involving memory, that cannot be completed by the processor. Meaning, you just tried to read/write memory that does not exist or you tried to read/write memory that the operating system has decided you may not touch.

HEX?

Finally we get to the strangest and, in my humble opinion the most enlightening, question of all: what do those hex values mean? Let’s enter hex’s favorite playland: assembly. (Note: to see this for yourself: right-click on your crashed cpp file and select “Go To Disassembly”)

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  7
 
  8
 
  
void main()
 
  {
 
  	int* p = 0;
 
  0016100B  mov         dword ptr [p],0 
 
  	*p = 4919;
 
  00161012  mov         eax,dword ptr [p] 
 
  00161015  mov         dword ptr [eax],1337h
 
  }

Take a look at the line you expect to crash. Notice anything about the numbers on the left around it compared to our 2 exception messages?

That’s right! on line 7 we have: 00161015 the same number as our 0×00161015 from the exception messages! The debugger is telling us that we crashed attempting to execute that line of assembly. Now that we know where it crashed, let’s try to figure out why we got an access violation.

mov

Let’s take a quick look at line 00161015 and the line above it. On 00161012 we are setting the value of register eax to the value of our pointer. For work to be done in a processor (like add, multiply, etc…) it must be done on registers. Normally a mov occurs from multiple variables into registers, then some operation is performed and finally the end result of that operation is written back out to the variable so the processor can free up the register to do more work. On the line we crashed on we are attempting to write a literal value of 1337h (aka 4919 in decimal) to the location pointed to by eax. “[eax]” means the value pointed to by eax. That’s why we got an access violation, we are attempting to write to location 0×0 in memory. That is why we get the Access violation writing location 0×00000000, because we are attempting to write data to location 0×00000000!

That’s a lot of stuff that I just threw at you, let’s go through some more examples to see how helpful understanding these 2 lines can be!

Structures

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  7
 
  8
 
  9
 
  10
 
  
struct test
 
  {
 
  	int foo;
 
  	int bar;
 
  };
 
  void main()
 
  {
 
  	test* t = 0;
 
  	t->bar = 4919;
 
  }

*CRASH*

First-chance exception at 0×00101015 in test.exe: 0xC0000005: Access violation writing location 0×00000004.

Unhandled exception at 0×00101015 in test.exe: 0xC0000005: Access violation writing location 0×00000004.

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  7
 
  8
 
  
void main()
 
  {
 
  	test* t = 0;
 
  0010100B  mov         dword ptr [p],0 
 
  	t->bar = 4919;
 
  00101012  mov         eax,dword ptr [p] 
 
  00101015  mov         dword ptr [eax+4],1337h
 
  }

Once again we are dereferencing a NULL pointer and once again our program is crashing but this time we are trying to set a value in a structure. So what’s changed between our exceptions? Obviously the code assembly line we crashed on has changed, what else?

Do you see some 4s showing up in some places?

  • …writing location 0×00000004
  • dword ptr [eax+4],1337h

Where is that coming from? Well, we are trying to access the variable bar in our test structure. foo and bar are ints. sizeof(int) in 32bit programs = 4 bytes = 32 bits. So our program is taking the address of our structure in memory and offsetting that address to get to our variable inside the structure. 0×00000000 + sizeof(int) = 0×00000004. As you can see from the assembly instruction that we crashed on “[eax+4]“, we are attempting to assign some data to the variable in our structure which is at (pointer to our structure) + (offset to our variable in the structure). 0 + 4. Cool.

Basics Complete. Now onto struct/class functions.

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  7
 
  8
 
  9
 
  10
 
  11
 
  12
 
  13
 
  
#include <cstdio>
 
  struct test
 
  {
 
  	void print()
 
  	{
 
  		printf("hello!n");
 
  	}
 
  };
 
  void main()
 
  {
 
  	test* p = 0;
 
  	p->print();
 
  }

Now what do you expect to happen? A crash right? …

hello!

Wait, what!?

Let’s take a look at the assembly.

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  7
 
  8
 
  9
 
  10
 
  11
 
  12
 
  13
 
  14
 
  15
 
  16
 
  17
 
  
struct test
 
  {
 
  	void print()
 
  	{
 
  		printf("hello!n");
 
  012D103E  push        offset string "hello!n" (12E81CCh) 
 
  012D1043  call        printf (12D11F7h)
 
  	}
 
  };
 
  void main()
 
  {
 
  	test* p = 0;
 
  012D100B  mov         dword ptr [p],0 
 
  	p->print();
 
  012D1012  mov         ecx,dword ptr [p] 
 
  012D1015  call        test::print (12D1030h) 
 
  }

Take a close look at ecx, follow its use. Did you notice ecx is only ever set to 0? We never use ecx anywhere! No where in our generated code do we have “[ecx]“!! Let that sink in a bit. Try to understand that the compiler did everything right; it did everything you asked it to.

Ever wonder why sometimes your code crashes multiple levels deep in functions in your class when the real problem was that your pointer was NULL? Here is the reason why. The compiler has no reason, whatsoever, to touch any data in your class unless you tell it to.

Let’s cement this

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  7
 
  8
 
  9
 
  10
 
  11
 
  12
 
  13
 
  14
 
  15
 
  
struct test
 
  {
 
  	int foo;
 
  	int bar;
 
  	void crash()
 
  	{
 
  		printf("hello!n");
 
  		bar = 4919;
 
  	}
 
  };
 
  void main()
 
  {
 
  	test* p = 0;
 
  	p->crash();
 
  }

hello!

First-chance exception at 0x0003104e in test.exe: 0xC0000005: Access violation writing location 0×00000004.

Unhandled exception at 0x0003104e in test.exe: 0xC0000005: Access violation writing location 0×00000004.

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  7
 
  8
 
  9
 
  10
 
  11
 
  12
 
  13
 
  14
 
  15
 
  16
 
  17
 
  18
 
  19
 
  20
 
  21
 
  22
 
  23
 
  
#include <cstdio>
 
  struct test
 
  {
 
  	int foo;
 
  	int bar;
 
  	void crash()
 
  	{
 
  		printf("hello!n");
 
  0003103E  push        offset string "hello!n" (481CCh) 
 
  00031043  call        printf (31201h)
 
  		bar = 4919;
 
  0003104B  mov         eax,dword ptr [this] 
 
  0003104E  mov         dword ptr [eax+4],1337h 
 
  	}
 
  };
 
  void main()
 
  {
 
  	test* p = 0;
 
  0003100B  mov         dword ptr [p],0 
 
  	p->crash();
 
  00031012  mov         ecx,dword ptr [p] 
 
  00031015  call        test::crash (31030h) 
 
  }

Now, a little “gotcha” before I continue. You may notice that we are using this instead of ecx on line 0003104B. Time for me to be honest, I’m hiding some assembly from you. My goal in this article is not to teach you assembly but to help you understand how and why programs crash and more importantly what you can glean from them to aid you in debugging and bug-fixing! this is a renaming by visual studio of the register containing our struct/class pointer. In our case that happens to be ecx.

*WHEW*

Hopefully, from the previous examples you now understand 2 things:

  • Why “hello!n” is printed
  • Why we crashed on 0003104E

VIRTUAL FUNCTIONS

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  7
 
  8
 
  9
 
  10
 
  11
 
  12
 
  13
 
  
#include <cstdio>
 
  struct test
 
  {
 
  	virtual void crash()
 
  	{
 
  		printf("hello!n");
 
  	}
 
  };
 
  void main()
 
  {
 
  	test* p = 0;
 
  	p->crash();
 
  }

Now what should happen? You may be surprised by the answer…

First-chance exception at 0×00261016 in test.exe: 0xC0000005: Access violation reading location 0×00000000.

Unhandled exception at 0×00261016 in test.exe: 0xC0000005: Access violation reading location 0×00000000.

Some of you might be thinking: “So wait, if I don’t make that function virtual than it doesn’t crash? But when I make it virtual it crashes!? WTF is going on? It’s not like I’m touching any data inside of my structure!”

“It’s not like I’m touching any data inside of my structure!” I’m sorry to break it to you good sir/ma’am, but you are touching data. To answer why that’s the case, let’s take a look at the assembly!

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  7
 
  8
 
  9
 
  10
 
  11
 
  12
 
  
//... removed for clarity
 
  {
 
  	test* p = 0;
 
  0026100C  mov         dword ptr [p],0 
 
  	p->crash();
 
  00261013  mov         eax,dword ptr [p] 
 
  00261016  mov         edx,dword ptr [eax] 
 
  00261018  mov         esi,esp 
 
  0026101A  mov         ecx,dword ptr [p] 
 
  0026101D  mov         eax,dword ptr [edx] 
 
  0026101F  call        eax
 
  }

Our exception says: “Access violation reading location 0×00000000″. Notice the difference? We are reading data here instead of writing data. 00261016 mov edx,dword ptr [eax] Simply put, the value of eax is 0 and that is not a valid location to read from. But how did eax get set to be 0? Take a look at the line above! eax is 0 because our pointer p is zero. Why are we trying to read the value from eax? Because we are calling a virtual function. Whenever code calls a virtual function then the compiler must do 2 things. First: it must make room in your struct/class for a virtual function table or vftbl for short. Second: it must generate code to figure out which function to call at runtime. So the line we crashed on is just doing the work to first figure out which function to call. It has to read the first variable of our structure (because that is where the vftbl is stored) to figure that out.

POP QUIZ

What do you expect the code below to do? (Hint: the code will crash, try to guess what line it will crash on and what location the Access Violation will occur at)

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  7
 
  8
 
  9
 
  10
 
  11
 
  12
 
  13
 
  14
 
  15
 
  16
 
  17
 
  18
 
  19
 
  20
 
  21
 
  22
 
  23
 
  24
 
  25
 
  
#include <cstdio>
 
  struct test
 
  {
 
  	int foo, bar;
 
  	void print()
 
  	{
 
  		printf("printn");
 
  	}
 
  	void touch()
 
  	{
 
  		printf("touchn");
 
  		bar = 4919;
 
  	}
 
  	virtual void crash()
 
  	{
 
  		printf("crashn");
 
  	}
 
  };
 
  void main()
 
  {
 
  	test* p = 0;
 
  	p->print();
 
  	p->touch();
 
  	p->crash();
 
  }
1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  7
 
  8
 
  9
 
  10
 
  11
 
  12
 
  13
 
  14
 
  15
 
  16
 
  17
 
  18
 
  19
 
  20
 
  21
 
  22
 
  23
 
  24
 
  25
 
  26
 
  27
 
  28
 
  29
 
  30
 
  31
 
  32
 
  33
 
  34
 
  35
 
  36
 
  37
 
  38
 
  39
 
  40
 
  41
 
  42
 
  
#include <cstdio>
 
  struct test
 
  {
 
  	int foo, bar;
 
  	void print()
 
  	{
 
  		printf("printn");
 
  0035105E  push        offset string "printn" (3681CCh) 
 
  00351063  call        printf (351251h)
 
  	}
 
  	void touch()
 
  	{
 
  		printf("touchn");
 
  0035108E  push        offset string "touchn" (3681D4h) 
 
  00351093  call        printf (351251h)
 
  		bar = 4919;
 
  0035109B  mov         eax,dword ptr [this] 
 
  0035109E  mov         dword ptr [eax+8],1337h 
 
  	}
 
  	virtual void crash()
 
  	{
 
  		printf("crashn");
 
  	}
 
  };
 
  void main()
 
  {
 
  	test* p = 0;
 
  0035100C  mov         dword ptr [p],0 
 
  	p->print();
 
  00351013  mov         ecx,dword ptr [p] 
 
  00351016  call        test::print (351050h) 
 
  	p->touch();
 
  0035101B  mov         ecx,dword ptr [p] 
 
  0035101E  call        test::touch (351080h) 
 
  	p->crash();
 
  00351023  mov         eax,dword ptr [p] 
 
  00351026  mov         edx,dword ptr [eax] 
 
  00351028  mov         esi,esp 
 
  0035102A  mov         ecx,dword ptr [p] 
 
  0035102D  mov         eax,dword ptr [edx] 
 
  0035102F  call        eax
 
  }

print

touch

First-chance exception at 0x0035109e in test.exe: 0xC0000005: Access violation writing location 0×00000008.

Unhandled exception at 0x0035109e in test.exe: 0xC0000005: Access violation writing location 0×00000008.

We crash on 0x0035109e because that is the first place in our program’s execution that touches invalid memory. Here’s the ordering laid out:

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  7
 
  8
 
  9
 
  10
 
  11
 
  12
 
  13
 
  14
 
  
0035100C  mov         dword ptr [p],0 ;p = 0
 
  00351013  mov         ecx,dword ptr [p] ;store p (in case the function call needs a this pointer)
 
  00351016  call        test::print (351050h)
 
  ;enter test::print
 
  0035105E  push        offset string "printn" (3681CCh)
 
  00351063  call        printf (351251h)
 
  ;leave test::print and return to main
 
  0035101B  mov         ecx,dword ptr [p] ;store p yet again (code was compiled without optimizations)
 
  0035101E  call        test::touch (351080h)
 
  ;enter test::touch
 
  0035108E  push        offset string "touchn" (3681D4h)
 
  00351093  call        printf (351251h)
 
  0035109B  mov         eax,dword ptr [this] ;move ecx (aka this) into eax
 
  0035109E  mov         dword ptr [eax+8],1337h ;*CRASH* writing to [0 + 8] = 0x00000008.

The reason why the compiler output [eax+8]? sizeof(vftbl) + sizeof(test::foo) = 4 + 4 = 8.

If you wish to see the value of 8 for yourself you can use offsetof.

Also in the Visual Studio watch window you could write:

&((test*)0)->bar

So there ya go. A whole lot of null dereferences, a whole lot of assembly and, hopefully, a better understanding of what happens when your computer crashes and what initial steps you can take to understand and solve the problem. Happy Debugging!

—–UPDATE September 28, 2011—–

Stefan Reinalter posted a link to an excellent article by Elan Ruskin that goes much, much more in-depth into this stuff. I highly recommend working through the “forensic debugging” slide-show!

http://assemblyrequired.crashworks.org/2011/03/08/annotated-slides-for-gdc11-forensic-debugging/

GDC 2011: Crash Analysis and Forensic Debugging


Reflection in C++, Part 1: Introduction

Original Author: Don Williamson

If there was one job I’d love to do other than writing games it’d be writing compilers. This probably explains my obsession with the subject of reflection, a topic I’ve been hammering away at for almost 10 years now. Having written a few compilers in the past, it became glaringly obvious to me that reflection would be quite simple to add to C++ (if you’re willing to place some limits on it) and that the language has suffered from its absence.

Adding reflection to C++ via a library or other means can be a simple task, a very hard task, or a down-right impossible task. You can’t reflect all aspects of your C++ program and it’s highly unlikely that you will ever want to do so.

Over the coming weeks I’ll be writing some tutorials on reflection from the point of view of a game programmer. This first post is very high-level in the hope that it’s a gentle introduction to the subject that can provide you with some key reasons why you might want to add reflective features to your game engine.

Subsequent posts will provide in-depth case studies of methods I’ve developed in the past that are either out there in shipped games, buried in code bases never to be seen again or the result of frenzied late night coding sessions:

  • I’ll cover a very simple method of reflection that can be very powerful, developed in my spare time as Reflectabit, with a similar implementation written for Splinter Cell: Conviction. The main selling points of the implementation are the ease with which you can replicate anything over a network connection and the extra bonus of being able to live-edit your C++ code while the game is running.
  • This will be followed by an approach that required the development of an IDL compiler and some crazy template programming for performing binding to arbitrary programming languages. Even though it worked on a PSP, it wasn’t the ideal method of achieving a solution for that platform and a subset of its implementation could prove a good match for others out there.
  • Another spare time project of mine I’ll cover is something I informally call Reflectalot. It works by scanning a PDB file and is surprisingly thorough at providing you with most of the information you need, albeit not really cross-platform (PC & Xbox 360 only). One of its cunning little features is its ability to provide you with a constant-time typeof operator.
  • Finally I’ll cover my latest development, clang C++ frontend to build a reflection database. This to me is as close to ideal as I’m going to get for C++ on Windows, however it can be taken to its logical conclusion on other platforms such as MacOS or Linux where the LLVM backend is more stable. Please checkout its webpage because I’d really love some help developing it further!

What is Reflection?

A reflection API is a very basic, powerful tool that every game studio should have at their disposal. It normally contains some or all of the following features:

  • A database of types and their inheritance relationship with each other.
  • A means of creating objects of a specific type by name.
  • A list of data member descriptions for each type, with name/type/offset tuples.
  • A database of enumeration types and their associated key/value pairs.
  • A database of functions/methods with their return types and parameter lists.
  • A means of calling functions/methods by name at runtime with an arbtrarily constructed parameter list.
  • A database of properties represented as Get/Set method pairs that externally look like a named value.
  • A database of attributes that can be attached to any of the above, describing how they should be used.

Each language has varying levels of support for reflection, while C++ has RTTI. You can do various things with RTTI but it’s an incredibly limited system that only gives you:

  • The ability to discover an object’s type at runtime through the typeid operator.
  • A typeid operator that can also be applied to types themselves.
  • A type’s name, its hash code and some comparison functions.
  • Runtime downcasting and similar operations through dynamic_cast.

This is not nearly enough! RTTI also has varying levels of support between compilers and type names are implementation specific.

So why would you want reflection? Perhaps it’s best to list a few things that it can enable:

  • Serialisation of any game type.
  • Transparent implementations of various backend data formats with one point of serialisation for any given format.
  • Versionable serialisation of any data.
  • Inspect game state of any object at runtime for debugging.
  • Dependency tracking with the pointer graph (ever wanted to know what objects are dependent on another before deleting?).
  • Reloadable resource (mesh, texture, script, etc) reference updating.
  • Automatically populate and describe user interfaces for editing tools.
  • Binding to arbitrary programming languages (Lua, C#, Python, etc.) through minimal translation layers.
  • Network communication/replication through serialisation and RPC.
  • Memory mapping of data formats with post-load pointer patching.
  • Live C++ code editing.
  • Garbage collection or defragmentable memory heaps (useful on systems where the GPU uses physical addressing).

You can of course build individual systems for each of these but they all share the same need to register type data and access it offline or at runtime. Using reflection for these systems can either make everything easier to understand and maintain or obfuscate intent and lead to a brittle code base. As such, a clean and simple reflection API is absolutely vital if you intend to adopt one.

Generating a reflection database can be done in any number of different ways with C++, including:

  • Using macros to simultaneously annotate your code and generate registration calls.
  • Using templates and meta-programming techniques to achieve the same goal.
  • Using a hybrid of the above or even doing it non-intrusively. Collectively these are runtime databases with no offline representation.
  • Using an IDL/DDL compiler to generate cpp/h files containing C++ equivalents and registration code. This can also generate an offline representation of your database that can be used in tools.
  • Using an existing language that already has reflection to describe your data/interfaces to achieve the same as the previous method (C# is a good candidate for this).
  • Performing a pre/post process on your C++ code using a custom parser that picks up interesting information.
  • Inspecting debug information emitted by the compiler.

There are many tradeoffs with each technique and covering each is beyond the scope of these posts. However, the use-cases should be broad enough to show how varied implementations can be.

Basic C++ Reflection API

To introduce the above concepts we’ll need a quick API we can talk about:

struct Name { int hash; string text; }
 
  struct Primitive { Name name; }
 
  struct Type : Primitive { int size; }
 
  struct EnumConstant : Primitive { int value; }
 
  struct Enum : Type { EnumConstant constants[]; }
 
  struct Field : Primitive { Type type; int offset; }
 
  struct Function : Primitive { Field return_parameter; Field parameters[]; }
 
  struct Class : Type { Field fields[]; Function functions[]; }
 
  struct Namespace : Primitive { Enum enums[]; Class classes[]; Function functions[]; }

The base type for any entry in the reflection database is a Primitive and will be used below to describe any such entry.

Serialisation

The cross-over between serialisation and reflection APIs is quite large and subtle. When you have game objects that you want to load and save from disk, a natural response is to develop a dedicated serialisation API that reads and writes data from within your game types. Reflection can be considered a generalisation of such a serialisation API by presenting a runtime description of all your types and their memory layout. This allows you to write serialisation code separate from your types that can be adapted to suit multiple file formats.

Let’s start with a very basic set of game types:

struct Vector
 
  {
 
      float x, y, z;
 
  };
 
   
 
  struct PhysicsComponent
 
  {
 
      Vector position;
 
      Vector velocity;
 
      Vector acceleration;
 
  };
 
   
 
  struct GameObject : public Object
 
  {
 
      PhysicsComponent physics;
 
  };

The reflection database can tell you:

  • Vector has 3 floating point data members at offsets 0, 4 and 8.
  • PhysicsComponent has 3 data members of type Vector at offsets 0, 12 and 24.
  • GameObject has one PhysicsComponent at offset 0.

Object is a type introduced by the reflection API that all objects must inherit from if they intend to be the root of any serialisation requests. In the code above Vector and PhysicsComponent do not inherit from Object, representing any of your lightweight game types. This means that you can only serialise objects of type GameObject – however, as long as the reflection database contains a description of the Vector/PhysicsObject types, they can be serialised as part of any objects that contain them. This should become apparent when we introduce what Object actually looks like:

struct Object
 
  {
 
      Type* type;
 
  };

So far that’s all we need. Object simply stores a pointer to the reflection database’s description of whatever type that object is. Some psuedo-code for a save function would be:

SaveObject(Object* object)
 
  {
 
      // Types that inherit from Object already know their type so can call
 
      // the overloaded SaveObject directly
 
      SaveObject(object, object->type);
 
  }
 
   
 
  SaveObject(char* data, Type* type)
 
  {
 
      // Using the description of the type, iterate over all fields in
 
      // the object
 
      for (Field* field in type->fields)
 
      {
 
          // Each field knows its offset so add that to the base address of the
 
          // object being saved to get at the individual field data
 
          char* field_data = data + field->offset;
 
   
 
          // If the field type is a known built-in type then we're at leaf nodes of
 
          // our object field hierarchies. These can be saved with explicit save
 
          // functions that know their type. If not, then we need to further
 
          // recurse until we reach a leaf node.
 
          Type* field_type = field->type;
 
          if (field_type is builtin)
 
              SaveBuiltin(field_data, field_type);
 
          else
 
              SaveObject(field_data, field_type);
 
      }
 
  }
 
   
 
  SaveBuiltin(char* data, Type* type)
 
  {
 
      switch (type)
 
      {
 
          case (char): SaveChar(data);
 
          case (short): SaveShort(data);
 
          case (int): SaveInt(data);
 
          case (float): SaveFloat(data);
 
          // ... etc ...
 
      }
 
  }

Whether your file format is text XML or binary, the algorithm is the same. The difference is in how you write your known built-in types and how you annotate your output data along the way (e.g. text tags for XML). A nice side-effect of writing your serialisation this way is that for a given file format, your serialisation code is written in one place and can handle any object that can be described by your reflection API – you just have different files for each format implementation.

Containers

Game objects are more complicated than those specified above and will contain containers. This is an umbrella term for any of these:

  • C-style Arrays
  • Vectors
  • Linked Lists
  • Sets
  • Key/Value Maps and Hash Maps

For the moment if we assume that our primary goal is to make these serialisable, a simple means of doing so is to extend SaveObject:

SaveObject(char* data, Type* type)
 
  {
 
      // ... start of function ...
 
   
 
      if (field_type is builtin)
 
          SaveBuiltin(field_data, field_type);
 
      else if (field_type is container)
 
          SaveContainer(field_data, field_type);
 
      else
 
          SaveObject(field_data, field_type);
 
   
 
      // ... rest of function ...
 
  }
 
   
 
  SaveContainer(char* data, Type* type)
 
  {
 
      switch (type)
 
      {
 
          case (vector): SaveVector(data, type);
 
          case (list): SaveList(data, type);
 
          // ... etc ...
 
      }
 
  }
 
   
 
  SaveVector(char* data, Type* type)
 
  {
 
      // Cast the data to your vector type
 
      vector& vec = data cast as vector;
 
   
 
      Type* stored_type = type->container_value_type;
 
   
 
      for (int i in vec.count)
 
      {
 
          char* value_data = data + i * stored_type->size;
 
          SaveObject(value_data, stored_type);
 
      }
 
  }

The first problem we come up against is that, taking SaveVector as an example, the type of the vector changes based on the data it stores. So, std::vector<int> is a different type to std::vector<char> and can’t be cast at compile-time. There are two ways of dealing with this that will be covered in more detail later in the use-case studies. They are:

  • The reflection API is entirely runtime-based and when you register a field that is a container, code gets generated using templates that will be used to serialise when needed. This has the benefit that any container becomes easily serialisable without you having to know the memory layout of the container type itself. It has the drawback that it can generate quite a substantial amount of code that can have a negative impact on your memory budget.
  • If you can rely on knowing the memory layout of your container independent of its type, you can use that to iterate over all elements using the type information stored in the reflection database, as above. This has the benefit that there is only one section of your code that is used to serialise all containers of that type. It has the drawback that you may not want to rely on knowing the internal layout of your container because it’s not part of an API that you own/control, e.g. STL.

The second problem you encounter is that if you have N file formats and M types of container, you’re going to have to write M*N functions that handle all your serialisation possibilities. Later discussion covers how to use the reflection database for other purposes, such as walking a pointer graph, and in such cases you’d also have to write specific implementations for each container type.

Obviously that won’t do and you can add a layer of indirection to get around this. The way I deal with this is by introducing the container interface to report basic information about a container, such as its entry count, and read/write iterator interfaces for reading and modifying the containers:

interface IContainer
 
  {
 
      Type* GetKeyType() const;
 
      Type* GetValueType() const;
 
   
 
      ReadIterator GetReadIterator();
 
      WriteIterator GetWriteIterator();
 
  };
 
   
 
  interface IReadIterator
 
  {
 
      char* GetKey() const;
 
      char* GetValue() const;
 
      int GetCount() const;
 
   
 
      void MoveNext();
 
      bool IsValid();
 
  };
 
   
 
  interface IWriteIterator
 
  {
 
      void SetKey(char* data);
 
      void SetValue(char* data);
 
   
 
      void MoveNext();
 
      bool IsValid();
 
  };

If you want to skip ahead, Reflectabit contains a very good example of this.

All container types you support implement these interfaces. Notice that they account for both the key and value of an item in a container, which can be safely ignored for those containers that don’t conceptually have keys. Use is then a simple case of:

SaveContainer(char* data, Type* type)
 
  {
 
      ContainerInterface* container = type->GetContainerInterface(data);
 
      Type* key_type = container->GetKeyType();
 
      Type* value_type = container->GetValueType();<br>
 
   
 
      Serialise iterator->GetCount();
 
   
 
      WriteIterator* iterator = container->GetWriteIterator();
 
      while (iterator->IsValid())
 
      {
 
          if (key_type)
 
              SaveObject(iterator->GetKey(), key_type);
 
   
 
          SaveObject(iterator->GetValue(), value_type);
 
   
 
          iterator->Next();
 
      }
 
  }

Like the serialisation code that we started with, this algorithm is independent of file format and only differs in how the data is finally written. This also requires you to write only one container save per file format, cleanly solving the implementation explosion.

Pointers and the Object Database

Serialising pointers can be a tricky subject and any grizzled console programmer will tell you that a good way to handle the problem is to not serialise them at all! If you can get away with using indices and handles you may find them more comfortable than pointers. With a reflection API and object database, however, serialising pointers is remarkably easy. Not only that, it opens up a whole host of possibilities for future use.

To start you need some means of creating objects from a central source and assigning them a unique ID, so let’s redefine Object and introduce the object database:

struct Name
 
  {
 
      u32 id;
 
      const char* text;
 
  };
 
   
 
  struct Object
 
  {
 
      Name name;
 
      Type* type;
 
  };
 
   
 
  class ObjectDatabase
 
  {
 
      Object* CreateObject(const char* type_name);
 
  };

Here, the Name type represents the full name of your object, assigned offline by your tools/editor or generated at runtime. It contains a pointer to the text of the name that can be used for debugging and a unique ID that maps to that name – usually a hash of the name. The text can be removed in your release builds or preferrably, not stored at all: it’s pretty simple to create a Visual Studio debugger plugin that can map the ID to a locally stored text database or write network logging tools that only require the ID to print the name. The important point is that your means of generating the ID from the name must be consistent and there must be no collisions.

Given such properties, serialising pointers is a straight-forward case of serialising their ID in place of their pointer:

if (field_type is pointer)
 
  {
 
      Object* object = (Object*)field_data;
 
      Serialise object->hash as u32
 
  }
 
  else if (field_type is builtin) ...

Generally you will need a top-level collection of all objects in a level, package or whatever abstraction you choose. The classic example of this is the Unreal Package. When loading these IDs, you generally won’t create them on-demand, but assume they exist and look them up/point to them. For this reason you need to be careful about loading order.

Several solutions I’ve used in the past are:

  • If the referenced object doesn’t exist, create an uninitialised proxy object for it.
  • Use scoped tree-referencing where pointers can only go in one direction.
  • The package being loaded contains a list of packages it depends upon that need to be loaded first.

Custom loading functions

Game objects can be even more complicated than this – sometimes you have fields which can’t directly be serialised to disk. A good example of this is a D3D vertex buffer, which is represented as a D3D resource interface pointer. Other times there are types which may not be reflection-registered due to their complexity that you still want to save – std::string is a nice example of this.

With each type or field you can associate a means of loading and saving data of that type via a function pointer. The serialisation code first checks to see if the field type has an associated set of load/save functions before trying to serialise another way. It can be a little more complicated than that if you’re worried about performance and support for multiple file formats; take the simplest example of this:

// Serialisation code for the XML file format
 
  if (SaveFunc f = field_type->save_funcs.find(FORMAT_XML))
 
  {
 
      f(field_data, field_type);
 
  }

Before you get to checking what other properties the field may have, you’re doing some form of map lookup.

The simplest/fastest way of doing this I’ve found is by assigning your file format types indexed enums and having an array of function pointers inside your type/field:

enum Format
 
  {
 
      FORMAT_BINARY,
 
      FORMAT_TEXT_XML,
 
      FORMAT_BINARY_XML,
 
      FORMAT_COUNT
 
  };
 
   
 
  struct Type
 
  {
 
      SaveFunc save_funcs[FORMAT_COUNT];
 
      LoadFunc load_funcs[FORMAT_COUNT];
 
  };

Serialisation becomes quick and simple but there is a loose-coupling of concepts between reflection API and serialisation code which you may not like. A happy medium of the two is storing a sorted, dynamic array in the type that can be binary searched – the general case would be an empty array that is quickly skipped.

The serialisation code with custom save array lookup now looks like this:

SaveObject(char* data, Type* type)
 
  {
 
      // Using the description of the type, iterate over all fields in
 
      // the object
 
      for (Field* field in type->fields)
 
      {
 
          // Each field knows its offset so add that to the base address of the
 
          // object being saved to get at the individual field data
 
          char* field_data = data + field->offset;
 
   
 
          // Branch on field type
 
          Type* field_type = field->type;
 
          if (field_type is pointer)
 
              Serialise ((Object*)field_data)->hash as u32
 
          else if (SaveFunc f = field_type->save_funcs[FORMAT_XML])
 
              f(field_data, field_type);
 
          else if (field_type is builtin)
 
              SaveBuiltin(field_data, field_type);
 
          else if (field_type is container)
 
              SaveContainer(field_data, field_type);
 
          else
 
              SaveObject(field_data, field_type);
 
      }
 
  }

It’s worth mentioning that another means of achieving this is to have a single Load/Save function per object that handles the serialisation of all fields that are too complicated to reflect in one place. Unreal Engine (UE) is a good example of this and it’s one of the main reasons I prefer the above solution. There are no marked boundaries between serialised fields so it’s very easy to damage an entire object by messing up one field – you can’t temporarily skip it and keep everybody working while you solve the problem at hand. It gets more unwieldy when you get into versioning, which is covered below.

Versioned file formats

So far we haven’t taken a look at any loading code. Some pseudo-code for loading anything saved with the features we’ve covered above could look like this:

void LoadObject(char* data, const Type* type)
 
  {
 
      for (Field* field in type->fields)
 
      {
 
          char* field_data = data + field->offset;
 
          Type* field_type = field->type;
 
   
 
          if (field_type is pointer)
 
              // Load u32 hash, lookup in Object Database, point to it (or create, or proxy object, impl defined...)
 
          else if (LoadFunc f = field_type->load_funcs[FORMAT_XML])
 
              f(data, field_type);
 
          else if (field_type is container)
 
              LoadContainer(field_data, field_type);
 
          else if (field_type is builtin)
 
              LoadBuiltin(field_data, field_type);
 
          else
 
              LoadObject(field_data, field_type);
 
      }
 
  }

This code expects the data to be saved in the order the fields are specified in the type. If you add or remove fields or change the implementation of your custom loading function then catastrophe awaits. A versionable file format is one which can adapt to these changes gracefully.

Versionable file formats can be an incredibly important tool for development files in game asset pipelines. A good example here would be a mesh file format, as loaded by your game:

  • Edit the mesh in your DCC.
  • Export the mesh to an intermediate file format – this is custom or 3rd party (e.g. COLLADA, FBX or XSI).
  • A custom tool “compiles” the mesh to its game-loadable file (per platform).
  • Editor loads the output to use as level edit placement.
  • Game loads the output.

Discussion of the merits of different build and development strategies goes far beyond the scope of this post, considering the variety of approaches developers take. However, if you’re using a build system that caches the compiled mesh contents so that other developers don’t have to build them locally to run the game, you’ll need to have a system in place to handle changes to the formats of those cached files.

A common approach taken in many studios, including some I’ve worked at is to store a version number at the start of each mesh file and refuse to load the file (or assert) if there’s a version mismatch. When a programmer wants to change the file format, they do the following:

  • Make the change locally on their machine and iterate on a small subset of the assets.
  • Kick off a process that recompiles every mesh in the game. This can be overnight on your machine or offloaded to an worker machine and distributed in some way.
  • Submit new compilation tools, game and compiled assets.
  • Content creators sync to new tools and new assets – potentially gigabytes of data.

On one project it was not unknown for a complete rebuild of all textures to take up to a week. This put the programmer offline for a considerable amount of time, requiring multiple client-specs to maintain productivity. It completely killed any enthusiasm to change the file formats. Your mileage may vary but I’ve found that the ease at which I can optimise a game is greatly influenced by the ease at which I can modify the format of the files it loads.

If your file format is amenable to change, you can do the following:

  • Make the change locally and iterate on a small subset of the assets.
  • You can integrate these assets into larger levels with older assets during testing.
  • Submit new compilation tools and game.
  • Content creators get latest and can still play/edit the game.
  • Any assets created or modified use the latest file format.
  • Programmer schedules an offline build process to gradually go through all cached meshes and convert them to the new format.
  • Content creators slowly sync over time to the updated assets.

This forms the backbone of UE-based development and scales gracefully to 150-200 man teams with outsourced developers added on top. It’s also how we built the Splinter Cell: Conviction engine, allowing us to rewrite the renderer on the main branch while around 50-80 content creators continued to work with daily tool/game updates.

I’m straying a little too far from the point of this post but this is a worthy discussion to have. The reality is, each developer views the issue differently and it’s possible to take any of the above solutions and create an environment in which it works wonderfully well or is a constant production risk.

So, back to the point! If your output format is XML, you can simply change your loading code to:

void LoadObject(char* data, const Type* type)
 
  {
 
      for (string tag in xml_nodes)
 
      {
 
          // Skip any fields that have been removed
 
          Field* field = type->find_field(tag);
 
          if (field == 0)
 
              continue;
 
   
 
          // Normal loading
 
          char* field_data = data + field->offset;
 
          Type* field_type = field->type;
 
          if (field_type is pointer)
 
              // Load u32 hash, lookup in Object Database, point to/create it
 
          else if (LoadFunc f = field_type->load_funcs[FORMAT_XML])
 
              f(data, field_type);
 
          else if (field_type is builtin)
 
              LoadBuiltin(field_data, field_type);
 
          else if (field_type is container)
 
              LoadContainer(field_data, field_type);
 
          else
 
              LoadObject(field_data, field_type);
 
      }
 
   
 
      // Any added fields in the type won't be present in the data so are
 
      // naturally handled if you provide them with a default value
 
  }

If your output format is binary you can use a solution similar to IFF files: each field is prefixed with a chunk descriptor that specifies a tag ID and chunk size. In our case, the tag ID can be the hash of the field name:

void LoadObject(char* data, const Type* type)
 
  {
 
      int nb_fields = read from file;
 
   
 
      for (i in nb_fields)
 
      {
 
          // Read the chunk header
 
          u32 field_hash = read from file;
 
          u32 data_size = read from file;
 
   
 
          // Skip any fields that have been removed
 
          Field* field = type->find_field(field_hash);
 
          if (field == 0)
 
          {
 
              // seek from current position over the data_size
 
              continue;
 
          }
 
   
 
          // Normal loading
 
          char* field_data = data + field->offset;
 
          Type* field_type = field->type;
 
          // ...
 
   
 
          // You can insert an extra check here to verify that the loading code has
 
          // consumed the number of bytes equal to data_size. Very useful for tracking
 
          // errors in custom loading functions.
 
  	}
 
  }

This has two issues you need to solve:

  • If somebody loads a new file version with an older version of your editor, it will discard data when resaving. One way of solving this is to store the data of any skipped fields in a dictionary assigned to that object that gets saved later. A simpler way is to force everybody to update to any new tools versions!
  • If data for a newly added field is not present in the file, a nice default value needs to come from somewhere. An easy solution is to initialise your default value in the object constructor. The downside to this is that the default value is not visible to external tools and you need to recompile source each time you change a default value. Another approach would be to specify the default value as some reflection attribute that gets saved offline. You would need to change the code above to then manually assign these defaults to any missing fields.

Custom load/save functions need special attention with respect to versioning. As mentioned above, UE has a custom Serialize function per object, within which multiple version checks are made to see what needs to be serialised. This can get very complicated to manage and is easy to break.

When you have the ability to associate custom load/save functions per type or per field, this becomes easier to manage. If you serialise version numbers with each function then the job of deciding what’s valid and skipping invalid chunks is handled automatically for you, leading to more maintainable and fault tolerant code.

Both methods can suffer from lack of old version pruning. When we started Splinter Cell: Conviction, there was still loading code for the mesh format in the original Splinter Cell. Nobody knew whether this worked as it hadn’t been tested in years. Updating the code was fraught with problems and a reboot was required.

Enumerations

Enumerations can quite easily be serialised as integers but this is quite brittle. If a programmer changes the order of enumerations, changes their value or adds/removes any, all existing data that uses that enum type will likely be invalidated. I have worked on projects that would require rebuilding the entire asset database if you were ever bold enough to try such a move!

A very simple way to avoid this problem is to serialise enumerations as the hash of their name:

void SaveEnum(char* data, Type* type)
 
  {
 
      // Cast the type to an enum and retrieve the value
 
      Enum* enum_type = type->AsEnum();
 
      int enum_value = *(int*)data;
 
   
 
      // Lookup the constant and save its hash
 
      EnumConstant* constant = enum_type->find_constant(enum_value);
 
      SaveU32(constant->name.hash);
 
  }
 
   
 
  void LoadEnum(char* data, Type* type)
 
  {
 
      // Cast the type to an enum and read the constant hash
 
      Enum* enum_type = type->AsEnum();
 
      int hash = ReadU32();
 
   
 
      // Lookup the constant and assign the value
 
      EnumConstant* constant = enum_type->find_constant(hash);
 
      *(int*)data = constant->value;
 
  }

You will also have to account for data that stores old enum values, typically handled by leaving the destination untouched and initialised at its default value.

Performance: Baked serialisation functions & PODs

This may all seem a little slow but the reality is you are likely to be I/O bound; even on hard-drives none of this code factors negatively in the performance.

However, it can be given a little speed boost with a technique that you may find easier to read/maintain: give each field a custom serialisation function. Instead of the inner loop of SaveObject branching, ahead of time you can figure out what the result will be and record it for that field:

BakeSerialisationFunctions(Field* field, Format format)
 
  {
 
      // Don't bake anything for transient fields
 
      if ("transient" in field->attributes)
 
          return;
 
   
 
      // Bake the save function based on the type
 
      if (field_type is pointer)
 
          field->save_funcs[FORMAT_XML] = SavePointer;
 
      else if (field_type->save_funcs[FORMAT_XML])
 
          field->save_funcs[FORMAT_XML] = field_type->save_funcs[FORMAT_XML];
 
      else if (field_type is builtin)
 
          field->save_funcs[FORMAT_XML] = SaveBuiltin;        
 
      else if (field_type is enum)
 
          field->save_funcs[FORMAT_XML] = SaveEnum
 
      else if (field_type is container)
 
          field->save_funcs[FORMAT_XML] = SaveContainer;
 
      else
 
          field->save_funcs[FORMAT_XML] = SaveObject;
 
  }
 
   
 
  SaveObject(char* data, Type* type)
 
  {
 
      // Call the save for each field
 
      for (Field* field in type->fields)
 
      {
 
          char* field_data = data + field->offset;
 
          Type* field_type = field->type;
 
          field->save_funcs[FORMAT_XML](field_data, field_type);
 
      }
 
  }

Furthermore, if your reflection API deems that an object is of a POD type, you don’t have to recurse into the children and can instead write a binary blob for the entire object (un-versioned, binary only).

Field offsets and inheritance

The basic implementation of a Class type will store only the fields that were declared within the class. Field layout is ABI-specific and you will need a database per compiler/platform when using field offsets. Access to fields of its base class requires following the base class pointer in Class:

SaveObject(char* data, Type* type)
 
  {
 
      // ... end of the function ...
 
   
 
      // Recurse into base types
 
      if (type is class && type->base_class)
 
          SaveObject(data, type);
 
  }

This has some subtle side-effects. If your class contains virtual methods then it’s up to the compiler where it stores the virtual function table pointer. Typically this has no effect on the validity of recording field offsets that are used at runtime but there are some simple cases where it breaks. Take this piece of code:

struct PodBase { int x; };
 
  struct NonPodDerived : public PodBase { virtual void f(); };
 
  NonPodDerived obj;
 
  NonPodDerived* a = &obj;
 
  PodBase* b = a;

The addresses of a and b will be different because NonPodDerived needs to store an extra virtual function table pointer. This means that the address of PodBase::x will be different to NonPodDerived::x!

If you want to use multiple inheritance, things get a little trickier:

struct B0 { int x; };
 
  struct B1 { int y; };
 
  struct C : public B0, public B1 { int z; };

The field offsets for both x & y in their class descriptions will be 0. When serialising B0 and B1 on their own this will be fine, but when serialising C, both x & y can’t live at offset 0! The compiler may layout C like this:

 
      x: 0
 
      y: 4
 
      z: 8
 
  

Serialising this kind of object using the class descriptions of C, B0 and B1 will not work. This simple case can be solved by calculating the offsets of x & y when contained in C and storing them in the class description of C itself. No longer will your serialisation code walk up the inheritance hierarchy finding members, and given that reflection databases are usually quite small, you may actually find this kind of setup preferrable. It will also fix the first issue.

But what if B0 and B1 themselves inherit from the same base class? This is the dastardly diamond inheritance issue:

struct A { int w; }
 
  struct B0 : public A { int x; };
 
  struct B1 : public A { int y; };
 
  struct C : public B0, public B1 { int z; }

In this case C will contain two copies of A, each with different offsets for their own w. There’s really no clean solution to this in a reflection API unless you add more complexity. One way to force the compiler to only embed one copy of A in C is to use virtual inheritance:

struct A { int w; }
 
  struct B0 : virtual public A { int x; };
 
  struct B1 : virtual public A { int y; };
 
  struct C: public B0, public B1 { int z; };

We’re into highly implementation specific territory here but the compiler might offset like this:

 
      vptr B0: 0
 
      x: 4
 
      vptr B1: 8
 
      y: 12
 
      z: 16
 
      w: 20
 
  

Notice that w is right at the end and there’s only one copy. There’s also a couple of virtual table pointers in C that help the compiler cast between the various classes at runtime. At first sight, it appears that using the initial multiple inheritance solution might work here, however the representation of a member offset for non-POD types is implementation defined and not guaranteed to work on any compiler. Indeed, the following code crashes at runtime in MSVC2005:

struct VirtualBase { };
 
  struct Derived : virtual public VirtualBase { int x; };
 
  int offset = offsetof(Derived, x);

You will hit this issue if you decide to allow multiple inheritance of root serialisation types as they all need to inherit from Object.

There are a few ways of recording the offset of a field, including:

  • With runtime, templated registration, you can create “visitor functions” that wrap access to the field. This is by far the most portable/standards-compliant way of doing this but you’re adding complexity to your API & runtime, increasing compile times and generated code size.
  • At runtime you can use the C++ offsetof macro. This is standards-compliant for POD types. It’s practically compliant for a variety of non-POD configurations for the platforms game developers use but will break down with pure virtual inheritance.
  • Offline, you can use a layout generator that knows the target ABI and can calculate the field offsets for you. A good example of this is the one that ships as part of clang: RecordLayoutBuilder.cpp.
  • Get your compiler to report field offsets after a compile step.

In later posts I’ll explain how offsetof works and how you can work-around its limitations with pure virtual inheritance. However, it’s hairy territory and I’d advise avoiding the problem altogether – personal experience has shown that the added complexity required to deal with such cases does not justify the limited use it sees.

It’s worth reading Memory Layout for Multiple and Virtual Inheritance to get more background information on this problem.

Attributes

Attributes are a means of annotating your primitives, adding extra data that can be used to control how your program performs at runtime. C# attributes are closer but a little too powerful/complicated.

A simpler attribute system would allow:

  • Flags: Named flags that represent a boolean state. A classic example is “transient”, which allows you to mark fields which you don’t want to serialise.
  • Values: These are name/value pairs, such as “min_value”, “max_value”, “default_value”, that can be used to drive user interface widgets. Integer or floating point value types can be used.
  • Strings: These are name/value pairs, such as “description” and “group”, that allow you to attach descriptive/grouping data to a primitive for user interfaces.
  • Functions: Name/function name pairs that allow you to more conveniently specify custom load/save functions (e.g. load=FunctionName).

Later posts will describe ways in which you can annotate primitives and retrieve them at runtime. There are, as you would guess, many tradeoffs with each approach.

Network Serialisation and Visualisation of Game State

It’s probably obvious by now how this can be achieved: use serialisation to a byte buffer that is optionally compressed and send that to your endpoint – most likely binary and versionable. With the addition of an attribute that describes network transient fields, this allows you to make some very powerful editing tools.

The main class of tool is an editor that connects to a live game, edits an intermediate data representation and broadcasts changes to a live game. There are many benefits to this approach:

  • You get live updates of any changes on PC or console.
  • Design is multi-threaded due to the nature of network communication, making for some graceful UI tools.
  • You can iterate on your tool code without bringing the live game down. If the tool crashes, it doesn’t bring down the game.
  • If your game gets into a state that is deemed incorrect, you can connect and visually see the state of all your objects.

If you want to write your tool code in C# or embrace the era of the Internets and write in a combination of Javascript, HTML, CSS, etc. you only need to write the equivalent of the above serialisation code in that language to allow editing and communication. Each C++ container type you support will need to map to an equivalent in the tool language.

This is how a stand-alone material editor, realtime PIX debugging tool and post-process editor were developed for Splinter Cell: Conviction, to be discussed in a later post.

This is more than likely not good enough for communicating real-time network updates for game code as you’ll want to do things like:

  • Use context-specific knowledge to compress data (e.g. movement updates).
  • Use smaller situation-specific packet structures to communicate small changes to objects.
  • Compress the ID representation of any objects that are referenced by packets.

You can use a reflection API to describe your packet structures and binary serialise them or generate C++ code from the offline representation. If your packet structures end up being PODs then you can write code which performs a memcpy, exchanging any pointers for the unique hash of the object pointed to. However you choose to do it, the basic description of types and their layout that a reflection API can provide you with can give you a good head start.

Walking the Object Graph and the UI

With the tools developed above you can visit all data members within an object and perform arbitrary operations, such as printing their value to a console, displaying them in a widget in-game or recording them using some form of programmable logging system.

If you’re generating a UI for your tools, you might be best off using an offline description of your types stored in some easily loadable format (e.g. json or xml). If none exists then you need to somehow send the runtime database to your tool (something I’ve achieved in the past by sending the entire database over the network on tool connect). With this you can:

  • Use attributes to specify default values, descriptions, field grouping and ranges.
  • Use the type of a field to determine what kind of widget to use, inspecting optional attributes to refine the choice.
  • Restrict the assignment of object references based on type.
  • Automatically populate enumeration list boxes.

When you can walk the object graph you can also record any pointers an object contains: what other objects does it reference? We used this technique in Splinter Cell: Conviction to accelerate deletes in UE. UE used to serialise all objects in a level, discarding everything but pointers when it needed to check for dependencies. This was incredibly slow in levels which contained 10s of thousands of actors – I believe we got delete operations from minutes down to a couple of seconds. More recent versions of UE have made significant performance improvements in this area, however.

On systems where the GPU can only use physical addressing to reference data, runtime defragmentation of specific memory heaps for vertices and textures becomes a very useful technique. Of course, you need a system that informs any referencing assets where the memory has moved to. The ability to inspect pointers and relocate them makes this quite trivial. You can also solve this issue with handles or an extra level of indirection – you may or may not be willing to accept the runtime performance this costs you based on your overall engine design.

Generalising the issue, you can also do controlled same goal by serialising all objects, checking for references in a mark and sweep operation.

Calling Functions, Script Binding and RPC

You can build a reflection API for your game without needing to worry about adding function call support. By this I mean the ability to do something similar to the following:

// Retrieve the function by name
 
  Function* function = db.GetFunction("FunctionName");
 
   
 
  // Build a set of parameters to pass to the function
 
  ParameterStack params;
 
  params.Add(1);
 
  params.Add("string");
 
   
 
  // Call the function and inspect any return value
 
  function->Call(params);
 
  int ret = params.GetReturnValue();

This is an incredibly useful tool to have at your disposal for binding to scripting languages. A very simple way to bind to a scripting language is to use its API directly and manually register each function you want to expose:

void NativeFunctionExample(BindLanguageContext* ctx)
 
  {
 
      // Pop some parameters off the script language stack
 
      int param0 = ctx->PopInt();
 
      string param1 = ctx->PopString();
 
   
 
      // ...do some work with the parameters...
 
   
 
      // Push a return value result of the work done
 
      ctx->PushInt(1);
 
  }
 
   
 
  void RegisterFunctions(BindLanguageContext* ctx)
 
  {
 
      ctx->RegisterFunction("NativeFunctionExample", NativeFunctionExample);
 
  }

This of course means your function can only be called from script. If you want to call it from C++ code as well the classic solution is to instead create wrapper functions and register them:

int NativeFunctionExample(int param0, string param1)
 
  {
 
      // ...do some work with the parameters...
 
  }
 
   
 
  void NativeFunctionExample_Wrapper(BindLanguageContext* ctx)
 
  {
 
      int param0 = ctx->PopInt();
 
      string param1 = ctx->PopString();
 
      int ret = NativeFunctionExample(param0, param1);
 
      ctx->PushInt(ret);
 
  }

Now you can call NativeFunctionExample from C++ and register NativeFunctionExample_Wrapper with the script environment. Of course this is highly error-prone and downright tedious. It also gets worse when you try to bind to multiple languages, which is why many solutions have been developed to address these shortcomings.

Examples of automated binding approaches include:

  • SWIG: This scans your C/C++ header files and automatically generates wrapper code for anything you want to bind to other languages.
  • Boost.Python: Uses template meta-programming to generate the required wrappers at compile-time.
  • LuaBind: Uses template meta-programming for binding C++ to Lua.
  • Gem (FuBi): Uses knowledge of the platform ABI and a description of parameters to populate the native stack.

Template meta-programming approaches suffer from increased compile-times and along with code generation, result in larger than necessary executables. However, the approaches are cross-platform. On the other hand, if you have knowledge of the platform ABI you can write one function that takes a function signature and places parameters on the native stack before calling it. This requires highly platform-specific code but is remarkably concise and has a tiny footprint.

In each of these binding libraries, however, you’ll find very similar function registration, parameter description and code generation techniques. This typically takes up a large majority of the implementation and can be quite complicated – the amount of code that deals with the specifics of the script language is not that great. If instead you took one of the above techniques and used it to populate an intermediate stack representation, you can write very simple code for each language variant you need to use:

void MakeParameterStack(ParameterStack& params, BindLanguageContext* ctx)
 
  {
 
      // Iterate over every value pushed onto the script stack   
 
      for (int i = 0; i ParametersOnStack(); i++)
 
      {
 
          BindLanguageVal* val = ctx->GetStackRef();
 
   
 
          switch (val->type)
 
          {
 
              // If required, convert the script value to a native equivalent
 
              // In the case of ints, floats, etc, nothing may need to be done
 
              // In the case of object references, you need a means of preserving the reference in the target language
 
          }
 
   
 
          // Add to the intermediate stack
 
          params.Add(val);
 
      }
 
  }
 
   
 
  void CallFunctionFromScript(BindLanguageContext* ctx, string function_name)
 
  {
 
      // Retrieve the function from the reflecton database
 
      Function* function = db.GetFunction(function_name);
 
   
 
      // Build the parameter stack
 
      ParameterStack params;
 
      MakeParameterStack(params, ctx);
 
   
 
      // Call the native function
 
      function->Call(params);
 
  }

The complicated part of the problem is now holed up in the Call function and whatever techniques you use to generate the reflection description of your functions. It’s comparitively easy to add new languages with this.

If you’re uncomfortable with the overhead this introduces then an offline reflection database can be used to generate C++ wrappers for each function you want to call from script. Naturally, this increases the size of your executable but that may be a trade-off you can handle.

I’ll be discussing techniques I’ve used to bind to Lua, C#, Python and my own custom game language in future posts. This will also include coverage of how container binding was handled.

The final piece of the puzzle, RPC, should now be evident. All you need to do is serialise the parameter stack to a byte buffer and send that over the network. Any return values are serialised and returned. The details of how you wait/poll/interrupt on results are all you need to worry about.

Live C++ code editing

Edit-and-continue is OK when it works, but what if you had the ability to edit large sections of your C++ code without having to shutdown the game, reload the compiled executable, load your levels, navigate to your testing location and resume what you were doing? What if you could come to work, load up the game and sit there all day *in the game* coding away until the day ended. Iteration is essential for creating great games and most studios will already have dynamic reloading of ingame assets such as scripts, textures, meshes, sounds or even entire level layouts. Some may even have a live connection between their editor and the game running on the console. C++ programmers are missing the boat!

Before I cover this, there are a number of ways you can minimise the impact of this problem:

  • If you find a scripting language which doesn’t sacrifice expressiveness, safety and stability, you can swap it in for coding your game in C++. Correct use of scripting languages can allow you to build the majority of your game logic without also sacrificing performance.
  • Compile shaders directly to object files that can be dynamically loaded and reloaded by your engine. Irrespective of how you represent shaders (HLSL files, shader graphs, DCC plugins, etc.) this is easy enough to achieve and should be first on your list.
  • Try to make your rendering in some way scriptable without affecting performance – Direct3D effect files are a good example of this.
  • Have a level editor that allows easy construction of test levels by programmers so that they can create libraries of levels that test specific features in the game for iterating on them. It should be no substitute for testing said features in final game levels before check-in, however.
  • Have save games that can be triggered from any point that can save as much of the game state as possible.
  • Work on your loading times before it’s too late. Long load times will reduce the amount of time you can spend iterating on your work and make the game harder to test within whatever time frame you have allocated.
  • Work on the boot times and stability of your game and editors.
  • Work on your compile times.

Done all that? Great! It’s not good enough, is it? 🙂 As an engine programmer, the biggest problem I’ve always had with just having reloadable shaders is that at some point you have to edit your render code. When you start adding scriptability to your rendering pipeline you increase its complexity, make the performance more opaque, and never quite reach the flexibility you need, requiring endless hours moving back and forth or adapting the system to your requirements; time that could be spent writing your engine!

During early development of the Splinter Cell: Conviction engine we had such a system. It was a custom rendering engine hosted within the UE framework that allowed the engine programers to iterate on the engine C++ code while UnrealEd was running. Most times code changes would take a couple of seconds to build and reload within the editor and at times we could code for a few hours without bringing the editor down. It was a little brittle and would break now and again because somebody would check-in engine changes that were incompatible with the object model – whether the object model had flaws or it required too much contextual knowledge to keep working, I unfortunately never got the chance to find out.

However, you can achieve a similar system with a reflection API that has the following requirements:

  • Embed your code in reasonably partitioned DLLs.
  • Cross-DLL communication occurs with interfaces (abstract base classes in C++, structures of function pointers in C).
  • All dynamically created objects in your game are created from the same source.
  • All objects have a unique ID through which they can be serialised – usually a 32-bit CRC of the object name.

If you write a file system watcher that continously waits for changes to your DLL (or polls for it) then you can react to any change as follows:

  • Identify all objects of any types in the changed DLL and store them.
  • Iterate over the properties of objects that point to the collected objects. Replace the pointers with the unique ID of those objects.
  • Serialise the collected objects to memory.
  • Release the collected objects.
  • Reload the DLL.
  • Deserialise the collected objects from memory.
  • Iterate over the properties of objects that point to the collected objects. Replace the unique ID with the newly allocated pointers to the objects.

The key to this is that you’re serialising everything that changes and you have the ability to walk the pointer graph and patch up the objects that are kept alive with the location of the newly created objects. The serialisation can be done to RAM on PC, letting the OS take care of the paging. On consoles you can’t really do that (with the exception of some debug kits) so you may have to implement a slower path that uses your network connection. Debugging is achieved simply by attaching/detaching to your process whenever necessary.

I can’t stress enough how good this setup felt and how strongly I feel that every game should have it. This setup was literally serialising everything in a level within a few seconds (models, textures, data structures, etc.) without breaking a sweat.

Memory Mapping

Memory mapping is an old technique that in its most basic form involves loading a file from whatever storage medium you are using with one single read, not touching the result: it’s usable as it is. It naturally evolved from the ROM programming model where both your code and data is defined in your code files, making limited use of slow-access RAM. There are notable cases of old Playstation games, for example, serialising the entire contents of RAM to a memory card for save games!

Of course, in these days of heavy dynamic memory allocation and far more complicated data structures, this practice is rarely used. Especially when you consider that your load times are likely dominated by seek latency, volume of data and the complexity of any compression/recompression steps you perform.

In cases where you really do need to-the-metal memory-mapped loading of specific data types you can do this:

  • Walk over every pointer in the object and replace the object pointer with the hash of the object pointed to.
  • Save with a single write to disk.

The loading code then becomes:

// Read the memory map header for this object, which determines its type
 
  u32 type_hash = ReadU32();
 
  Type* type = db.GetType(type_hash);
 
   
 
  // Read the entire object into memory
 
  void *object = AllocateAndConstruct();
 
  ReadBlob(object, type->size);
 
   
 
  // Lookup each pointer hash
 
  for (ptr in type->pointers)
 
  {
 
      void* ref = object_db.GetObject((u32)ptr);
 
      ptr = ref;
 
  }

Of course, you’re now bound by the object lookups being performed per pointer. This can be accelerated by using bank/package files that store collections of objects with link tables. These link tables specify what objects are exported and imported, while object references now store indices into the link table. When a bank/package file is initially loaded, its import link table is updated by searching other bank/package files around it. The resolving of pointers then becomes:

for (ptr in type->pointers)
 
  {
 
      // The ptr now represents an index into the link table and a single bit representing
 
      // which link table to use
 
      u32 ptr_id = (u32)ptr;
 
      bool is_import = ptr_id & 0x80000000;
 
   
 
      // Pointer lookup is a simple array index
 
      if (is_import)
 
          ptr = import_table[ptr_id & 0x7FFFFFFF];
 
      else
 
          ptr = export_table[ptr_id & 0x7FFFFFFF];
 
   
 
  }

Conclusion

I hope nothing in the post above implies that specific techniques are the only options available to you. I’ve spent many years researching and implementing alternatives to the above and these are a sampling of the techniques I feel most comfortable with.

To give you an sampling of how diverse the implementations can be, I’ve collected some articles/implementations you may find interesting.

Implementations:

  • Nocturnal initiative. It’s an intrusive, registration-based reflection API.
  • CINT to parse header files and auto-generate reflection dictionaries.
  • Mirror – Template-based, built using Boost and intended for submission to the project. Offers both compile-time and runtime databases. Contains a separate tool to automatically generate C++ registration code.
  • Qt Meta-Object System – The ubiquitous Qt has its own reflection system that uses a “Meta-Object Compiler” to scan C++ files for custom-marked properties that need reflecting.
  • Xrtti – Uses GCCXML to generate C++ files that register reflection information for you.
  • Galaxy 3 Reflection – Intrusive, manual registration with some added template help.
  • Galaxy 4 Auto-Reflection – Uses a custom pre-processor to scan C++ files for markup, auto-generating the required C++ registration code.
  • CRD – Template-based with heavy STL influences. Implementation mainly in one header file.
  • cppreflect – Intrusive, macro-based manual registration.
  • CAMP – Template-based, manual registration with a minimal DSL-in-C++ approach.

Articles:

Blog posts:

This is a bit of a big post and I’d like to thank everybody who’s reviewed it for me. Special mentions go to Stephen Hill who made several tireless passes and Patrick Duquette, who cleared the road at Ubisoft for me to talk about the SC5 stuff.


How I discovered my favourite templated list class

Original Author: Alex Darby

Templated list class?!?

Really? This isn’t 1996 dude. We have the STL and the fancy new ISO approved C++11 standard. We don’t need your stupid list class.

That’s fine. Sorry. How stupid of me – please feel free to ignore the rest of this blog.

I can only apologise for having wasted your time.

My favourite templated list class

If you’re still reading, well done – you have passed the test.

You receive the +1 Spectacles of Second-Hand Perspective, and the +2 Underpants of Questionable Enlightenment.

Please continue reading.

Prelude

As other ADBAD posts about data structures have already said, there’s no such thing as a one-size-fits-all data structure.

Obviously you should definitely avoid spending time writing new data structures unless you genuinely need to, and not optimise until you find a bottleneck; but sometimes there are cases where you don’t actually need to, but if you don’t then you will maybe sleep a little less soundly at night wondering what that generic class you used is up to when you’re not looking at it.

The templated list class I’m eventually going to give you the code for definitely doesn’t fit all, and I guess the absolute need for it is questionable, but it’s still my favourite templated list class.

Why, dear reader?  That’s what I’m about to tell you.

The Backstory

It was the last console hardware transition. Our middleware provider had been bought by a large publisher and had essentially gone out of business. New consoles were appearing daily. The company decided to do a couple of work-for-hire type projects whilst we planned how to hit the next-gen.

We looked at the other middleware available at the time. We decided that we didn’t want to use Unreal and that most other extant providers were no more likely to be around forever than our previous one. After much deliberation it was decided that a small group of us should roll our own in-house engine.

What we were doing

Going from mature bleeding edge middleware to a self built engine is a big shock to the system

Our initial platforms were PS3 / X360, but we made a conscious choice to leave the door open for Wii because even the early indications were that it might end up having the biggest install base. This meant that the multi-platform architecture had to be very flexible and modular to leave room for the various platforms to be different but for our libs to work the same.

Several of the guys involved were old hands at engine code and we knew what we were doing, but even working as fast as we could, we knew that the most we’d be able to get done in any sensible timescale was the basics; and that any features we added above that would need to be directly applicable to the target game in order to make it worth our while.

What was on our mind

A lot of our core concerns in writing this engine were to mitigate against problems we’d experienced on our previous projects.

There was big picture stuff – like making sure the tool chain enabled assets to be added without needing code to be written so that the art and design teams could work effectively and iterate content without code issues slowing them up.

There was also nitty gritty stuff – for example, our previous game had had no end of problems with OOM caused by heap fragmentation. It was no-one’s fault directly – more a symptom of the fact that team sizes were growing, games were getting more complex, and tools hadn’t quite caught up – everyone was doing their own thing to get the game done, and because the game was fine until it had been running for some time none of the programmers had noticed it.

Must… Allocate… Memory… Fragmenting.

At the time we were relying on a slightly tweaked version of the basic the memory manager that our middleware provider had given us. It didn’t have the ability to do any sort of checks for fragmentation.

Heap fragmentation sucks, especially if you don’t know it’s happening. We only really found out that it was happening between alpha and beta – when QA started hammering it.

A couple of us spent a very long time tidying it all up. It was a mess. So much code existed that was blithely doing temporary allocations in amongst long term ones. Some stuff was just ill considered decisions made at 2am when under time pressure, but it turned out that a lot of it was to do with either:

  • Unexpected interactions between the asset management and graphics subsystems of the middleware when used in the way that our game was using them.
  • Insanely naive / inappropriate use of STL and other library code in the front end tin the code that wrote the persistent data that the gameplay relied on for settings etc. This was the killer.

It was a fairly big job to sort out, and involved several large architectural changes to the loading system of the game, plus tearing out all the STL use in our game side code.

Just to be clear, the problem wasn’t directly caused by STL but the way in which it was being used; it was quicker to tear it out than fix it – imagine a situation where someone had used STL whenever possible as opposed to where it made sense to, like having a std::vector< std::tuple< int, float > > instead of an array of a struct with an int and a float in it. And not pre-allocating the size of the vector even when it was fixed.

I’m honestly not trying to start an STL fight. This code was so insane that it managed to read the data out of a 15kb XML file and take up 750kb in memory. That was nothing to do with STL itself, but the way it had been (ab)used really didn’t help.

The upshot of all this? Our Technical Director ended up as an anti-STL extremist, and he and several others (myself included) ended up borderline paranoic about memory fragmentation.

What we did about it

When we started writing out new in-house engine, the first thing that was done to address these issues was that the technical director banned STL and we wrote our own templated container classes. Sure there were other things that could have been done, but that’s what was done.

Another thing was the creation of a memory management subsystem that had all sorts of nice features including fragmentation detection, multiple heaps for different sizes and types of allocs etc. etc.

We also set out one of our prime directives to do whatever we could to prevent fragmentation at the architectural level to make it harder for people to accidentally cause fragmentation.

So, about this template list class you mentioned…

I’ve not forgotten. Honestly.

The asset management system was on my schedule. Sure it’s pretty dry stuff, not everyone’s cup of tea, and definitely not a glory area like implementing a deferred bipolar thrumble edging render pass or whatever; but someone had to do it.

I’ll be honest, I like this sort of stuff. You get to stroke your chin and consider the “best way to do it ™”, and also have a good chance of getting the rare treat of writing code once and never having to change it again – other than the odd bug fix for a real life edge case you missed in your test runs.

With a prime directive of “prevent memory fragmentation”, I found it became very interesting. I stroked my chin and contemplatively nommed the arm of my glasses many times, discussed it over many cups of tea with other programmers, and eventually I came up with a plan that would work.

It did work, and it’s possibly the only bit of code I’ve ever written that I’ve been 100% happy with.

Simple, elegant, and competely bulletproof. Anti-fragmentation asset handling. The future.

There were a couple of teeny weeny little caveats on usage – much of the underlying architectural fragmentation resistance came from the fact it was sort of stack like; so it always unloaded assets in the opposite order to loading – but it was a small price to pay for being awesome and fragmentation proof.

That’s another story though. The important thing is that my main architectural concern was preventing memory fragmentation, and the best way that I know of to entirely reduce the risk of fragmenting memory is not to allocate or deallocate anything.

So, in an ideal world, this asset manager needed to store the managed assets in a data structure that didn’t have a fixed size, and which also didn’t allocate or deallocate memory.

Just one more tangent.

After I had been made paranoid about memory fragmentation and unintended allocations, I started to worry about the STL container classes.

Once I found out that you can mostly handle that sort of thing with an appropriate STL allocator I got over it (it still bothers me that so few programmers I’ve known seem to worry about this stuff, but that’s largely out of my control).

However, I eventually came up against a use-case with generic containers that still bothered me, and it was essentially a property of the way a generic containers have to work in order to be generic – so AFAIK there’s no work around other than a different approach.

Consider a common situation where you’ve got a known number of pre-allocated objects and you’re using a dead list and an active list to manage which are being used and which are not.

What happens when you take something out of one list and put it into another?

The link element used to store the object in the list doesn’t move with the object, but it still has to go somewhere.

I reasoned that what’s actually going on in any given templated list implementation is likely to be somewhere between the following two cases:

Worst case: a list element is newed every time I insert something into a list, and deleted every time I take it out. This is potentially one free store alloc + constructor and one destructor + free store dealloc per object moved between lists.

Best case: list elements are pre-allocated for each list so elements are used when an object is inserted and recycled when they’re removed. This still has an overhead for internal list element management, and also has to pre-allocate the maximum number of elements in both lists which is wasteful.

I honestly doubt there are many situations where this has become a bottleneck, but I just couldn’t bring myself to feel happy about the list element not moving with the object as it moves from one list to the other – it just seems wasteful and sloppy.

The way to work around this is bleedingly obvious, and is how most – if not all – heaps keep track of the memory they manage. You just put the list element into the object you’re going to store in the list.

This, as it happens, is also where the allocationless data structure comes in.

Ladies and Gentlemen…

It is my profound pleasure to present to you my favourite template list class.

(Note: 12/10/2011 – I found a bug in the code and have put the fixed code into the pastebin…)

In fact, it’s in a pastebin: 1) It doesn’t allocate.

2) It only has one case for inserting links and one case for deleting links. This is possibly my overall favourite thing.  The same approach to storing the head & tail that gives this property could be used in any list class.

Oh wait there’s three reasons:

3) The links move about between lists with the objects.

It’s not really important in the “generic list” case, but if you’re using a free list and an active list this saves all of the overhead of managing the list elements that the lists are built of.

Sorry, four reasons (“hold on, I’ll come in again…”):

4) The only real tradeoff for this awesomeness is that you can’t put a given object in more than one list simultaneously.

This might be fixable, but I’d be surprised if it didn’t need a significantly different implementation.

Parting thoughts

As I mentioned at the start, TNoAllocList  is not a one size fits all data structure.

I think it’s probably most suited to the use case that inspired me to write it in the first place – managing of  free / active lists of objects.

I hope this wasn’t a waste of your time, and ideally I hope it was useful – or at least interesting.

The final thing to remember is, just because your list can’t fragment memory it doesn’t mean you won’t.


Make Your Own Level Editor – Part II

Original Author: Amos Laber

On part 1 of this series, I outlined the basic principles of handling the content pipeline, and how the process should work. To continue, we move on to building the level. As mentioned on part 1, I use the Flash IDE (CS4 or CS5) because of its support for automation and custom commands. Basic familiarity with Flash is required.

Mapping Objects and Assets

Before actually making the first level, we need to create a manifest of all game objects. These can be anything from background elements, pick-ups or characters. Most are represented by static bitmaps or sprites-sheets where the game object is displayed using a single asset.

The idea is to map a logical game object (class) to a physical asset (bitmap or simple shape). For simplicity, we assume the size of the asset defines the bounding box of the logical game object. Each entry in the manifest will pair the class name with the asset file name, and a class ID. Its recommended to use some kind of naming convention with prefixes etc.

Building the Basic Level

Next we create a level template. This is a project with an empty level that would serve as a base for all future levels. Create a new Flash project (fla) and import all your assets into the library and save the file.

Flash keeps library objects either as the source assets or movie-clips, with the latter used as a container for a display object that can be manipulated.

For each of the source assets, create a movieclip for it (using F8), and name it to match the class id in the manifest. Save the file and start dragging objects to the stage. You can define a layer for graphic elements that are not exported and place there references, guides and bitmaps from the mock up. Make sure to have a rectangle that represents the screen bounds.Use the first level as template for the rest, reusing the same library objects.

Exporting the Level

The flash project file (fla) is how a levels is saved. This would be an intermediate file that will only be used to modify the level on design time. To actually use it in the game, we need to export the level data to a format that can be loaded at runtime.

Export the level data means we serialize it to XML, which will contain entries for all instances of game objects, each with is metadata (like class name, position, rotation and size). For that, we make use of JSFL – the JavaScript Flash API that’s included with the Flash IDE.

The following JSFL script creates an XML file and saves it in the current folder. To run, use Commands -> ‘Run Command’ and then point to the script file (should be saved with jsfl extension). After the first run, the script name will appear under the ‘Commands’ menu for quick access.

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  7
 
  8
 
  9
 
  10
 
  11
 
  12
 
  13
 
  14
 
  15
 
  16
 
  17
 
  18
 
  19
 
  20
 
  21
 
  22
 
  23
 
  24
 
  25
 
  26
 
  27
 
  28
 
  29
 
  30
 
  
//==========================================
 
  //
 
  // Export all stage instances to xml
 
   
 
  	var doc = fl.getDocumentDOM();
 
  	doc.selectAll();
 
   
 
  	var sel = fl.getDocumentDOM().selection;
 
  	var myXml = "rn"
 
   
 
  	var iLen = sel.length;
 
  	for (var i = 0; i < iLen; i++) {
 
  		var s = sel[i];
 
  		if (s.instanceType != "symbol") continue;
 
  		myXml += "    <child type='" + s.libraryItem.name + "'";
 
  		myXml += " x='" + s.x + "' y='" + s.y + "'";
 
  		myXml += " rotation='" + s.rotation+ "'";
 
  		myXml += "/>n";
 
  	}
 
   
 
  	doc.selectNone();
 
          myXml += "</level>";
 
   
 
  	flash.outputPanel.clear();
 
  	flash.trace(myXml);
 
  	var exportFileName = doc.pathURI.replace(/%20/g, " ").replace(doc.name,
 
                                  doc.name.replace(/.fla$/, "") + ".xml"); 
 
   
 
  	flash.outputPanel.save(exportFileName);
 
  	flash.trace("File saved to " + exportFileName);

Here is a sample of the resulting file:

A major advantage of this export is the ability to easily customize the XML data by changing the script to add different properties for each entry (lines 15-18).

 

Loading the Level

Loading the level into the game is a matter of de-serializing the XML data into game objects. This is truly platform agnostic: you could read the XML from any language or platform and parse the results however you like. The most common are for Flash, iOS and XNA.
The process for loading should be something like:
  1. Read list of game objects from XML into an array.
  2. Iterate on the array. For each item:
    1. Pass the class name to a factory method to create a new instance of the object.
    2. Locate the asset (resource) associated with the object, by looking it up on the object manifest map.
    3. Call the init function on the newly created instance, to populate it with the asset and metadata.
    4. Place the object on stage
  3. End

Your mileage may vary, but that should do the trick. Here I assume all game entity classes are based on one base class, and make use of polymorphism to construct and populate them.

 

Round Trip and Edit Cycle

As mentioned before, the level is also saved as a native ‘fla’ file, as intermediate, so it can be loaded and exported at any later time. The most effective way to edit a level is by instant round tripping between the game (preview) and the editor. Round tripping is the process of moving from runtime back to design in order to edit or modify the level data.

Round Tripping

With a build of the game ready to run and the export script in place, this is made possible.Typically, the designer will have the level editor still open after exporting the data. Running the game will load the data and serve as a preview. If the game runs on a browser its only a matter of refreshing it to reload new data. Now all that is left to do is follow the sequence:  Edit –  Export –  Preview, rinse and repeat.

Extending the Loader

The described method works well for simple game objects that are based on a single asset. However, not all objects are simple. Some object are logical only with no assets, but we still want to be able to place them and get visual indication. Such are trigger boxes, waypoints and spawn points. For these we can create simple shapes (like a semi transparent box or some icon) and use proxy assets.

There is also the option of putting more data into each object. As long as there is a convenient way of entering meta-data in the editor, it can be exported with the object. An example of that would be a ‘fail zone’ – an invisible trigger box that triggers a level fail when touched by the player object. So a generic trigger box can be extended with a message code (integer or string) that will be specified in the editor – opening the possibility for the designer to control simple game logic in different scenarios.

These type of ‘programmable’ object not only save time, but also frees the designer to experiment and quickly try out different layouts and configurations without requiring further coding.

Conclusion

Sophisticated level editing should not be an exclusive perk of heavy platforms the likes of Unreal3. With little effort, even small indie developers can come up with decent tools to help take control over content pipeline and speed up game content creation.

Don’t use global state to manage a local problem

Original Author: Rob-Galanakis

I’ve ripped off this title from a common trend on Raymond Chen of MSFT’s blog.  Here are a bunch of posts about it.

I can scream it to the heavens but it doesn’t mean people understand.  Globals are bad.  Well, no shit Sherlock.  I don’t need to write another blog post to say that.  What I want to talk about is, what is a global.

It’s very easy to see this code and face-palm:

global spam = list()
 
  global eggs = dict()
 
  global lastIndex = -1

But I’m going to talk about much more sinister types of globals, ones that mingle with the rest of your code possibly unnoticed. Globals living amongst us. No longer! Read on to find out how to spot these nefarious criminals of the software industry.

Environment Variables

There are two classes of environment variable mutation: acceptable and condemning.  There is no ‘slightly wrong’, there’s only ‘meh, I guess that’s OK’, and ‘you are a terrible human being for doing this.’

  1. Acceptable use would be at the application level, where environment variables can be get or set with care, as something needs to configure global environment.  Acceptable would also be setting persistent environment variables in cases where that is very clearly the intent and it is documented.  Don’t go setting environment variables willy-nilly, most especially persistent ones!
  2. Condemning would be the access of custom environment variables at the library level.  Never, ever access environment variables within a module of library code (except, perhaps, to provide defaults).  Always allow those values to be passed in.  Accessing system environment variables in a library is, sometimes, an Acceptable Use.  No library code should set an environment variable, ever.

Commandline Args

See everything about Environment Variables and multiply by 2.  Then apply the following:
  1. Commandline use is only acceptable at the entry point of an application.  Nothing anywhere else should access the commandline args (except, perhaps to provide defaults).
  2. Nothing should ever mutate the commandline arguments.  Ever!

Singletons

I get slightly (or more than slightly) offended when people call the Singleton a ‘pattern.’  Patterns are generally useful for discussing and analyzing code, and have a positive connotation.  Singletons are awful and should be avoided at all costs.  They’re just a global by another name- if you wouldn’t use a global, don’t use a singleton!  Singletons should only exist:
  1. at the application level (as a global), and only when absolutely necessary, such as an expensive-to-create object that does not have state.  Or:
  2. in extremely performance-critical areas where there is absolutely no other way.  Oh, there’s also:
  3. where you want to write code that is unrefactorable and untestable.
So, if you decide you do need to use a global, remember, treat it as if it weren’t a global and pass it around instead (ie, through dependency injection).  But don’t forget: singletons are globals too!

Module-level/static state

Module-level to you pythonistas, static to your C++/.NET’ers.  It’s true- if you’re modifying state on a static class or module, you’re using globals.  The only place this ever belongs is generally for caching (and even then, I’d urge you to reconsider).  If you’re modifying a module’s state- and then you’re acknowledging what you’re doing by, like, having to call ‘reload’ to ‘fix’ the state, you’re committing a sin against your fellow man.  Remember, this includes stuff like ‘monkeypatching’ class or module-level methods in python.

The Golden Rule

The golden rule that I’ve come up with with globals is, if I can’t predict the implications of modifying state, find a way not to modify state.  If something else you don’t definitely know about is potentially relying on a certain state or value, don’t change it.  Even better, get rid of the situation.  This means, you keep all globals and anything that could be considered a global (access to env vars, singletons, static state, commandline args) out of your libraries, entirely.  The only place you want globals is at the highest level application logic.  This is the only way you can design something where you know all the implications of the globals, and rigorously sticking to this design will improve the portability of your code greatly.

Agree?  Disagree?  Did I miss any pseudonymous globals that you’ve had to wrangle?

Building your tools as a webapp – Part 1

Original Author: Tom Gaulton

webappWow, where to start? I’ve had this topic in mind since before I joined #AltDevBlogADay but I’ve struggled to get it down on paper. It’s not that I haven’t got anything to write on the subject, more that I’ve got too much to write. To explain why, here’s some back-story:

I work as a tools developer on the BlitzTech middleware. The toolchain has been in development for 12 years, and I’ve personally been involved in writing it for 9 years. For the majority of that time the tools have been developed in C++, using the classic Win32 API for the user interface. More recently C++ started to give way to C#, the old Win32 functions were replaced with the .Net framework classes, and scripting languages began to find a home in the pipeline – but fundamentally the tried and tested method of building dialog based tools hadn’t changed in a decade.

Then, in the middle of last year, we decided to try something a little different. In fact, a lot different. Our user interface would no longer be a series of Windows dialogs, but instead a shiny web interface. The UI wouldn’t drive the back-end code directly, but would instead communicate via a HTTP interface. In short, our tools development would be radically different to everything we’d done before.

At first we weren’t sure it would work, but quickly we started to see the potential and over a year later this approach is really paying dividends. When Mike Acton published a post entitled New generation of @insomniacgames tools as webapp back in March, I was excited to see that ours wasn’t the only team going in the web direction, and subsequent enthusiastic posts from other developers made me determined to share my own experiences on the subject.

Now, how to distil all those thousands of hours I’ve spent developing tools into a single blog post that explains a) why building your production tools as a webapp is such an awesome idea, and b) how to actually go about doing it?

Well, on the goal of condensing it into a single post I’ve failed, so I’ve decided to split this into a series of posts. In fact I only finally decided to write this intro a few hours before my posting deadline so I’m afraid it’s merely a tease for what is to come. My next post will focus on all the reasons why a webapp is a Good Thing (TM). Mike’s previous post on the subject gives a good summary of the key goals, but there are a lot of other subtle benefits to explore.

After that, I’ll do a series of posts describing (in as much detail as I’m contractually allowed to share) how we’ve plugged together all the components to make a working webapp-based toolchain – and how we’ve gone about bolting this onto a decade old codebase in a way that gives us all the cool new stuff, while maintaining the legacy of feature rich tools.

If you have an interest in this subject, particularly if there’s anything specific you want to know, please drop a comment below and I’ll do by best to answer any queries.


Robust Inside and Outside Solid Voxelization

Original Author: Nick Darnell

While wrapping up my post for generating simplified occluders for Complete Polygonal Scene Voxelization. Afterwards I found time to read it thoroughly and implement it as a replacement for my existing ray casting based solid voxelization method.

The problem with the solid voxelization technique I was using previously was that it used ray casting; making it impossible to perform solid voxelization unless the mesh is watertight in addition to having no anomalies like intersecting geometry.

However, that restriction makes it an unrealistic solution in the real world because game art typically has holes in the locations players never see; such as the bottom cap on a building, which is rarely modeled.

The New Solution

The Complete Polygonal Scene Voxelization paper’s solution to voxelizing a scene is pretty clever; It applies a heuristic model to the problem of determining the inside/outside status of each voxel or octree cell. Allowing it to overcome the problem of holes and intersecting geometry making it suitable as a real world solution.

How It Works

bunny_octree_3

You can download the paper and read it for yourself, but let me go ahead and summarize the algorithm for brevity’s sake so that the rest of the article makes sense.

The algorithm takes place in 3 stages:

  1. Create Octree
  2. Find Seed Cell
  3. Propagate Seed Cell
Create Octree

First you create an octree around the mesh that continues to subdivide each cell until either the cell no longer intersects with any triangle or a maximum depth is reached. A typical maximum octree depth that will work for most meshes is 5. If the mesh has some exceptionally thin walls that you want the cells to be small enough to fill you may need to go as high as 7 or 8.

I was having some problems with the GPU AABB/Triangle overlap test I used for voxelization in the Möller implementation of AABB/Triangle overlap test to C# and just used it instead.

Also if you ever need to lookup how an intersection is performed I highly recommend the gigantic matrix of intersections over at realtimerendering.com. It was a handy resource since I don’t keep the algorithm for AABB/Triangle overlap stored in my brain.

After you’ve created the octree we need to process each cell that wasn’t intersecting with a triangle to determine if it is inside or outside.

Find Seed Cell

Before we can determine if a cell is inside or outside the mesh we need to find a seed cell. The seed cell is sort of the ground truth example cell that we use to propagate its status to the other cells that it can see. The seed cell’s status is determined by rendering a cube map centered inside the cell with the near plane placed at the cell edge.

When rendering each side of the cube map, you render the scene such that all front facing polygons are blue and all back facing polygons are red. You then read back each cube map surface from the GPU and determine the percentage of red and blue pixels seen at each face.

If at least 4 sides of the cube map contain red pixels, the cell is determined to be inside the mesh.

The paper says that NO red pixels can be seen for a seed cell to be determined to be outside however I found this problematic since occasionally a red pixel can be seen just through tiny rendering artifacts.

So I feel a better solution is one like the following:

MIN_INSIDE_FACES = 4;
 
  MIN_INSIDE_PERCENTAGE = 0.03f;
 
   
 
  int cubemap_sides_seeing_inside = 0;
 
   
 
  for (int i = 0; i < 6; i++)
 
  {
 
      RenderCubeMapSide(i);
 
      float backfacePercentage = CalculateBackfacePercentage(i);
 
   
 
      if (backfacePercentage > MIN_INSIDE_PERCENTAGE)
 
          cubemap_sides_seeing_inside++;
 
  }
 
   
 
  if (cubemap_sides_seeing_inside >= MIN_INSIDE_FACES || cubemap_sides_seeing_inside == 0)
 
  {
 
      if (cubemap_sides_seeing_inside >= MIN_INSIDE_FACES)
 
          cell.Status = CellStatus.Inside;
 
      else // cubemap_sides_seeing_inside == 0
 
          cell.Status = CellStatus.Outside;
 
   
 
      // Propagate cell status...
 
  }
 
  else
 
  {
 
      // Unable to solve status exactly.
 
  }

Where you don’t count a face as inside unless at least 3% of the total red and blue pixels are red. The percentage is just something I picked out of thin air, it feels like a number small enough to be easily overcome by any truly inside cube face, but high enough to allow me to ignore tiny artifacts.

Propagate Seed Cell

The last step is to propagate the seed cell’s status to other cells. After classifying a seed cell you need to test every unknown cell against each the depth map and frustum of each cube map surface.

You’re performing a test to see if any of the 8 corners of the octree cell when projected into the camera space of each cube face are closer than the depth value at that pixel. If it is, then then the entire octree cell likely visible.

If the cell is visible from the seed cell, then in all likelihood the cell has the same status as the seed cell. However because having holes means 2 seed cells (one inside and one outside) can potentially see the same cell you want several seed cells to confirm the status of a cell before committing to it.

So once you’ve determined a cell is visible from the seed cell you’ll increment a counter on the cell for that status. Once one of the statuses reaches a threshold, like for example 16 you’ll change the status of the cell from Unknown to whatever status counter overcame the threshold and no longer process that cell.

It should be noted that only seed cells propagate their status. Cells that you propagate to do not propagate their own status.

Repeat

After you’ve found a seed cell and propagated its status you’ll continue to repeat finding a seed cell and propagating the seed cell until all cells have a status of inside, outside or intersecting. You can rarely end up with some cells whose status simply can’t be determined so make sure your code can handle that scenario and not loop forever.

Improvements

While implementing the paper I made some additional improvements to the proposed solution. I sped up the process by taking advantage of hardware improvements to render the scene using a single pass. I also improved the conservativeness of the algorithm in situations where you’re using square voxels. When a mesh is wider than it is tall in those sitations there will be padding below the mesh; if the bottom of the mesh is uncapped it can lead to inside cells ‘leaking’ their status outside.

Single Pass Rendering

The paper was published back in 2002 and due to limitations at the time the simplest method of rendering front faces one color and back faces another was to render the scene twice and flip the winding order and the color of the triangles being rendered. However this method is slower than just using a simple pixel shader to change the color of front/back faces.

In OpenGL you can use SV_IsFrontFace.

void main()
 
  {
 
       gl_FragColor = gl_FrontFacing ? vec4(0,0,1,1) : vec4(1,0,0,1);
 
  }
Intersect Mesh Bounds and Clip To Bounds

One problem I found is that when a mesh (like a building) is uncapped at its base but is wider than it is tall there will be several cells below the base of the mesh. Cells that will have the status of inside spread to them, even though a human could easily see those cells are outside.

So one improvement I ended up adding is that when testing for triangle intersection, you should also test against intersection against the mesh bounds. Additionally you immediately mark a cell as Outside if is outside the mesh bounds, since it simply is not possible for that cell to be inside the mesh, but don’t treat that cell like it’s a seed cell; just mark it as outside and move on.

I needed some real game art to properly test the solution so I exported a roof structure from the Necropolis map from UDK’s UTGame sample. Here you can really see the difference it makes to clip to the bounds of the mesh. Note how many additional voxel/octree cells (purple lines) are determined to be ‘inside’ because of how many backfaces (red triangles) they can see.

udk_necropolis_roof_not_fixed

Figure 1 – Roof Inner Voxelization Not Bounded (Before)

udk_necropolis_roof_fixed

Figure 2 – Roof Inner Voxelization Bounded (After)

Future Improvements

When the cube map for each seed cell is processed it’s read back from the GPU and each pixel is checked on the CPU. This is wildly inefficient when all we care about is the percentage of red to blue pixels on each face.

So that processing can be moved to OpenCL to improve the performance significantly. I would prefer to have all cells be seed cells since that makes it easier to define the rules of what cells are inside vs. outside. Allowing the cells to propagate has a higher potential to cause problems I suspect on meshes with very nasty artifacts. Giving each cell the ability to individually determine their status will be more stable and more predictable.

Currently my cube map rendering is performed in 6 passes. Shifting over to a single pass method using the geometry shader will likely add additional speed improvements, but I don’t know for sure.

If I move enough of the processing to the GPU it may allow me to make more cells seed cells (perhaps all?) and still maintain an acceptable performance footprint for tool time usage. For offline usage though this method is already very acceptable (a few seconds for an average mesh and a maximum depth of 5) even with all the CPU read-backs I’m performing.

Sample Code

I’m still working on an improved version of the Generating Occluders for Hierarchical Z-Buffer Occlusion Culling sample. So the code is a bit tied up a the moment. However the next post I do on generating the occluders will contain it, which should be soon.

Robust Inside and Outside Solid Voxelization @ nickdarnell.com


Education: The Importance of Teams

Original Author: Heather M Decker-Davis

As you may be aware, there are numerous education programs out there for game development. What bothers me about many of these programs is the scarcity, or sometimes even lack of, group projects. The game industry is currently experiencing a wonderful renaissance in which it is once again possible for solo developers to forge successful careers creating indie and mobile titles, but a lot of students don’t necessarily sign up for game development programs with the dream of working alone. Therefore, if you’re an aspiring student who dreams of working on teams large or small, you owe it to yourself to be sure you’re getting appropriate learning experiences that exercise your social skills, including cooperation and compromise. Likewise, if you’re an educator responsible for curriculum, you owe it to your students to provide as many real-world development situations as possible.

Dear Students,

If you find yourself in a great program that simply doesn’t offer you as many teamwork opportunities as you see fit, there is still hope for expanding your horizons. You may need to work with other students outside of classes. I assure you this is not a preposterous suggestion. Many prominent game developers have side projects, and in fact, some studios even actively encourage side or pet projects. It makes perfect sense! Each project you undertake is an opportunity to learn new insights you can apply to future projects. This is invaluable experience you won’t gain from simply learning software programs and reading about general development processes.

Starting your own group project can be challenging, but also highly rewarding if you stick with it. The first step is not to be shy. Reach out to your fellow classmates, check with relevant groups or clubs, ask your faculty if they know of other students interested in collaborating, and hit your school’s media outlets (newsletters, bulletin boards, social media, etc.) with ads seeking teammates. It may take some time, but it’s totally possible to build a team. It often helps if you set goals for your project, such as entering IGF or showcasing your game at a local event. These types of goals also include hard deadlines, ensuring your project has a definite ship date. Learning to cope with project scale and task scheduling in a group setting are important hands-on skills you are unlikely to find in your regular classes!

Dear Educators,

I realize there are a variety of reasons team projects aren’t frequently offered in your programs. In fact, many game development programs are fairly new at this point in time, meaning a lot of curriculum is still being pioneered and polished. Let’s push the envelope. One fairly straightforward way we can improve our offerings to students is by providing more opportunities for teamwork. This presents new challenges in planning and grading, but ultimately means we’re training students more relevant real-world skills! It’s not enough to simply teach students to use 3D modeling software and 3D game engines. We need to simulate the workflow involved in collaboration. Optimally, basic classes should be offered to give students a good grounding in a particular software set, followed by group project classes in which the students are now tasked with applying their software knowledge to a team production effort. (Example: 3DS Max and UDK classes lead up to team-based level design classes.) In addition to cultivating crucial social skills needed for effective work on a team, students would also have the opportunity to create larger, more polished projects than they could alone, meaning better portfolio pieces in the long run.

Overall, I think grading is one of the biggest hurdles when it comes to offering team projects. In many groups, there tends to be a predictable mix of students who work their hardest and students who just skim by. My first suggestion is to stick to individual grades. This forces each student to be accountable for their own efforts, rather than falling back on a blanket grade. Additionally, upon completion, have students write a brief summary of their duties on the project and what they learned. Comparing team member accounts, as well as your own observations during class time, can be helpful in gauging specific participation.

 

In closing, both students and educators should be interested in academic project teams. From the student perspective, you gain social and workflow training that is absolutely critical in a collaborative field. You’ll start seeing common pitfalls and learn to overcome them. From an educator’s standpoint, you’re training more qualified students for the field, which reflects positively on your own teaching ability and your program at large. From either viewpoint, the benefits are clear. I hope to see more collaborative student projects!