Software Rasterizer Part 2

Original Author: Simon Yeung

Introduction

Continue with the affine transformation (i.e. after transformation, the mid-point of the line segment is no longer the mid-point), this will result in some distortion and this artifact is even more noticeable when the triangle is large:

interpolate in screen space
perspective correct interpolation

Condition for linear interpolation

When interpolating the attributes in a linear way, we are saying that given a set of vertices, vi (where i is any integer>=0) with a set of attributes ai (such as texture coordinates), we have a function mapping a vertex to the corresponding attributes, i.e.

f(vi)= ai

Say, to interpolate a vertex inside a triangle in a linear way, the function f need to have the following properties:

f(t0 *v0 + t1 *v1 + t2 *v2 ) = t0 * f(v0) + t1 * f(v1) + t2 * f(v2)
, for any t0t1t2 where t0 t1 t2=1

which means that we can calculate the interpolated attributes using the same weight taffine function with the following form:

f(x)= Ax + b
, where A is a matrix, x and b are vector

Depth interpolation

When a vertex is projected from view space to normalized device coordinates(NDC), we will have the following relation (ratio of the triangles) between the view space and NDC space:

The Difficulties of an Infinite Video Game World

Original Author: Alex Norton

The Premise

Procedural Generation is definitely in vogue, and I personally have believed that it is the way forward in video gaming for many years now. Using procedural generation in games is nothing new of course, as fans of games such as Elite or The Sentinel will know that we’ve been seeing it in games for a good 25 years.

Older titles made good use of it due to the memory constraints of the hardware of the time. It was simply more efficient to have generated levels rather than hand crafted ones, but that is no excuse for games not to make better use of it now that we have better specced hardware.

Fans of the RPG genre will no doubt remember The Elder Scrolls II: Daggerfall, which had one of the largest in-game worlds ever seen, and still to this day tramples almost every RPG made in terms of world size. I recall reading somewhere that the in-game world of Daggerfall was equal to twice the landmass of the British Isles.

That is a heck of a lot of world to explore, and – from a game design perspective – a nightmare to recreate by hand. Through clever use of procedural generation, however, it is easily possible, which is what Bethesda Softworks did with Daggerfall. The settlements and towns were hand-crafted, with the wilderness in between being generated by the game.

But why stop there? Why have world borders at all? Procedural generation code hasn’t changed much in the last 25 years. People are still stuck using fractals and diamonds and blobs to do everything, which becomes repetitive and quite simply looks like procedurally generated content. To any programmer looking at it, it virtually smells of procedural generation. On top of all this, if you get it wrong, it will end up VERY wrong. The indie crowd seems to do it best, with titles like Dwarf Fortress generating MASSIVE worlds with lush histories and more world than you could ever hope to explore. But still, they aren’t pushing the envelope. My aim was to fix that by making it work. An infinite game world should be possible, and indeed it is.

 

The Idea

Just over two years ago I began assembling a team to make the first truly infinite, fully 3D fantasy RPG, entitled Malevolence: The Sword of Ahkranox. It was to be played in a style similar to the classic grid-based, first person RPGs of the late 80s and early 90s such as Might & Magic, Eye of the Beholder and Dungeon Master, but set in a literally infinite world. We had originally thought to make it a planet-sized world, but in the end decided on the story being that the game’s world was being created within the imagination of a sentient sword, which would act as a way to “explain” the infinity of it.

After much experimentation and very complex math, we got it working, but all in raw data. Nothing really playable. But we had in front of us an infinite world filled with infinite dungeons and infinite cities filled with infinite NPCs. We then worked to get a game working in such a world (some of the efforts of which, you may have read about in my last post)

Now, just to confirm, this world wasn’t being randomly generated. It was both infinite AND persistent. Without going into too much detail, this is achieved by making the world dynamically affected by the passing of time. Every part of the world is identified as either affected by time or timeless. The lay of the land with its hills and caverns… That’s all timeless, and never changes. Because those parts never change and cannot be affected by the player, they only need to be loaded into memory when the player can see them (or if they are needed to generate quest information, etc). However, if an object is affected by time (for example, the contents of a chest), then they have a time coefficient applied to the procedural algorithm that generates them. This means that a chest in a dungeon, for example, will have different items in it depending on WHEN the player opens the chest. If the player was the clear out that chest, that act is stored in a database of player changes, but then re-set when a certain amount of time has passed. This ensures that the database of player changes to the world never exceeds a certain size (which is estimated to be around 250mb at the very most, but more realistically around 50mb)

This generation accounts for almost everything in the game. Spell creation, item creation, weapon creation, potion creation, NPC dialogue system, even the spell effects that happen on the screen. Due to this, the world that the player explores will be ever-changing and infinite. They won’t keep finding the same old weapons or items, there will be no end to the number of spells they can find or use, they won’t even keep having the same conversations with NPCs. This is necessary to keep a player interested for long enough in an infinite world.

 

Public Acceptance

Back when the game Elite was first being worked on, it was planned to have around 282 trillion galaxies with around 256 star systems in each one, but their publisher, Firebird, were worried that such a large in-game universe would be intimidating to players and put them off. I have to say I had wondered at that, and was interested to see how the public would react to an even bigger in-game world.

I was surprised at the results.

We’ve been quite public with our development process for the game so far and generated a small cult following on communities such as IndieDB, but very few people seem to quite grasp the scale of an infinite world, despite our thorough descriptions of it. We had put up renders of the world generation data, showing just a tiny fraction of the world:

And then, we showed them this:

That inland sea is around the size of the entire in-game world of Skyrim. Funnily enough, the largest response we got from this information was disbelief. Many called us liars and that it simply wasn’t possible. Others began to believe that the world size of Malevolence was the entire above image, rather than infinite. Only about 20% of people really understood.

So, from a marketing perspective, it’s been a bit of a nightmare to have an infinite world. We’ve even had many suggest that Malevolence is just a rip-off of Legend of Grimrock, despite the fact that Malevolence was started about a year before. But that’s always going to happen, no matter what the game. What happens upon release will happen, and that’s just how the cookie crumbles with game development. Funnily enough, that hasn’t been the hardest bit. The hardest bit has been the math involved in making a world like this one.

 

The Math

Being infinite, procedural AND persistent, most of the mathematics behind Malevolence is theoretical math – that is, mathematics with few or no fixed/known values acting in a volatile space. But we’ve broken the world creation down into multiple layers.

The first layer is the one you saw above. A large world segment is generated which covers an area of about 400x400km. This is the only layer of the game that uses a standardised procedural generation system (perlin noise)

That is then broken down into chunks that are around 3x3km, calculating the biome information within that area, like so:

In the end, all of these steps need to be completed when each new world segment is generated in order to turn the raw data into this:

That is just for the overworld. Every world segment that is VISIBLE to the player (as in the view above) is given a unique code, generated by the procedural algorithm. If there is a dungeon entrance in that segment, the dungeon is generated using this unique code, ensuring that every time the player returns to that spot, the same dungeon will be there:

This same method is used for town generation, graveyards, ruins or anything else that the player may encounter. And this goes on forever. If a player was to turn off collision and hold down the ‘move forward’ button, it would take them just under three weeks to walk from one end of a world segment to another, and then they would simply move to a new world segment seamlessly, and then another, forever.

The biggest question we have been given is how we have dealt with the data type limitations on player co-ordinates, but unfortunately we can’t give away all our secrets 🙂 But I can tell you that Malevolence doesn’t suffer from the Minecraft world-edge issue, it just keeps going on and on.

 

Conclusion

Using procedural generation in your game can be a rewarding experience, but definitely don’t rush into it. It takes good planning, clever usage and most of all it needs to feel seamless, otherwise the public simply won’t accept it.

If you’d like to read more about Malevolence: The Sword of Ahkranox, you can check out these links:

Once Upon a Time…

Original Author: Poya Manouchehri

I have a theory: everyone has or will have, at some point, an idea for a story they want to write. Or tell. And I don’t mean a real life story, but a story that is a creation of one’s imagination. Now it might be a passing thought… Maybe it’s a person, a news report, a real life event, a book, or a game that suddenly triggers an idea for a story. The process of turning that idea into something complete and finished is a whole other…well, story.

Currently I’m writing the story for the game Connectorium. It’ll be the second story I’m writing in full, after co-writing the Revival short film (I’m not counting the one or two short stories here and there, and a failed attempt at writing a fantasy novel after watching the first Lord of the Rings film. Who didn’t do that, right?). Here are just a collection of random thoughts, observations, and experiences about the process. Obviously these are not the opinions of an expert; I’m merely hoping it opens up the way for a conversation and invites thoughts from you.

From Abstraction to Realization

This is something that is universal to the creative process. You begin with an empty canvas. Maybe a concept that is completely abstract and vague. Then with every sentence, with every stroke of a brush, with every added note, or with every line of code, you bring that abstraction one step closer to existence (and also the number of possibilities of what that end product will be reduces with every step). But there is a key thing I have realized: this is a two way process. The original idea, or concept affects what you create. But what you create also affects the idea over time. To a point where the final product may in no way resemble the original idea. I think this a very important part of the creative process: the organic nature of it.

As far as a story goes, that initial concept and idea can be many different things. Maybe it’s a particular character, or a specific plot point. Maybe it’s a particular setting. Maybe it’s a mechanic in the game you are designing. Either way, it’s important to keep in mind that your completed story may be nothing like what you had initially conceived. And that’s OK. In fact it’s more than OK. It’s usually a good thing.

Working Backwards

When I first started working on Connectorium, I had a general idea for the story. The game is about systems and connections, so the story was going to be about a little girl who wakes up one morning to a world where all connections have gone missing. Her adventure would be about her meeting various characters, helping them restore the missing connections, and solving the mystery. For some time though, I stalled fleshing out the story more. Eventually I asked myself, why am I wasting time? Why don’t I just write the story? And it occurred to me: it’s because I didn’t know how it’s going to end.

So one morning I decided to take my iPad, go to a quiet park, and not come back home until I have figured out how the story will end. It took a couple of hours, but eventually I came up with an idea, quite suddenly really. I had a big smile on my face right at that moment, because I knew I could start writing the story now.

Maybe this is more a function of the kinds of story that I enjoy and like to write, but I find that I really need to know the ending early on. Everything in the plot, the characters, the gameplay in the case of a game, is pushing the audience towards that ending. It’s what keeps the story coherent to me.

Characters or Plot

One of my favorite writers, Isaac Asimov, is often criticized for having somewhat uninteresting and 2D characters. Nevertheless he is an amazing story teller.

But one can’t argue that the best of stories combine a great plot, with believable and great characters. What I have noticed is that personally I’m much more interested and focused on the plot. So I always need to be conscious of the “flatness” of my characters. For that reason, after I have written the initial draft of the story, I’ll do an iteration where I’ll focus specifically on each character, writing more back story, fixing the dialog, descriptions, and so on, of course adjusting the plot where necessary. I can imagine the reverse can work just as well: building a detailed and interesting character, and developing the story around that character (or characters).

Dialog, Dialog, Dialog

For me, probably the hardest part of writing a story is the dialog. Not only is it really hard to write a believable, natural, and flowing conversation between two or more characters, it’s even harder to have all your characters not sound exactly the same! Exactly like…you!

More than anything, it just requires time, and rewrites to improve this. It is also important to have back stories for characters, even if none of it is ever revealed to the audience. Where do they come from? What do they do? What do they eat? What was their childhood like? What are their relationships like? What is their motivation? All of these impact how a character speaks, how they would react to a situation, and how they’d express themselves.

Another thing that has helped me is trying to picture a real life person acting out that character. Maybe someone you know, or an actor. Putting a face and voice to a line of dialog goes a long way to help you see if it’s the right fit. Sometimes reading it out loud in the voice that you think the character would be speaking in also helps here.

On the Subject of Games

I’ve been talking a lot about stories, and haven’t really talked much about games. Here is point I want to make which I can expect at least some to disagree with.

I feel that the gameplay must reinforce the story as much as possible. At the very least it shouldn’t contradict it, because that takes you out of the immersion that you might otherwise have. How often do you run around in a game, killing various things, and collecting numerous items, stats, etc, just to be reminded by a cut scene that you’re actually trying to resolve a much greater conflict.

“Alright guys, just a few more crates. Then Lord what’s-his-face is gonna get it…”

And here is another (potentially less popular) thought. Given that there are practically infinite possible stories, why is it that a good percentage of games, especially those with plots and characters, include combat in some form as their core mechanic? Is it that we are simply avoiding stories where combat isn’t an integral component? Or are we throwing in combat into the mix, regardless of whether or not it reinforces the story?

Just a thought. Would love to hear yours.

Functional Programming in C++

Original Author: John-Carmack

Probably everyone reading this has heard “functional programming” put forth as something that is supposed to bring benefits to software development, or even heard it touted as a silver bullet.  However, a trip to Wikipedia for some more information can be initially off-putting, with early references to lambda calculus and formal systems.  It isn’t immediately clear what that has to do with writing better software.

My pragmatic summary:  A large fraction of the flaws in software development are due to programmers not fully understanding all the possible states their code may execute in.  In a multithreaded environment, the lack of understanding and the resulting problems are greatly amplified, almost to the point of panic if you are paying attention.  Programming in a functional style makes the state presented to your code explicit, which makes it much easier to reason about, and, in a completely pure system, makes thread race conditions impossible.

I do believe that there is real value in pursuing functional programming, but it would be irresponsible to exhort everyone to abandon their C++ compilers and start coding in Lisp, Haskell, or, to be blunt, any other fringe language.  To the eternal chagrin of language designers, there are plenty of externalities that can overwhelm the benefits of a language, and game development has more than most fields.  We have cross platform issues, proprietary tool chains, certification gates, licensed technologies, and stringent performance requirements on top of the issues with legacy codebases and workforce availability that everyone faces.

If you are in circumstances where you can undertake significant development work in a non-mainstream language, I’ll cheer you on, but be prepared to take some hits in the name of progress.  For everyone else: No matter what language you work in, programming in a functional style provides benefits.  You should do it whenever it is convenient, and you should think hard about the decision when it isn’t convenient.  You can learn about lambdas, monads, currying, composing lazily evaluated functions on infinite sets, and all the other aspects of explicitly functionally oriented languages later if you choose.

C++ doesn’t encourage functional programming, but it doesn’t prevent you from doing it, and you retain the power to drop down and apply SIMD intrinsics to hand laid out data backed by memory mapped files, or whatever other nitty-gritty goodness you find the need for.

 

Pure Functions

A pure function only looks at the parameters passed in to it, and all it does is return one or more computed values based on the parameters.  It has no logical side effects.  This is an abstraction of course; every function has side effects at the CPU level, and most at the heap level, but the abstraction is still valuable.

It doesn’t look at or update global state.  it doesn’t maintain internal state.  It doesn’t perform any IO.  it doesn’t mutate any of the input parameters.  Ideally, it isn’t passed any extraneous data – getting an allMyGlobals pointer passed in defeats much of the purpose.

Pure functions have a lot of nice properties.

Thread safety.  A pure function with value parameters is completely thread safe.  With reference or pointer parameters, even if they are const, you do need to be aware of the danger that another thread doing non-pure operations might mutate or free the data, but it is still one of the most powerful tools for writing safe multithreaded code.

You can trivially switch them out for parallel implementations, or run multiple implementations to compare the results.  This makes it much safer to experiment and evolve.

Reusability.  It is much easier to transplant a pure function to a new environment.  You still need to deal with type definitions and any called pure functions, but there is no snowball effect.  How many times have you known there was some code that does what you need in another system, but extricating it from all of its environmental assumptions was more work than just writing it over?

Testability.  A pure function has referential transparency, which means that it will always give the same result for a set of parameters no matter when it is called, which makes it much easier to exercise than something interwoven with other systems.   I have never been very responsible about writing test code;  a lot of code interacts with enough systems that it can require elaborate harnesses to exercise, and I could often convince myself (probably incorrectly) that it wasn’t worth the effort.  Pure functions are trivial to test; the tests look like something right out of a textbook, where you build some inputs and look at the output.  Whenever I come across a finicky looking bit of code now, I split it out into a separate pure function and write tests for it.  Frighteningly, I often find something wrong in these cases, which means I’m probably not casting a wide enough net.

Understandability and maintainability.  The bounding of both input and output makes pure functions easier to re-learn when needed, and there are less places for undocumented requirements regarding external state to hide.

Formal systems and automated reasoning about software will be increasingly important in the future.  Static code analysis is important today, and transforming your code into a more functional style aids analysis tools, or at least lets the faster local tools cover the same ground as the slower and more expensive global tools.  We are a “Get ‘er done” sort of industry, and I do not see formal proofs of whole program “correctness” becoming a relevant goal, but being able to prove that certain classes of flaws are not present in certain parts of a codebase will still be very valuable.  We could use some more science and math in our process.

Someone taking an introductory programming class might be scratching their head and thinking “aren’t all programs supposed to be written like this?”  The reality is that far more programs are Big Balls of Mud than not.  Traditional imperative programming languages give you escape hatches, and they get used all the time.  If you are just writing throwaway code, do whatever is most convenient, which often involves global state.  If you are writing code that may still be in use a year later, balance the convenience factor against the difficulties you will inevitably suffer later.  Most developers are not very good at predicting the future time integrated suffering their changes will result in.

 

Purity In Practice

Not everything can be pure; unless the program is only operating on its own source code, at some point you need to interact with the outside world.  It can be fun in a puzzly sort of way to try to push purity to great lengths, but the pragmatic break point acknowledges that side effects are necessary at some point, and manages them effectively.

It doesn’t even have to be all-or-nothing in a particular function.  There is a continuum of value in how pure a function is, and the value step from almost-pure to completely-pure is smaller than that from spaghetti-state to mostly-pure.  Moving a function towards purity improves the code, even if it doesn’t reach full purity.  A function that bumps a global counter or checks a global debug flag is not pure, but if that is its only detraction, it is still going to reap most of the benefits.

Avoiding the worst in a broader context is generally more important than achieving perfection in limited cases.  If you consider the most toxic functions or systems you have had to deal with, the ones that you know have to be handled with tongs and a face shield, it is an almost sure bet that they have a complex web of state and assumptions that their behavior relies on, and it isn’t confined to their parameters.  Imposing some discipline in these areas, or at least fighting to prevent more code from turning into similar messes, is going to have more impact than tightening up some low level math functions.

The process of refactoring towards purity generally involves disentangling computation from the environment it operates in, which almost invariably means more parameter passing.  This seems a bit curious – greater verbosity in programming languages is broadly reviled, and functional programming is often associated with code size reduction.  The factors that allow programs in functional languages to sometimes be more concise than imperative implementations are pretty much orthogonal to the use of pure functions — garbage collection, powerful built in types, pattern matching, list comprehensions, function composition, various bits of syntactic sugar, etc.  For the most part, these size reducers don’t have much to do with being functional, and can also be found in some imperative languages.

You should be getting irritated if you have to pass a dozen parameters into a function; you may be able to refactor the code in a manner that reduces the parameter complexity.

The lack of any language support in C++ for maintaining purity is not ideal.  If someone modifies a widely used foundation function to be non-pure in some evil way, everything that uses the function also loses its purity.  This sounds disastrous from a formal systems point of view, but again, it isn’t an all-or-nothing proposition where you fall from grace with the first sin.  Large scale software development is unfortunately statistical.

It seems like there is a sound case for a pure keyword in future C/C++ standards.  There are close parallels with const – an optional qualifier that allows compile time checking of programmer intention and will never hurt, and could often help, code generation.  The D programming language does offer a pure keyword:  http://www.d-programming-language.org/function.html  Note their distinction between weak and strong purity – you need to also have const input references and pointers to be strongly pure.

In some ways, a language keyword is over-restrictive — a function can still be pure even if it calls impure functions, as long as the side effects don’t escape the outer function.  Entire programs can be considered pure functional units if they only deal with command line parameters instead of random file system state.

Object Oriented Programming

Michael Feathers @mfeathers   OO makes code understandable by encapsulating moving parts. FP makes code understandable by minimizing moving parts.

The “moving parts” are mutating states.  Telling an object to change itself is lesson one in a basic object oriented programming book, and it is deeply ingrained in most programmers, but it is anti-functional behavior.  Clearly there is some value in the basic OOP idea of grouping functions with the data structures they operate on, but if you want to reap the benefits of functional programming in parts of your code, you have to back away from some object oriented behaviors in those areas.

Class methods that can’t be const are not pure by definition, because they mutate some or all of the potentially large set of state in the object.  They are not thread safe, and the ability to incrementally poke and prod objects into unexpected states is indeed a significant source of bugs.

Const object methods can still be technically pure if you don’t count the implicit const this pointer against them, but many object are large enough to constitute a sort of global state all their own, blunting some of the clarity benefits of pure functions.  Constructors can be pure functions, and generally should strive to be – they take arguments and return an object.

At the tactical programming level, you can often work with objects in a more functional manner, but it may require changing the interfaces a bit.  At id we went over a decade with an idVec3 class that had a self-mutating void Normalize() method, but no corresponding idVec3 Normalized() const method.  Many string methods were similarly defined as working on themselves, rather than returning a new copy with the operation performed on it – ToLowerCase(), StripFileExtension(), etc.

Performance Implications

In almost all cases, directly mutating blocks of memory is the speed-of-light optimal case, and avoiding this is spending some performance.  Most of the time this is of only theoretical interest; we trade performance for productivity all the time.

Programming with pure functions will involve more copying of data, and in some cases this clearly makes it the incorrect implementation strategy due to performance considerations.  As an extreme example, you can write a pure DrawTriangle() function that takes a framebuffer as a parameter and returns a completely new framebuffer with the triangle drawn into it as a result.  Don’t do that.

Returning everything by value is the natural functional programming style, but relying on compilers to always perform return value optimization can be hazardous to performance, so passing reference parameter for output of complex data structures is often justifiable, but it has the unfortunate effect of preventing you from declaring the returned value as const to enforce single assignment.

There will be a strong urge in many cases to just update a value in a complex structure passed in rather than making a copy of it and returning the modified version, but doing so throws away the thread safety guarantee and should not be done lightly.  List generation is often a case where it is justified.  The pure functional way to append something to a list is to return a completely new copy of the list with the new element at the end, leaving the original list unchanged.  Actual functional languages are implemented in ways that make this not as disastrous as it sounds, but if you do this with typical C++ containers you will die.

A significant mitigating factor is that performance today means parallel programming, which usually requires more copying and combining than in a single threaded environment even in the optimal performance case, so the penalty is smaller, while the complexity reduction and correctness benefits are correspondingly larger.  When you start thinking about running, say, all the characters in a game world in parallel, it starts sinking in that the object oriented approach of updating objects has some deep difficulties in parallel environments.  Maybe if all of the object just referenced a read only version of the world state, and we copied over the updated version at the end of the frame…  Hey, wait a minute…

 

Action Items

Survey some non-trivial functions in your codebase and track down every bit of external state they can reach, and all possible modifications they can make.  This makes great documentation to stick in a comment block, even if you don’t do anything with it.  If the function can trigger, say, a screen update through your render system, you can just throw your hands up in the air and declare the set of all effects beyond human understanding.

The next task you undertake, try from the beginning to think about it in terms of the real computation that is going on.  Gather up your input, pass it to a pure function, then take the results and do something with it.

As you are debugging code, make yourself more aware of the part mutating state and hidden parameters play in obscuring what is going on.

Modify some of your utility object code to return new copies instead of self-mutating, and try throwing const in front of practically every non-iterator variable you use.

 

Additional references:

http://www.haskell.org/haskellwiki/Introduction

http://lisperati.com/

http://www.johndcook.com/blog/tag/functional-programming/

http://www.cs.kent.ac.uk/people/staff/dat/miranda/whyfp90.pdf

http://channel9.msdn.com/Shows/Going+Deep/Lecture-Series-Erik-Meijer-Functional-Programming-Fundamentals-Chapter-1

http://www.cs.utah.edu/~hal/docs/daume02yaht.pdf

http://www.cs.cmu.edu/~crary/819-f09/Backus78.pdf

http://fpcomplete.com/the-downfall-of-imperative-programming/

Kevin Bacon in Video Gaming

Original Author: Alex Norton

The Premise

My project team have the distinct honor of listing actor Kevin Bacon in the ‘Special Thanks’ portion of our credits, but I doubt he has any idea that he’s in there. The fact is, he’s had a very large – albeit unknown – influence on the game’s development.

The game is Malevolence: The Sword of Ahkranox, which I have been project lead on for over two years now, and its engine is quite different to most games due to the fact that the entire game world is procedurally generated, infinite and also persistent.

 

The Challenge

Being infinite AND persistent, the team’s main challenge was to keep the game interesting so long as the player kept playing (not an easy task). We, of course, took the path of procedural item/weapon creation, even going so far as to make the game procedurally generate the graphics for the weapons, to ensure plenty of new gear to find. That, however, can only last for so long, and procedurally generated countryside, dungeons and towns can only entertain a player for so long before they all start to look the same. So we put our heads together and came up with a solution. We all agreed that we couldn’t keep the players interested infinitely, but we can take steps to ensure they get maximum enjoyment and re-playability out of the game while they do play it.

 

The Solution

What we came up with in the end was the quest and dialog system, and the way they interacted. Both were to be procedurally generated and intricate enough to ensure long-time interest from the player. But how does Kevin Bacon get involved with this? The answer lies in his namesake header file:

Some readers will be familiar with the Six Degrees of Kevin Bacon game back in 1994. The basic concept was that you could take the name of anyone involved in the Hollywood film industry, whether they be an A-List actor or an isolated gaffer somewhere on an obscure film, and within no more than 6 steps, be able to link them to Kevin Bacon. For example, one of the voice cast in my game Malevolence, Karen Kahler, was in the short film “The Magician” with actress Jackie Zane, who was in the film “Burning Palms” with actor Nick Stahl who acted in the movie “My One and Only” with Kevin bacon. Thus, her “Bacon Index” is 3 (and through Karen, mine is 4!). Make sense?

Well, it’s that system (or a modification of it) which the NPCs in Malevolence work with.

 

So How Does it Do it?

Let’s say that the player enters a town and talks to an NPC. The game determines that this NPC will have a quest for them, and so the game spreads its feelers out and works out what is relatively close to the town, and how far away each location is. Since the game world is generated procedurally, it does this process dynamically:

First the game procedurally generates the quest. The engine first selects what type of quest to generate and settles on an item centric quest. It then generates an item, and an incident and comes up with a backpack which was lost. Once this is done, the NPC tells the player that they need help recovering their backpack from a dungeon that they were exploring. Only the catch is, they fled the dungeon so quickly that they don’t remember where it was.

While this is happening, the engine consults the memory map shown above and looks at the area around the town for a few kilometers, then chooses a dungeon that is close enough to not be too far away, and far enough away that the player will have to search for it. However, once it has found a dungeon, it doesn’t let the player know where it is like most games. It is now the player’s mission to search for it.

 

Finding the Unfindable

Now that you know that there is a dungeon out there somewhere with a backpack in it, you can ask around to get more info. The engine, however, is processing the dungeon’s “Kevin Bacon Index” in the background. The main difference, however, is that not everyone knows everyone else perfectly well, so if one person gives you information, they may not be 100% sure about the information. So when you get a map of people like this:

You can see the percentages between them all. That shows the familiarity of the characters between them. So, if you speak to NPC ‘A’ and ask them about nearby dungeons, they will tell you that they have no idea about that sort of thing, but their friends Steve and Kyle might. If you speak to Steve, he may refer you to Kyle or pass you on to Bob the Blacksmith, who he’s fairly sure knows a mapmaker and a woodsman, who would probably have a better idea about dungeons in the area. When you speak to Bob the Blacksmith, he’ll tell you about Keith the Woodsman, who is familiar with the local wilderness, but will more likely put you on to Kevin the Mapmaker, who knows Keith the Woodsman quite well and may be able to help you himself (with his maps). These NPCs may all be in the same town, or they may be spread between multiple towns. It’s all generated by the procedural engine, but it’s how the player FOLLOWS the path that defines how well the quest will work out.

If a player is clever and good at deduction, they may have an easy time of it – for example, if they followed the path ABDEF, while it may not be the most direct route to Keith, they will get some accurate maps out of it, and maybe even a new weapon to help clear out the dungeon to find the backpack. But if they don’t follow good advice, they may have a far less fortuitous way. And keep in mind that due to the lack of familiarity between certain NPCs, sometimes the player will get false information in their searching, which can slow them down quite a bit.

 

That’s Very Long Winded

Very true, but not all quests in the game will work like this. The quests are divided into two quest types. There are the multi-tiered quests, as mentioned above, and “b*tch quests” which are your standard “there’s a dungeon, clean it out” or “there are rats in my basement, kill them”. On top of that, even when a player is assigned a multi-tiered quest, sometimes they will have 3 steps to complete, sometimes they will have 20, it all depends on how the cookie crumbles in the procedural generation.

 

Conclusion

Making a game which is infinite AND persistent has provided countless challenges to us as a dev team, but the solutions to intricate problems are often the most unique. I hope you enjoyed reading about this little section of our game, and if you ever end up playing it and you see Kevin Bacon in the credits, you’ll now know why.

Pushing the Button More Carefully

Original Author: Alex Norton

Hi all, first post on here, but it’s a topic that I feel particularly strongly about and I decided I would share my thoughts. Please keep in mind that all views expressed here are purely my opinion and I in no way intend any offense.

Game Design

Game design is always about looking back before looking forward. Sometimes this is done consciously, other times it is done unconsciously, but it always happens. Every great new idea is built on improving an one or more old ideas, and the best game designers are well aware of this. One of my biggest pet peeves is when someone says to me “don’t reinvent the wheel”, which they often quickly regret saying as I begin to lecture them on how if no-one ever reinvented the wheel we would never have tyres, suspension, alignments, treading, etc. All things which have made the wheel more efficient, smoother and just generally better.

A great design for a reinvented wheel

Inheriting Velocity in Ragdolls

Original Author: Niklas Frykholm

After a slew of abstract articles about C++ and code structuring I’d like to get back to some more meaty game engine stuff. So today I’ll talk about ragdolls. In particular, how to preserve the momentum of animated objects, so that when you switch over to the ragdoll it continues to stumble forward in the same direction that the animation was moving, before crashing to a gruesome death.

So this is a small, but important problem. We want to somehow get the velocities of the animated objects and then apply them to the bodies in the ragdoll. The only snag is that animated objects typically don’t know anything about velocities. Also, we need some way of matching up the physics bodies with the animated objects.

First, some background information. In the Bitsquid engine, physics, scene graph and animation are completely separate systems. We strongly believe in minimizing the couplings between different systems since that makes the engine easier to understand, reason about, modify, optimize and rewrite.

  • The physics system simulates a number of bodies, possibly connected by joints.

  • The scene graph handles local-to-world transforms for a collection of nodes in a hierarchy.

  • The animation system evaluates and blends animation curves for bones.

Bones and bodies hold references (just integer indices, really) to nodes in the scene graph and this how the systems communicate. After the animation has been evaluated, the resulting local transforms are written to the bones’ nodes in the scene graph.

For keyframed physics (animated hit bodies), the animation drives the physics, which means the physics’ bodies will read their world transforms from the corresponding nodes in the scene graph. For ragdolled physics, the world transforms of the bodies are written to the scene graph after the simulation has completed.

For partial ragdolls (such as a non-functioning, but still attached limb) or powered ragdolls (ragdolls driven by motors to achieve animation poses) it gets a little more involved (perhaps a topic for a future post), but the basic setup is the same.

Given this setup there are two ways of calculating the animation velocities:

  • We can calculate the velocities directly by differentiating the animation curves.

  • We can record a node’s transform at two different time steps and compute the velocity from the difference.

The first approach is doable, but not very practical. Not only do we have to differentiate all the animation curves, we must also take into account how those velocities are affected by the blend tree and local-to-world transforms. And even if we do all that, we still don’t account for movements from other sources than animation, such as scripted movements, IK or interactions with the character controller.

The second option is the more reasonable one. Now all we need is a way of obtaining the transforms from two different time steps. There are a number of possible options:

  • We could add an array of Matrix4x4:s to our scene graph’s last_world where we store the last world transform of every object. So whenever we want to go to ragdoll we always have a last_world transform to calculate velocities from.

  • We could simulate the character backwards in time when we want to go to ragdoll and obtain a last_world transform that way.

  • We could delay the transition to ragdoll one frame, so that we have enough time to gather two world transforms for computing the velocity.

The first approach is conceptually simple, but costly. We are increasing the size of all our scene graphs by about 50 % (previously they contained local and world transforms, now they will also need last_world). In addition we must memcpy(last_world, world) before we compute new world transforms. That’s a significant cost to pay all the time for something that happens very seldom (transition to ragdoll).

The second appraoch sounds a bit crazy, but some games actually already have this functionality. Servers in competetive multi-player fps games often need to rewind players in time in order to accurately determine if they were able to hit each other. Still, I find the approach to be a bit too complicated and invovled just to get a velocity.

The third aproach seems simple and cheap, but it violates one of our Bitsquid principles: Thou Shalt Not Have Any Frame Delays. Delaying something a frame can be a quick fix to many hairy problems, but it puts your game in a very weird transitional state where it at the same time both is and isn’t (yet) something. The character isn’t really a ragdoll yet, but it will be the next frame, whether I want to or not.

This new slightly self-contradictory state invites a host of bugs and before you know it, little logic pieces will start to seep into the code base “do this unless you are in the special transition-to-ragdoll state”. Congratulations, you have just made your codebase a lot more complicated and bug prone.

If this is not enough, consider the poor sucker who just wants to write a routine that does A, B, C and D, when A, B and C requires frame delays. Suddenly what was supposed to be simple function got turned into a state machine that needs to run for four frames to produce it result.

The simple rule that actions should take place immediately protects against such insanity.

So three options, none of them especially palpable.

I actually went with the first one, to always compute and store last_world in the scene graph, but with a flag so that this is only used for units that actually need it (characters that can go to ragdoll). When there is no clear winner, I always pick the simplest solution, because it is a lot easier to optimize later if the need should arise. (We could for example track last_world only for the nodes which have a corresponding ragdoll actor. Also we could store last_world as (p,q) instead of as a matrix.)

For completion, given the two transforms, the code for compting the velocities will look something like this:

Vector3 p0 = translation(tm_0);
 
  Vector3 p1 = translation(tm_1);
 
  Vector3 velocity = (p1 - p0) / dt
 
   
 
  Quaternion q0 = rotation(tm_0);
 
  Quaternion q1 = rotation(tm_1);
 
  Quaternion q = q1 * inverse(q0);
 
  AxisAngle aa = q.decompose();
 
  Vector3 angular_velocity = aa.axis * aa.angle / dt;

This has also been posted to The Bitsquid Blog.


Exceptional Floating Point

Original Author: Bruce-Dawson

Floating-point math has an answer for everything, but sometimes that’s not what you want. Sometimes instead of getting an answer to the question sqrt(-1.0) (it’s NaN) it’s better to know that your software is asking imaginary questions.

The IEEE standard for floating-point math defines five exceptions that shall be signaled when certain conditions are detected. Normally the flags for these exceptions are raised (set), a default result is delivered, and execution continues. This default behavior is often desirable, especially in a shipping game, but during development it can be useful to halt when an exception is signaled.

Halting on exceptions can be like adding an assert to every floating-point operation in your program, and can therefore be a great way to improve code reliability, and find mysterious behavior at its root cause.

This article is part of a series on floating-point. The complete list of articles in the series is:

Let’s get it started again

The five exceptions mandated by the IEEE floating-point standard are:

  1. Invalid operation: this is signaled if there is no usefully definable result, such as zero divided by zero, infinity minus infinity, or sqrt(-1). The default result is a NaN (Not a Number)
  2. Division by zero: this is signaled when dividing a non-zero number by zero. The result is a correctly signed infinity.
  3. Overflow: this is signaled when the rounded result won’t fit. The default result is a correctly signed infinity.
  4. Underflow: this is signaled when the result is non-zero and between -FLT_MIN and FLT_MIN. The default result is the rounded result.
  5. Inexact: this is signaled any time the result of an operation is not exact. The default result is the rounded result.

The underflow exception is usually not of interest to game developers – it happens rarely, and usually doesn’t detect anything of interest. The inexact result is also usually not of interest to game developers – it happens frequently (although not always, and it can be useful to understand what operations are exact) and usually doesn’t detect anything of interest.

imageThat leaves invalid operation, division by zero, and overflow. In the context of game development these are usually truly exceptional. They are rarely done intentionally, so they usually indicate a bug. In many cases these bugs are benign, but occasionally these bugs indicate real problems. From now one I’ll refer to these first three exceptions as being the ‘bad’ exceptions and assume that game developers would like to avoid them, if only so that the exceptions can be enabled without causing crashes during normal game play.

When can divide by zero be useful?

While the ‘bad’ exceptions typically represent invalid operations in the context of games, this is not necessarily true in all contexts. The default result (infinity) of division by zero can allow a calculation to continue and produce a valid result, and the default result (NaN) of invalid operation can sometimes allow a fast algorithm to be used and, if a NaN result is produced, a slower and more robust algorithm to be used instead.

The classic example of the value of the division by zero behavior is calculation of parallel resistance. The formula for this for two resistors with resistance R1 and R2 is:

image

Because division by zero gives a result of infinity, and because infinity plus another number gives infinity, and because a finite number divided by infinity gives zero, this calculation calculates the correct parallel resistance of zero when either R1 or R2 is zero. Without this behavior the code would need to check for both R1 and R2 being zero and handle that case specially.

In addition, this calculation will give a result of zero if R1 or R2 are very small – smaller than the reciprocal of FLT_MAX or DBL_MAX. This zero result is not technically correct. If a programmer needs to distinguish between these scenarios then monitoring of the overflow and division by zero flags will be needed.

The interpretation of divide-by-zero as infinity bothers some as can be seen in this official interpretation request/response, which explains the decision quite well.

Resistance is futile

Assuming that we are not trying to make use of the divide-by-zero behavior we need a convenient way of turning on the ‘bad’ floating-point exceptions. And, since we have to coexist with other code (calling out to physics libraries, D3D, and other code that may not be ‘exception clean’) we also need a way of temporarily turning off all floating-point exceptions.

The appropriate way to do this is with a pair of classes whose constructors and destructors do the necessary magic. Here are some classes that do that, for VC++:

// Declare an object of this type in a scope in order to suppress
 
  // all floating-point exceptions temporarily. The old exception
 
  // state will be reset at the end.
 
  class FPExceptionDisabler
 
  {
 
  public:
 
      FPExceptionDisabler()
 
      {
 
          // Retrieve the current state of the exception flags. This
 
          // must be done before changing them. _MCW_EM is a bit
 
          // mask representing all available exception masks.
 
          _controlfp_s(&mOldValues, _MCW_EM, _MCW_EM);
 
          // Set all of the exception flags, which suppresses FP
 
          // exceptions on the x87 and SSE units.
 
          _controlfp_s(0, _MCW_EM, _MCW_EM);
 
      }
 
      ~FPExceptionDisabler()
 
      {
 
          // Clear any pending FP exceptions. This must be done
 
          // prior to enabling FP exceptions since otherwise there
 
          // may be a 'deferred crash' as soon the exceptions are
 
          // enabled.
 
          _clearfp();
 
  
 
          // Reset (possibly enabling) the exception status.
 
          _controlfp_s(0, mOldValues, _MCW_EM);
 
      }
 
  
 
  private:
 
      unsigned int mOldValues;
 
  
 
      // Make the copy constructor and assignment operator private
 
      // and unimplemented to prohibit copying.
 
      FPExceptionDisabler(const FPExceptionDisabler&);
 
      FPExceptionDisabler& operator=(const FPExceptionDisabler&);
 
  };
 
  
 
  // Declare an object of this type in a scope in order to enable a
 
  // specified set of floating-point exceptions temporarily. The old
 
  // exception state will be reset at the end.
 
  // This class can be nested.
 
  class FPExceptionEnabler
 
  {
 
  public:
 
      // Overflow, divide-by-zero, and invalid-operation are the FP
 
      // exceptions most frequently associated with bugs.
 
      FPExceptionEnabler(unsigned int enableBits = _EM_OVERFLOW |
 
                  _EM_ZERODIVIDE | _EM_INVALID)
 
      {
 
          // Retrieve the current state of the exception flags. This
 
          // must be done before changing them. _MCW_EM is a bit
 
          // mask representing all available exception masks.
 
          _controlfp_s(&mOldValues, _MCW_EM, _MCW_EM);
 
  
 
          // Make sure no non-exception flags have been specified,
 
          // to avoid accidental changing of rounding modes, etc.
 
          enableBits &= _MCW_EM;
 
  
 
          // Clear any pending FP exceptions. This must be done
 
          // prior to enabling FP exceptions since otherwise there
 
          // may be a 'deferred crash' as soon the exceptions are
 
          // enabled.
 
          _clearfp();
 
  
 
          // Zero out the specified bits, leaving other bits alone.
 
          _controlfp_s(0, ~enableBits, enableBits);
 
      }
 
      ~FPExceptionEnabler()
 
      {
 
          // Reset the exception state.
 
          _controlfp_s(0, mOldValues, _MCW_EM);
 
      }
 
  
 
  private:
 
      unsigned int mOldValues;
 
  
 
      // Make the copy constructor and assignment operator private
 
      // and unimplemented to prohibit copying.
 
      FPExceptionEnabler(const FPExceptionEnabler&);
 
      FPExceptionEnabler& operator=(const FPExceptionEnabler&);
 
  };

The comments explain a lot of the details, but I’ll mention a few here as well.

_controlfp_s is the secure version of the portable version of the old _control87 function. _controlfp_s controls exception settings for both the x87 and SSE FPUs. It can also be used to control rounding directions on both FPUs, and on the x87 FPU it can be used to control the precision settings. These classes use the mask parameter to ensure that only the exception settings are altered.

The floating-point exception flags are sticky, so when an exception flag is raised it will stay set until explicitly cleared. This means that if you choose not to enable floating-point exceptions you can still detect whether any have happened. And – not so obviously – if the exception associated with a flag is enabled after the flag is raised then an exception will be triggered on the next FPU instruction, even if that is several weeks after the operation that raised the flag. Therefore it is critical that the exception flags be cleared each time before exceptions are enabled.

Typical usage

The floating-point exception flags are part of the processor state which means that they are per-thread settings. Therefore, if you want exceptions enabled everywhere you need to do it in each thread, typically in main/WinMain and in your thread start function, by dropping an FPExceptionEnabler object in the top of these functions.

When calling out to D3D or any code that may use floating-point in a way that triggers these exceptions you need to drop in an FPExceptionDisabler object.

Alternately, if most your code is not FP exception clean then it may make more sense to leave FP exceptions disabled most of the time and then enable them in particular areas, such as particle systems.

Because there is some cost associated with changing the exception state (the FPU pipelines will be flushed at the very least) and because making your code more crashy is probably not what you want for your shipping game you should put #ifdefs in the constructors and destructors so that these objects become NOPs in your retail builds.

There have been various instances in the past (printer drivers from a manufacturer who shall not be named) that would enable floating-point exceptions and leave them enabled, meaning that some perfectly legitimate software would start crashing after calling into third-party code (such as after printing). Having somebody’s hapless code crash after calling a function in your code is a horrible experience, so be particularly careful if your code may end up injected into other processes. In that situation you definitely need to not leave floating-point exceptions enabled when you return, and you may need to be tolerant of being called with floating-point exceptions enabled.

Performance implications of exceptions

Raising the exception flags (triggering a floating-point exception) should have no performance implications. These flags are raised frequently enough that any CPU designer will make sure that doing so is free. For example, the inexact flag is raised on virtually every floating-point instruction.

However having exceptions enabled can be expensive. Delivering precise exceptions on super-scalar CPUs can be challenging and some CPUs choose to implement this by disabling FPU parallelism when floating-point exceptions are enabled. This hurts performance. The PowerPC CPU used in the Xbox 360 CPU (and presumably the one used in the PS3) slows down significantly when any floating-point exceptions are enabled. This means that when using this technique on these processors you should just enable FPU exceptions on an as-needed basis.

Sample code

The sample code below calls TryDivByZero() three times – once in the default environment, once with the three ‘bad’ floating-point exceptions enabled, and once with them suppressed again. TryDivByZero does a floating-point divide-by-zero inside a Win32 __try/__except block in order to catch exceptions, print a message, and allow the tests to continue. This type of structured exception handling block should not (repeat not) be used in production code, except possibly to record crashes and then exit. I hesitate to demonstrate this technique because I fear it will be misused. Continuing after unexpected structured exceptions is pure evil.

With that said, here is the code:

int __cdecl DescribeException(PEXCEPTION_POINTERS pData, const char *pFunction)
 
  {
 
      // Clear the exception or else every FP instruction will
 
      // trigger it again.
 
      _clearfp();
 
  
 
      DWORD exceptionCode = pData->ExceptionRecord->ExceptionCode;
 
      const char* pDescription = NULL;
 
      switch (exceptionCode)
 
      {
 
      case STATUS_FLOAT_INVALID_OPERATION:
 
          pDescription = "float invalid operation";
 
          break;
 
      case STATUS_FLOAT_DIVIDE_BY_ZERO:
 
          pDescription = "float divide by zero";
 
          break;
 
      case STATUS_FLOAT_OVERFLOW:
 
          pDescription = "float overflow";
 
          break;
 
      case STATUS_FLOAT_UNDERFLOW:
 
          pDescription = "float underflow";
 
          break;
 
      case STATUS_FLOAT_INEXACT_RESULT:
 
          pDescription = "float inexact result";
 
          break;
 
      case STATUS_FLOAT_MULTIPLE_TRAPS:
 
          // This seems to occur with SSE code.
 
          pDescription = "float multiple traps";
 
          break;
 
      default:
 
          pDescription = "unknown exception";
 
          break;
 
      }
 
  
 
      void* pErrorOffset = 0;
 
  #if defined(_M_IX86)
 
      void* pIP = (void*)pData->ContextRecord->Eip;
 
      pErrorOffset = (void*)pData->ContextRecord->FloatSave.ErrorOffset;
 
  #elif defined(_M_X64)
 
      void* pIP = (void*)pData->ContextRecord->Rip;
 
  #else
 
      #error Unknown processor
 
  #endif
 
  
 
      printf("Crash with exception %x (%s) in %s at %p!n",
 
              exceptionCode, pDescription, pFunction, pIP);
 
  
 
      if (pErrorOffset)
 
      {
 
          // Float exceptions may be reported in a delayed manner -- report the
 
          // actual instruction as well.
 
          printf("Faulting instruction may actually be at %p.n", pErrorOffset);
 
      }
 
  
 
      // Return this value to execute the __except block and continue as if
 
      // all was fine, which is a terrible idea in shipping code.
 
      return EXCEPTION_EXECUTE_HANDLER;
 
      // Return this value to let the normal exception handling process
 
      // continue after printing diagnostics/saving crash dumps/etc.
 
      //return EXCEPTION_CONTINUE_SEARCH;
 
  }
 
  
 
  static float g_zero = 0;
 
  
 
  void TryDivByZero()
 
  {
 
      __try
 
      {
 
          float inf = 1.0f / g_zero;
 
          printf("No crash encountered, we successfully calculated %f.n", inf);
 
      }
 
      __except(DescribeException(GetExceptionInformation(), __FUNCTION__))
 
      {
 
          // Do nothing here - DescribeException() has already done
 
          // everything that is needed.
 
      }
 
  }
 
  
 
  int main(int argc, char* argv[])
 
  {
 
  #if _M_IX86_FP == 0
 
      const char* pArch = "with the default FPU architecture";
 
  #elif _M_IX86_FP == 1
 
      const char* pArch = "/arch:sse";
 
  #elif _M_IX86_FP == 2
 
      const char* pArch = "/arch:sse2";
 
  #else
 
  #error Unknown FP architecture
 
  #endif
 
      printf("Code is compiled for %d bits, %s.n", sizeof(void*) * 8, pArch);
 
  
 
      // Do an initial divide-by-zero.
 
      // In the registers window if display of Floating Point
 
      // is enabled then the STAT register will have 4 ORed
 
      // into it, and the floating-point section's EIP register
 
      // will be set to the address of the instruction after
 
      // the fdiv.
 
      printf("nDo a divide-by-zero in the default mode.n");
 
      TryDivByZero();
 
      {
 
          // Now enable the default set of exceptions. If the
 
          // enabler object doesn't call _clearfp() then we
 
          // will crash at this point.
 
          FPExceptionEnabler enabled;
 
          printf("nDo a divide-by-zero with FP exceptions enabled.n");
 
          TryDivByZero();
 
          {
 
              // Now let's disable exceptions and do another
 
              // divide-by-zero.
 
              FPExceptionDisabler disabled;
 
              printf("nDo a divide-by-zero with FP exceptions disabled.n");
 
              TryDivByZero();
 
          }
 
      }
 
  
 
      return 0;
 
  }

Typical output is:

image

When generating SSE code I sometimes see STATUS_FLOAT_MULTIPLE_TRAPS instead of STATUS_FLOAT_DIVIDE_BY_ZERO. This is slightly less helpful, but the root cause should be straightforward to determine.

That said, determining the root cause can be slightly tricky. On the x87 FPU, floating-point exception reporting is delayed. Your program won’t actually crash until the next floating-point instruction after the the problematic one. In the example below the fdiv does the divide by zero, but the crash doesn’t happen until the fstp after.

011A10DD fdiv        dword ptr [__fmode+4 (11A3374h)] 
011A10E3 fstp        dword ptr [ebp-1Ch]

Normally it is easy enough to look back one instruction to find the culprit, but sometimes the gap can be long enough to cause confusion. Luckily the CPU records the address of the actual faulting instruction and this can be retrieved from the exception record. This value is printed out when applicable in my exception handler, or you can see it in the Visual Studio registers window.

The sample code can be downloaded as a VisualC++ 2010 project (32-bit and 64-bit) from here:

ftp://ftp.cygnus-software.com/pub/FloatExceptions.zip

Handle and continue

If you want to get really crazy/sophisticated then it is possible to catch a floating-point exception with __try/__except, handle it in some domain specific way (handling overflow by scaling down the result and recording that you did that) and then resume. This is sufficiently esoteric that I have no more to say about it – consult the documentation for _fpieee_flt if this sounds interesting.

SIMD

SSE and its SIMD instructions throw a few wrinkles into the mix. One thing to be aware of is that instructions like reciprocal estimate (rcpps) never trigger divide-by-zero exceptions – they just silently generate infinity. Therefore they are a way that infinity can be generated even when the ‘bad’ exceptions are enabled.

Additionally, many common patterns for SIMD instructions only use some components of the four-wide registers. This could be because the code is operating on a three-float vector, or it could be because the code is operating on an array of floats that is not a multiple of four long. Either way, the ‘unused’ component or components in the registers may end up triggering floating-point exceptions. These exceptions are false-positives (they don’t indicate a bug), but they must be dealt with in order to allow floating-point exceptions to be enabled. The best way to deal with this is to ensure that the unused components are filled with valid data, at least in the development builds where floating-point exceptions are enabled. Filling them with one or zero is generally good enough.

Filling the unused components with valid values may also improve performance. Some CPUs drop to microcode when they encounter some ‘special’ numbers (NaNs, infinities, and/or denormals) and using well behaved values avoids that risk.

Practical experience

On some projects I have been able to enable these three floating-point exceptions, fix all of the accidental-but-unimportant exceptions, and then find a few crucial bugs hidden in the weeds. On these projects, enabling floating-point exceptions during development was crucial. On other projects – big messy projects with a lot of history and large teams – I was unable to get the team to buy off on the concept, so it ultimately didn’t work.

Your mileage may vary, but as with asserts of any type, enabling them early, and ensuring that violations get fixed promptly, is the trick to getting value from floating-point exceptions. Adding them to a large existing codebase is trickier, but can be dealt with by only enabling them in particular parts of the code where their value exceeds their cost.

Practical experience, hot off the presses

I’ve been trying to improve the usability of debug builds on my current project and one persistent problem was a NaN that would show up in the particle system early on, triggering many different asserts. I couldn’t tell where this NaN was being generated so I knew I had to enable floating-point exceptions, using the classes described above. This project was not designed to have floating-point exceptions enabled so there were several challenges. The process was:

  • Enable floating-point exceptions in three key functions that called out to all of the particle system code
  • Disable floating-point exceptions in one child function that did floating-point overflow by design
  • Pad one of our arrays of particle system data with valid data to the next multiple of four so that the unused SIMD lanes wouldn’t trigger spurious exceptions
  • Find and fix five bugs that were causing floating-point exceptions

It worked. All of the bugs were worth fixing, and one of them was the source of the NaNs. After most of a day of investigation the crucial fix was to change one letter – from ‘e’ to ‘t’ – and this was enough to prevent us from dividing zero by zero. Now our debug builds are significantly more usable, and a genuine bug that was causing (apparently unnoticed) glitches is gone.

Homework

The summary is that while floating-point exceptions, even the ‘bad’ ones, aren’t necessarily bad, you can often find bugs in your code by treating them as errors. By using the classes shown above, with appropriate #ifdefs so that they go away in retail builds, you can enable floating-point exceptions in most parts of your code, and thereby improve reliability and avoid unexpected behavior.

But please, don’t use the __try/__except block, in debug or in retail code. It is an ugly and dangerous hack that should only be used in specialized demonstration code.

Localization Notes

Original Author: Michael A. Carr-Robb-John

In a few days I am going to jump on a plane and move to a new country, this time I am off to Seattle. With this pending adventure it got me thinking about some of the wonderful fun I have had localizing games for foreign markets and I thought some of my notes might make an interesting post for anyone thinking of doing localization.

First though, some reader interaction… dig out the schedule for your current project and look through it. Is there a task anywhere on it titled “localization”? If it is on the schedule I would put good money on it being close to the end of the project usually within five tasks of the Alpha build.

If you have a task titled “localization setup” in the first half of your projects development cycle then you or someone on your team has probably already been through the pain of localization and is already preparing for it.

Location, Location, Location

What most people focus on primarily with localization is usually the language and although it is important it is only half the equation. The “Location” is the other half and this is where things get subtle and you need to become a little culturally aware.

Let’s consider this post, you might initially be unaware that I have localized it for English American. I am from England and I am British which means that I learnt to spell color as colour, localization as localisation and I pronounce ‘Z’ as “Zed” rather than “Zee”. These differences come about not because of the language but because of the differences in culture.

Keeping localization in mind during development means that when issues arise they can be fixed quickly rather than waiting until the end of the project and trying to fix it with a hack.

I do not profess to be an expert on these various countries and cultures, what follows are issues that I have experienced during the localization process. By sharing them I am hoping that you will firstly have a clearer idea of what to watch out for but also I would encourage you to share your own experiences in the comments section below.

Hand Gestures

Animators use hand gestures to breathe life into the characters, unfortunately there are quite a few hand gestures that can cause problems when dealing with localization. The “Okay” gesture, the “Thumbs up” gesture and the extended hand with palm outward are just three gestures that I have seen removed from various games due to how different cultures react to them.

You could go down the route of generating a different animation tree localized for each country but that does increase the amount of data to be maintained and also the likely hood of a bug being introduced in a specific country.

A quick internet search can produce some good descriptions of various gestures that can be problematic, this one even showed gestures I was unaware of before writing this post: Telegraph.

In game visuals

Imagine that in your game there is a wall upon which scribbled in a marker pen is a message that is important for putting the player on the right track. Now image that Portuguese is your language, unless that texture has been translated into a different language you have just made it extremely difficult for those players to progress.

Apart from doing a texture swap based upon the selected language you might also consider adding a language trigger where if the player gets close to the wall it will pop up a translation.

The Language of Color

Color plays a huge part in our emotional and physiological state even though for the most part we are completely unaware of it most of the time, what is really interesting is that the specific colors actually change from one culture to another. For example in Western countries black is used to represent mourning however in South America the color to use would more appropriately be purple.

I am mentioning this here more for completeness since I can only remember only ever once changing a game color for localization. The specific issue was relating to the color that a map of Germany had been painted.

This is quite a useful and interesting chart:  informationisbeautiful.

Violence

Germany has very strict laws about violence especially where humans are concerned, as a consequence there have been a few games where the red blood has been changed to (robot) green. Or in the case of one of my projects the characters were to be tortured for information, in the German version we simply covered the animations up hiding the offending visuals from the players in Germany.

Adult Content

Worth remembering that not every country believe that games are for adults, one such example is Australia which at present does not allow a rating of 18+ for games.

Have a look of this list on wikipedia of banned games in different countries, useful to know why a game was banned since it isn’t always about violence and adult content.

Audio

Another facet of the localization process is obviously to make sure the correct spoken speech / dialog is triggered in game in the correct language. It is easy to look through an excel document and see when translation text is missing however it is a lot harder to check the audio which is why I try to encourage artists / engineers / designers on different platforms to spend time working in the different language and locations as early in a projects development as possible.

Hopefully you have finished reading this and there wasn’t anything new that you didn’t know already, if anything here was surprising and new then you probably should kick localization up your schedules to do list.