(Audio Post) Our model of the world and games

Original Author: Matt Yaney

Hey everyone,

Mike Acton here. For Matt’s post today, we decided to team up and try something different.

One of my goals with #AltDevBlogADay is to get a little more personal view of developers. Get our (collective) voices out there. Well, I got to talking with Matt about some of those ideas and that’s what we’re sharing with you today. Enjoy!

(Mike and Matt) Introduction

Our model of the world and games

Minecraft and LittleBigPlanet and stuff

Let us know what you think!


The Path to Good Scripting is Relative.

Original Author: Jake Simpson

Scripting Languages are hard to get right. It’s so easy to just implement Lua or stackless python, throw a few bindings at it so the in script state can be read by the game engine, say to the scripters “Be careful” and leave it at that. It’s even easier to build something specific to your game engine and throw that over the wall.

The natural results of a lot of these approaches are very cpu expensive scripts, badly written with lots of stack issues, filled with magic numbers that make them *extremely* tied to specific level geometry, lots of experimentation by the scripters and basically a lot of extra code being run for no really good reason.

In my experience with scripting languages (and I should point out my credentials – I was responsible for the scripting language The Sims 2 sits on), it’s usually less about the *actual* scripting language used – this won’t be a “home grown” vs “off the shelf” blogs post – and more about the feature set that is exposed in scripts that make them either easier to write or harder.

The first thing to understand is that relativity is key. What’s meant by that is that scripts should NEVER allow the scripter to write what is basically trig, or write magic numbers in the script that relate to something that can be changed externally to the script. For example, a script that says “If the NPC is within range A of coordinates X, Y and Z” is so specific to the environment it’s being run on that if that environment changes – someone modifies the level’s terrain for example – as to need re-writing every time something changes. I’m sure we can all agree this is bad news.

A script that says “If the NPC is within range A of Object B’s coordinates X, Y and Z” is marginally better, since the values for X, Y and Z are now coming dynamically from object B, which means if you move Object B in the world, the script doesn’t need to be re-written.

However one of the impacts of this approach is that the scripter still has access, at run time, to absolute coordinates (in fact it doesn’t matter if the coordinates they have access to are absolute, relative or whatever – it’s the fact they have access to ‘real’ numbers that matters). Why would you need this? Why to do trig of course! Wait, what?

Seriously, 95% of the time that real coordinate values are made available inside of a scripting language the scripter WILL end up doing some degree of trig math on them, in order to work out stuff. Now, as I’m sure most script supporting tech guys are already saying – “that’s bad news”. And it is. Doing trig math in a parsed language means a) parsing it in the first place, b) doing range check verification on the values coming in – a divide by 0 at run time is a bad thing for example, but most scripters won’t check for that themselves, c) doing the trig (and range checking the results) and then d) returning the result to the script. This is expensive, time consuming and will often be done in some kind of loop. So what to do to avoid this?

Well, a script that says “If the NPC is within range A of Object B” – where ‘within range’ is an actual engine function that is being called, so the math is being done in native code directly without the scripting system ever seeing any coordinates, then all of a sudden the ability to do expensive trig is removed. If you don’t have access to world coordinates, then there’s very little trig you can do on it!

Basically you’ve made your language relative. Everything that happens suddenly becomes relative to objects in the world, wherever they may be, and your scripting code never needs to know what that is. Want to route to a door to open it? A NPC script with the line  “Route to Object A, in front of, distance X, facing” means the routing system will find Object A and plot a route to infront of the Object, at distance X and your NPC will end up facing Object A. At no point is the script doing any trig, nor is it handling any magic numbers except for passing in distance, which really should be defined by the animation you are expecting to run once the NPC arrives at the door (E.g. reach out and grasp the handle) anyway.

Another aspect of relativity is that now we are off loading functionality into the native engine (where is should be, for speed), we should also be offloading object state too. Part of what makes a scripting language attractive is the idea of holding state in the script and not having to ask the game engine for obscure stuff. While this is nice, there’s a balance to be held in knowing that if the script crashes (and it will – there are ALWAYS edge cases), you can actually recover from that if your object state is held in the game engine. All you do is reset certain elements of state and restart scripting (and hope that the conditions where the script crashed where environmental and the environment has changed sufficiently to avoid it happening again). However, if all your state is held in script, when the script crashes, so does all your state. This is why holding essential state externally in the native engine rather than in script is a good idea, plus that comes with the added bonus that when you need to save game state, it’s all in the C++ implementation of Save – you don’t need to add extra code in the actual script to save out state – it’s already being saved in the native engine since that’s where your essential object state is held anyway.

Of course the downside to this approach is that going back and forth between bindings to get a state variable that is being checked or written to often can also be expensive, but like dereferencing pointers, in any given function you should be doing that once and holding the result in a local variable, and then re-writing the new state back out again once the function is complete.

One last thing that every engine should have is a built in profiler – something that can be run in your nightly process (or whenever anyone has time) that actually watches script durations and keeps track of loops, total duration, individual instruction counts / durations etc. This kind of instrumentation (handled in engine, not in the script) is essential to catching scripts that work in isolation but get out of control in real world situations. Without this, scripting optimisation is effectively guess work.

Good scripting implementation is often less about which language you use and more about the features you expose (or not, as the case may be) and constraints you put on your scripters. The more they get used to this, the more they will come to you for extra functionality to be put In Engine, where expensive functionality really should be.


My Cat Is Better Than Your Dog

Original Author: Mike Jungbluth

Ever got into a heated discussion with another dev in your discipline about which software package or toolset was better than another? This is certainly so among animators, but I know I’ve heard the same discussion from other disciplines. I myself have many times been locked in battle as to which is better, Maya or Max. Or why Motionbuilder is so terrible/awesome. And while caught up in the moment, it seemed like the most important, possibly life and studio changing argument of all time. Nay, it could change the entire INDUSTRY if I could only sway the person I am talking to into believing my software package and tools were the best. The world would become a utopia if only I could convert them to my process.

But to everyone else listening, all they hear is, “My cat is better than your dog.”

I’m not saying these discussions are pointless. Far from it. They point out the flaws and shortcomings in different workflows which ultimately make you fix them or re-examine how you work. And that is amazing. But the chances of getting someone to switch sides is about as likely as convincing a dog person to give up their hound for a cat. It won’t happen. There is a personal attachment to workflows and a level of investment that is almost impossible to breach. It starts by learning how to do what it is you love in that program and then builds through years of learning how to deal with its quirks and shortcuts it becomes a part of your family. You grow comfortable, knowing what it consistently can and can’t do, to the point that if it craps on your foot you can just bear it and grin. And if anyone dares to point out that something else might be better, it’s only natural to get defensive. Or point out the flaws in THEIR method of choice to make yours look better. Maybe animators get worked up over this more than others. It could be that software packages are one of the few things animators have control over in game development and as such take it so seriously. But when you step back and see that all programs do the same thing, but with just different pros and cons, you begin to realize that what you both really want is the same. The ability to create something you love.

And that got me thinking. When was the last time I got as heated up with another animator or developer about something in the game we were making? I know I have done it but was I as critical and honest as when I was ranting about a random piece of licensed software? Many a times I’ve gone on about how not having a graph editor is akin to making me animate with one hand tied behind my back. I know I have certainly used that critical and brutally honest eye when playing OTHER people’s games, getting angry about why a camera randomly switches from 1st person to 3rd person between cinematics and how that breaks the player’s narrative. But I can’t honestly say I have always gone to the same lengths on a feature in my own games as often. I will point something out or go on small diatribes, sometimes to someone that can fix it, often times just to another animator, and then just assume the person working on it will take care of it. Because SURELY they must see its shortcomings and it isn’t a finished product yet, so there is time to fix it. I’m always of the mindset that everyone I work with is more qualified at their job than I am, because I can’t do what they do. And that means it MUST be harder than what I do, so they must be smarter. And there are still months left to fix it, so it’s going to be fine. Software packages and other games are finished products after all so it’s easier to say, “WHOOPS! That is just WRONG?!” Beyond all that, as an animator in games, my voice is silenced very quickly by programmers or designers saying something can’t be done for a variety of reasons, be it technical or gameplay related. Animation is a service industry in games, and the chance to be a DIVA as you might be afforded in films is not going to happen. Nor should it. Film is about story and animated performances are what drive it. That means you get the spotlight. Games are about gameplay, which means animation is there to serve it, not direct it.

And that is the point I have to slap myself. All of those are just excuses that allow me to pull my punches at work, because I know the people around me personally and have to interact with them on a daily basis. And while I should be MORE honest with them, I end up giving them a lot of leeway because of all those factors I listed above. But worst of all, I’m giving myself leeway to cop out. And as far as I am concerned, that is a cardinal sin of game development.

So how do I fix this? Well, first step is to always attach that critical eye I love using on other people’s products onto my own teams. Whenever I see something that looks wrong, I’ll make a note of it. Whenever I work on anything, I will make sure I understand the core of what is needed. Any task I am given I will talk one on one to the person who requested it and make sure we are working together to make the most unique and compelling animation possible. Any character I am animating I will make sure I know their unique voice or find it if none already exists. And for everything, not just assume that someone else sees a problem. But the biggest help is instead of just complaining to another animator that the story doesn’t make sense at this point, that the characters personalities are interchangeable, or that the moment to moment gameplay in this certain area is boring, I will tell it to someone who can fix it.

And you know what, when you start doing it, and it works, it feels great. When you can point to moments in your game that are now successful when before they were weak, well that is just about one of the greatest feelings you can have. It is something you can take real pride in.

Yes, these were all things we already do to varying degrees and should come rather naturally to anyone in this industry. But it changes from project to project and studio to studio. Some team cultures have this naturally built into their workflow, and others require you to push a little harder to be heard. But just make sure it exists in some form, because if it doesn’t, then you probably need to get out.

Ultimately the one thing I will ALWAYS do from now on, is when I feel myself getting locked into a debate about whether my cat is better than another animators dog, is redirect that focus and conversation towards something that can actually be seen and felt by the player. Because all that matters to them is what we are making, not what we are making it with. So I’m just going to do what we can ALL benefit from; making something as memorable and cherished as my cat. Or your dog.


Introduction to Behavior Trees

Original Author: Bjoern Knafla

What is a behavior tree? How does it work and what is its role in game AI?

Welcome to a series of blog articles about my experiment (read: stumbling around) of marrying data-oriented, memory-streamlined behavior trees with a second representation to ease creation and modification during development. I write it to document my findings and decisions and to ask for your invaluable feedback to build a BSD licensed BT toolkit that is truly useful.

Article Updates

  • March 10, 2011 – Added a reference to the second article in my behavior tree experiment blog series.
  • March 05, 2011 – Posted bjoernknafla.com, too.
  • March 02, 2011 – Reader eric wrote a fantastic behavior tree feature analyzation in the comments section. Don’t miss it!
  • February 24, 2011 – added a reference section to the end of the article with additional references not found in the text.
  • February 24, 2011 – added a section to show the advantages of behavior trees over finite state machines based on a question by snake5.

Background

Behavior trees (BTs) are deployed by more and more game AI programmers to implement reactive decision making and control of the virtual creatures entrusted to them as retrospective for 2010 and outlook for 2011.

What is a behavior tree and how does it tick

My view and understanding of behavior trees has been fundamentally shaped by here.

Factually, behavior trees aren’t just trees but directed acyclic graphs (DAGs). A node can have multiple parents which allows reuse of sub behavior trees and therefore modularization. I’ll stick with the not-quite correct tree term as it is the one most widely used.

Behavior trees replace the often intangible growing mess of state transitions of finite state machines (FSMs) with a more restrictive but also more structured traversal defining approach. Behavior trees are formed by hierarchically organizing behavior sub-trees consisting of nodes. Visiting a node, respectively the sub-tree it roots, means running it according to its semantics. Execution of a node or a sub behavior tree results in an (aggregated) return state, e.g.:

  • the node is ready to run before it is called,
  • it finished with with a success state,
  • the behavior has not finished running yet and should be called again during the next simulation step to eventually finish,
  • it failed cleanly without side effects, or
  • an error occurred that had side effects to explicitly deal with.

To update a behavior tree it’s always traversed beginning with its root node in depth first order. Each traversed node affects traversal. Selector nodes select one of their child nodes (if available) to traverse next. After the first child node and its associated sub-tree has been traversed the node is re-run to react to the child return status so it can decide if traversal should go up to its own parent node or if it might select another child node to traverse next.

Inner nodes explicitly control traversal semantics, e.g.:

  • On each traversal priority selectors check which child to run in priority order until the first one succeeds or returns that it is running. One option is to call the last still running node again during the next behavior tree update. The other option is to always restart traversal from the highest priority child and implicitly cancel the last running child behavior if it isn’t chosen immediately again.
  • Sequence selectors run one child to finish after the other. If one or multiple fail the whole sequence fails, too. Without a reset or without finishing the last child node a sequence stores the last running child to immediately return to it on the next update.
  • Loops are like sequences but they loop around (hah, who would have thought!) when reaching their last child during their traversal instead of returning to their parent node like sequence node do.
  • Random selectors randomly (hah again) select which of their child nodes to visit. A running node is visited again until it finishes.
  • Concurrent nodes visit all of their children during each traversal. A pre-specified number of children needs to fail to make the concurrent node fail, too. Instead of running its child nodes truly in parallel to each other there might be a specific traversal order which can be exploited when adding conditions (see below) to a concurrent node because an early failing condition prevents its following concurrent siblings from running.
  • Decorator nodes typically have only one child and are used to enforce a certain return state or to implement timers to restrict how often the child will run in a given amount of time or how often it can be executed without a pause.

Leaf nodes are classified as:

  • Actions which finally implement an actors or game world state changes, for example to plan a path and move on it, to sense for the nearest enemies, to show certain animations, switch weapons, or run a specified sound. Actions will typically coordinate and call into different game systems. They might run for one simulation tick – one frame – or might need to be ticked for multiple frames to finish their work.
  • Conditions check that certain actor or game world states hold true. If a sequence node has a condition as one of its children then the failing of the condition will prevent the following nodes from being traversed during the update. When placed below a concurrent node, conditions become a kind of invariant check that prevents its sibling nodes from running if a necessary state becomes invalid.

Actions and conditions get filtered access to the actor and the world state through an actor specific blackboard of collected or pre-computed values passed to them during an update step.

There be dragons – an example behavior tree and its traversal

I didn’t truly grasp all the details of the behavior tree update traversal until I started to implement it for myself. To shorten this process for you let’s take a look at an example how the traversal of a behavior tree for a dragon might work. Here is the simple example behavior tree:

  • 0. Root priority selector
    • 1. Guard treasure concurrent selector
      • 1.1. Is thief near treasure detectable? condition
      • 1.2. Make thief flee (or eat him/her) behavior
    • 2. Rob more treasures sequence selector
      • 2.1. Choose a castle to fly to! action
      • 2.2. Fly to castle! action
      • 2.3. Fight (and eat) guards
      • 2.4. Is still strong enough to carry treasure home? condition
      • 2.5. Take gold! action
      • 2.6. Fly home! action
      • 2.7. Put newly robbed treasure to possessed treasure! action
    • 3. Post pictures of hoard to Facebook behavior

Each of the child nodes (and of the child nodes child nodes, and of the… you know the drill) might be another behavior defining sub-tree. For this example I only examine child node 1. and 2. in detail.

The root of the example tree is a priority node (with id 0).

Guard treasure is a concurrent node with two children. The Is thief near treasure detectable? condition only returns success if a trespasser can be sensed near the treasure, otherwise it fails. Keep care to not only test if a thief is near the hoard but also if the dragon is in a position to detect the robber. If the dragon is currently away to organize more gold it can’t see thieves to abandon its raid. Recall that a priority selector might always check its highest prioritized child behaviors first instead of sticking to a running child.

The Make thief flee behavior might be a sub-tree or might be an action to activate the games combat system.

Back to the children of the root priority node – its second child Rob more treasures is a sequence node. Choose a castle to fly to! is an action that selects a castle to raid and searches for a path to it. Via Fly to castle! the dragon flies along the calculated path until it reaches the castle. These are the nodes I cover in the example traversal. The others should be pretty self explanatory by their names.

On beginning the traversal all nodes are ready for execution.

Let the first behavior tree update begin – it visits and runs the ready to run root priority node 0.

The priority node checks its child nodes from the first to the last until a child node returns a success or running state. It fails if no succeeding or running child node can be found.

The first priority child behavior with id 1. is a ready concurrent selector.

To traverse it all of its child nodes are visited concurrently.

Its Is thief near treasure detectable? condition node child 1.1. fails because no trespasser is visible.

Therefore the whole concurrent Guard treasure node 1. fails.

Priority node 0. tries to run the next child in its priority order. Child 2. Rob more treasures is ready and therefore visited.

The sequence node starts by steering traversal into its child node 2.1. Choose a castle to fly to!.

A target castle has been chosen successfully.

The next child in the sequence is executed – node 2.2. Fly to castle!.

As the castle is far, far away it can’t be reached in this update step so it returns running.

The sequence selector 1. can’t precede further and also returns running to its parent node.

The priority selection root node bails out with a running state, too, it has found a running child node so it doesn’t need to check lower priority child nodes to traverse. This update traversal is done.

All non-running nodes are set to be ready before the next traversal.

During the next simulation step the behavior tree is updated again and traversal restarts at the root node.

It checks its first child node which subsequently still fails because the dragon can’t see a thief nearby its treasure.

Therefore traversal reaches the next priority node child Rob more treasures which hasn’t reset its running state and still stores which of its children was the last one executing because no higher-priority sibling has run so it wasn’t reseted.

The Rob more treasures sequence node has stored that it last visited its child action 2.2. Fly to castle! – which might succeed, or keep running, or fails during this update step.

I’ll stop bombarding you with behavior tree traversal diagrams for now.

Behavior trees vs. finite state machines – fight!

Why replace finite state machines (FSMs) with behavior trees?

The transitions between FSM states give a finite state machine creator great freedom – and enough rope to grow into an intangible mess, as they proliferate and are harder and harder to follow and understand. Hierarchical finite state machines (HFSMs) help a bit in this regard though.

However, the main point where behavior trees shine, is in their clear – even if a bit restricted – vocabulary in comparison to FSMs. For me it is easier to focus on what I want to express while I often feel a bit lost and uncomfortable where to start with complex FSMs. Constraints are good in this case – they function like guidlines for my thinking.

Selectors force me to think and view problems in a very specific way. Ad-hoc transition structures of finite state machines become explicit.

In addition, conditions and actions as atomic building blocks plug-able into selectors, guide me to make them more reusable. An action or even a whole behavior sub-tree can appear at multiple places in a tree. Sspecial conditions can be grouped with them and guard behaviors to fit into the specific place in the tree. If there are actions to fight enemies with impressive combat styles, then by combining them with conditions, the actions can “rely” on the actor being near the enemies and on having enough room to use the specific combat style.

Alex Champamdard often makes the point to think about actions, behavior trees and sub-trees as goals to reach. The tree is formed around these goals to check that they are reachable, to run through the right steps to approach them, and to ensure that there aren’t other, higher priority goals, that need the actors immediate attention.

Because of their explicit structured, behavior trees also enable the memory organization I am working on, and therefore allow a very efficient traversal – something I can’t envision with FSMs right now (though I haven’t thought about it in depth).

That’s it – what’s next?

So much for this post. I hope that I could give you a good introduction of what behavior trees are and how they conceptually work. In my next blog post I will motivate why a memory streamlined behavior tree representation makes sense, how I envision such a data-oriented behavior tree to look like, and, I will start with the first test-driven implementation of behavior tree functionality.

It would be a blast to see you again in two weeks!

Cheers and bye,

Bjoern

References

References from the article

Behavior tree libraries and frameworks

Paper and book references

  • Alex J. Champandard’s Getting Started with Decision Making and Control Systems, AI Game Programming Wisdom 4, section 3.4, pp. 257–264, Course Technology, 2008

Behavior tree series

  1. Introduction to Behavior Trees (this text)
  2. Shocker: Naive Object-Oriented Behavior Tree Isn’t Data-Oriented
  3. Data-Oriented Streams Spring Behavior Trees
  4. Data-Oriented Behavior Tree Overview
  5. Behavior Tree Entrails

Collection of series related links on pinboard.in

You don’t have a clue and you should stop pretending that you do

Original Author: Jonathan Adamczewski

When it comes to making improvements to your code, you repeatedly demonstrate an inability to pick the right targets. Your intuition is without connection to reality, and you waste time trying to solve problems that don’t need solving. You don’t even try to gather evidence before sallying forth on a pointless quest to fix something that isn’t broken.

You don’t have a clue and you should stop pretending that you do.

(It should be made clear that instances of “you” or “your” should be replaced with “I” or “my”. This article is written to the author. There is no suggestion that the reader has anything to learn from this article. Not at all. There is no hyperbole and no sarcasm.

Where was I were you did this interlude begin? Oh, yes…)

You don’t have a clue and you should stop pretending that you do.

Preparation

You try to improve the wrong things. Not every piece of code must have a minimal instruction or cycle count. Neither is minimal memory use, maximum generality, nor even sufficient reconfigurability essential to every project.

So, let us start at the beginning. What actual problem is this code trying to solve?

No, not “what are the inputs and outputs” – what is the problem?

You don’t know, do you?

You think you do know? Really? You’ve considered the needs of the users? The amount of precision needed? The allowable memory use? For code and data? The amount of time you have to write the code? How about the time you have to find and fix the bugs? How will you even know if there are bugs? How will the testing and validation be done? What algorithms will you use? What alternatives are there? How will you evaluate the performance? What about the API – is your user interface going to encourage more bugs later on? Does the program need to be portable? To what other architectures? Who will be maintaining it? How long is it going to take to recompile each time you change one of the many hard-coded variables you’ll thoughtlessly distribute through the codebase?

You might want to think it through.

For longer.

Maybe you should try explaining your choices to someone else – do you believe it yourself when you articulate it to a third party?

I didn’t think so. Try again.

OK, that will have to do. Thanks for trying.

Iteration

So you’ve written some code and discovered that it doesn’t actually meet some essential requirement of the problem description.

Is that essential requirement actually essential? Or just something that you’ve convinced yourself is essential? I thought so. Focus on the things that actually matter – leave the novelties and interesting distractions for your spare time.

What now? You’ve found a shortcoming? Another one? And this one actually matters? Really? Oh – you’re right. Well then – you should fix it.

Wait! What!? STOP! What do you think you’re doing?

Fixing the problem? That’s not the problem!

It’s really not.

Prove that what you’re working on is actually the problem.

You can’t, can you.

Oh, you have a hunch, you say. It’s intuition, you say. When was the last time your had an accurate hunch about a problem like this?

When? Never? Because you’ve never seen a problem quite like this one before?

That explains a lot.

Measure. Dissect. Examine. Understand the problem. Test your understanding. Go deeper. Deeper. Find the root cause.

Do you know what the problem is now? Are you sure? You may have found the cause, but aren’t 100% certain. You can do better than that.

Find what is causing the problem, not just something that looks like it might be. Did you find the cause, or just another symptom?

Do you know what the compiler is doing? Do you understand how the hardware works? Can you trace what the supporting libraries and operating system are doing? Do you have other source code and documentation you can read and learn from?

Do you actually understand the code you wrote?

Oh, you found the problem and fixed it? Excellent! Where was it? In the only place you were certain that it couldn’t be. That’s not surprising. Not at all.

Reflection

Let’s face facts: you aren’t very good at designing or implementing code to meet the actual requirements. When you start before you know the requirements, what chance do you have of meeting them?

And you aren’t very good at making improvements because you start in the wrong place. How can you make something better when you don’t understand its deficiencies?

The good thing is this: you can be better at design and iteration. Work on your habits. Look for evidence first. Don’t trust your instincts. Don’t trust the compiler too much (don’t distrust it too much, either – don’t kid yourself, it almost always makes better inlining decisions than you do). Know your platform – the software and hardware. Solve the right problems and measure. Without measurements, your heuristics are just guesses, and probably not very good ones.

Stumbling around in the dark is a terrible way to get work done – give yourself as much light and opportunity as you can to succeed.

[Photographs by Karen Adamczewski]


Team Building (Mini Game Week)

Original Author: Justin Liew

In this post I want to bring up one of the highlights of my time in the industry and how it enhanced our company, teams and processes.  While I know some of you are hoping it will involve camping, drinking and embarrassment of colleagues you may know, this was not your typical team building event.  For Mini Game week, the studio was divided up into 18 teams of 7-8 people each.  On Monday morning we were given the teams, assigned to an area, and our task was simple: deliver a game on Friday morning.  There would be prizes and bragging rights.  Everyone was involved: executives, programmers, artists, designers, admin, and managers.  There were no restrictions on theme, platform, or technology.  And there was no time to waste!

Our team comprised 3 programmers, 2 artists, 1 animator and 1 designer.  Our first two tasks were to come up with an idea, and settle on the tech we wanted to use.  We brainstormed until lunch and it involved lots of whiteboard discussion, diving into ideas, and ranking different designs.  Our main considerations were ensuring we weren’t overscoping ourselves, that we were playing to our strengths, and that we weren’t complicating things.  On Friday we’d be presenting our game to the studio in the form of a science-fair-type walkaround, so we had to draw people in without losing them with an elaborate explanation of concept or controls.  We did up a simple paragraph describing the game, and while the designer and artists brainstormed deeper into that, the programmers experimented with tech.  Each of us took an engine and tried to create a prototype by the end of the day, and argue for that platform.  I took XNA, and the other two considerations were Unreal and Flash.  An hour later, I had taken one of the XNA samples and modified it into the beginnings of what would be our final game.  The Flash SE struggled with how to handle controller input, while the Unreal demo struggled with sheer overhead of building assets and an over-qualified toolset.  By the end of the first day we had our design, our tech, and had made our first version control checkin!

The rest of the week was a blur of quick decision making and implementing.  Because we were using XNA and our artists and designers didn’t have Visual Studio, we ended up with a simple method of getting our content into our game.  The SEs would work in Perforce, and when we deemed enough changes or fixes were complete, we’d push a build to the network for the rest of the team.  Due to the way XNA loads content, they could edit XML files, drop art assets into the build directory and be able to load them into the game without programmer intervention.  This made for a fairly efficient workflow, and shielded them from our inevitable breakages early on.  Once they were happy with the art they’d check it in and we’d sync to make sure nothing went wrong.  We worked this way throughout the week.  Through an organic process, team members found their roles.  People stepped out of their comfort zone to take on tasks that needed doing, and even the rote tasks were snapped up, as the adrenalin rush of quick iteration and results motivated us.  By Thursday morning we had essentially an “alpha” version of our game, and breathed a bit lighter, knowing we could tune and fix bugs for the rest of the day and not push risky work into demo day.  As the deadline loomed, rumours were abound.  Throughout the week people had been keeping ideas to themselves, but nearing the end it became less likely someone would steal a design.  During the week, we felt confident our game would hold up to the competition, but in the end, we underestimated the talent and ideas brewing in the rest of the studio.

Demo Day was probably the best day I’ve ever had as a professional game developer.  Late Thursday night people were at the office into the evening, fine tuning their games, making signs, coming up with props, and the next morning the studio was transformed.  The energy in the air was crackling, as people wandered from room to room playing the other 17 games that had emerged from the chaos.  I think there was general amazement at what had been accomplished in 4 days.  No team worked an insane amount of OT, and almost everything was done from scratch.  So to see network multiplayer, complex 3D models, intricate levels, and amazing designs, was remarkable.  Some highlights were the use of the Rock Band drums to control fighting apes, the use of voice pitch to control the player, and a fully polished game that eventually won best overall game that was written in plain C/DirectX.  There were games done in XNA, Flash, Phyre Engine, PopCap, and plain old C/C++.  There were PC games, 360 games, and PS3 games.  It was a diverse collection of ideas and decisions that was birthed from groups of people who may have only worked together for the first time at the start of the week.

So what was the point?  Was this just a week off from the grind; a chance to slack off and work on a pet project?  I believe the value of this week cannot be underestimated.  As an individual team we learned many things:

  • quick iteration is possible, and beneficial.  This was apparent early on when I had the XNA demo up and running before Unreal was even done cooking assets for the first time!  Don’t settle for a bad process, as it can make all the difference.
  • decision making under pressure.  Sometimes decisions just HAVE TO BE MADE.  During this week, there was no alternative, and some of our best ideas came late in the day when a little bit of panic was bubbling.  Put creative people together, give them a deadline and a goal, and they’ll come up with great things.  A choice doesn’t have to be analyzed for months to be right.
  • scaffolding is essential for prototyping.  There were many systems we hacked in, but they worked.  If we ended up shipping our game we could easily replace them, but getting systems built up quickly to support the end result was the only way to do things.
  • technology isn’t the focus of games.  Programmers obsess on technology and will debate their language of choice to the death, and rightly so, as it is our job for the short and long term health of the studio.  We’ll play competitors’ games and analyze the anti-aliasing, physics system, and network performance.  Don’t get me wrong, bad technology can kill a game and a studio.  But, at the end of the day, design and gameplay rules.  Of the 18 games, the technology didn’t give any one an advantage over another.  XNA didn’t make our game more fun.  We set aside those glasses for the day and looked purely at mechanics, and what made the games fun, and it felt good!

From a studio point of view, there were also benefits:

  • camaraderie.  Our studio was growing, so at the start of the week there were many introductions to be made. By the end of demo day, the 7 of us had formed bonds that would help us down the road when working together.  The importance of team composition cannot be underestimated, and I think if anything the week showed that we had hired the right people at our studio.  There were no reports of ego trips, slackers, or major conflicts.  People were having fun, taking on different roles, and working well as teams.
  • the importance of prototyping.  Why don’t we do this every time we create a new project?  Why do we restrict this decision making to a small group of people?  It was clear that everyone had good ideas.  With polish, I think we could have easily shipped three or four of the demos.  It was only a week, and we generated more ideas and pushed them to some level of completion than we could’ve any other way.
  • evaluation of technology.  Sure, this goes against my last point above, but while technology didn’t impact the final games, there were pros and cons everyone ran into.  The team that tried PhyreEngine was the first group in the studio to evaluate it, and I think they learned more during this week than they could have with a more formal evaluation of the product.  I hadn’t written any C# before this week, and while the code I wrote was fairly horrible, I learned what was easy to do and what was difficult to do in a short amount of time.
  • expansion of skill sets.  We had designers using Photoshop, world modelers tuning animations, and production coordinators designing the UI.  This sort of activity can prevent a studio from stagnating.
  • everyone saw every stage of game development.  For some of the greener employees, they saw how ideas were generated, how people interacted, and how to “final” something.  While it was very compressed and obviously doesn’t represent a true cycle, it was nonetheless valuable experience.

So overall, it was a fabulous exercise, relevant to our studio, and a whole lot of fun.  We never did it again, but if I ran my own studio, I’d do it every year.  I know it is difficult to find a point where essentially shutting down studio operations for week wouldn’t negatively impact a game team, but perhaps a smaller scale operation done by a single game team post-finaling, for example, might work.  What do you think?  Do you see the benefit of such a thing?  What other events have other studios done that felt beneficial?


Tips for High Performance Art

Original Author: Richard Sim

Tips for High Performance Art

While performance is often thought of as a programmers problem, the truth is that even some simple changes to how art is authored can have a drastic affect on a games performance. Unfortunately programmers suck at divulging much of this information – something I hope to remedy somewhat in this post. As with anything performance related, this isn’t an absolute list of rules to live by – some will depend on the architecture of the engine you’re working with, others on the particular scene, and yet others on the artistic look and style of the game. It’s also by no means comprehensive; I think this will likely spill over into another blog post. It’s best to discuss with your rendering team how (and if) each of these impact your game.

Pixel Quads and Pixie Dust

A few days ago an artist asked me for information on specular behaviour for materials, to which I sent him a link to the post Technofumbles.

Spatial Partitioning. Part 1: Survey of spatial partitioning solutions

Original Author: Jason-Swearingen

A quick note:

Being my second post here at #AltDevBlogADay, I was going to write about the commercial viability of XNA.  However I’m just finishing up the architecture of our new spatial partitioning system, so I’ll be doing this series first, while it’s still in my head 🙂

Spatial Partitioning Series!

I’ll be spending the next two (or more) posts on spatial partitioning.  Starting this post is a quick survey of the spatial partitioning solutions I know most about.  If you have a favorite that I don’t mention, please do let me know in the comments!

What’s spatial partitioning?

For those of you who are not familiar with the subject, you are forgiven.  Spatial partitioning (quad trees and the such)  are not really known for being the sexiest part of game development, that honor goes towards rendering or perhaps AI.  However without a proper spatial partitioning choice, that awesome “moar bloom” being thrown at the screen runs into a brick wall trying to figure out what game objects are in the camera vs what is not.  That’s ultimately what spatial partitioning is all about:  reducing the cost of  finding objects you care about by storing objects in some sort of database that is easy to query by location.   Instead of going on-and-on about it, I’d refer the beginners to Real time collision detection” book on the market (chapter 2 and 7 specifically).

A Survey of the solutions

Here’s a list of the various spatial partitioning systems that I happen to know about.  I consider myself fairly well versed in the subject, so if some guru’s happen to know of a gaping hole in my knowledge, please fill me in via the comments!

Grid

Perhaps the simplest solution, this is usually implemented via a 2 dimensional array.  3d grids are of course possible, but given that most 3d games have a generally 2d distribution, and the fact that 3d grids quickly eat your systems RAM, 2d is much, much more common.

Usage Characteristics

There are generally 2 ways of using grids:   Either each cell contains the objects who’s bounding volumes (usually sphere or box) are center positions are located in the cell, or contain all objects that touch the cell.   for the former case, the actual objects (and/or metadata) may be stored in the cell directly (allocated inline), or reference pointers are stored that point to the objects actual location.  For the Later case of each cell storing all intersections, you are restricted to the reference pointer case.

Benefits / Drawbacks

The greatest benefit of grids is their simplicity, and virtually 1:1 mapping to simple 2d arrays.  If you have a homogeneous or static world, grids may be a very excellent choice.

The greatest drawback is the inability to handle heterogeneous data: objects of various sizes and/or clumped distributions are huge problems that quickly ruin the elegance and performance of grids.

Benefits of cell inline allocation of objects is cache coherency, but since you have fixed cells allocated for the entire size of your world, you run the risk of each cell’s allocations being both too big and too small if your world’s objects are heterogeneously distributed.

A more exotic grid solutions include virtual grids, who’s backing database is a hashtable.  this reduces memory use but greatly increases the cost of queries, as multiple hash lookups need to made to find neighboring cells, plus memory fragmentation.

Grids also give good performance with dynamic objects, as due to it’s fixed-space nature, there is no re-partitioning needed (which is not true with most other solutions!)

Quad Trees

Quad trees are one of the most natural extensions of grids, and are primarily designed to overcome grid’s biggest problems: heterogeneous sizes and distributions.   In the classical case, cells are replaced with “Nodes” (cells, but reference based, not fixed allocations/alignments), and nodes with two or more objects are sub-divided until only 1 object is in each of the “leaf” nodes.

Usage Characteristics

To enable elegant node sub-division, the center position of objects are used when determining ownership.   The simple case keeps dividing down to 1 object per node, but this is not strictly necessary, and is easily customizable, though if it is more efficient for you to query against 1000 objects in 1 node or 1000 nodes, is an exercise for the reader or perhaps someone else’s #AltDevBlogADay entry =D

Benefits / Drawbacks

Narrowly targeted queries are quite elegant, as walking from the root, you may reject large branches of the tree upon the parent failing to meet the query parameters.   Broad queries however, offer fairly poor performance, of requiring deep traversals of nodes.  For programming environments with a Java, the reference-class overhead of the nodes is substantial and can be problematic (gc stalls when running a game is not good!).

As mentioned, the nodes are partitioned based on object centers, which is quite a large problem due to “external” objects that may potentially overlap your node-of-interest when performing collision detection.  Some exotic solutions such as storing pointers to neighboring nodes exist, but generally it’s a pain in the ass and quite weak performance for dynamic objects.  Another big weakness of trees are the need to repartition when dynamic objects move, the overhead of which can cripple highly dynamic simulations.

One potential benefit of trees is using the varied node sizes (nodes are cut in 4 each level) to bucket objects based on size.  This also helps reduce the complexity of handling overlapping nodes, but it’s still not a simple decision on how to handle this problem.  some choices push the objects up to the first node that entirely contains them, but unfortunately this tends to push a large number of objects all the way up to the root, due to objects overlapping the partition boundaries (as an example, consider an object at (0,0) in the picture shown above)

Often cited as a solution are “loose” quad trees, but I dislike these for the same reason I dislike the seemingly coolsphere-tree: maintenance is a complicated mess and it’s not so obvious how to customize the ‘looseyness’ to optimize performance in any given situation.

Another big caveat for quad trees is that they should not be used for heterogeneous distributions, as you will effectively have a cache-poor, pointer-based grid at the lowest level and a required tree-traversal on top.  Generally for heterogeneous distributions, you are better served by a grid.

Though I mention many drawbacks, trees aren’t that bad.  their memory savings compared to grids under “clustered distribution” situations is considerable, while offering fast coarse-granularity partitioning, and it’s darn simple to implement.

Hierarchy Grids

A cross between grids and a quad tree, think of it as pre-allocating all the nodes of a tree (whether they have objects in them or not).  So with the respect of  the two, they share the strengths and weaknesses of grids/trees based on these choices.  I won’t repeat them, you can simply read them above 🙂

Benefits / Drawbacks

Overall, I much prefer hierarchy grids over grids/trees, because the platforms I work on can afford the extra memory overhead, and it’s actually quite advantageous to pre-allocate for my platform of choice (XNA/.NET) which reduces garbage collection pressure.

The biggest drawback to this is the expensive cost of querying empty cells.

Kd Trees

kd-trees are kind of  the uber-spatial-optimization of quad trees, where instead of sub-dividing the world into fixed sized cells, the spaces are sized based on the objects located inside (at least where they were at time 0).

Usage Characteristics

Much the same as quad trees, except “faster” queries.  However special care has to be made on modifications, as fragmentation occurs and the tree will need to be rebalanced to maintain high performance.   Generally these are useful for offline pre-computation of static environments.

Benefits / Drawbacks

good for static environments (better than quad trees for localized, ray queries), but not the most trivial to implement, and is not a good choice for highly dynamic environments due to the additional costs of repartitioning.

Overall, the performance (and logic) is fairly close to quad trees. I dislike this choice because of the high repartitioning cost, and unlike a grid or quad tree, is a bit more difficult to debug.

Zones/Rooms

Some-what a ‘super partition’  strategy is to subdivide the world by “room”, often the dimensions of which are left up to content creators, and the connection of rooms performed by portals in addition to the potential for coordinate-system locality. A fairly well known example of this is the game dungeon siege.

Usage Characteristics

While I have not personally experimented with this approach, it seems interesting.  Perhaps not the way as described in dungeon siege, but I could say this approach being useful for “zones” in the world-of-warcraft sense of the word: allowing each zone to be subdivided by it’s own partitioning system.

Also noted by the dungeon siege article is that there is a certain benefit to allowing artists to author the rooms, but also there is an incredible amount of complex infrastructure and toolsets that need to be developed to undertake their approach.    I would consider this to be fairly risky, and costly, and puts an emphasis on hand-tailored content, which is a luxury none but the most AAA projects can afford.

Offset Hierarchy Grid (OHG)

The OHG is what I just finished architecting for Novaleaf’s use.  It’s a hierarchy grid, with each level offset from the previous, to allow the benefits of “loosey” grids/trees without their unpredictable nature and maintenance overhead.

Usage-wise it’s “just a grid”, so gives the same benefits and tradeoffs.  However obviously since I just architected Novaleaf’s solution around the OHG, there must be some reason we went this approach, right?   well if you are reading this far, I’m sorry to disappoint, but this will be the subject of my next blog post(s)    Complain in the comments!!!

About the author:

Novaleaf Game Studios.

You can reach Jason via email (jasons aat novaleaf doot coom), or twitter (@jasons_novaleaf).

PS: images courtesy of wikipedia

PPS: anything you want to know about in detail?  again, let me know in the comments!!!

Sorting by branchiness

Original Author: Joe Valenzuela

This President’s day I’m happy to ford ahead and write this second article in a three part series. Generally I’m aiming to illustrate some of the times I made code less readable as a result of optimization. Last time I posted a case where we changed a run-time conditional into a templatized (thus compile-time) test to allow the compiler to remove it painlessly. Now I’m going to fillmore details in and talk about a more extreme case. As in my last post, this example comes from code running on the SPU.

PathNodes

A previous game I worked on had a lot of entities which fall under the description of PathFollowers. These are entities which basically follow a fixed path at a predictable (but not constant) speed. Although there are many varieties of path followers, their core functionality is provided by an element updated on the SPU called a PathNode.

While the PathFollower is a complex C++ virtual class, a PathNode is a simple (144 byte) struct. They are updated en masse and their owning C++ object is responsible for applying the transformed data at the appropriate point in the update loop. Once a frame, the PathNodes advance themselves along a piecewise linear path. Their behavior is mostly parametric – the distance moved, the path smoothing applied, and rotation about the roll axis. However, there are a few boolean parameters on each instance which can alter their update in a discontinuous fashion.

1
 
  2
 
  
  kPathNodeLoop = 0x04,     // if not set, movement will stop at end-of-path
 
    kPathNodeSegments = 0x10, // speed, accel, etc are measured in segments/s, not m/s

Looping PathNodes at the end of their path cycle will loop (assuming the path is closed). PathNodes can also advance at a multiple of segments a second (which are of non-uniform length) as opposed to meters a second. Both of these flags are tested in many of the functions used during a PathNode instance update.

Adapting this system to the SPU happened in the hayes of late project development. I wasn’t optimistic about replacing the entire system, I’d have to make due with simply porting it over and optimizing the largest existing code.

Code

Although PathNodes could theoretically transition from looping to non-looping or from travel-by-segment to non-travel-by-segment, the lion’s share of nodes were added when a path was created and remain in the same state until they’re destroyed. I had to grant nodes the ability to change state, but could optimize assuming it would not happen.

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  7
 
  8
 
  
  // 0: no meter, no loop
 
    // 1: no meter, loop
 
    // 2: meter, no loop
 
    // 3: meter, loop
 
    #define NOMETER_NOLOOP     0
 
    #define NOMETER_LOOP       1
 
    #define METER_NOLOOP       2
 
    #define METER_LOOP         3

Like the problem I discussed article, I’d be building separate versions of a routine, each taylored for a combination of the potential dynamic settings. The PathNode system partitioned the set of updating PathNodes into one of four groups. More specifically, it used a simple counting sort to populate four sets of indices into the complete set. This time I’d also be changing the instance virtual function update to one aggregated routine per partitioned group.

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  7
 
  8
 
  9
 
  10
 
  11
 
  12
 
  13
 
  
  // body of pathnode_meter_loop.cpp, an SPU program
 
    #define PATH_METER_TOGGLE 1
 
    #define PATH_LOOP_TOGGLE  1
 
   
 
    #include "code_fragment_path.inl"
 
   
 
    extern "C" void async_pathnode_meter_loop(global_funcs_t* gt, update_set_info_t* info, 
 
                                              common_blocks_t* common_blocks,
 
                                              instance_streams_t* instance_streams, 
 
                                              u8* work_buffer, u32 buf_size, u32 tags[4])
 
    #include "pathnode_raw.inc"
 
    #undef PATH_METER_TOGGLE
 
    #undef PATH_LOOP_TOGGLE

For my meta-programming fix this time, I’d relied on C preprocessor defines instead of template parameters. Many functions didn’t need to be changed to account for static branches. Those that did, suffered from a simple if tedious set of transformation. For example, C++ if statements with relevant expressions were changed to always true or false depending on the high-level set of flags expressed during compilation.

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  7
 
  8
 
  9
 
  10
 
  11
 
  12
 
  13
 
  14
 
  15
 
  16
 
  17
 
  18
 
  19
 
  20
 
  21
 
  22
 
  23
 
  
  f32 GetEndT(Path* path)
 
    {
 
      f32 end_t = (f32)(i64)path->m_point_count;
 
      if (m_flags & kPathNodeLoop)
 
          end_t -= 1.0f;
 
   
 
      return end_t;
 
    }
 
   
 
    #if PATH_LOOP_TOGGLE
 
    f32 GetEndT_LOOP(Path* path)
 
    #else
 
    f32 GetEndT_NOLOOP(Path* path)
 
    #endif
 
    {
 
      f32 end_t = (f32)(i64)path->m_point_count;
 
   
 
    #if (PATH_LOOP_TOGGLE == 0)
 
      end_t -= 1.0f;
 
    #endif
 
   
 
      return end_t;
 
    }

Affected function names were changed as well to help ensure the right version was being compiled/called under the appropriate circumstances (some preprocessor redirection was used to help prevent invoking the wrong function through inconsistent C preprocessor #defines and file inclusion). There was also an explicit typed constant to avoid having to completely rewrite compound/complicated expressions. This didn’t totally pierce the readability issue but did help ameliorate some of the problems that arose from introducing C preprocessor metaprogramming.

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  
  #if PATH_LOOP_TOGGLE
 
      const bool LOOP_CONSTANT = true;
 
   
 
      #define PATH_GetEndT(p) PATH::GetEndT_LOOP(p)
 
      // ...
 
    #endif

Conclusion

The changes I’d made had an unexpected shelf life. In the process of porting it, I’d made the system functionally the same on PPU as on the SPU. The partitioning of instances, the aggregate update, the removed branches were the same on the PPU as SPU. After the project in which it was initially used, the SPU version fell into disuse and eventually the PPU version was used exclusively. This reflects a couple of things. First, the total number of path followers for subsequent projects has decreased from what in retrospect appears to be a high water mark. Second, the steps of prepping a system to run on the SPU also contribute to performance improvements on other platforms. (There are a third and forth which I’m omitting unless polked).

Compile time polymorphism

Next time, I’ll talk about how my metaprogramming mania reaches a fever with a scheme to break out virtual dispatch to the compile time. Also, I’ll be lincoln the points from all the articles and talk about the larger lessons.


Building a Network Library for a Video Game: Part 2

Original Author: Joe Hegarty

Originally published on my personal blog at: last post I covered the basics of how TCP/IP works and explained why I have decided to use UDP. In this post I’ll be covering how to create a socket and send/receive data using it. Let’s dive right in shall we?

Working with Winsock

As I described in my last post, we’ll be working with Winsock in this series. The code should port relatively easily to other platforms.

The very first thing we need to do when working with Winsock is to include the correct header.  The header is WinSock2.h.

#include <WinSock2.h>

We also need to link to the Winsock2 library, you need to specify ws2_32.lib as an additional dependency on your linker input.

WinsockLib

Starting up and Shutting down Winsock

When working with Winsock you are required to initiate the Winsock DLL for use in your process. This is achieved with a call to WSAStartup. We will be using Winsock 2.2, so we need to tell Windows that’s the version of Winsock we’d like to load.

bool Initialise()
 
  {
 
      ::WSADATA wsaData = { 0 };
 
      int startupResult = ::WSAStartup(MAKEWORD(2,2), &wsaData);
 
      return startupResult == ERROR_SUCCESS;
 
  }

Once you are finished with Winsock you are also required to shut it down, this is achieved with a call to WSACleanup.

bool Shutdown()
 
  {
 
      int shutdownResult = ::WSACleanup();
 
      return shutdownResult == ERROR_SUCCESS;
 
  }

Initialising a Socket

As Winsock is ready to be used, we can create a socket. We’ve decided to use UDP so we’ll be creating a UDP socket with a call to the socket function.

SOCKET socketHandle = ::socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP);
 
  
 
  if(socketHandle != INVALID_SOCKET)
 
  {
 
      // Socket was created successfully
 
  }
 
  else
 
  {
 
      // Socket creation failed
 
  }

As you can see, we are passing three arguments. The first argument represents the protocol family we wish to use, we’ll be using IPv4 which is represented by PF_INET, we could also use PF_INET6 if we were targeting IPv6. The second argument is the type of socket we wish to create, SOCK_DGRAM describes a datagram socket, the type required for UDP. The final argument represents the protocol we’ll be using which is UDP.

Binding a Socket

Now that we have a UDP socket, before we can use it we need to bind it to a port with a call to the bind function.

We’ll take our previous example of creating a socket and add the binding functionality.


 
  SOCKET socketHandle = ::socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP);
 
  
 
  if(socketHandle != INVALID_SOCKET)
 
  {
 
      ::sockaddr_in inAddress = { 0 };
 
      inAddress.sin_family = AF_INET;
 
      inAddress.sin_addr.s_addr = INADDR_ANY;
 
      inAddress.sin_port = htons(portNumber);
 
  
 
      int bindResult = ::bind(socketHandle, (SOCKADDR*) &inAddress, sizeof(inAddress));
 
      if(bindResult != SOCKET_ERROR)
 
      {
 
          // Socket bound to port
 
      }
 
      else
 
      {
 
          // Failed to bind the socket to a port.
 
          // We should close the socket here usually.
 
      }
 
  }
 
  else
 
  {
 
      // Socket creation failed
 
  }

As you can see, once the socket has been created the first thing we need to do is create a sockaddr_in structure, this tells Winsock which port and address we’d like to bind to.

The sin_family variable describes which address family we’d like to use. sin_addr.s_addr tells Winsock which address we would like to use, in our case we don’t really mind so we’ll let Winsock pick for us.  Next we need to specify the port number we’d like to bind to in the sin_port variable. The call to htons means Host to Network Short, network byte order is always big-endian, so any integers passed to Winsock need to be in big-endian format, the call to htons detects the host byte order and changes it to network order if required.

Finally we call the bind function and check that we bound successfully, only one socket can be bound to a given port at any one time.

Sending Packets

As we now have a bound socket we can send a packet. As there is no connection in UDP you need to specify the destination every time you want to send a packet. IPv4 addresses are typically represented as a group of 4 unsigned 8bit integers such as 72.14.204.104 (A google.com address if you were wondering). Winsock represents IP addresses as an unsigned 32bit integer, so we need to do a little fiddling to change the standard representation to something Winsock can understand. You can do this with the following operations:

(72 << 24) | (14 << 16) | (20 << 8) | 104

Now that we know how to do it, we’ll wrap it up into a helper function.

unsigned int CreateAddress(unsigned char a, unsigned char b, unsigned char c, unsigned char d)
 
  {
 
      unsigned int newAddress = (a << 24) | (b << 16) | (c << 8) | d;
 
      return newAddress;
 
  }

To actually send data, we make a call to the sendto function.

bool Send(SOCKET socketHandle, unsigned int destinationAddress, unsigned short destinationPort, const char* p_data, u32 dataSize)
 
  {
 
      if(socketHandle != INVALID_SOCKET)
 
      {
 
          ::sockaddr_in destination = { 0 };
 
          destination.sin_family = AF_INET;
 
          destination.sin_port = htons(destinationAddress);
 
          destination.sin_addr.s_addr = htonl(destinationPort);
 
  
 
          int sentLength = ::sendto(socketHandle, p_data, dataSize, 0, (sockaddr*)&destination, sizeof(destination));
 
  
 
          return sentLength != SOCKET_ERROR;
 
      }
 
  
 
      return false;
 
  }

We need to construct another sockaddr_in, but this time we’re specifying the destination of the packet. You’ll notice a call to htonl, this does the same thing as htons for longs. Once this is done, we can make the call to sendto, the arguments are the handle of the socket we wish to use to send the packet, a char* of the data we wish to send, the length of the data we wish the send, the flags to use for sending this, the destination address and the length of the destination address.

sendto returns the amount of data sent (which is always the full amount when using UDP) or SOCKET_ERROR when the data fails to be sent, it’s important to remember that it doesn’t tell you if the data actually reaches it’s destination, UDP offers no guaranteed delivery.

Receiving Packets

Next we’ll take a look at receiving packets, to read a packet we use the recvfrom function. You can receive packets from multiple different remote sockets and the recvfrom call will give you the remote address.

bool Receive(SOCKET socketHandle, unsigned int& outAddress, unsigned short& outPort, char* p_outData, u32& outSize, u32 dataBufferSize)
 
  {
 
      if(socketHandle != INVALID_SOCKET)
 
      {
 
          ::sockaddr_in fromAddress = { 0 };
 
          int fromAddressLength = sizeof(fromAddress);
 
  
 
          int receivedLength = ::recvfrom(socketHandle, p_outData, dataBufferSize, 0, (sockaddr*) &fromAddress, &fromAddressLength);
 
  
 
          if(receivedLength != SOCKET_ERROR && receivedLength > 0)
 
          {
 
              outSize = receivedLength;
 
              outAddress = ntohl(fromAddress.sin_addr.s_addr);
 
              outPort = ntohs(fromAddress.sin_port);
 
              return true;
 
          }
 
      }
 
  
 
      return false;
 
  }

First we create another sockaddr_in structure which will be filled with details of the remote peer. We also need to specify a buffer we’d like to fill. recvfrom returns the amount of data received or SOCKET_ERROR when an error is encountered. You shouldn’t rely on the remote address to identify a connection, we’ll cover that in a future post.

There is one problem though, as the code currently stands it will block on the recvfrom function call until some data is received, as we’re currently running this on our main thread that’s not particularly suitable for our needs.

Blocking vs Non-Blocking

By default a socket is created in blocking mode, as discussed in the last section this means the code will block in the recvfrom call. Luckily, Winsock provides a method to stop the socket blocking and make it return immediately. The method we will be using is ioctlsocket , this allows us to set modes on the socket, in this case the mode we are interested in is FIONBIO.

You can enable/disable blocking as follows:

bool SetBlocking(SOCKET socketHandle, bool blocking)
 
  {
 
      if(socketHandle != INVALID_SOCKET)
 
      {
 
          DWORD nonBlocking = blocking ? 0 : 1;
 
          int setBlockingResult = ::ioctlsocket(socketHandle, FIONBIO, &nonBlocking);
 
  
 
          if(setBlockingResult == ERROR_SUCCESS)
 
          {
 
              return true;
 
          }
 
      }
 
  
 
      return false;
 
  }

Closing a Socket

Finally, we need to clean up after ourselves and close the socket. We can do this with the closesocket function.

void Close(SOCKET& socketHandle)
 
  {
 
      if(socketHandle != INVALID_SOCKET)
 
      {
 
          ::closesocket(socketHandle);
 
          socketHandle = INVALID_SOCKET;
 
      }
 
  }

Wrapping all of this up

Now that we’ve covered the basic of sockets, we want to wrap all of this up in an easy to use form. I’ve created a library which is a Visual Studio 2010 project that wraps all of this into an easy to use form. It’s licensed under the Microsoft Public Licence so should be suitable for most uses.

Example:

#include "NetLib.hpp"
 
  #include "socket/NetSocketWrapper.hpp"
 
  #include "socket/NetSocketAddress.hpp"
 
  
 
  int _tmain(int argc, _TCHAR* argv[])
 
  {
 
      NNetLib::u16 localPort = 4566;
 
      NNetLib::u16 remotePort = 4567;
 
  
 
      if(argc > 2)
 
      {
 
          localPort = _wtoi(argv[1]);
 
          remotePort = _wtoi(argv[2]);
 
      }
 
  
 
      std::cout << "##########################" << std::endl;
 
      std::cout << "Local Port: " << localPort << std::endl;
 
      std::cout << "Remote Port: " << remotePort << std::endl;
 
      std::cout << "##########################" << std::endl << std::endl;  
 
  
 
      // Initialise our network library, including Winsock
 
      if(NNetLib::StartNetLib())
 
      {
 
          std::cout << "Network library initialised..." << std::endl;
 
          NNetLib::CSocketWrapper socket;
 
          // Create and bind our socket
 
          if(socket.Create(localPort))
 
          {
 
              // Set the socket to non-blocking for this example
 
              socket.SetBlocking(false);
 
              std::cout << "Socket created..." << std::endl;
 
              NNetLib::u32 remoteAddress = NNetLib::CSocketAddress::CreateAddress(127, 0, 0, 1);
 
              NNetLib::CSocketAddress remoteSocket(remoteAddress, remotePort);
 
  
 
              while(true)
 
              {
 
                  // Send some data
 
                  const std::string dataToSend = "Welcome to the world of tomorrow!";
 
                  if(socket.Send(remoteSocket, dataToSend.c_str(), dataToSend.length()))
 
                  {
 
                      std::cout << "Sent " << dataToSend.length() << " bytes of data to port " << remoteSocket.GetPort() <<  std::endl;
 
                  }
 
  
 
                  // Receive data
 
                  bool moreData = true;
 
                  do
 
                  {
 
                      char dataOut[1024];
 
                      NNetLib::CSocketAddress receiverSocket;
 
                      NNetLib::u32 receiveSize = 0;
 
  
 
                      moreData = socket.Receive(receiverSocket, dataOut, receiveSize, sizeof(dataOut));
 
  
 
                      if(moreData)
 
                      {
 
                          std::cout << "Received " << receiveSize << " bytes of data from port " << receiverSocket.GetPort() << std::endl;
 
                      }
 
  
 
                  } while (moreData);
 
  
 
                  // Dirty, but let's sleep in this example
 
                  ::Sleep(1000);
 
              }
 
  
 
              // Close the socket
 
              socket.Close();
 
  
 
          }
 
      }
 
  
 
      // Shutdown the network library, including winsock
 
      NNetLib::ShutdownNetLib();
 
  
 
      return 0;
 
  }

.

In my next post we’ll be looking at how to serialise data.

Further Reading