Speaker Highlight – Bobby Anguelov

Original Author: AltDevConf

Our range of speakers runs the gamut from indie right through to AAA, and today’s highlight is drawn from the AAA end of the spectrum – Bobby Anguelov of IO Interactive will be speaking about “AAA Game Programming, A Day in the Life…”

So how do you get NPCs to behave the way they do? How do we go from a set of raw animations to having a character running around in the game. AI programmers are focussed on building the behavior and animation systems for NPC. This short talk will discuss the various domains that this job title covers as well as discuss some of the responsibilities and realities of working in the AAA industry.

Here’s a bit more about Bobby.

Bobby is an AI programmer at Io Interactive having worked on Hitman: Absolution, prior to this he spent some time working as a software consultant in various industrial and commercial fields. There was also a dark period in his life where he taught graphics programming at a university…

Bobby and all the other speakers are working hard on their sessions right now. You can learn more at the conference mini-site, and we really hope that you will join us November 10th and 11th to learn more from these excellent speakers.

Cover your Assets: The Cost of Distributing Your Game’s Digital Content

Original Author: Colt McAnlis

Today’s game developers have a wide choice of digital distribution platforms from which to sell their products. When a game is very large it’s often necessary to deliver some of its content from another source, separate from the distribution service. This can be new territory for those of you who are unfamiliar with web development. In this article, we’ll discuss some of the concepts and strategies that will help you decide how to distribute your content on the web. It is well worth the effort to distribute your digital assets yourself  because it gives you the chance to make your users happier.

When thinking about digital delivery the two main factors to consider are time and money: The time it takes the user to download content and start playing, and the dollar cost to you, the developer, to deliver the bits. As you might suspect, the two are related.

Measuring the Cost of Time: Bandwidth and Latency

We all know that users hate long load times, which is already known to be a large factor with the success of websites. In order to better understand the issues, let’s take a look at some results from running speedtest to measure download times from various servers around the world to a public computer in a San Jose library. Using the measured download speeds, here are the times it would take to download 1GB from various locations:

Server (City, Country) Download Speed Time to download 1GB (minutes)
San Jose, USA 20mbps 6.8
Maine, USA 15mbps 9.1
Lancaster, UK 2.25mbps 60.7
Sydney, AU 2.22mbps 61.5
Madrid, ES 1.93mbps 70.7
Beijing, CN 0.8mbps 170.6

Depending on the location of the server it can take from about 7 minutes to almost 3 hours to transfer the same amount of data. Why is there such a variation in download time? To answer the question you must consider two terms: bandwidth and latency.

Bandwidth is the amount of data that can be transmitted in a fixed amount of time. It can be measured by how long it takes to send a packet of data from one place to another. Developers have some control over bandwidth (for example by reducing packet size or  the number of packets transmitted), but bandwidth is ultimately controlled by the infrastructure between the user and the distribution service and the service agreements that each has with their respective  ISPs. For instance, an ISP may offer different tiers of bandwidth service to users, or may throttle the bandwidth based upon daily usage limits, and the “last mile” of copper – or whatever the connection is to the user’s platform – can also cause bandwidth to degrade. (i.e. A chain is only as strong as its weakest link.)

Latency, measures the time delay experienced in a system. In networking parlance latency is often referred to as round trip time (RTT), which is easier to measure since it can be done from one point. You can imagine RTT as the time it takes for a sonar ping to bounce off a target and return to the transmitter. The familiar Unix ping program does just that. As a developer, you have little control over latency. Your algorithms might contribute some overhead that will accrue to latency, but the physical realities of the distance between the transmitter and receiver and the speed of light impose a hard lower bound on RTT. That is, the physical distance between two points across a specific medium (like copper wire or fiber cable) caps how fast data be transmitted through it.

At first glance, you might think that you can buy your way to faster content delivery by purchasing higher bandwidth connections. This might work if you knew exactly where your users are located and they don’t move; however, as Ilya Grigorik points out, incremental improvements in latency can have a much greater effect on download times than bandwidth improvements. (His argument relies on some interesting work by Mike Belshe who also makes the same case.)

The simplest and most popular way to reduce latency is to minimize the distance between the user and the content server using a Content Delivery Network, or CDN.


Attacking latency with locality

Content Delivery Networks duplicate and store your data in multiple data centers around the world, reducing the amount of time it takes to fetch a file from an arbitrary point on the globe. For instance, I may have originally uploaded a file to a CDN server that happened to be in San Jose, but a user downloading the file from Beijing would most likely receive the file from a server in China.

When you use a Content Delivery Network you take advantage of one of the basic features of internet architecture: The internet is, at its core, a hierarchy of cached data. YouTube is an excellent example. Once a video is uploaded, it is distributed to the primary YouTube data centers around the world, avoiding the higher cost of sending the file from the originating data center no matter where the request is coming from. Google Cloud Storage also uses a similar policy. In high-demand areas multiple intermediate caches can exist. For example, there may two additional data centers in Paris.

You may not be aware of another hidden efficiency of the net: Client machines usually cache data for faster retrieval later, but data can be cached by an Internet Service Provider (ISP) as well, before sending it on to the end user – all in an attempt to reduce the cost of data transfer by keeping the bits closer to the users.


Attacking latency with technology

Some CDNs provide advanced transfer protocols that can speed up delivery even more. Google App Engine supports the SPDY protocol, which was designed to minimize latency and overcome the bottleneck that can occur when  the number of concurrent connections from a client is limited.

A CDN can also offer flexibility in controlling access to your data. Google Cloud Storage supports Cross-origin resource sharing (CORS), and Access Control Lists, which can be scripted. These tools can help you tailor the granularity of your content, pairing specific assets with specific kinds of users for example. Google App Engine is scriptable. Scripting can help you increase the security of your online resources, for example by writing code that detects suspicious behavior such as an unusual barrage of requests for an asset coming from multiple clients.

Using a CDN allows you to scale and deliver data to your users around the world more efficiently and safely.


A note on mobile content delivery

Mobile networks have additional problems involving latency and download speed. Ilya Grigorik does a great job of explaining this:

The mobile web is a whole different game, and not one for the better. If you are lucky, your radio is on, and depending on your network, quality of signal, and time of day, then just traversing your way to the internet backbone can take anywhere from 50 to 200ms+. From there, add backbone time and multiply by two: we are looking at 100-1000ms RTT range on mobile. Here’s some fine print from the Virgin Mobile (owned by Sprint) networking FAQ:Users of the Sprint 4G network can expect to experience average speeds of 3Mbps to 6Mbps download and up to 1.5Mbps upload with an average latency of 150ms. On the Sprint 3G network, users can expect to experience average speeds of 600Kbps – 1.4Mbps download and 350Kbps – 500Kbps upload with an average latency of 400ms.To add insult to injury, if your phone has been idle and the radio is off, then you have to add another 1000-2000ms to negotiate the radio link.

So for you mobile developers, be aware of these issues when trying to get data to the user fast. Be sure your streaming and compression systems are designed to compensate for these extra burdens.

The Cash Cost of Content Delivery

You must spend some money to use a CDN. (Sadly, the free lunch comes next Tuesday.) For example, Google Cloud Storage charges around $0.12 per gig for the first 1 terabyte of data transferred each month.

To put that in perspective, let’s say your game sees 3.4 million unique users monthly. Assuming your in-game content is 1GB in size, and Google cloud storage charges about $0.085 per gig to transfer 0.66 petabytes a month (the pricing tier at that usage level),  then your cost would be about $9,633 per day.

Putting things in perspective, to break even you’d need  to earn about $0.002 per user per day to distribute that much content. Hopefully, if you’ve got 3.4 million monthly users you should easily be able to do that.

Admittedly these are worst-case numbers; the chances that you’re serving 1GB of content to 3.4 million unique users daily is a bit far fetched. Within a few months everyone on the planet would have your content; so this scenario doesn’t represent a long-term estimate.

Knowing is Half the Battle

Once you’re aware of the time and financial costs involved with distributing assets, you can plan a path to do something about it.

Only send to users what they need (right now)

Up until now, we’ve assumed that all 1GB of content is required for each user to start the game, but that’s a great, horrible lie. In reality, the user may only need a subset of data to begin play. With some analysis you will find that this initial data can be delivered quickly, so the user can start to experience the content immediately, while the rest of the data can be streamed in the background, behind the scenes.

For instance, if a user downloads the first 20MB from a website or digital software store, can they start playing right away and stream in the rest later? How long until they need the next 20MB? What about next 400MB?  Would a CDN be able to deliver the follow-on content in a faster, or more flexible manner? Optimizing for this sort of usage can decrease the perceived load time and the overall transfer cost, enhancing the accessibility and affordability of your product.

Own your content updates

In the current world of game development, it’s common to run on many different platforms. When an update is available, it takes time to be sure that the new content has been received by all your users. For instance if you’ve pushed a new build of your game server, some of your players can be out of sync for an extended period, which can generate lots of “OMG th1s g4m3 duznt werk!” bugs for your QA testers. Controlling how and when your app performs updates can be highly beneficial – though it must be noted that in some cases the updating logic is managed by the operating system and is out of your hands. Whenever possible, your app should be aware of new updates and capable of fetching them.

Most applications will contain some number of platform-specific assets. For instance, the types of hardware-supported texture compression formats can vary by platform, you may need a separate tier of lower-resolution models to run on mobile, or some of your content can vary by region. When any of these assets change, the associated update need not be universal. If you can segment at least a part of your content by platform and location you can better control when you need to update, who needs to update, and what you need to send.

Moving forward

To be fair, the distribution strategies we discussed here are not for everyone. If your entire app is under 10mb, there’s little need to segment assets or distribute content outside of the primary distribution point.

But if your app is hefty, it pays to understand the costs involved with distributing your game’s digital assets, and how you can reduce those costs – and the users’ costs as well. It’s also wise to consider how a distribution strategy can decrease load times and reduce headaches caused by updates. By taking control of distribution you have the ability to save money and increase the quality of the end-user experience.

Which is the point, right?

Fix your Pebbles

Original Author: Ted Spence

Good programming stories don’t quite grow on trees, but it’s quite fun to take random bits of advice and make them apply to technology. One of my favorite witty aphorisms came from – if you can believe it – an inspirational saying on a tabletop.

I came across this phrase at lunch across the street from Vivendi Universal’s headquarters. The restaurant had decorated their furniture with little sayings, the kind that probably came from a fortune cookie. I found this one underneath my fish tacos:

“It isn’t the mountain in front of you that slows your pace – it’s the pebble in your shoe.”

That’s quite a snappy turn of phrase. It’s short, intelligible, pithy, and it quickly found its way into my collection of “Wise Programmer Stories”. This phrase is especially useful when I’m teaching a junior employee who would otherwise blindly use whatever tools and techniques I gave them..

I tell them the phrase, and ask them, “What is it that slows down your work? What can you do to improve upon your situation?”

Empowerment, Proactivity, and $GENERIC_BUZZWORD_3

It’s a cliche of management that you want your employees to think on their feet and solve problems without having to be told. But we, as managers, often make it impossible for employees to deliver the kind of improvements we wish they would produce!

Modern development teams saddle our programmers with painful ticket tracking systems, arbitrary reporting requirements, continuous time management, and “priority-order” deliverables. We often tell our employees that “only critical tasks matter” and they should leave the little things for another day.

By teaching my team to spot, and fix, problems without asking for permission, I am teaching them to use their intelligence to make the company better at every moment. Even more importantly, I’m telling them to improve on things that may not yet be problems, because often the person closest to the task sees a problem long before everyone else.

When I tell my employees to fix their pebbles, I ask them to “sneak” improvements. I tell them to take a big task that’s really annoying, pad its time estimate, and use the extra time to deliver some core improvement that matters to them. I certainly understand that sometimes this means I will lose an hour or two of an employee’s time to scratching their own itch. But doing this has two key benefits:

  • I am teaching the employee to operate on their own initiative. I am challenging them to surprise me with some key improvement that I may not have known about. I reward someone richly when they surprise me like this.
  • I am encouraging the employee to love their job, and to take ownership of it. An employee who shows up at 9am, pulls tickets off the pile, and executes until 5pm is just a machine. My employees love their work and they deliver better results because of it.

Bad Tools Are Often Invisible

It’s very easy for a manager to keep track of the “top level tasks.” Nobody ever loses sight of the core deliverables, the top priority projects, the contractual obligations. But managers often miss the details. It’s easy to develop a blind spot for little things that sap a team’s time and attention. I’ve seen the following happen repeatedly:

  • Developers who have to hunt down the latest specifications for a project, because the authors write files in MS Word using revision tracking and share them via email (or … shudder … paper).
  • Artists who have to cut and copy texture data from their artworks before saving new work into the content pipeline.
  • Operations team members who have to back up and migrate databases manually to set up test environments.
  • Complex “setup” or “go-live” tasks that nobody bothers to automate.
  • Fragile “build scripts” that require constant attention.
  • Non-automated disaster recovery tests that take people days off their normal schedule.
  • Slow code that encourages team members to get up and walk away rather than staying focused on a task.
  • Incomplete documentation that keeps team members calling each other for “final” information.
  • Infrastructure health reports that take time to generate each day.

In many cases, these (and innumerable other) tool problems are invisible. Because each problem is so small, and because the effort required to report little problems in a time-management system is annoying, employees don’t bother reporting little nuisances. Instead, large tasks simply slow down because employees are busy doing and redoing these little tasks when they think they’re working on the big ones.

The manager who observes the team’s progress data may not even notice the delay – after all, it’s only a small percentage of a larger task. Isn’t the large task the only critical thing? The customer doesn’t care if your backup process is fast or slow; the customer only cares that they get their deliverable on time, right?

Our Development “Pace” is Constantly Measured

It’s become a standard part of business practice – something they teach you every semester in business school – that you have to measure your progress. Many SCRUM and AGILE systems even explicitly ask you to cite your pace, or velocity, as a standard daily metric.

Yet a simple measurement, “pace” often obscures critical information. If you have five “tasks” that are estimated to take 100 hours, and you complete four of them in a week, is your pace “4”? Is your pace 80%? Is your pace the ratio between the hours you estimated and the hours you spent? Does any of that information help the company?

Setting aside these statistics, let’s consider another way to look at our productivity. Every day, my team does some tasks that are mundane, boilerplate work – recording facts, submitting reports, updating statistics. But then there’s the rich, unique stuff that is our company’s value proposition: writing new code, building an improved system, delivering new features.

What percentage of our time is taken up by the ordinary stuff?

If you take an ordinary task – say, backing up the database and putting it on the disaster recovery server – and write a script to automate it, another script to test to ensure that it worked, you’ve just taken a task that added no value to the company and converted it into a task that adds value to your company.

Pebble Spotting

Alright, let’s put on our manager caps. We have two engineers and five critical projects that have to be delivered regularly over the next month. While you’re dropping by to visit the team, you notice that Alice has her feet up on the table and is staring at the ceiling.

“What’s going on?” you ask.

“I’ve built my code and now I have to test on the staging environment. Just waiting for Bob to set it up.”

“Why can’t you do it?”

“Bob is the only one with the credentials necessary to make the changes to the staging server. I pinged him a few minutes ago and he’s just getting started now.”

Congratulations! You’ve just discovered a pebble. By itself, it’s not a big deal. Many managers just shrug and walk away, and complain later that their programmers aren’t as good as the programmers at Google.

So what do you do next? Your team is already hard at work on the critical projects you’ve given them. You can tell them to work harder, or put in longer hours, or just focus more closely. But you might be better off tasking Alice and Bob with teaming up to develop a script to fix the staging problem.

Without constant attention, these little time sinks proliferate. You may find that many of the delays in your top level date are actually an accumulation of little delays, because nobody bothered to fix these pebbles back when you had the time to do so. Here are some good places to look for pebbles that you can fix:

  • Check your continuous integration system. Does it need constant fiddling to keep it running?
  • How good is your content pipeline? Is it constantly sucking in new artwork and pushing it out to the nightly builds?
  • Talk to your operations team. Can they deploy new code by hitting a single button? Does the button work every time?
  • Look at the nightly automated tasks. Do they run unattended? Do critical managers just get reports from them like clockwork?
  • How long does it take to set up a new hire’s computer?
  • Is your documentation written up and published online in a central intranet? Can your team find the information they need?

Good Tools Multiply Your Velocity

Does anyone think the original StarCraft would have been as good a game if it hadn’t had a fantastic level editor? Blizzard achieved incredible, persistent success in their program in part because of their fantastic toolset: the StarCraft level editor was a breath of fresh air. It was used both internally and externally, and it took a game that was already at the peak of its genre and elevated it into something completely different; a platform that redefined competitive gaming.

Yet many of us suffer in silence because we have to continuously redo our manual work every day. We know our tools aren’t quite perfect, but there are always other pressures – deliverables, contracts, milestones. Sometimes we blind ourselves by constantly worrying about other things.

If we do see the problems, we don’t always allow ourselves the time necessary to fix them. But when you encourage your team to build incredible tools every day, you rapidly find that your speed on the big tasks increases. It’s not just because your team eliminates a few minutes of hassle – it’s because you’re helping your team think about their work.

When you give the team permission to fix the little nuisances, you teach them to value every part of their contribution. They aren’t just working on a deliverable, they’re working on improving your company as a whole.

Give It To The Lazy Guy

The rumor I’ve heard is that Bill Gates once said “I will always choose a lazy person to do a difficult job – because he will find an easy way to do it.” cheesy “meme” pages – it shows directly the consequence of the “pebbles” aphorism. If you take the things about your job that are difficult, frustrating, or hard, and make them easy – you’ll enable yourself to deliver the big projects faster. Your company will move faster.

And, most importantly, your employees will know that they should solve problems rather than just deal with them.

Being a negative developer

Original Author: Rob Galanakis

I want to respond to the AltDevBlogADay post Negative Developers and Team Stability, which hit home. It’s not that I think the advice was particularly interesting (all good, standard stuff), it’s that it reminded be that I’ve been a negative developer.

I don’t know what I could have done differently. I just wasn’t happy at work, and there was little I could do to change it. The quality of my work was apparently very good, I was just terrible for morale, because I was either 1) pissing people off or 2) encouraging people to be pissed off at the problems I/we saw. Eventually I got the best advice I’ve ever gotten (which deserves its own blog post), and left the company. I went to the right place and became a positive developer.

And that’s sort of what struck me about the article and about how we typically deal with negative developers. Some developers are just not a good fit, regardless of how amazing their work is. If someone is negative because she is “culturally incompatible”, because there’s nothing you or your manager can do to fix it. And it is worth it to have a frank discussion about whether that person can ever be happy without changes to the studio, and if that person says ‘no’, you should discuss plans to part with mutual respect at a mutually agreed date.

I had to put in my two weeks at my last job to have this advice given to me by the President (GM? Can’t remember) at the time. It convinced me to un-quit, and to stay on another year. It ended up being a miserable year in many ways, but it was the right thing to do and worked out for the best. As managers- and friends and team members of negative developers- we need to keep this advice in mind when dealing with negative developers (and ourselves).

A Data-Oriented, Data-Driven System for Vector Fields — Part 3

Original Author: Niklas Frykholm

In this post, I’ll finish my series on vector fields (see part 1 and part 2) by tying up some loose ends.

Quick recap of what has happened so far:

  • I’ve decided to represent my vector fields in functional form, as a superposition of individual effect functions G_i(p).
  • I represent these functions in bytecode format, as a piece of bytecode that given an input position p computes a vector field strength F_i.
  • By running each step of the virtual machine over a thousands of input points, the cost of decoding and interpreting the bytecode instructions is amortized over all those points.
  • This means that we get the bytecode decoding “for free” — the bytecode can run at nearly native speed.

Bytecode format

In the last article I didn’t say much about what format I used for the bytecode. Generally speaking, designing a bytecode format can be tricky, because you have to balance the compactness (keeping programs short) against the decoding cost (keeping bytecode fast).

Lucky for us, we don’t care about either of these things. Compactness doesn’t matter, because our programs will be very short anyway (just a few instructions). Decoding cost doesn’t matter (much), because it is amortized.

When it doesn’t really matter I always pick the simplest thing I can think of. In this case it is something like:

(instruction) (result) (argument-1) (argument-2)

Here, instruction is a 4-byte instruction identifier. result is a 4-byte channel identifier that tells us which channel the result should be written to. argument-1 and argument-2 are either channel identifiers or Vector4’s with constant arguments. (Instructions of higher arity would have more arguments.)

Note that using 4 bytes for instructions and registers is beyond overkill, but it is the simplest option.

One annoyance with this representation is that I need different instructions depending on whether argument-1 or argument-2 is constant. For a 2-arity instruction, I need four variants to cover all cases. For a 4-arity instruction (such as select), I would need 16 variants.

There are two ways of dealing with this. First, I could make the code that executes each instruction a bit more complex, so that it can handle both constant and register arguments. Second, I could make all instructions operate only on registers and have a single instruction for loading constants into registers.

Unfortunately, both of these option results in significantly slower bytecode. In the first case, the extra logic in each bytecode executor makes it slower. In the second case, we need extra instructions for loading constants, which increases the execution time.

So at least for two argument functions, the best option seems to be to have separate code for handling each argument combination. For four argument functions, it might be better to use one of the other options.

Just to give you some example of how the bytecode works, here is some raw byte code and the corresponding disassembled bytecode instructions:

05000000 02000000 00000000 00000000000020410000000000000000
r2 = sub          r0       (0,10,0,0)

16000000 03000000 00000000000000000000803f00000000 02000000
r3 = cross        (0,0,1,0)                        r2

0a000000 04000000 00002041000020410000204100002041 03000000
r4 = mul          (10,10,10,10)                    r3

10000000 03000000 02000000 02000000
r3 = dot          r2       r2

0c000000 05000000 04000000 03000000
r5 = div          r4       r3

09000000 03000000 05000000 0000a0400000a0400000a0400000a040
r3 = mul          r5       (5,5,5,5)

00000000  01000000  01000000  03000000
r1 = add            r1        r3

High-level language

You can’t really expect people to author their effects in raw bytecode, or even in our “bytecode assembly language”. Effect authors will be a lot more productive if they can use a more comfortable language.

I decided to create such a language and model it after HLSL, since it serves a similar purpose (fast processing of vectorized data). Programmers interested in writing vector field effects are probably already used to working with HLSL. Plus, if at some point we want to move some of this work to the GPU we can reuse the code.

To show what the high level language looks like, here is an implementation of a whirl effect:

const float4 center = float4(0,10,0,0);
const float4 up = float4(0,0,1,0);
const float4 speed = float4(10,10,10,10);
const float4 radius = float4(5,5,5,5);

struct vf_in
    float4 position : CHANNEL0;
    float4 wind : CHANNEL1;

struct vf_out
    float4 wind : CHANNEL1;

void whirl(in vf_in in, out vf_out out)
    float4 r = in.position - center;
    out.wind = in.wind + speed * cross(up, r) / dot(r,r) * radius;

If you squint, you may notice that this high level code exactly corresponds to the low level bytecode in the previous

Just as with HLSL, although this looks like C it actually isn’t C. Things that work in C may not work in this language and vice versa. I’m quite strict when I parse this. I figure it is better to be start by being strict rather than permissive. This gives you more leeway to extend or modify the language later while keeping backwards compatibility. A strict syntax can always be loosened later, but if you design the language with a too permissive syntax you can paint yourself in a corner (case in point: Ruby).

I usually don’t bother with Lex or Yacc when I write a parser. They are OK tools, I guess, but if I can get by without them I prefer not to have the extra precompile step and to have code that is a bit more straightforward to read and debug.

Instead I tend to use a recursive descent parser (a predictive variant, with no backtracking) or some variation of Dijkstra’s shunting yard algorithm. Or sometimes a combination of both.

For this language I parse the overall structure with recursive descent, and then use Dijkstra’s algorithm to process each statement in the function body.

I generate the bytecode directly from the shunting yard algorithm. When I pop an operator from the operator stack I generate the bytecode for computing that operator and storing the result in a temporary register. I then push that register to the value stack so that the result can be used in other computations. Temporary channels are recycled after they are popped of the value stack to minimize the channel count.

Constant patching

Constants in the bytecode can be changed when an effect is played. I do this by directly patching the bytecode with the new constant values.

When I generate the bytecode I keep track of where in the bytecode different global constants can be found. This patch list is a simple array of entries like:

(hashed constant name) (offset in bytecode)

When playing a vector field effect, the gameplay programmer specifies the constant values with a table:

VectorField.add(vf, "whirl", {radius = 10})

I look through the patch list, find all the offsets of constants named “radius” and replace them with the value(s) supplied by the gameplay programmer.

Since globals can be patched later, I can’t do constant folding when I generate the bytecode. (Without global patching, I could just check if both arguments were constants when I popped an operator, and in that case, compute the constant result and push that directly to the value stack, instead of generating a bytecode instruction.)

I could reduce the instruction count somewhat and improve performance by doing a constant folding pass on the bytecode after the globals have been patched, but I haven’t implemented that yet.

Physics integration

In my physics system I maintain a list of all awake (non-sleeping) actors. I apply wind from a vector field with an explicit call:

void apply_wind(const VectorField &field, const CollisionFilter &filter);

This extracts the position of every awake actor that matches the collision filter and sends that list to the vector field for evaluation. It then does a second loop through the actors to apply wind forces from the returned wind velocities.

I’ve chosen to have an explicit step for applying wind, so that you don’t have to pay anything for the wind support unless you actually use it. Having an explicit step also opens up the possibility to have other types of vector fields. For example, there could be a vector field representing gravity forces and a corresponding function:

void apply_acceleration(const VectorField &field, const CollisionFilter &filter);

The fact that the wind is only applied to awake actors is important. Without that check, the wind forces would keep every actor in the world awake all the time, which would be really expensive for the physics engine. Just as with gravity, we want physics objects to come to rest and go to “sleep” when the wind forces are in balance with other forces on the actor.

This of course creates a problem when the wind forces are varying. An actor may be in balance now, but a change in the wind direction could change that. A leaf that is resting on the ground may be lifted by a sudden updraft. Since we don’t apply the wind forces to sleeping object we can’t get that behavior. Once a leaf has come to rest, it will stay put.

This problem is most noticeable when you have drastic effects like explosions in the vector field. It looks really strange when actors are completely immobile and “sleep through” a big explosion.

I deal with this by having a function for explicitly waking actors in an AABB:

wake_actors(const Vector3 &min, const Vector3 &max, const CollisionFilter &filter)

If you want to play a drastic wind effect (like an explosion), you should first wake the nearby actors with a call to wake_actors(). This ensures that all nearby actors will get the wind forces from the explosion (since they are now awake).

I apply the wind force with the standard formula:

F = 1/2 r v^2 C A

Where r is the density of air, v is the relative velocity of the air with respect to the object (so v = v_wind – v_object, where v_wind is the wind speed and v_object is the object’s speed). C is a drag coefficient that depends on the object’s shape and A is the object’s reference area.

For C and A, I actually loop through all the physics shapes in the actor and estimate C and A based on those shapes. This is by no means a perfect approach. There are many situations where C might be really different from what such an estimation gives. For example, an object that is heavily perforated would receive much less wind force.

However, I want to have something in place that gives decent behavior in most cases, so that it only very rarely has to be changed. The less artists have to mess around with physical parameters, the smaller is the chance that anything gets messed up.

Note that the wind force is just air resistance with a velocity for the air. So by implementing wind you get the “air resistance” behavior “for free”.


If you compute the drag force using the formula above and apply it to a physics actor, it won’t add any rotation to the actor. This is actually correct. The drag force, as we compute it here, has no rotational component.

Yet it feels counter-intuitive. We expect objects to rotate when they are blown about by the wind. Leafs and papers certainly swirl around a lot when the wind blows.

What happens in that case is actually a second order effect. When the wind blows around an object you get zones of high and low pressure as well as turbulence, and it is the forces from these interactions that affects the object’s rotation.

These interactions are tricky to model accurately and they depend a lot on the object’s shape. Right now, I’m not even trying. Instead I use a much simpler approach: I apply the drag force a bit above the object’s actual center of mass so that it produces a torque and makes the object rotate. This is a complete hack that has no basis at all in physical reality, but it does add some rotation. At least it looks a lot better than applying the wind force without any rotation.

It should be possible to do better — to make some kind of estimate of what rotational forces wind induces when it blows against typical physics shapes: boxes, spheres, capsules, etc. Just give my a couple of days in a wind tunnel and I’ll try to come up with something.

This has also been posted to the Bitsquid blog.

Angle based SSAO

Original Author: Simon Yeung


“The Technology behind the Unreal Engine 4 Elemental Demo” about how they implement SSAO. Their technique can either use only the depth buffer or with the addition of per-pixel normal. And I tried to implement both version with a slight modification:

Using only the depth buffer

The definition of ambient occlusion is to calculate the visibility integral over the hemisphere of a given surface:

4 Simple Tips for Combating Game Piracy

Original Author: Tyler York

Game piracy is a huge problem. All app developers suffer from it, but game developers are particularly vulnerable. One acclaimed indie developer suffered a interview where they said they saw a 60% piracy rate across iOS and Android.

For solo developers or small teams, the best way to combat piracy is to do their best to prevent piracy in the first place. Here’s some anti-piracy tips to make life harder for those swashbuckling app thieves.

Tip 1: Switch to Freemium

It’s sad but true: the majority of pirated games on Android are Paid titles, not free ones. By making your game free, you are removing a significant amount of the incentive for others to pirate your game. There are numerous guest post for our blog on the topic.

But freemium is not a silver bullet: IAP piracy exists

Yet while switching to freemium definitely deters pirates, freemium piracy still exists. On Android, programs like “IAP Free” and “IAP Cracker” give pirates the ability to spoof transactions locally to fool your game, or send fraudulent transactions to your game server. The majority of these tools are most commonly used on Android, but jailbroken iPhones are likely culprits as well. Hackers have even figured out how to unlock “free” IAPs on Apple’s own App Store, though using it was considered universally to be a bad idea.

The good news is that you can fight in-app purchase (IAP) piracy too, with methods that aren’t obnoxiously difficult for players to wade through. After sifting through countless forum posts, articles, and email threads on mobile piracy, here’s a short list of the simplest and most effective methods for making a pirate’s life difficult.


Tip 2) Remote communication

The baseline for any anti-piracy efforts on mobile is to have the app communicate to your game server remotely. Even if it’s a single player game, recording player IDs and actions to your server is the only way to fight piracy. You need to be able to understand if piracy is occurring and hopefully take action. This can mean shutting down account creations from a particular country if the majority of purchases from that country are fraudulent, by removing the ill-gotten purchases from users accounts, or shutting the offending game account down entirely. This is especially effective when used in conjunction with our next tip: online registration.


Tip 3) Online registration

Hero Academy is a great example of an online registration system.

The most commonly cited elegant solution for preventing piracy was online registration with an email address. Whenever the user goes to play, they need to authenticate with their email address (automatic sign-in is fine after the first visit). Doing so creates a gate that you can use to prevent pirates from accessing your game. Then, when you catch a user pirating IAP your game, you can shut out that email address and the pirate will lose everything they gained illicitly with the loss of the account.

If you want to go the extra mile, you can ask for email verification from players as well. You should choose the right place to do this within your game, but it should always be before in-app purchases can be bought. This ensures that pirates can’t use a fake email, and makes the piracy process more difficult because you’re adding more work to each piracy cycle. Each time a pirate gets caught, they now have to create a new email address in addition to creating a new account in your game.
Tip 4) Encryption

Lastly, to make a pirate’s life much more difficult, use encryption for all communications related to in-app purchases, especially IAP purchase confirmations to your server. This adds an extra layer of protection between your purchases that will deter many hackers, even if it doesn’t prevent all of them. And if you want to take your app’s encryption to the next level, you can use a third-party encryption service like Molebox to make your app extra difficult to crack.
An ongoing battle

The sad truth is that no matter what platform you’re on or how well you build your system, you’re never completely piracy-proof. That said, you can make your game so difficult or time-consuming to pirate that the majority of would-be offenders will give up. No one should let their lives be consumed by fighting pirates, so hopefully you can “set it and forget it” with these tips and endure much lower piracy rates with minimal maintenance.


If you liked this post, check out our educational developer newsletter.

Negative Developers and Team Stability

Original Author: Lee Winder

It doesn’t take much for negative feelings to start to seep into a team but it takes a lot more to turn a team around and start to raise moral and motivation. The following isn’t based on an in-depth study of development teams across the world but on my own personal experience of managing and observing a number of teams over the last 10 years.

Take of that what you will…

When you look at the make up of a team it will always be staffed by people who raise the game and by some who only bring it down. It’s the nature of taking a group of individual people and asking them to work together for a period of time towards a common goal. It’s the individuality of these people that can take a project and make it fly or cause it to crash and burn.

One thing that’s clear is that it’s much easier for a single individual to bring a team down than it is for an individual to improve the team in any significant way. Negativity will spread like wild fire through a team whilst positivity acts more like treacle and can be much harder to spread around.

But why?

A negative attitude to work is a whole lot easier. Doing less, talking badly about the team or rubbishing the game is much easier than creating excellent content, taking responsibility for your work or stepping outside your defined role and doing something great.


What Defines a Negative Developer?

There are many ways in which a developer might have a negative effect on a team. The most obvious is through their general attitude to their current project, be that general low level complaining, pushing back against work requests for no good reason or general slacking off during the day.

It could be a lack of skill development or even a backsliding in the quality of the work they are producing.

But it could also be an attitude that doesn’t gel with the general ethos the team is aiming for. Maybe you want your developers to take more responsibility for their work and how it’s designed and implemented and one or two developers will only work when they are told exactly what they need to do.

Maybe it’s a developer who doesn’t get involved with the daily meetings, mumbling through and obviously not interested in what other people are doing.

At the end of the day, identifying a developer generating a negative effect on a team is usually pretty easy. They’re the ones who are difficult to deal with in usually many aspects of the development process…


Team Development

Lets have a look at a few situations, where a green developer is a ‘positive’ developer, red a ‘negative’ one.

In the first situation we have two developers working side by side, one working well and another not doing so great. Maybe one of them has a bad attitude, maybe they don’t want to really push what they are doing. Either way, their contribution to the team is much less than that of the positive developer.

In most cases, this will go only one way. The good developer, seeing their partner being allowed to get away with not working so hard, not having to put in as much effort will eventually start to slow down and equalise with the poorer developer.

It’s much less likely that the poorer developer who is getting away with poor work or a bad attitude will see the better developer and decide to put in that extra work. As a result, you now have two bad developers rather than one.

When does it go the other way? When does the poor developer look around and start to raise their game? The answer isn’t very encouraging.

Take the following situation

Theres a tight balance here, but since it’s much easier for a developer to reduce the quality of their work rather than improve it, it’s easier to slide the wrong way and at that point its’ very easy to see where this will go.

Based on a number of observations it seems at though while a 3:1 ratio might get you some good results it still brings risks because should one developer start to slip it then becomes 1:1 which puts us right back at the start.

In most cases you can only really guarantee that other people will not slip if you have a 4+:1 ratio between positive and negative developers. In a number of cases the negative developer didn’t change their attitude without help but other developers didn’t slip due to the peer pressure of the other better developers.


Positive Developers

But in all these situations I’m not giving these positive developers enough credit. A good developer won’t always slack, they’ll continue working hard, producing great content and generally continue to fly high.

But take the following situation…

These developers are good for a reason, be that personal pride, ambition or sheer enjoyment of the work they are doing. And if a good developer finds themselves in the minority for a long period of time, the outcome is inevitable.

Great developers won’t stick around if those around them are not working to their potential or failing to create an environment in which the better developers feel themselves being pushed. And once your great developers leave you have a much higher chance of those left realising they don’t need to actually work that hard to get through the day.

Solving the Problem

There are two ways to deal with poor developers on a team. The first is the most drastic, but initially not an option if you’re working in a region with sane labour laws.

Just drop them.

To be honest I wouldn’t recommend this anyway.  Simply letting someone go generally removes the problem but it can leave a lot of holes on the team and you hired this person for a reason, why not try and get that spark back?

Performance Management structures (you do have a performance management process don’t you?) within an organisation can, if done correctly, not only resolve the problem but allow the poor developer to raise their game and become a star on the team.

Identify the source of the problem.  Does the developer just not like the game, are they having a difficult time outside of work, do they disagree with how work is being allocated or do they just not want to be there?

Depending on what their answers are, you’ll have a good idea of where to go next.

Make sure goals are set. Define goals designed to turn the situation around but don’t just set and forget about them (which happens far to often).  Monitor them on a weekly or bi-weekly basis, setting very short term goals to complement the longer term ones.

Define a fixed period of time.  Don’t just let things drag on with only small improvements here or there, have a deadline at which point things will get more serious.

Make it clear what the end results will be.  Whether they are the chance to work on something different or whether it’s a termination of the contract, make it clear so everyone knows what will happen when the goals are reached or missed.

Keep constant records.  Make sure every meeting is documented and the progress or results of all the goals are recorded daily.

Let them go.  While it is drastic, if improvements are not being made given all the opportunities you’ve given them then there really is no other option.  If you’ve bent over backwards to try and solve the problem and the developer hasn’t taken you up on the offer then there really is nowhere else to go.

And even with those sane labour laws, the documentation you’ve been keeping over the Performance Management period mean you can release the developer from their contract knowing you tried your best and they didn’t want the help.


So negative developers, whatever is defined as negative based on the goals of your team, are almost guaranteed to have a bad effect on a group developers.  Negative attitudes to work and development can spread much faster than you might think and will either cause people on your team to normalise at a level far below where they need to be or will simply leave.

It’s vital that as a group these developers are tackled fast, rather than when their effects start to be felt.


This article was originally posted on Engineering Game Development on Sunday the 30th September 2012

Why Do You Create?

Original Author: Claire Blackshaw

My friends know I’ve been struggling to write this article for months, struggling with the tone and the question: so I ask you, Why? It’s the most important question you can ask yourself, though asking it is never easy.

A recent Edge piece asked some creative directors and leads the question. Some give honestly insightful answer and are worth a read. Though this question extends to every member of a studio or aspiring developer, not just the creative directors of the world.

Please bear with me for two paragraphs of personal anecdote to help me discuss this issue.

From a young age I was making games, programming BASIC on C64 before the age of ten. At the same time my brother introduced me to roleplaying games. From those early days until adulthood I was passionately creating games, roleplaying systems, writing/directing plays to stage and drawing a web-comic in the early internet days all while earning money doing freelance photography, websites and just so much stuff.

An important personal question was raised in my late teens: If I wasn’t doing what I was passionate about and being my true self what was the point? This lead to separation from my family over disagreements, hard life choices, being broke for 8 years while working full time and putting myself through university, twice in two countries. Always struggling to break into the industry without compromise. Always broke, often living rough and using my holidays to sit exams or go to interviews. I was in debt when I broke into the industry and quickly got a Lead Programmer credit and since then a Lead Designer credit. I now work as a Designer / Programer. Though my time in the industry has been far from ideal with the last three years in three companies in three cities.

Now the amount I’ve created in the last few years as a full time employed developer is less than in the years where I was not. I use this personal story because my hunger to be in the industry diverted me away from why I wanted to be part of it. In my struggle to become part of the industry and do well, most of my energy is focused towards the industry and not my creations within it, and my situation is not unique. The realities of business can often pull us away from games, while they are required they are a means to an end not ends in themselves.

Too often in the grind of the daily job, the crunch of a project or just an eye on the next thing in the industry we lose sight of why we are pushing bits. Though field leaders one after another will espouse the virtue of direction, putting the why before the how. Just sample a few TED talks# or look at creatives you admire, they have purpose and drive beyond the daily grind.

Now why you make games could be to pay the bills, have difficult challenges, work with fun people, self expression, the desire to create or a million other motivations. The most important thing is that you know what you want to do and why.

Please take a moment to answer these anonymous questions, click here, about why you make games. I have a two follow-ups I want to write, one dealing with unlocking the power of your team and other people’s motivations but also about the responses I hope to get.

It’s not an easy question and for me it’s all about what sort of games do I want to spend my life making. Personally it all comes back to the stage for me, I want my audience to share an experience with me, be it political satire, dark comedy or whimsy. To make things that friends talk about over cocktails and coffee. To make things that matter.

# Look at a few of the top talks and publications across the field and the theme of motivation or why is core.

A Data-Oriented, Data-Driven System for Vector Fields — Part 2

Original Author: Niklas Frykholm

In Part 1 we decided to represent a vector field as a superposition of individual effects:

G(p) = G_0(p) + G_1(p) + ... + G_n(p)

Here, each G_i(p) is a function that represents some effect, such as wind, an explosion or the updraft from an air vent.

The next step is to find a way of quickly evaluating the function G(p), a general function that could be almost anything, for lots of different positions p_i. This is quite tricky to do well in C++.

Of course, evaluating specific functions is not hard. If we want to evaluate a specific function, such as:

Vector3(sin(p.x), sin(p.y), 0);

we can just type it up:

inline Vector3 f(const Vector3 &p)
	return vector3(sin(p.x), sin(p.y), 0);

But if we don’t know beforehand what G(p) will be we don’t have that option.

We could write our system so that it supported a limited set of specific effects, with hardcoded C++ implementations. For example, there could be an “explosion” effect with some parameters (radius, strength, etc), an “updraft” effect, a “whirl” effect, etc. Similarly we could have support for a variety of standard shapes, such as “sphere”, “cylinder”, “capsule”, etc. And perhaps some different types of falloffs (“linear”, “quadratic”). Perhaps also some temporal effects (“attack-sustain-release”, “ease-in-ease-out”).

But it is hard to know where to draw the limit with this approach. Exactly what effects and shapes and falloffs and time curves should the system support? The more things we add, the more cluttered the system becomes. And the system is still not completely general. No matter how much we add, there will still be some things that the user just can’t do, without disturbing a programmer and get her to add a new effect to the system. This means that the system is not truly data-driven.

Whether this is a problem or not depends a lot on your development style. If you are a single artist-programmer working on a single game you may not even care. To you code and data is the same thing. Who cares if you have to add something to the code to make a special effect. That is what the code is for.

At Bitsquid, however, we are in a different position. We are making a general purpose engine to be used on multiple platforms for all kinds of tasks. We can’t put game specific code in the engine or everything will end up a total mess. Sure, our licensees could modify their cloned copy of the source to add their own effects. But that is not an ideal solution. It forces them to learn our code, it makes it harder for us to reproduce their bugs, since our code bases have now diverged and it makes it harder for us to modify and optimize the source code without putting our licensees in merge hell.

So our aim is always to be completely data-driven.

But how can we represent a general function as data? There are really only two possibilities:

  • As a piece of executable machine code.
  • As a piece of bytecode that gets executed by a virtual machine.

The first approach is the fastest of course, but it has two drawbacks. First, machine code is platform dependent. Writing a system that can dynamically generate machine code for a lot of different targets is no small undertaking (though it could be simplified by using LLVM). Second, and more serious, many systems simply don’t allow us execute dynamically generated machine code.

The inevitable conclusion is that we have to use bytecode (perhaps coupled with a machine code compiler on the platforms where that is feasible).

Unfortunately, as everybody who has used a dynamic language without a JIT compiler knows, bytecode is slow. Usually, at least a factor 10 slower than machine code. And remember that one of our design goals for this system was that it should be fast. We said in the beginning that it should be able to handle at least 10 000 queries per frame.

So what can we do?

The Massively Vectorized Virtual Machine

At this point it makes sense to stop and think a bit about why bytecode is slow. If you look at the code of a virtual machine, it is essentially a tight loop that repeatedly does three things:

  • Decode the next bytecode instruction into operation + arguments.
  • Jump to the code that performs the operation.
  • Execute the operation.

The third step is usually just as fast as handwritten machine code would be. Computing a+b is not more expensive because it was triggered by an OP_ADD bytecode instruction.

So all the overhead of bytecode, the thing that makes it “slow”, is found in the first two steps.

Well then here is an idea: what if we could reuse the computations that we make in those two steps?

Remember that our goal is to compute G(p) for a lot of points p_i. We want to evaluate the same function, the same bytecode instructions, for a lot of different data points. In that case, why repeat the expensive operation of decoding the bytecode instructions again and again for each point? Why not just decode the instruction once and then execute it for all data points?

So, with that change, our virtual machine loop now becomes:

  • Decode the next bytecode instruction.
  • Jump to the code that executes it.
  • Execute that single instruction for all the input data.

With this change, the cost of decoding the bytecode is now amortized over all the query points. The more query points we have, the less time (proportionally) we will spend on decoding bytecode. With enough points (>1024) that time should be nearly negligible . In other worlds, our bytecode should be able to run at nearly the same speed as native machine code.

In a quick test I made, the overhead of a bytecode implementation compared to native code was just 16 % — a far cry from the 10x slowdown we have come to expect.

Fleshing out the Details

Since we are computing a vector function on vector input and we want it to run as fast as possible, it makes sense to use SSE (or its equivalent on other platforms) and represent all our data as vector4 intrinsics.

Virtual machines can be stack-based or register-based. Stack-based machines produce more compact bytecode since the arguments are implicit. Register-based machines need fewer instructions to accomplish a task, since they don’t have to juggle things around on the stack. In our case, compact bytecode doesn’t buy us much, since our programs are short and the decoding cost is amortized. On the other hand, accomplishing the same thing with fewer instructions means less code to execute for each query point. So a register-based virtual machine seems to be a clear win.

Here is what the code for an explosion effect could look like in a made-up intermediate language for our virtual machine. The effect produces a wind of 50 m/s outwards from the center of a sphere of radius 5 m located at (2,4,0):

direction = sub position, (2,4,0,0)
lensqr = dot direction, direction
direction = normalize direction
direction = mul direction, (50,50,50,50)
direction = select_lt lensqr, (25,25,25,25), direction, (0,0,0,0)
output = add output, direction

Here position is the input query position and output is the output result of the function. direction and lensqr are temporary variables.

Note that the final operation adds the result to the output register instead of overwriting it. This allows us to merge multiple effects by simply concatenating their bytecode. So to evaluate G(p) for a large number of points, we can first intersect the AABB of the points with the AABB of each individual effect G_i(p). Then we merge the bytecodes of each intersecting effect into a single bytecode function G'(p) that we finally evaluate for each point.

We can feed position and output to the virtual machine as arrays of intrinsics:

void evaluate(void *bytecode, unsigned n, Vector4I *positions, Vector4I *output)

Note that since we are running the bytecode one instruction at a time for all the data, the local variables (direction and lensqr) need to be arrays too, since we need to remember their value for each of the input positions.

We could allocate arrays for these local variables and pass them to evaluate just as we do for positions and output. But that seems a bit wasteful. A complicated function could have twenty global variables or more, meaning that with 10 000 particles we would need to allocate 3.2 MB of temporary memory. The amount needed will vary widely, depending on how complicated the function is, which is driven by the data. This makes it hard to do a memory budget for the system.

So let’s use an alternative approach. We allocate all local variable buffers from a “scratch space” which is provided by the caller:

void evaluate(void *bytecode, unsigned n, Vector4I *positions, Vector4I *output, unsigned scratch_bytes, void *scratch_space)

Now the caller has complete control over the amount of temporary memory the system uses. It is predictable and can be made to fit any desired memory budget.

To make this work, we need to chop this scratch memory up into areas for each local variable. The size of those buffers then determine how many input positions we can process at a time.

For example, suppose we have 256 K of scratch memory and 8 local variables. Each local variable then gets 32 K of memory, which can hold 2 K Vector4I’s. So this means that instead of processing all 10 000 particles at the same time when we execute an opcode, we process the particles in 5 chunks, handling 2 048 particles each time. The cost of decoding the bytecode gets amortized over 2 048 particles, instead of over 10 000, but it is still negligible.

The nice thing about this approach is that we always use a constant, predictable amount of scratch space, regardless of how many query points we process and how complicated the function is. Instead we scale down how many particles we process at a time.

Since both input data and local variables are now Vector4I buffers, the inner loop of the virtual machine is simple to write, it will look something like:

void run_vm(const void *bytecode, unsigned n, Vector4I **registers)
	const void *pc = bytecode;
	while (true) {
		unsigned op = DECODE_OP(pc);
		switch(op) {
			case OP_ADD:
				Vector4I *a = registers[DECODE_REGISTER(pc)];
				Vector4I *b = registers[DECODE_REGISTER(pc)];
				Vector4I *c = registers[DECODE_REGISTER(pc)];
				Vector4I *ae = a + n;
				while (a != ae) {
					*a++ = addi(*b++, *c++);

An Example

Here is a YouTube video that shows a vector field implemented using this method. Unfortunately, the YouTube compression is not very nice to a video that contains this much high-frequency information. But at least it gives some idea of the effect.

The video shows 20 000 particles being animated by the vector field at a query cost of about 0.4 ms on a single thread (of course, parallelization is trivial, so you can divide that by the number of available cores).

This has also been posted to the Bitsquid blog.