Promoting your indie game company with a podcast

Original Author: Tyler York

This is a guest post by Matt Hackett of LostCast.

Introduction

A few months before jumping fulltime into our independent game startup, my co-founder and I started a podcast. We did this largely because of the advice from other successful independent game companies like guerilla marketing techniques with remarkable results. It also sounded like a fun excuse to talk shop!

Our podcast (called sponsor, which I believe is a significant milestone for any show. So at this point I thought I’d share some insights about podcast production and how other independents could use podcasting to help their companies thrive.

Is it right for you?

At Lost Decade Games, we often run audits on our time to ensure it’s being used effectively. A few weeks ago, we noticed that Lostcast was taking a significant chunk of my time away from game development, so we needed to take a close look at the podcast to see if it was a worthwhile time investment.

Though it’s difficult to measure its exact impact, Lostcast has definitely opened some doors for us that would otherwise never have been opened. While it might not make sense for your company, I’ll describe some of the reasons that we think it’s proven valuable.

STAND OUT FROM THE CROWD

There are droves of independent game companies out there, spanning across hobbyists and professionals alike. Maintaining blogs and Twitter accounts is practically mandatory for any company dealing with consumers these days, but how does one not get lost in the sea of similar offerings? We receive regular comments that our podcast is a welcome break from the norm and helps people to remember us.

ESTABLISH TIGHT BONDS

The thing about works of audio is that they can be taken anywhere and played anytime. It’s a drastically different medium than the text-based tools in an independent’s marketing arsenal. Many listeners say they listen to Lostcast during their work commute; I personally listen to podcasts while jogging or even grocery shopping. Essentially your show becomes a part of people’s lives. You’re with them when they’re bored or lonely, talking about things that interest them. This establishes a bond.

Not convinced? A while back we were complaining about our poor recording hardware on an episode of the show, and a listener commented that we should start a wishlist for the gear we needed. We made the wishlist and — to our complete amazement — he bought us everything on it. (Thanks Joe!) I believe that this remarkable occurence just wouldn’t have existed in a medium where the fan couldn’t hear our voices and our excitement about making games. As a similarly-minded person, he felt a connection to us and felt compelled to be very generous.

TAP INTO THE PODCAST MARKET

When independent game companies try to get news coverage from the primary outlets, they’re competing against the biggest game companies in the industry. A games journalist can only write so much in any given day, so it’s usually up to the indies themselves to find their own audiences.

Why not the podcast market? There are millions of podcast listeners who don’t use Twitter and won’t find your company there. And maybe they don’t enjoy reading blogs either, so how else would they know about your games? Indies rarely have the luxuries of substantial advertising budgets or featured spots. The difference between having a podcast and not having one could be the difference between many fans knowing your company intimately versus never having heard of it at all.

OTHER OPPORTUNITIES MAY ARISE

Part of success is being able to take advantage of opportunity when it arises. By having a podcast and beginning to cultivate a niche audience, you’re throwing a larger net. Lostcast began as what we thought was simply a fun, unique way to reach a gaming audience. But because we had the show, we were able have conversations about sponsorship, an opportunity that wouldn’t have existed without it.

 

What do you need to podcast?

The bare minimum you need to podcast is simply a computer with a recording device (even a built-in laptop microphone will do). Free software such as Audacity will work fine for recording and editing the audio. The bar can be very low if you’re working on a shoestring budget, but better quality equipment means better recordings. If you want to produce a high quality podcast, it’s worth it to invest in great audio gear.

MICROPHONE(S)

We already had two essentially the same, so just buy cheap cables by a decent brand.

AUDIO INTERFACE

If you’ll be recording multiple podcasters at once, you’ll also need to solve the problem of multiple inputs. We went with an TRS connectors.

SOFTWARE

Lastly, I recommend acquiring some high quality software to ease your recording and editing process. Lower-end software can certainly be used, but you’ll save yourself time and headaches by using a premiere digital audio workstation.

As a hardcore Mac user, I went with Logic Pro ($200), which is Apple’s own offering. I find it to be exceptionally powerful, versatile, and intuitive. Though its large feature set may be overkill if you’re only wanting to record audio, I’ve found its niceties like compressors and gating modules extremely useful.

Offerings similar to Logic Pro (and available on other platforms) include Avid Pro Tools, which are both excellent choices as well. There may also be less expensive software intended solely to record and edit podcasts.

 

Record!

Once you’ve got all your gear in place, you’re ready to start recording! Before you begin, you’d do well to listen to several other popular podcasts to get a vibe for what people may be expecting. But the beauty of podcasts is that they can be whatever you want them to be!

To help your show stand out in the sea of podcasts, consider picking a niche topic to talk about. For example, we feel that being the only HTML5 games podcast out there (at least to our knowledge) gives us a unique edge and helps us find listeners.

While it’s beyond the scope of this article, I also highly recommend finding tutorials on editing, compression, and gating techniques to apply to your podcast after recording. A little post-production love can drastically improve the quality of your show and make your listeners much happier! I also spend a few hours editing out long pauses and filler worlds such as “uh” and “um” that nobody wants to hear.

 

Publish

The publishing step involves bouncing the recording to disk and making it available on the Internet. The bare minimum is probably to upload an mp3 file to a web server and link to it from a blog or Twitter account. Again, that’s the beauty of podcasts: if that’s all the effort you care to put in, it’s adequate and listeners can still enjoy your show.

However, you really should take the extra effort to get your podcast into popular distribution outlets. It should be no surprise that iTunes is the premiere platform for podcasts. While you could simply upload your podcast to your website and link to it from there, you should seriously consider also including your podcast on iTunes. Fortunately, Apple has provided a useful Making a Podcast document that will walk you through the relatively simple submission process.

 

Summary

A podcast is a great way to get your company name and your games out there. They may not be a good fit for every team, but they have the ability to establish tight bonds with an audience and help get exposure for your games.

Podcasting can be a lot of work, but the prerequisites are minimal (a laptop microphone and free software will suffice!) and the payoffs can be tremendous. If you’re on the fence about it, I urge you to record a rough episode, even if you do nothing with it. You might just like it!


Photon Mapping Part 2

Original Author: Simon Yeung

Introduction

Continue with Spherical Harmonics(SH) basis. 4 SH coefficients is used  for each color channels. So 3 textures are used for RGB channels (total 12 coefficients).

Baking the light map

To bake the light map, the scene must have a set of unique, non-overlapping texture coordinates(UV) that correspond to a unique world space position so that the incoming radiance at a world position can be represented. This set of UV can be generated inside modeling package or using UVAtlas. In my simple case, this UV is mapped manually.

To generate the light map, given a mesh with unique UV and the light map resolution, we need to rasterize the mesh (using scan-line or half-space rasterization) into the texture space with interpolated world space position across the triangles. So we can associate a world space position to a light map texel. Then for each texel, we can sample the photon map at the corresponding world space position by performing a final gather step just like previous post for offline rendering. So the incoming radiance at that world space position, hence the texel in the light map, can be calculated. Then the data is projected into SH coefficients, stored in 3 16-bits floating point textures. Below is a light map that extracting the dominant light color from SH coefficients:

The baked light map showing the dominant

light color from SH coefficients

Using the light map

After baking the light map, during run-time, the direct lighting is rendering with usual way, a point light is used to approximated the area light in the ray traced version, the difference is more noticeable at the shadow edges.

direct lighting only, real time version
direct lighting only, ray traced version

Then we sample the SH coefficients from the light map to calculate the indirect lighting

indirect lighting only, real time version
indirect lighting only, ray traced version

Combining the direct and indirect lighting, the final result becomes:

direct + indirect lighting, real time version
direct + indirect lighting, ray traced version

 

As we store the light map in SH, we can apply normal map to the mesh to change the reflected radiance.
Rendered with normal map
Indirect lighting with normal map

We can also applying some tessellation, adding some ambient occlusion(AO) to make the result more interesting:

Rendered with light map, normal map, tessellation and AO
Rendered with light map, normal map, tessellation and AO

Conclusion

This post gives an overview on how to bake light map of indirect lighting data by sampling from the photon map. I use SH to store the incoming radiance, but other data can be stored such as storing the reflected diffuse radiance of the surface, which can reduce texture storage and doesn’t require floating point texture. Besides, the SH coefficients can be store per vertex in the static mesh instead of light map. Lastly, by sampling the photon map with final gather rays, light probe for dynamic objects can also be baked using similar methods.

References

Localization Pipeline

Original Author: Michael A. Carr-Robb-John

In my previous post on localization I talked about some of my experiences localizing games for different languages / regions. This time I wanted to expand upon those notes a little and talk more about the technical aspects of localization and walk through a pipeline.

The Language and Locale Encoding

In the early days I used to simply have an enumeration in a header file that was very similar to this:

 

enum ELanguage
 
  {
 
       eLanguage_English,
 
       eLanguage_French,
 
       eLanguage_German,
 
       eLanguage_Spanish,
 
       eLanguage_Amount
 
  };


20 years ago this was fine, I was developing on a cartridge that had all the languages essentially loaded at once and really there was no need to support regions beyond the specific languages. These days however we need something a little more robust and as you should have picked up from my last post the locale is very important these days. So lets start by looking at how we identify each translation, thankfully two very useful standards have been defined by people who know a lot more about languages and regions than I. These standards allow us to specifying each language and each region as a two digit code.

Language Code www.loc.gov/standards/iso639-2/php/code_list.php

Region Code www.iso.org/iso/country_codes/iso_3166_code_lists/country_names_and_code_elements.htm

Using these we can create a short code for every possible supported language and region we are likely to encounter, for example:

   en-US     English America
 
     en-GB     English Great Britain
 
     es-MX     Spanish Mexico
 
     nl-NL     Dutch Netherlands
 
     en-CA     English Canada
 
     fr-CA     French Canada

 

A Pipeline

This is by no means the only pipeline that can be used for localization, they all have different benefits and issues this one just so happens to be my preference, probably because I like offline tools.

 

 

The storage and manipulation of localized strings I have seen done in every way possible from databases to proprietary editing tools. My personal choice is to use Excel for editing and manipulation but this does not come without two issues that you should be aware of;

  • Although version control software generally is fairly good at merging xml files the xml generated from Excel always seems to make merging difficult (especially for designers) to the point that it is safest to simply lock the file while it is being edited.
  • Not all translators like to work in Excel so you will probably need someone or a tool (probably both) to convert what ever format the translators are working in to the excel format.

  An example of the strings in excel:

Column A contains the identifier string then each column along contains one translation. Notice the encoding id at the top of the sheet, this not only tells us the language / region but is used by the exporter tool to know which files to generate. The export tool exports the data into whatever binary compressed format you prefer to use in-game. Since I have not worked on a game with massive amounts of text I have generally stuck to a text format with each language being written out as a separate text file, like so:

en-US.lang

     PRESS_START=Press [Start]
 
       OPTIONS=Options
 
       MUSIC=Music
 
       …

fr-FR.lang

     PRESS_START=Appuie sur [START]
 
       OPTIONS=Options
 
       MUSIC=Musique
 
       …

 

Depending on which language / locale is required at run-time just that single translation file is loaded into memory.

The exporter tool can also be useful in other ways:

  • Automatically detect and report missing strings.
  • Build fonts based upon the characters that are actually used, very important if you are doing Chinese which has thousands of characters, this method alone has been known to save megs of texture space.
  • Detect formatting mistakes and illegal / reserved characters.

Strings in Code / Scripts

In order to provide a framework for localization the first thing that needs to be cracked down upon is the use of strings themselves.

Previously you might of written the code or script:

     DrawString(“Hello World, My name is Mr Flibble.”);

instead it should now be written passing a String Identifier like so,

     DrawString( eStringId_HelloMessage );

The string enumerations can be auto generated by the export tool, however I did this for a couple of projects and decided that it was more hassle than it was worth. My recommendation is to avoid this if possible, a better way is to pass the string identifier as a string itself:

     DrawString( “Hello_Message” );

Either way both methods would end up looking into a table to find the specific string to be displayed.

 

Encoding

There are quite a few encoding systems for text out there, since this ground has been walked quite a few times in a lot of other posts I’ll skip it here with only a note that for game development my take on the subject is if you are working with limited memory use UTF-8 otherwise use UTF-16.

 

Icons

More often than not it is far simpler to insert an icon into a string than it is to use a long drawn out explanation to describe something. In the text string I indicate where an icon is to be displayed and which one by using the [] markers, for example:

     Press [START] to continue.
 
       Activate [GEM] by pulling string.

Part of my text rendering manager loads a setup file (text again) at startup that contains a list of all the codes and textures to use when that icon is encountered. Very similar to this:

     START, 0, X360_StartButton.tga
 
       MOVESTICK, 0, X360_LS.tga

I can add additional textures on the line if I wanted to animate the icon for example:

     DODGE, 4, Wii_RemoteWave_1.tga, Wii_RemoteWave_2.tga, Wii_RemoteWave_3.tga

The number after the code is the animation speed (FPS).

On the subject of icons, consider this:

     "Use [RS] to aim and [RT] to mark enemy before pressing [A] to fire."

Imagine that your project is multi-platform, [A] should really be [X] on the PS3 and [B] on the Wii. An additional issue is that the Wii doesn’t generally have a [RS]! You could create a string unique to each platform but that really would just double or triple the amount of data that needs to be maintained as and when things change.

My solution in the past to this little nightmare has been to ban platform specific icon names, which includes identifiers like [D-PadLeft], [A], [LeftStick], [X], [Y], [Z], [RT], [L1], etc. Instead I encourage game descriptive text:

     "Use [TARGETTING] to aim and [TARGETREGISTER] to mark enemy before pressing [FIRE] to fire."

Then I have a different icon setup file for each platform and everything works between platforms without any major headaches.

 

Parameters

It’s quite common to construct a string for displaying on screen however it can cause issues for the translators if they don’t know the context. Consider this:

     DrawString(“%s! Get rid of them!”, m_PlayersName );

Now you can see straight away that the %s will be replaced with the players name, however what the translators  see is:

     “%s! Get rid of them!”

Their best guess might be that it is going to be a name of a character but it might also be something else i.e.:

     “Chairs! Get rid of them!”
 
       “Michael! Get rid of them!”

In order to help the translators I use {} to mark parameters:

     “{s-Name}! Get rid of them!”
 
       “{s-Object}! Get rid of them!”

The context after the ‘–‘ is ignored when rendering the text, it is purely descriptive text to help the translators.

 

Formatting

As I’m sure we are all aware by now not everyone writes the date in the same way.

Consider the date 3/4/2012 to me personally this is 3rd April 2012, however to some it is 4th March 2012. Obviously once you get past halfway through the month it becomes a lot easier to spot but it does mean that your region needs to know which date format to use.

 

Translators

Good translators should produce strings in the new language that are roughly the same length as the original string. I usually estimate a rough 20% difference between the English and other languages. This is another useful feature I have built into my tool, it can detect excessive differences between the lengths of the various translations.

Translators MUST not change the order of parameters in a translation. Obvious from a programming stand point but I have in the past had translations that not only re-ordered the parameters but added additional ones as well!

Keep communication levels between you and the translators to a minimum, there have been times when a 5-10 minute email or phone call could of solved a problem but because it has to be filtered through channels it can end up taking days or even weeks to sort out.

 

Assets

Asset management for localization has the potential to touch so many different moving parts of an engine it very quickly stops being funny. The solution I describe is tailored to the way my engine works and it may not be applicable to how your tech works, still you might find this useful.

When an asset is requested my manager has a list of directories that it scans for the requested asset, the first instance it finds is the file that gets loaded. By controlling which directories are in the list and their order I can in effect override assets according to the language and region.

This is my directory structure for localization:

If the requested audio file exists in the specific language directory that file will be loaded, if it doesn’t exist the manager will carry on searching the other directories until it finds the asset. Obviously I don’t allow the player the ability to change languages half-way through a game.

It’s simple but it works.

 

Finally

That’s everything I wanted to talk about in relation to localization, I hope you find it useful.

 

Fixing a Slow Server

Original Author: Ted Spence

A colleague of mine has a server performance problem. It’s a regularly scheduled task that has to work through tons of raw data in a database. The task is reasonably fast except in rare cases. Naturally, over time the volume of data will only increase; so what can we do to make it more reliable and predictable?

Okay, first off, I cheated a bit. This problem hasn’t just happened to one colleague, it’s happened to virtually all of them at one point or another. Slow tasks occur naturally due to the inevitable deterioration of software systems over time – more data accumulates, other features get added, the operating system gets patched, but the task still runs on the same old hardware.

When you first write the task it runs great: Compile, build, tune, and it’s done. Then you forget about it for a few months and it gets slower and slower every time it runs. It’s like the frog in the saucepan – because the change is so gradual, you don’t think about it until it reaches the point where it causes pain to everyone.

If you have a chance to design these tasks fresh from the start, there are lots of great ideas about how to build them so they can scale. But let’s spend today’s article talking about what you can do to diagnose and fix a program – one that isn’t yours – without radical surgery or rewriting it from the ground up.

Isolate your Environment

Before beginning a performance tuning project, you need to isolate your program. It’s generally not a good idea to do performance tuning directly on a live server, but most development environments need to be carefully configured in order to provide useful performance tuning work. This may actually be the toughest step in your work: the programs that get ignored (and gradually accumulate performance problems) tend to be the ones with lots of hidden dependencies.

So, let’s take a few first steps:

  • Set up an isolated development environment. If you can run the entire program on a single desktop or laptop, great; if not, let’s restore to a new machine or VM. If the program requires multiple machines, go get your disaster recovery plans and use them to restore a working environment. If the plans don’t work, this is a great opportunity to fix them and get them right!
  • Use virtualization liberally. Hopefully you have a big VMWare or Xen cluster in your office; if not, just pay Amazon or Rackspace to host an instance for a few days. Most importantly, write scripts to set up these servers. Write a script to do all the manual fixups that the environment requires (i.e. installing programs, changing configuration settings, and copying files). Eventually, the goal is to hit a button and have a test / development environment magically appear.
  • Once the environment is restored, get your unit tests running. Verify that all your unit tests work. You’d be surprised how often these unit tests fail! Keep at it until your isolated environment passes all the tests. If you don’t have tests, well … now would be a great time to write a few.
  • Refactor the program and break out the task you want to improve. I virtually guarantee you that your task is wrapped up in lots of layers of unnecessary code. Try to redo the task so it can be run all by itself, without triggering any other systems. Ideally, it could be a command line program run with a small selection of options.

With this in place, you have everything you need to start breaking down the problem.

Monitoring Is People

Next, you need to know how the task is working in order to know what to improve. You need some statistics; meaningful ones, something better than CPU/Memory/Disk utilization. Here are some useful statistics you might want to gather:

  • Duration – How long does the task take to run? How long did it take back when everyone used to be happy with it?
  • Errors & Warnings – How many errors, if any, does the task encounter when it runs? Do you have a full log output from each time the task runs?
  • How often is it supposed to run? Is it supposed to work daily but they only run it weekly? Do they wish it was continuous?
  • Size of the task queue. How many items are waiting to be processed?
  • Average wait time. How long does an average item take to process?

Monitoring can go great with a command line task. I like to build long running server tasks as command line executables which record their progress in the database when they launch and when they finish. I can then use a clever performance monitoring package to check how long the task took to run. I take the task’s console output logs and write them to disk and cycle them out after a reasonable period of time – say 90 days.

Fixing Performance

So now that you’ve got a working environment, try drawing up a flowchart of the steps the application goes through. Break it down to meaningful levels and explain it to the guy or girl sitting next to you, even if that person doesn’t know anything about the program (in fact, if you have to explain it to somebody new, it’s often better – the challenge forces you to clarify your thoughts).

With this flowchart in hand, let’s start trying to figure out what kind of a performance problem we have.

  • Sequential tasks that could be run in parallel – Check the dependencies in each stage; can you fire off three steps at once and then just wait for them all to finish?
  • Data that can be pre-computed and cached – Is there any work that can be moved outside of the task? For example, let’s imagine my program builds a taxi price chart for London and updates it each week. It doesn’t make sense to have my program query Google Maps for distance information each week; instead, I should gather that data once, cache it, and only update it on demand. Perhaps you should set an “age” limit on each cached data element and have a secondary task go through and parse the old data periodically?
  • Work that can be divided – Sooner or later, in every big complex task, you find the core: the gigantic loop that hits every data element in your project. When you finally find this part you’ll know you’re getting close to the end. This is the component that you want to “Map/Reduce”: you have tons of data and a small element of code that must be run on each item of data.

What happens if you can’t map/reduce your data? Don’t worry: even if you’re not writing for a Hadoop environment you can still split up the work. Consider writing an inherently parallel version of your program that runs multiple items at once. What prevents you from running two threads at once? Ten? Twenty? What prevents you from having dozens of servers doing this work at the same time?

The answer is likely to be well understood by anyone who does parallel work: synchronization and bottlenecks. You synchronize your tasks by making sure each server doesn’t interfere with the next one, and you identify the bottlenecks by noticing where the pain occurs. Let’s start with some simple pseudocode that could be used as a wrapper around your project:

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  7
 
  8
 
  9
 
  10
 
  11
 
  12
 
  13
 
  14
 
  15
 
  16
 
  17
 
  18
 
  19
 
  20
 
  21
 
  22
 
  
function MyGiganticLoop()
 
  {
 
      // Retrieve all the cached information I need in order to work
 
      Setup(); 
 
      while (true) {
 
   
 
          // Contact a centralized server to obtain one work item
 
          var WorkItem = RequestWorkFrom(DISPATCHER_HOST_NAME); 
 
          if (WorkItem != null) {
 
   
 
              // This is my "high intensity" function that takes a lot of processing time
 
              var WorkResults = DoMyTask(WorkItem); 
 
   
 
              // Update the work item with my results
 
              MarkWorkItemComplete(WorkItem, WorkResults, DISPATCHER_HOST_NAME); 
 
          } else {
 
   
 
              // The queue must be empty - I'll sleep for a resonable time and check again
 
              Sleep(1000);
 
          }
 
      }
 
  }

This code is useful because it uses a centralized dispatcher server to ensure that each task client continues to work as fast as possible. Ideally, the centralized server keeps a list of all the work items that need to be executed. Each time a client contacts it, the dispatcher marks that item as “in progress” and hands it to a client. If the client crashes or fails to complete a work item in time, the dispatcher can return it into the queue and allow another client to try.

More importantly, in this pattern, you have the option to scale pretty much linearly (provided your central dispatcher isn’t a performance hog itself). You can simply monitor the work queue and spin up new instances when it falls behind, and shut them down when you catch up.

But what if this doesn’t resolve the issue? You may find, when you get down to this level, that the function DoMyTask(WorkItem) isn’t actually where all the time is being spent. Perhaps there’s a database call that is the culprit. But now you have isolated your environment enough to be sure you’re on the right track.

Oh, and when you’re done – Finish up your work with a code review of the change. Walk through every line of code you had to modify and explain it. It will take a long time, but it will be worthwhile. Happy tuning!


A Big Jumbled Blog About Joining Team Audio

Original Author: Ariel Gross

I keep writing and rewriting this blog. First it was going to be about our hiring process at Volition. Then it was going to be about what it takes to join the Volition audio team. Then it was going to be about a few things that I’m looking for in audio design candidates. Then it was going to be about some of the resumes that I’ve seen and explain how certain things do not qualify people to be in-house game audio designers. Then it was going to be about jerky things that I think developers do to applicants.

I realized that I was doing that thing that I always do, which is spend an hour writing different titles for my blog, fantasize about the content, try to define the blog and what its importance was, and not get anywhere. This is one of my curses when it comes to blogging. And it’s silly. So, now I’m just gonna write this thing. Screw it! I’m writing it! And it just going to be a big jumbled blog about all that stuff.

The Volition Audio Hiring Process

I think it all really started with Anne. Anne is the project manager for audio at Volition. She’s an innovator. I’d like to think that I am, too, but I’d rather someone else say it about me than to say it about myself. “I’m an innovator!” Sure ya are, buddy (wink-mouthclick-point). At the very least, I am an early adopter. So, with Anne and I working in the same department, experimentation can sometimes be the path of least resistance. (wink-mouthclick-point-jump-heelclick-belch) I don’t know why I just wrote that.

Anyway, we decided to put the audio hiring process up for discussion and change.

We kept a central person to review all incoming applicants. That would be me. I’d scrap a bunch of incoming applicants because I could tell by reading the cover letter and resume that a person did not have the stuff. I will talk more about that later. If someone piqued my interest, I would pass their cover letter, resume, and demo materials along to the rest of the audio team. I’d get feedback and then decide if we wanted to proceed with the candidate to the next step.

The next step would be some kind of test. Previously, we had sent out a written test that had a bunch of questions on it. Stuff like, what do you consider to be the three most important areas of sounds in an open world game? What do you think would be difficult about working on audio in an open world game? And if you had to design a beam weapon, how would you put it together both creatively and technically? And a bunch of other riddles and puzzles and noodle-ticklers that usually had no specific correct answer but plenty of potential incorrect or awkward answers.

We decided to kill the written test. We had all taken similar written tests and decided that they were annoying and time consuming. Additionally, those are the types of things that we discuss on a daily basis within the group. If someone in the Volition audio team were to say, “I’d design a beam weapon as 330 one-shot sound effects of varying lengths,” then one of us would say, “That seems like an odd approach,” while cleaning up all the barf. Also, we could ask questions like that over the phone or in person and it would allow for some back-and-forth.

Previously, if the applicant had gotten past the written test stage, they would go on to the video test. This is where we would send out a video capture of Red Faction: Guerrilla with the sound stripped out and would ask the applicant to replace the sounds with their own creations. This method is decent for exposing a candidate’s sound design skill. They would need to design some weapons, some impacts, a vehicle engine, ambience… it would give us back a pretty good variety of sounds that would be relevant to their jobs.

But there are some problems with that method. We do some linear work within the Volition audio team, but the vast majority of our work is non-linear. Also, it’s easy to get caught up in the little gotchas, like, did they get every footstep? Did they notice that piece of metal in the distance falling over? Did they notice that the player is low on health in this section? And if we weren’t really careful about it, we might be mentally dinging an otherwise awesome candidate because they missed that little visual cue, which again, would be something that could be addressed in a feedback session if they were working here. We also tended to get a lot of very similar results back. Also, it doesn’t really give the candidate much of a chance to show us if they can get a point across or tell a story with sound, although that was partly because of the footage we would send.

Byron had heard that some other companies were instead sending out a scenario that is written out in text. The candidate is asked to read the scenario and to then send back an audio file. That is, there is no video component to sync the audio to. There are some limitations, like how long they have to do it (two weeks) and the duration of the .wav file that they send back (60-120 seconds), but other than that, it’s really up to the candidate to tell the story with their sounds. Not only does the candidate have more creative liberty, but they also get to completely control the pacing. They can tell a better story this way. So we did it. After getting a couple of these tests back, we decided that we were able to tell a lot more about our candidates than what we were able to tell from the video method.

So, the new process to this point would be to look at the applicant’s materials that they’ve sent, then pass along the good ones to the team for discussion, and then to send out this new test. If we liked their test, we’d schedule a phone interview.

In the past, we had tried a couple different methods for the phone interview. The first method involved getting the entire audio team in a conference room and calling the poor applicant as a team. We would all go around the room asking questions off of a piece of paper. Lots of standard questions like, “Do you have any weaknesses? No? Okay, next question,” and, “What’s a game you think had cool sound? Saints Row? +10 points.” It was too rigid and it didn’t really give us a sense of who this person was.

So, the pendulum swung to the complete other side and we went paperless and very spontaneous. One might say unprepared. But we were all still in the conference room. Like seven or eight people asking all sorts of disconnected questions, like, “What kind of music do you listen to,” followed by, “How would you design a tool to implement ambience,” followed by, “What’s your favorite plug-in?” My opinion was that it was a complete mess, and although we sometimes would get a better sense of who this person was than with the worksheet full of standard interview questions, it was spotty at best.

This time around, the phone interview was two people. I was in all the phone interviews and we rotated the other members of the audio team. We would meet for 15 minutes before the phone interview and toss around a few questions that we’d like to ask. Usually we would each have around five questions that we wanted to ask, and the rest of the time was left open for banter and rambling. Banter and rambling actually means a lot to me. I want to know how this person banters and rambles. In the end, I’d say that it went the best that it has ever gone. There are probably ways to improve the process, but it worked out better than anything before.

At this point, after the application process, the test, and the phone interview, we had a pretty good sense of this person. There was just one last test to go. The on-site interview.

To me, the on-site interview has a primary purpose, which is that I want to see how well this person is going to fit in with the team. I already know that this person is qualified or they wouldn’t get to this point. So, for the on-site, I just want to be able to predict whether or not I want to work with this person day in and day out. But there are lots of other things that we can find out during the on-site.

When the candidate shows up for their on-site, the first thing we do is gather up the audio team and a few other relevant people and listen to the candidate’s test a couple times with the candidate in the room. Then we start the critique.

I find that it’s actually pretty tricky, because the candidate wouldn’t be sitting in a Volition conference room if their test was bad. My favorite question that I heard asked was, “If the tables were turned and you had to critique this test, what criticism would you have?” This was followed by, “Now respond to your own criticism.” We would also pick a section and say, “What would you do to make this section more realistic?” This would be followed by, “Okay, same section, but how would you make it funnier?”

The responses to the questions isn’t really the point to me, anyway. So, it doesn’t really matter what we criticize. We do this because we want to know how the person reacts to feedback. Do they get defensive? Do they struggle with coming up with new approaches? Do they clam up and seem defeated? I personally give the candidate lots of slack, too, since they’re in the hot seat for a job that they presumably want really badly.

After the critique, they get a little face time with me and our audio programmer and we tend to ask more technical questions. I want to get a better sense of whether this person has serious technical chops or if they are more of a content creator. Or maybe both.

This is followed by some show n’ tell of what we’re working on. We take them into one of our offices and just play the game. We talk about what we’re working on, what gets us excited about the project, we play the game in front of them, and we check out their reaction. We usually get some good questions from the candidate at this point in the process. The questions that they ask during the show n’ tell of the game give me an indication of where their head is at, what they’re most interested in, stuff like that.

Then it’s lunch with the audio team. Lunch is important. This is the first chance that we have to see how the candidate behaves with the audio team in a more social setting. We’re not in the office. We’re not sitting in front of a computer. We’re not grilling the person. We continue to ask questions, but they’re social ones. Anne likes to ask questions like, “What would you be leaving behind if you were to move here?” And I tend to ask stuff like, “What kind of music do you listen to?” More personal questions. It tells me a lot about the candidate. It’s also a chance to reset the candidate and get them ready for what’s coming next.

After lunch, we have an interview gauntlet. Three hours of interviews with people from the audio team as well as people from production, studio management, writing, design, and whoever else we think would be able to give us an interesting perspective on this person. This is probably the most stressful part of the on-site. After this, it’s usually around 4:30pm, and the candidate is probably like, “I need a drink.” Which is exactly what we do.

It was Anne’s idea and it has proved to be another great one. We have formalized drinks as the way that we end our on-site interview process. We promptly head out to a local bar and have beers for two hours. I find this to be the most interesting part of the on-site because once you get a beer or two into someone, especially after an extremely stressful day, they tend to open up. To me, this is an essential part of the interview process because we start to see who this person really is.

After that, they go home and we make a decision in the following days. If you’ve read this far, well, I’ve just barely gotten started! Sorry boutcha! I haven’t blogged in a couple months and I have all this crap rattling around in the ol’ fleshy hat rack.

What It Takes To Work With Us

If you’re applying to join Team Audio at Volition, your odds of actually joining us are very low. It doesn’t matter if you’re straight out of school or if you’ve been in the industry for 20 years, the odds are still very low. It doesn’t matter if you’ve never worked on a game before or if you have 20 games under your belt. Still low.

For this round of hiring, we had 64 applicants that made it through HR and landed on the network for me to check out. That’s the lowest that I’ve seen since I’ve been here, and it’s probably because we didn’t post the job to Gamasutra or other job sites. Typically we have well over 100, but for what I’m about to say, let’s go with 64. Out of these 64, we had on-sites for three of them. Of those three, we hired two. So, purely looking at the numbers, without taking anything else into account, if you applied, you had around a 4% chance of getting an on-site interview, and around a 3% chance of getting hired. So, I would call that a pretty small chance. I’m assuming that I did the math right, there. I think I did.

I want to add that the people that we hired didn’t really have experience as in-house game audio designers. We were able to hire senior guys if we wanted to, but we didn’t. And I also want to add that we had senior candidates apply. Lots of them. Full on industry vets that were more than qualified for the job requirements that we posted. But we went with guys that had way less comparable experience. And I’ll also mention that we had that same criteria of experience and shipped titles on our job listing. So, why would we hire guys that didn’t have a ton of experience?

It’s because that stuff isn’t all that matters to us. And I think it matters a lot less to people like me who are in the position to hire other people than many might think. In fact, if you’re out there publicly complaining about the catch-22 of needing experience before getting an entry level job, I think you’re looking at it from the wrong angle. Also, I might see you doing that, and as someone who gives people a chance, that might annoy me. Just sayin’! And believe me, I understand how you could see things this way. I was in your shoes not too long ago. I remember sending my resume and demo materials to over 100 developers before Volition saw my potential and hired me. And I had experience and titles! And even then, they didn’t really know what I was capable of until I had been working there for a while.

And that’s kinda the point. We can research you and check out your previous projects and watch your demo and talk to you on the phone and even meet you in person at the on-site and we really don’t know what we’re getting until you’ve been working here for a while. And I realize this. I’d like to think that most people in my position realize this. Which is why many of us are looking for something that can’t really be articulated very easily. Yes, we want to see an awesome demo reel. Yep, it would be great if you’ve got some experience. But there are things that mean way more to me than that stuff. I’m going to give you three things that I think are more important than all that other stuff.

First of all, I want to see that you have a purpose for wanting this job. Not what you’re doing (e.g. your resume and demo). Not how you’re doing it (e.g. your web site or blogs that show how you do what you do). But why you’re doing it. That’s tougher to show me. I realize this. But people have managed to do it. It’s in the tone of your presence on the Internet and in person. It’s written between the lines in the e-mails that you send me. It’s hidden in something you wrote on Twitter or on a comment on a Gamasutra article. It’s the sound of your voice and the look in your eyes when we’re talking in person. And if you’re starting to think this is unfair because it requires you to be active in some community that I’m a part of, then there’s one glorious place where you’re assured that I will see it no matter what. That is your cover letter. You should see the look on peoples’ faces when I tell them that I often get more out of a cover letter than a resume or a demo. But sometimes I do. That’s your shot at showing me that you have a purpose.

Secondly, I want to see the potential for growth. It doesn’t matter if you’ve been around the block, either. There are as many titans of the game audio industry as there are newbies who realize that this field changes so quickly that you still need to be able to grow and change. If you think you have figured this whole game audio thing out, well, there might still be somewhere out there for you, but it’s not Team Audio at Volition. I see this as going hand-in-hand with some other important qualities, like humility, a good sense of humor about yourself and your work, and the ability to take feedback. I wrap it all up in this thing that I call potential for growth.

And thirdly, I want to see that there’s more to you than audio. Team Audio at Volition tries to look at each other as complete human beings. We can all design and implement audio. But that’s not all there is to it, not by a long shot. There’s so much more to being part of this game development team than being a good audio designer or than knowing how to make things sound right in Wwise. Maybe you have some game design sensibilities. Maybe you’re able to make people laugh. Maybe you play an instrument. Maybe you know how to read a schematic. Maybe you’re good at making a point. Maybe you’re a futurist. Maybe you like to take random online classes. Maybe you like to put together puzzles. It doesn’t matter. I’m not just trying to look at you as an audio designer, I’m trying to look at you as a complete human being. The bigger picture I can see, the most interested I may become.

One Swallow Does Not A Summer Make

After looking at so many audio applicants since being at Volition, even the most recent time, when there is more information about what it takes to get an in-house audio job out there than ever before, it has become apparent that some people still don’t understand the breadth of knowledge and skill that it takes to get an in-house audio gig at Volition.

I see a lot of applicants applying who have experience in some form of broadcast media. This is relevant, don’t get me wrong. There are things that you could learn at these places that might give you some skills that would apply to the work that we do. But if this is all you got, then it’s probable that you will be outgunned even for an entry level position.

I see a lot of applicants applying who emphasize that they are musicians or composers. This is also relevant. But if that’s all you got, you’ll be outgunned.

I see a lot of applicants coming from advertising. Again, relevant. All you got? Outgunned.

Lots of applicants coming from good schools. That’s it? Outgunned.

Theater audio? Outgunned.

Live sound? Outgunned.

Worked on a mod? Outgunned.

Helped engineer at a recording studio? Outgunned.

VO recording and editorial? Outgunned.

However, if you’ve done several of these things… I’m interested. You may or may not have noticed that this person hasn’t shipped a game or held an in-house audio position at a game developer. But what this person at least appears to have done is, well, a lot. They’ve done a lot. Even though they haven’t shipped a game, they’re interesting to me. If their cover letter and demo is good, they’re on the path to an interview. People like this definitely exist and they’re itching to get into game audio.

Now, if someone has shipped a game or two, has a solid demo reel, has some knowledge of how audio works in games, and also has a good cover letter and resume, they are definitely going to give this other person a run for their money. But that doesn’t automatically mean that they’ll get the job. If you read everything else I wrote above, there’s a lot of stuff that means a lot to me other than what someone has done in their past.

The reason I started this section with one swallow does not a summer make is because there are people out there who have dedicated huge amounts of effort to getting a job in game audio. Huge amounts of effort. Can’t understate that. That’s what it takes. And like I said earlier, if you’ve got other skills that make you valuable, like knowing how to script, or knowing how to solve complex problems, or knowing how to build a synthesizer, or knowing how memory and streaming work, or knowing how to build a level in Hammer, or know how to recount something that happened to you in a compelling way, well, that is awesome, because that sounds like someone I might want to work with.

So, be honest about where you’re at and try to keep things in perspective. If you’ve applied to 100 game audio jobs and haven’t found one yet, take some time to think critically about yourself and what you bring to the table. Think about what you could do, what you could learn, who you could learn from, what it might take to make you someone that a company must hire. Then go do that stuff. There are a lot of people out there already doing it. But don’t let that discourage you. None of them are you and they can never be you.

Jerk Move, Potential Employer. Jerk Move.

Okay, this is the last thing I want to write about.

Why are developers being so jerky to their applicants. Do these people not realize or remember what it’s like to be an applicant?

My friend Dave Samuel, a kick ass VFX artist, put it this way, and I’ll never forget. When you’re applying for jobs, a minute is like an hour, an hour is like a day, a day is like a week, a week is like a month, and a month is like ten years. These people are in agony, waiting with baited breath to hear back from you. It’s way better to get a rejection quickly than to be strung along for who-knows-how-long. Stringing people along is lame. I’m guilty of it, too. Nobody’s perfect. But I’m trying to get better. Try to tell your applicants how long it’s really going to take to get them moving to the next step.

If you’re rejecting someone, you can leave the door open. Sometimes your rejection letters can make people think that they will never have another shot at working there again. If that’s the case, well, okay then, I guess. But is it the case? Not as far as I’m concerned. The door is always open to reapply. The door is always open to talk to me. Even after I’ve rejected you. In fact, if I reject someone and they keep in touch, I see that as a good thing. It seems like a mature and smart thing to do.

Also, give you applicants your direct work e-mail address if you can. If you can’t give them that for some dumb reason, give them your home e-mail. Let them contact you. Encourage them to stay in touch. Build that relationship. It might turn into something amazing. Don’t screw up HR, talk to your HR department about it first, but probably you can keep in touch with these people.

Before you hang up the phone from the phone interview, or after the on-site interview, tell the applicants that they should not hesitate to contact you if they want an update or for any reason. Remember, they’re going to be biting their fingernails off and dreaming about your response. They probably have diarrhea from all the stress. Even if they know first hand that this process can take a really long time, it doesn’t make it any less nerve wracking. There’s no reason to leave these people thinking that they’ll botch everything if they ask you what’s going on or if there’s any news. And if they do ask you for an update, be straight up with them. It’s okay to say that there are other applicants and that you can’t decide yet, or that the team has been too busy to make a decision, although if that’s the case, then that’s kind of annoying and you should probably consider addressing that.

Just remember that you’re dealing with a human being. Someone who you could potentially be working with, or someday this person might be looking at your application. Who knows? There are all the reasons in the world to treat these people like you’d like to be treated. If you have the power to make or break someone’s dreams, then wield that power like a kind and honorable king. I believe it’s the right thing to do.

The End

Alright, that’s a bunch of stuff. I feel like I’ve said my piece a few times over. If you actually read all of this, you should leave a comment or send me an e-mail or something. I’m impressed that you, or anyone, would read these ramblings. And that’s just what these are. Ramblings. Try not to take them too seriously. I’m just some schmuck.

On that note, I’d also like to point out that these are my opinions and mine alone. These opinions do not reflect Volition’s official positions, or the Volition audio team’s official positions, or THQ’s official positions, or the FLOTUS’s official opinions, or any other silly ideas that you might get in your head. Honestly they probably won’t even reflect my own opinions in a few months.

Also: no bologna this time. Sorry. Except for that one that I just wrote. And this one: bologna.

Never build upon closed-source frameworks

Original Author: Rob-Galanakis

A poster on tech-artists.org who is using Autodesk’s CAT asks:

 The problem I´m having: I can control the ears now manually, after setting up the reactions, but apparently the CAT system isn´t respecting any keyframes I set on the facial controls, neither in setup nor animation mode.

He already hinted at the answer in the same post:

Even though I´ve heard a couple of times not to rely on CAT in production…

So there’s your answer.

Never depend upon closed-source frameworks that aren’t ubiquitous and proven.

It’s true of CAT, it’s true of everything. And in fact it is one reason I’ll never go back to developing with .NET on Windows (the 3rd party ecosystem, not the framework itself). If you don’t have the source for something, you 1) will never fully understand it, and 2) never be able to sustain your use of it. When I was at BioWare Austin, and the Edmonton guys decided to switch to CAT, I was up in arms for exactly this reason. We had an aging framework- PuppetShop- but it worked well enough, we had the source, and acceptable levels of support (Kees, the author, is/was readily available). Instead, they switched to a closed-source product (CAT) from a vendor that has repeatedly showcased its awful support (Autodesk), and headache ensued. Fortunately I never had to deal with it and left some months after that. Without the source, you’re dependent upon someone whose job it is to provide support just good enough so you don’t leave their product (which is difficult since they own everything), and bad enough that you have to hire them to fix problems (they pride themselves on this level of support, which I call extortion).

As for the ubiquitous and proven part: unless this closed source solution is incredibly widespread (think Max/Maya, some engines or middleware, etc.), and has a lot of involved power users (ie, Biped is widespread but difficult to get good community support for because it attracts so many novices), it means you have to figure out every single workaround because you can’t actually fix the problem because you don’t have source. Imagine working with no Google- that’s what you give up when you use a closed-source framework that isn’t an industry standard.

So don’t do it. Don’t let others do it. If you’re currently doing it, think about getting the source and if you can’t do that, start making a backup plan.

Addendum: I should clarify, there is a difference between “using” and “building upon.” Using a physics or sound library is usually not “building upon.” Using Unreal 3 would be. How much work are you doing that directly deals with and extends the framework (build upon it), and how much of it is just calling library functions and putting your own interface around it (“using”  the framework). I have no problem “using” closed-source frameworks, even if they’re a bit unproven but novel, if you can replace it relatively easily. However if you’d basically have to redo all the work done so far to change it (as you would in this example), then my advice holds.


WPA–Xperf Trace Analysis Reimagined

Original Author: Bruce-Dawson

Technology/ Code /

For many years xperfview.exe has been the main tool for analyzing Windows Performance Toolkit started including wpa.exe as an alternative. While the early versions had some significant rough edges, the latest version (6.2.8400.0, released in tandem with Windows 8 RC) is now superior to xperfview in most ways.

In this post I’ll briefly explain how to switch from using xperfview to WPA, and why this is worthwhile.

Before proceeding you should be sure to get the latest version of WPA, to avoid being frustrated by bugs that are already fixed. The latest version as of June 19, 2012, is available as part of the Windows Software Development Kit (SDK) for Windows 8 Release Preview. Gotta love those Microsoft names.

This post assumes some familiarity with xperf/ETW and xperfview. In particular you should definitely know now to wait analysis post.

It may be useful to understand xperf wait analysis, but much of the UI flow is quite different with WPA, so some of the details will be different.

The two articles in the xperf documentation series, CPU scheduling, are still relevant – understanding the column names is as important as ever. Those articles are timeless classics. The UI for selecting columns has changed, but that is easy enough to adjust to.

Getting started

When you first launch WPA it is dauntingly austere and blank. All of the graphs are collapsed and hidden:

The first step is to start dragging some graphs into the analysis area. The exact set of graphs depends on what type of analysis you are planning to do, and I don’t claim that my recommendations are one-size-fits-all, but they should be a reasonable starting point.

Generic Events

App-specific generic events can be crucial for navigating a trace. I use them to identify frame boundaries, user input, and other key events. With them I can see when the frame-rate drops, and I can see what events the drop correlates to. If you have followed the instructions at “Xperf Basics: Recording a Trace” then you should be calling functions like ETWMark() and ETWRenderFrameMark() – or whatever alternative functions you created – and the time-stamped data from these functions shows up in Generic Events.

The display of generic events in xperfview, the old trace viewing tool, left much to be desired:

If I looked in the ProviderIds drop-down then I could see that these blue diamonds corresponded to my Valve-FrameRate provider, but that provider emits events for simulation ticks as well as render ticks, and the distinction is invisible in xperfview.

The display of generic events in WPA is much improved. The events are displayed hierarchically so that I can drill down and see the simulation and render ticks on separate lines. The layout is configurable so if I want to I can save a bit of space by removing “Task Name” from the hierarchy:

If Generic Events are relevant to your trace analysis then open up the System Activity section and drag Generic Events onto the analysis area. On the other hand, if you haven’t created and registered your own providers and put in calls to their functions then there won’t be much of interest in this graph and you should probably not bother with it.

Window in Focus

This graph shows what process’ window is active. This can be helpful to choose where to analyze. Most games lower their frame rate when they are not active and investigating a poor frame rate when you’re intentionally going idle is pointless.

In the graph shown below the user switched from explorer to WLXPhotoGallery, and then DWM owned the active window for a while.

Aside: DWM becomes active whenever Windows detects that a program has hung – DWM takes over window management so that the user can still move the hung window around.

The Windows in Focus graph can be found in the System Activity section. Drag it over if you think it will be useful.

CPU Usage (Precise)

CPU Usage (Precise) is the graph formerly known as CPU Usage (there were three variants for grouping by thread/process/CPU). This graph is constructed from context switch data (and interrupt and DPC data) which means it is a sub-microsecond accurate measurement of what thread is running when and why, as well as exactly how much CPU time each thread and process is consuming. That seems pretty important so you probably want this one available. If you open up the Computation section and then within that open up the CPU Usage (Precise) section then you will find several graphs. I normally use the “Utilization by Process, Thread” graph, but the “Timeline by Process, Thread” graph also looks interesting, and makes some patterns more obvious. In particular, if two threads or processes are ping-ponging (taking turns executing) this behavior is much easier to follow on the timeline graph. However the timeline graph devolves into solid bars of color when zoomed out – it really only works when examining fine details.

It is unfortunate that the color coding for the utilization and timeline graphs are not consistent.

The screen shot to the right shows the utilization and timeline graphs over a 20 ms time period.

CPU Usage (Sampled)

CPU Usage (Sampled) is the graph formerly known as CPU Sampling (there were three variants for grouping by thread/process/CPU). This data is constructed using data from the profile provider, which periodically interrupts all CPUs to see what they are doing, with a default rate of 1 KHz. Because this data is sampled it cannot tell you what is going on between samples. However with enough samples and with well behaved code the sampling data can be extremely useful. In particular, if you have call stack collection enabled for the profile provider then each sample includes a call stack.

In general I would say that the main purpose of the sampled CPU data is looking at the call stacks, because this tells you (subject to sampling artifacts) what your code is actually doing. Therefore it seems passing strange that WPA defaults to not showing you call stacks.

Since most programs are CPU bound at least part of the time I think this data should always be enabled, so open up the Computation Section, then open up the CPU Usage (Sampled section) and grab one of the graphs. I’d recommend “Utilization by Process”.

When you drag this graph over you now have two graphs showing CPU usage – one precise and one sampled. That’s a waste of space. The precise CPU usage is, well, more precise so we want its graph, and all we really want from the sampled data is call stacks, and those are on the summary table. That means that we now make our first use of the buttons on the right side of the graph’s title bar. The three left buttons let you control whether to display “graph and table”, “graph only”, or “table only”. The default is graph only, but for CPU Usage (Sampled) I think we want table only, as shown above. This gives us a table of precise CPU usage, and a graph of sample CPU usage. Perfect.

The default set of columns for CPU Usage (Sampled) is similar to what is was for xperfview, which means that I think that it is wrong. In particular, the stack column is off by default which is peculiar since it is the whole raison d’http://randomascii.wordpress.com/2012/05/08/the-lost-xperf-documentationcpu-sampling/”>CPU sampling documentation, or just trust me and use this set of columns for a starting point:

  • Process
  • Stack
  • Orange bar
  • Weight

To set the columns click on the View Editor (Ctrl+E) button that is circled in red in the screen shot below:

When you drill in to the sampled data call stacks the graphs are highlighted to show where the samples came from, which can be invaluable data once you learn how to interpret it.

Recap

At this point you should have something beautiful like this:

What about wait analysis?

here, or just go with this wait-analysis-appropriate set:

  • NewProcess
  • NewThreadId
  • NewThreadStack
  • ReadyingProcess
  • ReadyingThreadId
  • ReadyThreadStack
  • Orange bar
  • Count
  • TimeSinceLast (us) – Sum (sort by descending)
  • Ready (us) – Sum
  • Waits (us) – Sum
  • Freeze bar
  • CPU Usage (ms)

The freeze bar means that CPU Usage (ms) will continue to be displayed as the right-most column even when the window is too narrow for all columns, which makes the summary table useful for an accurate per-process/thread measure of CPU consumption.

I normally have the CPU usage (precise) summary table turned off and I make it visible when I need it. However I always have my favorite column orderings set up, ready and waiting for me.

Profiles-> Save Startup Profile

Once you get everything perfectly arranged, don’t forget to save your startup profile. If you create a profile that is well suited to the type of analysis that your company does then you can share it with your coworkers – the data is stored in an XML file in “DocumentsWPA FilesStartup.wpaProfile”.

To save you some time, and to make creating tutorials easier, I’ve created a startup profile that you can install. I don’t claim that it is the one-true-startup-profile, but I think it’s a good place to begin and it contains everything I’ve discussed above. You can find this startup profile at ftp://ftp.cygnus-software.com/pub/WPAStartup.zip. Just download it and unzip the file into your “DocumentsWPA Files” directory.

Basic navigation

As with xperfview you can zoom in/out with ctrl+mouse-wheel. You can also select a region and then zoom to that region, either in the main window (right-click, Zoom), in a new tab (right-click, Zoom all in new view), or just the current graph in a new tab (right-click, Zoom graph in new view). The workflow is a bit different from xperfview but ultimately the ability to do deep analysis in a single window makes it worthwhile.

In addition to mouse and mouse-wheel based zooming you can use ctrl+”+” and ctrl+”-“ for keyboard based zooming. They don’t zoom around the mouse, but at least I can finally zoom in/out a bit when I don’t have a mouse-wheel available.

Hidden data

WPA continues the fine xperfview tradition of hiding some of the disk analysis graphs. In particular the cool chart showing disk-head movements is now found by opening up the Storage section, right-clicking on Disk Usage, and selecting Disk Offset Graph. There’s something amazing and terrifying about seeing how far the physical read/write head on your poor disk sometimes ends up moving.

WPA Advantages

WPA has improved the trace analysis experience in several ways. Some of these include:

  • Asynchronous symbol loading – keep working during the (sometimes slow) symbol loading process
  • Easy saving and sharing of startup profiles
  • Deep analysis in one window
  • Better display of Generic events
  • Ctrl+”+” and ctrl+”-“ for keyboard based zooming
  • Highlighting of relevant graph sections – makes viewing profile sample locations easier (more on this later)

WPA Bugs

WPA hasn’t quite reached its first full release and it does have some frustrating bugs and glitches, particularly around the use of Generic Events. Some of these are serious enough that they force me to use xperfview to extract some information, but I still use WPA for most day-to-day work. Your mileage may vary. The main glitches I have seen are:

  • Sorting of numeric fields in Generic Events is incorrect – the sorting is done alphabetically instead of numerically
  • Some Generic Events payloads aren’t decoded
  • Snapping to events when selecting a time range doesn’t work
  • Some parts of the UI still hang or are anomalously unresponsive during symbol loading

The xperf team has been responsive to bug reports and I assume that most of these bugs will be fixed in the final Windows 8 version. Fingers crossed.

Closing time…

I keep finding that ETW/xperf/WPA give me access to information that most other developers don’t have. This lets me find and fix performance problems that are otherwise invisible or intractable. I continue to enjoy having x-ray glasses that actually work.

Hack Day Report

Original Author: Niklas Frykholm

Last Friday, we had our second hack day (aka do-what-you-want day, aka google day) at the office.

Different companies seem to take different approaches to hack days. At some places it just means that you can spend a certain percentage of your working week on your own projects. We wanted something that was a bit more focused and felt more like a special event, so we used the following approach:

  • People were encouraged to pick tasks that could be completed, or taken to a “proof-of-concept” level in a single day. The goal was that at the end of the day you should have something interesting to show/tell your colleagues.

  • It is ok to fail of course. Failure is often interesting. Trying crazy ideas with a significant risk of spectacular failure is part of the charm of a hack day.

  • A couple of days before the event, everbody presented their projects. The idea was to get everybody to start thinking about the topics, so that we could help each other with ideas and suggestions.

  • We ate breakfast together in the morning to start the discussions and get everybody in the spirit of the event. At the end of the day, we rounded off with a couple of beers.

  • We avoided Skype, email and meetings during the day, so that we could focus 100 % on the projects.

  • A couple of days after the events we had a small show & tell, where everybody could present what they had learned.

Results

A number of interesting projects came out of this hack day:

  • Tobias and Mats created an improved highlighting system for indicating selected objects in the level editor. (Highlighting the OOBB works well for small objects, but for big things like landscapes and sub-levels, it is just confusing.)

  • Jim looked into a cross-platform solution for capturing screen shots and videos on target machines and transmitting them over the network.

  • Andreas created a Lua profiling tool, that can dynamically enable and disable profiling for any Lua function by hot-patching the code with profiler calls.

  • Finally, I rewrote the collision algorithm for our particle systems.

Being an egotistical bastard, I will focus on my own project.

Particle collision is one of those annoying things that it is difficult to find a good general solution to, for two reasons:

  • It ties together two completely different systems (particles and physics), creating an ugly coupling between them. Since the solution must have decent performance, the coupling must be done at a fairly low level, which makes it even worse.

  • Particles can have very different collision requirements. Some effects need a massive amount of particles (e. g., sparks), but don’t care that much about collision quality. As long as most of them bounce somewhat accurately, it is OK. Other effects may have just a single particle (e. g., a bullet casing). Performance doesn’t matter at all, but if it doesn’t bounce right you will surely notice. Handling both effects in the same system is a challenge. Having different systems for different effects is another kind of challenge.

My previous attempts at implementing particle collision have all been based on first cutting out a slice of the physics world around the particle effect and then trying to find a fast representation of the collision shapes in that world slice.

The problem with this approach is that there are a lot of variables to tweak and tune:

  • How big should the world slice be?

  • How much detail should there be in the simplified representation? More detail is slower, but gives better collision results.

  • What kind of representation should we use?

  • How should we handle dynamic/moving objects? How often should the world slice be updated?

I’ve tried a lot of different representations: a triangle soup, a collection of half-spheres, a height field, but none of them has given completely satisfactory results. Often, parameters that work for one effect at one location fail for a different effect at a different location. Both performance and behavior are hard to predict.

The main idea for the new approach came from a Naughty Dog presentation at GDC. Instead of trying to create a shared collision model for all particles, we give each particle its own collision model, and we store it inside the particle itself, together with the other particle data.

Of course, it would be expensive to store a complicated collision model inside every particle, so we use the simplest model possible: a plane. We can represent that by a normal and an offset from origin. So with this approach, the data for a particle might look something like this:

struct Particle {
 
  	Vector3 position;
 
  	Vector3 velocity;
 
  	Color8 color;
 
  	Vector3 collision_plane_normal;
 
  	float collision_plane_offset;
 
  };

(Side note: Our particle data doesn’t actually look like this, we use a “structure-of-arrays” approach rather than an “array-of-structures” and we don’t have a fixed set of fields, each effect has its own set.)

Note that we don’t bother with any flag for indicating whether there is plane or not. If there is no collision, we just put the collision plane far enough below the origin.

With this approach the collision test is super fast — just a dot product and a compare. It is also really easy to parallelize the test or run it off-CPU, since it just uses local particle data and doesn’t need to access any shared memory.

With this method we have divided the original collision problem into two simpler ones:

  • Collision test against a plane. (Trivial.)

  • Finding a suitable collision plane for each particle.

This means that if we want to, we can use different approaches for finding the collision planes for different effects. E.g., for static effects we could hard code the collision plane and avoid collision queries completely.

Generally, we can find a suitable collision plane for a particle by raycasting along its trajectory. If we didn’t have any performance constraints, we could do a raycast for every particle every frame. That way we would always know what surface the particle would hit next, which means that we would get perfect collision behavior.

Of course, we can’t actually do that. Raycasts are comparatively expensive and we want to be able to support large numbers of particles.

To control the performance, I exposed a parameter that lets the effect designer control how many raycasts per frame an effect is a allowed to make. A typical value of 1.0 means that every frame, one particle in the effect is picked at random, a raycast is performed along that particles trajectory and its collision plane is updated with the result.

Note that with this solution, the work is always evenly distributed over the duration of the effect. That is a lot nicer than what you typically get with the “world slice” approach where there is a big chunk of work in the beginning when you cut out the world slice.

Astute readers will have noticed a fatal flaw with the design as it has been presented so far: it can’t possibly work for very many particles. If we have an effect with 1 000 particles and do a raycast every frame, it will take 33 seconds before every particle has found its collision normal. By then, they will long since have fallen through the floor.

So, if we want to use this approach for large numbers of particles we must be able to somehow reuse the collision results. Typically, an effect will have bundles of particles traveling in approximately the same direction. When one such particle has done a raycast and found a collision, we want to be able to share the result with its neighbors somehow.

I wanted to find a solution to this without having to create a complicated collision representation, because that would bring back many of the problems I had with the “world slice” approach. Eventually, I decided that since what we want to do is to cache a collision query of the form:

(position, direction) -> collision_plane

The simplest possible thing would be to store the results in a hash. Hashes are nice, predictable data structures with well known performance characteristics.

To be able to hash on position and direction we must quantize them to integer values. We can quantize the position by dividing the world into cells of a certain width and height:

	const float cell_side = 0.5f;
 
  	const float cell_height = 2.0f;
 
  	int ix = position.x / cell_side;
 
  	int iy = position.y / cell_side;
 
  	int iz = position.z / cell_height;
 
  	uint64 key = HASH_3(ix, iy, iz);

In this example, I use a higher resolution along the xy-axes than along the z-axes, because typically that is where the more interesting features are. HASH_3() is a macro that performs the first three rounds of the murmur_hash algorithm.

To quantize the direction we can use a similar approach. I decided to quantize the direction to just six different values, depending on along which principal axis the particle is mostly traveling:

	unsigned id;
 
  	if (fabsf(dir.x) >= fabsf(dir.y) && fabsf(dir.x) >= fabsf(dir.z))
 
  		id = dir.x > 0 ? 0 : 1;
 
  	else if (fabsf(dir.y) >= fabsf(dir.z))
 
  		id = dir.y > 0 ? 2 : 3;
 
  	else
 
  		id = dir.z > 0 ? 4 : 5;
 
  	key = key ^ id;

Now that we have computed a quantized representation of (position, direction), we can use that as lookup value into our hash, both for storing and fetching values:

	struct CollisionPlane {
 
  		Vector3 normal;
 
  		float offset;
 
  	};
 
  	HashMap<uint64, CollisionPlane> _cache;

(Side note: Unless I’m worried about hash function collisions, I prefer to hash my keys before I insert them in the HashMap and just use a HashMap<uint64,T> instead of HashMap<MyComplicatedKeyStruct,T>. That way the hash map uses less memory and lookups can be done with a simple modulo operation.)

Whenever I do a particle raycast I store the result in the cache. When particles are spawned they lookup their collision plane in the cache. Particles also query the cache every time they bounce, since that typically means they will be traveling in a new direction.

I have a maximum size that the cache is allowed to use. When the cache reaches the maximum size, older entries are thrown out.

Results

The system gives high quality results for effects with few particles (because you get lots of raycasts per particle) and is still able to handle massive amounts of particles. The performance load is evenly distributed and it doesn’t need any special cases for dynamic objects.

There are some drawbacks. The cache requires some tweaking. Since it can only store one collision plane for each quantization cell it will miss important features if the cells are too big. On the other hand, if the cells are too small, we need lots of entries in the cache to represent the world, which means more memory and slower lookups.

Since we only have one collision normal per particle, there are some things that the particles just can’t do. For example, they can never come to rest at the bottom of a V-shape, because they will always only be colliding with one of the planes in the V. Overall, they will behave pretty badly in corners, where several collision planes with different normals meet. Some of these issues could be fixed by storing more than one collision plane in the particle, but I don’t think it is worth it. I prefer the simpler approach and having particles that in some tricky situations can fall through the ground.

Compared to the old collision code, the new code is simpler, runs faster and looks better.

All in all, I would say that the hack day was a success. We had great fun and produced some useful stuff. We will definitely do more days like this in the future. Not too often though. I think it is important that these days feel like a special treat. If they become too mundane, something important is lost. Once a month or so, would be ideal, I think.

This has also been posted to The Bitsquid blog.

Writing portable code: A process full of gain

Original Author: Charilaos Kalogirou

Lately, I am spending some of my time into porting my game engine to the Android platform. It is a rather refreshing, interesting, rewarding and also frustrating experience. All at the same time. The process helped me learn new lessons and remember some old ones I had forgot.

Getting confortable

First of all I realized once again that we people get comfortable. And oh boy we get comfortable! I remember myself a few weeks back being frustrated with XCode 4 and how it was slow and sluggish compared to XCode 3, how I don’t like the new environment, etc. Well, no more! All it took was a few days in Eclipse. Dialog windows popping up behind the main window, >500ms on most clicks on files, kitchen & sink user interface, can go on forever, and all you really basically get at the end of the day is just a smart editor and debugger that only works for the Java part. Compare that to XCode with its memory profilers, time profiles, filesystem profilers, network profilers, battery optimizers, the very helpful OpenGL state inspector and logger, there is really no relation. I had forgot how it was to develop on other platforms, and how amazed I was initially with the special treatment that Apple gives to developers with its tools. What amazed me more is that I don’t come from such a “comfy“ background. The initial version of Sylphis3D was developed in parallel on Linux and Windows, mostly using no IDE at all, and I never found the tools a problem. As it turns out hardship builds character, while comfortness breaks it.

Portable software is good for you

Building portable software is highly valued in my mind, because it helps you be a better software engineer while making better quality software at the same time. You get to learn many different development environments, understand their design decisions, workaround platform differences, think further ahead, etc. All these require you to get a deeper understanding of your code and your dependencies. Always pushes towards a more organized code structure and reveals bugs that would otherwise go unnoticed until Murphy’s laws decides it is the worst time to trigger.

So if you are a software engineer, don’t get too comfortable with your development and target environment. No matter how attractive that environment makes it! Make your code portable, to keep yourself and the code in better shape. After all wouldn’t it be cool to run your code on a future toaster?!

This post was also posted on Thoughts Serializer

Bringing SIMD Accelerated Vector Math to the Web Through Dart

Original Author: John McCutchan

Recently, I have been working on a vector math library for Dart. Boringly, I named it Dart Vector Math. The latest version can be found on github. My two biggest goals for Dart Vector Math are the following:

  • Near 100% GLSL compatible syntax. This includes the awesome vector shuffle syntax, and flexible construction of vectors and matrices.
  • Performance in terms of both CPU time and memory usage / garbage collection load.

Aside from a couple quirks, Dart Vector Math is GLSL syntax compatible. It is possible to copy and paste GLSL code into Dart and after making a couple tweaks have it compile with Dart Vector Math. This makes debugging shader code easy.

Since Dart is a garbage collected language, to be optimal in terms of space you want to avoid creating lots of objects. In order to facilitate that, Dart Vector Math offers many functions that work directly on already allocated vectors and matrices.

This weekend I started to look at CPU performance of Dart Vector Math versus glMatrix.dart (a port of glMatrix from JavaScript to Dart, the current champ of JavaScript vector math libraries). The initial results are heavily in favour of Dart Vector Math:

=============================================

Matrix Multiplication

=============================================

Avg: 14.59 ms Min: 10.161 ms Max: 22.927 ms (Avg: 14590 Min: 10161 Max: 22927)

=============================================

Matrix Multiplication glmatrix.dart

=============================================

Avg: 283.353 ms Min: 272.062 ms Max: 287.988 ms (Avg: 283353 Min: 272062 Max: 287988)

=============================================

mat4x4 inverse

=============================================

Avg: 28.289 ms Min: 21.019 ms Max: 34.891 ms (Avg: 28289 Min: 21019 Max: 34891)

=============================================

mat4x4 inverse glmatrix.dart

=============================================

Avg: 318.909 ms Min: 315.435 ms Max: 325.831 ms (Avg: 318909 Min: 315435 Max: 325831)

=============================================

vector transform

=============================================

Avg: 4.324 ms Min: 2.811 ms Max: 14.859 ms (Avg: 4324 Min: 2811 Max: 14859)

=============================================

vector transform glmatrix.dart

=============================================

Avg: 144.431 ms Min: 138.263 ms Max: 153.798 ms (Avg: 144431 Min: 138263 Max: 153798)

The code for 4×4 matrix multiplication in Dart Vector Math and glMatrix are practically identical, so on closer inspection the above numbers didn’t make much sense. There is one key difference- Dart Vector Math uses a native Dart object to store the matrix while glMatrix uses a Float32Array as storage. Digging into the disassembly I discovered that indexing into a Float32Array is a slow path for the VM right now, skewing the results against glMatrix.dart. Not that big of a deal, Dart is a new language and the VM needs time to mature.

Once the performance issue with Float32Arrays is fixed I want to have Dart Vector Math use them for two reasons. First, they take up 50% less space (single vs. double precision floats). Second, WebGL needs Float32Arrays for uniform data which means the matrix is going to eventually end up inside a Float32Array, might as well keep it in one the whole time. There is no CPU performance benefit from using Float32Array as storage because all operations result in the floats being promoted to doubles, operated on, and then stored back as floats.

My intention to move to Float32Array got me thinking and I ended up asking myself: Why doesn’t the browser offer an API for common vector math operations on Float32Array implemented efficiently with SIMD instruction sets? Well, I’m not sure why it is not offered, but I ended up spending the weekend implementing it for the Dart VM.

The API follows:

1
 
  2
 
  3
 
  4
 
  5
 
  
class SimdFloat32Array {
 
   static matrix4Inv(Float32List dst, int dstIndex, Float32List src, int srcIndex, int count);
 
   static matrix4Mult(Float32List dst, int dstIndex, Float32List a, int aIndex, Float32List b, int bIndex, int count);
 
   static transform(Float32List M, int Mindex, Float32List v, int vIndex, int vCount);
 
  }

I do not want anyone to get hung up on the specific API or naming convention (let’s avoid bikeshedding). My three biggest goals for this API are the following:

  • Offer the important operations used by vector math libraries
  • Operate directly on floats instead of promoting to doubles
  • Design for bulk processing

So far I have exposed three of the important operations, but there are many more. Each of those functions is backed by an SSE implementation that operates directly on the Float32Array data. Notice that each of the methods take a count variable, this allows a single call to do bulk work.

The results of my implementation were very encouraging:

=============================================

Matrix Multiplication SIMD

=============================================

Avg: 8.702 ms Min: 8.475 ms Max: 9.217 ms (Avg: 8702 Min: 8475 Max: 9217)

=============================================

mat4x4 inverse SIMD

=============================================

Avg: 7.107 ms Min: 6.89 ms Max: 7.754 ms (Avg: 7107 Min: 6890 Max: 7754)

=============================================

vector transform SIMD

=============================================

Avg: 6.415 ms Min: 6.204 ms Max: 7.006 ms (Avg: 6415 Min: 6204 Max: 7006)

Aside from the vector transformation operation (I think my SSE vector transform code is just slow), I got speedups between 2x and 4x.

Does this have legs? I hope so, but it’s not my call. If you see value in exposing this acceleration architecture into the browser, speak up.

Anticipating some questions:

What about JavaScript? The API would be easy to expose in JavaScript.

What about hardware without SIMD instruction sets? Probably not an issue since ARM, x86, and PPC have excellent SIMD instruction sets. Other platforms can implement the API using scalar floating point instructions.

What about other browsers? Again, this API would be easy to expose if it gained support.

Fast vector math operations are a requirement if we are going to start writing amazing games in the browser, I hope my proposal can make this possible.