So they made you a lead; now what? (Part 2)

Original Author: Oliver Franzke

The first part of this article took a closer look at why people with outstanding art, design or programming skills sometimes struggle or even fail as team leads. In addition to that part one also identified the core values of leadership as trust, direction and support.

The goal of this part is to provide newly minted leads with practical advice how to get started in their new role and it also describes different ways to develop the necessary leadership skills.

Learning leadership skills

Now that we have a better understanding of what leadership is (and isn’t) it’s time to look at different ways of developing leadership skills. Despite the claims of some books or websites there is no easy 5-step program that will make you the best team lead in 30 days. As with most soft skills it is important to identify what works for you and then to improve your strategies over time. Thankfully there are different ways to find your (unique) leadership style.

The best way to develop your skills is by learning them directly from a mentor that you respect for his or her leadership abilities. This person doesn’t necessarily have to be your supervisor, but ideally it should be someone in the studio where you work. Leadership depends on the organizational structure of a company and it is therefore much harder for someone from the outside to offer practical advice.

Make sure to meet on a regularly basis (at least once a month) in order to discuss your progress. A great mentor will be able to suggest different strategies to experiment with and can help you figure out what does and doesn’t work. These meetings also give you the opportunity to learn from his or her career by asking questions like this:

  • How would you approach this situation?
  • What is leadership?
  • Which leader do you look up to and why?
  • How did you learn your leadership skills?
  • What challenges did you face and how did you overcome them?

But even if you aren’t fortunate enough to have access to a mentor you can (and should) still learn from other game developers by observing how they interact with people and how they approach and overcome challenges. The trick is to identify and assimilate effective leadership strategies from colleagues in your company or from developers in other studios.

While mentoring is certainly the most effective way to develop your leadership skills you can also learn a lot by reading books, articles and blog posts about the topic. It’s difficult to find good material that is tailored to the games industry, but thankfully most of the general advice also applies in the context of games. The following two books helped me to learn more about leadership:

  • Team Leadership in the Games Industry” by Seth Spaulding takes a closer looks at the typical responsibilities of a team lead. The book also covers topics like the different organizational structure of games studios and how to deal with difficult situations.
  • How to Lead” by Jo Owen explores what leadership is and why it’s hard to come up with a simple definition. Even though the book is aimed at leads in the business world it contains a lot of practical tips that apply to the games industry as well.

Talks and round-table discussions are another great way to learn from experienced leaders. If you are fortunate enough to visit GDC (or other conferences) keep your eyes open for sessions about leadership. It’s a great way to connect with fellow game developers and has the advantage that you can get advice on how to overcome some of the challenges you might be facing at the moment.

But even if you can’t make it to conferences there are quite a few recorded presentations available online. I highly recommend the following two talks:

  • Concrete Practices to be a Better Leader” by Brian Sharp is a fantastic presentation about various ways to improve your leadership skills. This talk is very inspirational and contains lots of helpful techniques that can be used right away.
  • You’re Responsible” by Mike Acton is essentially a gigantic round-table discussion about the responsibilities of a team lead. As usual Mike does a great job offering practical advice along the way.

Lastly there are a lot of talks about leadership outside of the games industry available on the internet (just search for ‘leadership’ on YouTube). Personally I find some of these presentations quite interesting since they help me to develop a broader understanding of leadership by offering different ways to look at the role. For example the TED playlist “How leaders inspire“ discusses leadership styles in the context of the business world, military, college sports and even symphonic orchestras. In typical TED fashion the talks don’t contain a lot of practical advice, but they are interesting nonetheless.

Leadership starter kit

So you’ve just been promoted (or hired) and the title of your new role now contains the word ‘lead’. First of all, congratulations and well done! This is an exciting step in your career, but it’s important to realize that your day to day job will be quite different from what it used to be and that you’ll have to learn a lot of new skills.

I would like to help you getting started in your new role by offering some specific and practical advice that I found useful during this transitional period. My hope is that this ‘starter kit’ will get you going while you investigate additional ways to develop your leadership skills (see section above). The remainder of the section will therefore cover the following topics:

  • One-on-one meetings
  • Delegation
  • Responsibility
  • Mike Acton’s quick start guide

As a lead your main responsibility is to support your team, so that they can achieve the current set of goals. For that it’s crucial that you get to know the members of your team quite well, which means you should have answers to questions like these:

  • What is she good at?
  • What is he struggling with?
  • Where does she want to be in a year?
  • Is he invested in the project or would he prefer to work on something else?
  • Are there people in the company she doesn’t want to work with?
  • Does he feel properly informed about what is going on with the project / company?

You might not get sincere replies to these questions unless people are comfortable enough with you to trust you with honest answers. Sincere feedback is absolutely critical for the success of your team though which is especially true in difficult times and therefore I would argue that developing mutual trust between you and your team should be your main priority.

Building trust takes a lot of time and effort and an essential part of this process is to have a private chat with each member of your team on a regular basis (at least once a month). These one-on-one meetings can take place in a meeting room or even a nearby coffee shop. The important thing is that both of you feel comfortable having an open and honest conversation, so make sure to pick the location accordingly.

These meetings don’t necessarily have to be long. If there is nothing to talk about then you might be done after 10 minutes. At other times it may take an hour (or more) to discuss a difficult situation. Make sure to avoid possible distractions (e.g. mobile phone) during these meetings, so you can give the other person your full attention.

One-on-one meetings raise the morale because the team will realize that they can rely on you to keep them in the loop and to represent their concerns and interests. Personally I find that these conversations help me to do my job better since it’s much more likely to hear about a (potential) problem when the team feels comfortable telling me about it.

At this point you might be concerned that these meetings take time away from your ‘actual job’, but that’s not true because they are your job now. Whether you like it or not you’ll probably spend more time in meetings and less time contributing directly to the current project. Depending on the size of your company it’s safe to assume that leadership and management will take up between 20% and 50% of your time. This means that you won’t be able to take on the same amount of production tasks as before and you’ll therefore have to learn how to delegate work. I know from personal experience that this can be a tough lesson to learn in the beginning.

In addition to balancing your own workload delegation is also about helping your team to develop new skills and to improve existing ones. Just because you can complete a task more efficiently than any other person on your team doesn’t necessarily mean that you are the best choice for this particular task. Try to take the professional interest of the individual members of your team into account as much as possible when assigning tasks, because people will be more motivated to work on something they are passionate about.

Beyond these practical considerations it is important to note that delegation also has an impact on the mutual trust between you and your team. By routinely taking on ‘tough’ tasks yourself you indicate that you don’t trust your teammates to do a good job, which will ruin morale very quickly. Keep in mind that your colleagues are trained professionals just like yourself, so treat them that way!

Experiencing your entire team working together and producing great results is very empowering and it is your job to make it happen even if nobody tells you this explicitly. In an ideal world it would be obvious what your company expects from you, but in reality that will probably not be the case. It is important to understand that while you have more influence over the direction of the project, your team and even the company you also have more responsibilities now.

First and foremost you are responsible for the success (or failure) of your team and any problem preventing success should be fixed right away. This could be as simple as making sure that your team has the necessary hardware and software, but it could also involve negotiations with another department in order to resolve a conflict of interest.

One responsibility that is often overlooked by new leads is the professional development of the team. It is your job to make sure that the people on your team get the opportunities to improve their skillset. In order to do that you’ll first have to identify the short- and long-term career goals of each team member. In addition to delegating work with the right amount of challenge (as described above) it is also important to provide general career mentorship.

A video game is a complicated piece of software and making one isn’t easy. Mistakes happen and your team might cause a problem that affects another department or even the production schedule. This can be a difficult situation especially when other people are upset and emotions run high. I know it’s easier said than done, but don’t let the stress get the best of you. Rather than identifying and blaming a team member for the mistake you should accept the responsibility and figure out a way to fix the problem. You can still analyze what happened after the dust has settled, so that this issue can be prevented in the future.

It is very unfortunate that a lot of newly minted team leads have to identify additional responsibilities themselves. Thankfully some companies are the exception to the rule. At Insomniac Games, for example, new leads have access to a ‘quick start guide’ that helps them to get adjusted to their new role. This helpful document is publicly available and was written by Mike Acton who has been doing an exceptional job educating the games industry about leadership. I highly recommend that you read the guide: http://www.altdev.co/2013/11/05/gamedev-lead-quick-start-guide/

Leadership is hard (but not impossible)

Truth be told becoming a great team lead isn’t easy. In fact it might be one of the toughest challenges you’ll have to face in your career. The good news is that you are obviously interested in leadership (why else would you have read all this stuff) and want to learn more about how to become a good lead. In other words you are doing great so far!

I hope you found this article helpful and that it’ll make your transition into your new role a bit easier.

Good luck and thank you for reading!

PS.: Whether you just got promoted or have been leading a team for a long time I would love to hear from you, so please feel free to leave a comment.

PPS: I would like to thank everybody who helped me with this article. You guys rock!

Custom Vector Allocation

Original Author: Thomas Young

(First posted to upcoder.com, number 6 in a series of posts about Vectors and Vector based containers.)

A few posts back I talked about the idea of ‘rolling your own’ STL-style vector class, based my experiences with this at PathEngine.

In that original post and these two follow-ups I talked about the general approach and also some specific performance tweaks that actually helped in practice for our vector use cases.

I haven’t talked about custom memory allocation yet, however. This is something that’s been cited in a number of places as a key reason for switching away from std::vector so I’ll come back now and look at the approach we took for this (which is pretty simple, but nonstandard, and also pre C++11), and assess some of the implications of using this kind of non-standard approach.

I approach this from the point of view of a custom vector implementation, but I’ll be talking about some issues with memory customisation that also apply more generally.

Why custom allocation?

In many situations it’s fine for vectors (and other containers) to just use the same default memory allocation method as the rest of your code, and this is definitely the simplest approach.

(The example vector code I posted previously used malloc() and free(), but works equally well with global operator new and delete.)

But vectors can do a lot of memory allocation, and memory allocation can be expensive, and it’s not uncommon for memory allocation operations to turn up in profiling as the most significant cost of vector based code. Custom memory allocation approaches can help resolve this.

And some other good reasons for hooking into and customising allocations can be the need to avoid memory fragmentation or to track memory statistics.

For these reasons generalised memory customisation is an important customer requirement for our SDK code in general, and then by extension for the vector containers used by this code.

Custom allocation in std::vector

The STL provides a mechanism for hooking into the container allocation calls (such as vector buffer allocations) through allocators, with vector constructors accepting an allocator argument for this purpose.

I won’t attempt a general introduction to STL allocators, but there’s a load of material about this on the web. See, for example, this article on Dr Dobbs, which includes some example use cases for allocators. (Bear in mind that this is pre C++11, however. I didn’t see any similarly targeted overview posts for using allocators post C++11.)

A non-standard approach

We actually added the possibility to customise memory allocation in our vectors some time after switching to a custom vector implementation. (This was around mid-2012. Before that PathEngine’s memory customisation hooks worked by overriding global new and delete, and required dll linkage if you wanted to manage PathEngine memory allocations separately from allocations in the main game code.)

We’ve generally tried to keep our custom vector as similar as possible to std::vector, in order to avoid issues with unexpected behaviour (since a lot of people know how std::vector works), and to ensure that code can be easily switched between std::vector and our custom vector. When it came to memory allocation, however, we chose a significantly different (and definitely non-standard) approach, because in practice a lot of vector code doesn’t actually use allocators (or else just sets allocators in a constructor), because we already had a custom vector class in place, and because I just don’t like STL allocators!

Other game developers

A lot of other game developers have a similar opinion of STL allocators, and for many this is actually then also a key factor in a decision to switch to custom container classes.

For example, issues with the design of STL allocators are quoted as one of the main reasons for the creation of the EASTL, a set of STL replacement classes, by Electronic Arts. From the EASTL paper:

Among game developers the most fundamental weakness is the std allocator design, and it is this weakness that was the largest contributing factor to the creation of EASTL.

And I’ve heard similar things from other developers. For example, in this blog post about the Bitsquid approach to allocators Niklas Frykholm says:

If it weren’t for the allocator interface I could almost use STL. Almost.

Let’s have a look at some of the reasons for this distaste!

Problems with STL allocators

We’ll look at the situation prior to C++11, first of all, and the historical basis for switching to an alternative mechanism.

A lot of problems with STL allocators come out of confusion in the initial design. According to Alexander Stepanov (primary designer and implementer of the STL) the custom allocator mechanism was invented to deal with a specific issue with Intel memory architecture. (Do you remember near and far pointers? If not, consider yourself lucky I guess!) From this interview with Alexander:

Question: How did allocators come into STL? What do you think of them?

Answer: I invented allocators to deal with Intel’s memory architecture. They are not such a bad ideas in theory – having a layer that encapsulates all memory stuff: pointers, references, ptrdiff_t, size_t. Unfortunately they cannot work in practice.

And it seems like this original design intention was also only partially executed. From the wikipedia entry for allocators:

They were originally intended as a means to make the library more flexible and independent of the underlying memory model, allowing programmers to utilize custom pointer and reference types with the library. However, in the process of adopting STL into the C++ standard, the C++ standardization committee realized that a complete abstraction of the memory model would incur unacceptable performance penalties. To remedy this, the requirements of allocators were made more restrictive. As a result, the level of customization provided by allocators is more limited than was originally envisioned by Stepanov.

and, further down:

While Stepanov had originally intended allocators to completely encapsulate the memory model, the standards committee realized that this approach would lead to unacceptable efficiency degradations. To remedy this, additional wording was added to the allocator requirements. In particular, container implementations may assume that the allocator’s type definitions for pointers and related integral types are equivalent to those provided by the default allocator, and that all instances of a given allocator type always compare equal, effectively contradicting the original design goals for allocators and limiting the usefulness of allocators that carry state.

Some of the key problems with STL allocators (historically) are then:

  • Unnecessary complexity, with some boiler plate stuff required for features that are not actually used
  • A limitation that allocators cannot have internal state (‘all instances of a given allocator type are required to be interchangeable and always compare equal to each other’)
  • The fact the allocator type is included in container type (with changes to allocator type changing the type of the container)

There are some changes to this situation with C++11, as we’ll see below, but this certainly helps explain why a lot of people have chosen to avoid the STL allocator mechanism, historically!

Virtual allocator interface

So we decided to avoid STL allocators, and use a non-standard approach.

The approach we use is based on a virtual allocator interface, and avoids the need to specify allocator type as a template parameter.

This is quite similar to the setup for allocators in the BitSquid engine, as described by Niklas here (as linked above, it’s probably worth reading that post if you didn’t see this already, as I’ll try to avoid repeating the various points he discussed there).

A basic allocator interface can then be defined as follows:

class iAllocator
{
public:
    virtual ~iAllocator() {}
    virtual void* allocate(tUnsigned32 size) = 0;
    virtual void deallocate(void* ptr) = 0;
// helper
    template <class T> void
    allocate_Array(tUnsigned32 arraySize, T*& result)
    {
        result = static_cast<T*>(allocate(sizeof(T) * arraySize));
    }
};

The allocate_Array() method is for convenience, concrete allocator objects just need to implement allocate() and free().

We can store a pointer to iAllocator in our vector, and replace the direct calls to malloc() and free() with virtual function calls, as follows:

    static T*
    allocate(size_type size)
    {
        T* allocated;
        _allocator->allocate_Array(size, allocated);
        return allocated;
    }
    void
    reallocate(size_type newCapacity)
    {
        T* newData;
        _allocator->allocate_Array(newCapacity, newData);
        copyRange(_data, _data + _size, newData);
        deleteRange(_data, _data + _size);
        _allocator->deallocate(_data);
        _data = newData;
        _capacity = newCapacity;
    }

These virtual function calls potentially add some overhead to allocation and deallocation. It’s worth being quite careful about this kind of virtual function call overhead, but in practice it seems that the overhead is not significant here. Virtual function call overhead is often all about cache misses and, perhaps because there are often just a small number of actual allocator instance active, with allocations tending to be grouped by allocator, this just isn’t such an issue here.

We use a simple raw pointer for the allocator reference. Maybe a smart pointer type could be used (for better modern C++ style and to increase safety), but we usually want to control allocator lifetime quite explicitly, so we’re basically just careful about this.

Allocators can be passed in to each vector constructor, or if omitted will default to a ‘global allocator’ (which adds a bit of extra linkage to our vector header):

    cVector(size_type size, const T& fillWith,
        iAllocator& allocator = GlobalAllocator()
        )
    {
        _data = 0;
        _allocator = &allocator;
        _size = size;
        _capacity = size;
        if(size)
        {
            _allocator->allocate_Array(_capacity, _data);
            constructRange(_data, _data + size, fillWith);
        }
    }

Here’s an example concrete allocator implementation:

class cMallocAllocator : public iAllocator
{
public:
    void*
    allocate(tUnsigned32 size)
    {
        assert(size);
        return malloc(static_cast<size_t>(size));
    }
    void
    deallocate(void* ptr)
    {
        free(ptr);
    }
};

(Note that you normally can call malloc() with zero size, but this is something that we disallow for PathEngine allocators.)

And this can be passed in to vector construction as follows:

    cMallocAllocator allocator;
    cVector<int> v(10, 0, allocator);

Swapping vectors

That’s pretty much it, but there’s one tricky case to look out for.

Specifically, what should happen in our vector swap() method? Let’s take a small diversion to see why there might be a problem.

Consider some code that takes a non-const reference to vector, and ‘swaps a vector out’ as a way of returning a set of values in the vector without the need to heap allocate the vector object itself:

class cVectorBuilder
{
    cVector<int> _v;
public:
    //.... construction and other building methods
    void takeResult(cVector<int>& result); // swaps _v into result
};

So this code doesn’t care about allocators, and just wants to work with a vector of a given type. And maybe there is some other code that uses this, as follows:

void BuildData(/*some input params*/, cVector& result)
{
  //.... construct a cVectorBuilder and call a bunch of build methods
    builder.takeResult(result);
}

Now there’s no indication that there’s going to be a swap() involved, but the result vector will end up using the global allocator, and this can potentially cause some surprises in the calling code:

   cVector v(someSpecialAllocator);
   BuildData(/*input params*/, v);
   // lost our allocator assignment!
   // v now uses the global allocator

Nobody’s really doing anything wrong here (although this isn’t really the modern C++ way to do things). This is really a fundamental problem arising from the possibility to swap vectors with different allocators, and there are other situations where this can come up.

You can find some discussion about the possibilities for implementing vector swap with ‘unequal allocators’ here. We basically choose option 1, which is to simply declare it illegal to call swap with vectors with different allocators. So we just add an assert in our vector swap method that the two allocator pointers are equal.

In our case this works out fine, since this doesn’t happen so much in practice, because cases where this does happen are caught directly by the assertion, and because it’s generally straightforward to modify the relevant code paths to resolve the issue.

Comparison with std::vector, is this necessary/better??

Ok, so I’ve outlined the approach we take for custom allocation in our vector class.

This all works out quite nicely for us. It’s straightforward to implement and to use, and consistent with the custom allocators we use more generally in PathEngine. And we already had our custom vector in place when we came to implement this, so this wasn’t part of the decision about whether or not to switch to a custom vector implementation. But it’s interesting, nevertheless, to compare this approach with the standard allocator mechanism provided by std::vector.

My original ‘roll-your-own vector’ blog post was quite controversial. There were a lot of responses strongly against the idea of implementing a custom vector, but a lot of other responses (often from the game development industry side) saying something like ‘yes, we do that, but we do some detail differently’, and I know that this kind of customisation is not uncommon in the industry.

These two different viewpoints makes it worthwhile to explore this question in a bit more detail, then, I think.

I already discussed the potential pitfalls of switching to a custom vector implementation in the original ‘roll-your-own vector’ blog post, so lets look at the potential benefits of switching to a custom allocator mechanism.

Broadly speaking, this comes down to three key points:

  • Interface complexity
  • Stateful allocator support
  • Possibilities for further customisation and memory optimisation

Interface complexity

If we look at an example allocator implementation for each setup we can see that there’s a significant difference in the amount of code required. The following code is taken from my previous post, and was used to fill allocated memory with non zero values, to check for zero initialisation:

// STL allocator version
template <class T>
class cNonZeroedAllocator
{
public:
    typedef T value_type;
    typedef value_type* pointer;
    typedef const value_type* const_pointer;
    typedef value_type& reference;
    typedef const value_type& const_reference;
    typedef typename std::size_t size_type;
    typedef std::ptrdiff_t difference_type;
    template <class tTarget>
    struct rebind
    {
        typedef cNonZeroedAllocator<tTarget> other;
    };
    cNonZeroedAllocator() {}
    ~cNonZeroedAllocator() {}
    template <class T2>
    cNonZeroedAllocator(cNonZeroedAllocator<T2> const&)
    {
    }
    pointer
    address(reference ref)
    {
        return &ref;
    }
    const_pointer
    address(const_reference ref)
    {
        return &ref;
    }
    pointer
    allocate(size_type count, const void* = 0)
    {
        size_type byteSize = count * sizeof(T);
        void* result = malloc(byteSize);
        signed char* asCharPtr;
        asCharPtr = reinterpret_cast<signed char*>(result);
        for(size_type i = 0; i != byteSize; ++i)
        {
            asCharPtr[i] = -1;
        }
        return reinterpret_cast<pointer>(result);
    }
    void deallocate(pointer ptr, size_type)
    {
        free(ptr);
    }

    size_type
    max_size() const
    {
        return 0xffffffffUL / sizeof(T);
    }
    void
    construct(pointer ptr, const T& t)
    {
        new(ptr) T(t);
    }
    void
    destroy(pointer ptr)
    {
        ptr->~T();
    }
    template <class T2> bool
    operator==(cNonZeroedAllocator<T2> const&) const
    {
        return true;
    }
    template <class T2> bool
    operator!=(cNonZeroedAllocator<T2> const&) const
    {
        return false;
    }
};

But with our custom allocator interface this can now be implemented as follows:

// custom allocator version
class cNonZeroedAllocator : public iAllocator
{
public:
    void*
    allocate(tUnsigned32 size)
    {
        void* result = malloc(static_cast<size_t>(size));
        signed char* asCharPtr;
        asCharPtr = reinterpret_cast<signed char*>(result);
        for(tUnsigned32 i = 0; i != size; ++i)
        {
            asCharPtr[i] = -1;
        }
        return result;
    }
    void
    deallocate(void* ptr)
    {
        free(ptr);
    }
};

As we saw previously a lot of stuff in the STL allocator relates to some obsolete design decisions, and is unlikely to actually be used in practice. The custom allocator interface also completely abstracts out the concept of constructed object type, and works only in terms of actual memory sizes and pointers, which seems more natural and whilst doing everything we need for the allocator use cases in PathEngine.

For me this is one advantage of the custom allocation setup, then, although probably not something that would by itself justify switching to a custom vector.

If you use allocators that depend on customisation of the other parts of the STL allocator interface (other than for data alignment) please let me know in the comments thread. I’m quite interested to hear about this! (There’s some discussion about data alignment customisation below.)

Stateful allocator requirement

Stateful allocator support is a specific customer requirement for PathEngine.

Clients need to be able to set custom allocation hooks and have all allocations made by the SDK (including vector buffer allocations) routed to custom client-side allocation code. Furthermore, multiple allocation hooks can be supplied, with the actual allocation strategy selected depending on the actual local execution context.

It’s not feasible to supply allocation context to all of our vector based code as a template parameter, and so we need our vector objects to support stateful allocators.

Stateful allocators with the virtual allocator interface

Stateful allocators are straightforward with our custom allocator setup. Vectors can be assigned different concrete allocator implementations and these concrete allocator implementations can include internal state, without code that works on the vectors needing to know anything about these details.

Stateful allocators with the STL

As discussed earlier, internal allocator state is something that was specifically forbidden by the original STL allocator specification. This is something that has been revisited in C++11, however, and stateful allocators are now explicitly supported, but it also looks like it’s possible to use stateful allocators in practice with many pre-C++11 compile environments.

The reasons for disallowing stateful allocators relate to two specific problem situations:

  • Splicing nodes between linked lists with different allocation strategies
  • Swapping vectors with different allocation strategies

C++11 addresses these issues with allocator traits, which specify what to do with allocators in problem cases, with stateful allocators then explicitly supported. This stackoverflow answer discusses what happens, specifically, with C++11, in the vector swap case.

With PathEngine we want to be able to support clients with different compilation environments, and it’s an advantage not to require C++11 support. But according to this stackoverflow answer, you can also actually get away with using stateful allocators in most cases, without explicit C++11 support, as long as you avoid these problem cases.

Since we already prohibit the vector problem case (swap with unequal allocators), that means that we probably can actually implement our stateful allocator requirement with std::vector and STL allocators in practice, without requiring C++11 support.

There’s just one proviso, with or without C++11 support, due to allowances for legacy compiler behaviour in allocator traits. Specifically, it doesn’t look like we can get the same assertion behaviour in vector swap. If propagate_on_container_swap::value is set to false for either allocator then the result is ‘undefined behaviour’, so this could just swap the allocators silently, and we’d have to be quite careful about these kinds of problem cases!

Building on stateful allocators to address other issues

If you can use stateful allocators with the STL then this changes things a bit. A lot of things become possible just by adding suitable internal state to standard STL allocator implementations. But you can also now use this allocator internal state as a kind of bootstrap to work around other issues with STL allocators.

The trick is wrap up the same kind of virtual allocator interface setup we use in PathEngine in an STL allocator wrapper class. You could do this (for example) by putting a pointer to our iAllocator interface inside an STL allocator class (as internal state), and then forward the actual allocation and deallocation calls as virtual function calls through this pointer.

So, at the cost of another layer of complexity (which can be mostly hidden from the main application code), it should now be possible to:

  • remove unnecessary boiler plate from concrete allocator implementations (since these now just implement iAllocator), and
  • use different concrete allocator types without changing the actual vector type.

Although I’m still not keen on STL allocators, and prefer the direct simplicity of our custom allocator setup as opposed to covering up the mess of the STL allocator interface in this way, I have to admit that this does effectively remove two of the key benefits of our custom allocator setup. Let’s move on to the third point, then!

Refer to the bloomberg allocator model for one example of this kind of setup in practice (and see also this presentation about bloomberg allocators in the context C++11 allocator changes).

Memory optimisation

The other potential benefit of custom allocation over STL allocators is basically the possibility to mess around with the allocation interface.

With STL allocators we’re restricted to using the allocate() and deallocate() methods exactly as defined in the original allocator specification. But with our custom allocator we’re basically free to mess with these method definitions (in consultation with our clients!), or to add additional methods, and generally change the interface to better suit our clients needs.

There is some discussion of this issue in this proposal for improving STL allocators, which talks about ways in which the memory allocation interface provided by STL allocators can be sub-optimal.

Some customisations implemented in the Bitsquid allocators are:

  • an ‘align’ parameter for the allocation method, and
  • a query for the size of allocated blocks

PathEngine allocators don’t include either of these customisations, although this is stuff that we can add quite easily if required by our clients. Our allocator does include the following extra methods:

    virtual void*
    expand(
            void* oldPtr,
            tUnsigned32 oldSize,
            tUnsigned32 oldSize_Used,
            tUnsigned32 newSize
            ) = 0;
// helper
    template <class T> void
    expand_Array(
            T*& ptr,
            tUnsigned32 oldArraySize,
            tUnsigned32 oldArraySize_Used,
            tUnsigned32 newArraySize
            )
    {
        ptr = static_cast<T*>(expand(
            ptr,
            sizeof(T) * oldArraySize,
            sizeof(T) * oldArraySize_Used,
            sizeof(T) * newArraySize
            ));
    }

What this does, essentially, is to provide a way for concrete allocator classes to use the realloc() system call, or similar memory allocation functionality in a custom head, if this is desired.

As before, the expand_Array() method is there for convenience, and concrete classes only need to implement the expand() method. This takes a pointer to an existing memory block, and can either add space to the end of this existing block (if possible), or allocate a larger block somewhere else and move existing data to that new location (based on the oldSize_Used parameter).

Implementing expand()

A couple of example implementations for expand() are as follows:

// in cMallocAllocator, using realloc()
    void*
    expand(
        void* oldPtr,
        tUnsigned32 oldSize,
        tUnsigned32 oldSize_Used,
        tUnsigned32 newSize
        )
    {
        assert(oldPtr);
        assert(oldSize);
        assert(oldSize_Used <= oldSize);
        assert(newSize > oldSize);
        return realloc(oldPtr, static_cast<size_t>(newSize));
    }
// as allocate and move
    void*
    expand(
        void* oldPtr,
        tUnsigned32 oldSize,
        tUnsigned32 oldSize_Used,
        tUnsigned32 newSize
        )
    {
        assert(oldPtr);
        assert(oldSize);
        assert(oldSize_Used <= oldSize);
        assert(newSize > oldSize);
        void* newPtr = allocate(newSize);
        memcpy(newPtr, oldPtr, static_cast<size_t>(oldSize_Used));
        deallocate(oldPtr);
        return newPtr;
    }

So this can either call through directly to something like realloc(), or emulate realloc() with a sequence of allocation, memory copy and deallocation operations.

Benchmarking with realloc()

With this expand() method included in our allocator it’s pretty straightforward to update our custom vector to use realloc(), and it’s easy to see how this can potentially optimise memory use, but does this actually make a difference in practice?

I tried some benchmarking and it turns out that this depends very much on the actual memory heap implementation in use.

I tested this first of all with the following simple benchmark:

template <class tVector> static void
PushBackBenchmark(tVector& target)
{
    const int pattern[] = {0,1,2,3,4,5,6,7};
    const int patternLength = sizeof(pattern) / sizeof(*pattern);
    const int iterations = 10000000;
    tSigned32 patternI = 0;
    for(tSigned32 i = 0; i != iterations; ++i)
    {
        target.push_back(pattern[patternI]);
        ++patternI;
        if(patternI == patternLength)
        {
            patternI = 0;
        }
    }
}

(Wrapped up in some code for timing over a bunch of iterations, with result checking to avoid the push_back being optimised out.)

This is obviously very far from a real useage situation, but the results were quite interesting:

OS container type time
Linux std::vector 0.0579 seconds
Linux cVector without realloc 0.0280 seconds
Linux cVector with realloc 0.0236 seconds
Windows std::vector 0.0583 seconds
Windows cVector without realloc 0.0367 seconds
Windows cVector with realloc 0.0367 seconds

So the first thing that stands out from these results is that using realloc() doesn’t make any significant difference on windows. I double checked this, and while expand() is definitely avoiding memory copies a significant proportion of the time, this is either not significant in the timings, or memory copy savings are being outweighed by some extra costs in the realloc() call. Maybe realloc() is implemented badly on Windows, or maybe the memory heap on Windows is optimised for more common allocation scenarios at the expense of realloc(), I don’t know. A quick google search shows that other people have seen similar issues.

Apart from that it looks like realloc() can make a significant performance difference, on some platforms (or depending on the memory heap being used). I did some extra testing, and it looks like we’re getting diminishing returns after some of the other performance tweaks we made in our custom vector, specifically the tweaks to increase capacity after the first push_back, and the capacity multiplier tweak. With these tweaks backed out:

OS container type time
Linux cVector without realloc, no tweaks 0.0532 seconds
Linux cVector with realloc, no tweaks 0.0235 seconds

So, for this specific benchmark, using realloc() is very significant, and even avoids the need for those other performance tweaks.

Slightly more involved benchmark

The benchmark above is really basic, however, and certainly isn’t a good general benchmark for vector memory use. In fact, with realloc(), there is only actually ever one single allocation made, which is then naturally free to expand through the available memory space!

A similar benchmark is discussed in this stackoverflow question, and in that case the benefits seemed to reduce significantly with more than one vector in use. I hacked the benchmark a bit to see what this does for us:

template <class tVector> static void
PushBackBenchmark_TwoVectors(tVector& target1, tVector& target2)
{
    const int pattern[] = {0,1,2,3,4,5,6,7};
    const int patternLength = sizeof(pattern) / sizeof(*pattern);
    const int iterations = 10000000;
    tSigned32 patternI = 0;
    for(tSigned32 i = 0; i != iterations; ++i)
    {
        target1.push_back(pattern[patternI]);
        target2.push_back(pattern[patternI]);
        ++patternI;
        if(patternI == patternLength)
        {
            patternI = 0;
        }
    }
}
template <class tVector> static void
PushBackBenchmark_ThreeVectors(tVector& target1, tVector& target2, tVector& target3)
{
    const int pattern[] = {0,1,2,3,4,5,6,7};
    const int patternLength = sizeof(pattern) / sizeof(*pattern);
    const int iterations = 10000000;
    tSigned32 patternI = 0;
    for(tSigned32 i = 0; i != iterations; ++i)
    {
        target1.push_back(pattern[patternI]);
        target2.push_back(pattern[patternI]);
        target3.push_back(pattern[patternI]);
        ++patternI;
        if(patternI == patternLength)
        {
            patternI = 0;
        }
    }
}

With PushBackBenchmark_TwoVectors():

OS container type time
Linux std::vector 0.0860 seconds
Linux cVector without realloc 0.0721 seconds
Linux cVector with realloc 0.0495 seconds

With PushBackBenchmark_ThreeVectors():

OS container type time
Linux std::vector 0.1291 seconds
Linux cVector without realloc 0.0856 seconds
Linux cVector with realloc 0.0618 seconds

That’s kind of unexpected.

If we think about what’s going to happen with the vector buffer allocations in this benchmark, on the assumption of sequential allocations into a simple contiguous memory region, it seems like the separate vector allocations in the modified benchmark versions should actually prevent each other from expanding. And I expected that to reduce the benefits of using realloc. But the speedup is actually a lot more significant for these benchmark versions.

I stepped through the benchmark and the vector buffer allocations are being placed sequentially in a single contiguous memory region, and do initially prevent each other from expanding, but after a while the ‘hole’ at the start of the memory region gets large enough to be reused, and then reallocation becomes possible, and somehow turns out to be an even more significant benefit. Maybe these benchmark versions pushed the memory use into a new segment and incurred some kind of segment setup costs?

With virtual memory and different layers of memory allocation in modern operating systems, and different approaches to heap implementations, it all works out as quite a complicated issue, but it does seem fairly clear, at least, that using realloc() is something that can potentially make a significant difference to vector performance, in at least some cases!

Realloc() in PathEngine

Those are all still very arbitrary benchmarks and it’s interesting to see how much this actually makes a difference for some real uses cases. So I had a look at what difference the realloc() support makes for the vector use in PathEngine.

I tried our standard set of SDK benchmarks (with common queries in some ‘normal’ situations), both with and without realloc() support, and compared the timings for these two cases. It turns out that for this set of benchmarks, using realloc() doesn’t make a significant difference to the benchmark timings. There are some slight improvements in some timings, but nothing very noticeable.

The queries in these benchmarks have already had quite a lot of attention for performance optimisation, of course, and there are a bunch of other performance optimisations already in the SDK that are designed to avoid the need for vector capacity increases in these situations (reuse of vectors for runtime queries, for example). Nevertheless, if we’re asking whether custom allocation with realloc() is ‘necessary or better’ in the specific case of PathEngine vector use (and these specific benchmarks) the answer appears to be that no this doesn’t really seem to make any concrete difference!

Memory customisation and STL allocators

As I’ve said above, this kind of customisation of the allocator interface (to add stuff like realloc() support) is something that we can’t do with the standard allocator setup (even with C++11).

For completeness it’s worth noting the approach suggested by Alexandrescu in this article where he shows how you can effectively shoehorn stuff like realloc() calls into STL allocators.

But this does still depends on using some custom container code to detect special allocator types, and won’t work with std::vector.

Conclusion

This has ended up a lot longer than I originally intended so I’ll go ahead and wrap up here!

To conclude:

  • It’s not so hard to implement your own allocator setup, and integrate this with a custom vector (I hope this post gives you a good idea about what can be involved in this)
  • There are ways to do similar things with the STL, however, and overall this wouldn’t really work out as a strong argument for switching to a custom vector in our case
  • A custom allocator setup will let you do some funky things with memory allocation, if your memory heap will dance the dance, but it’s not always clear that this will translate into actual concrete performance benefits

A couple of things I haven’t talked about:

Memory fragmentation: custom memory interfaces can also be important for avoiding memory fragmentation, and this can be an important issue. We don’t have a system in place for actually measuring memory fragmentation, though, and I’d be interested to hear how other people in the industry actually quantify or benchmark this.

Memory relocation: the concept of ‘relocatable allocators’ is quite interesting, I think, although this has more significant implications for higher level vector based code, and requires moving further away from standard vector usage. This is something I’ll maybe talk about in more depth later on..

** Comments: Please check the existing comment thread for this post before commenting. **

So they made you a lead; now what? (Part 1)

Original Author: Oliver Franzke

What to do after you get promoted into a leadership position should be a trivial question to answer, but in my experience the opposite is true. In fact sometimes it seems to me that leadership is some kind of taboo topic in the games industry. Making games is supposed to be creative and fun and people would rather not talk about a ‘boring’ topic like leadership, but everyone who has had a bad supervisor at some point will agree that lack of leadership skills can be incredibly harmful to team morale and therefore to the game development process. That’s why, when I was first promoted into a leadership position, I set myself the goal to be just like the awesome supervisors I had in the past. But what made these people a great boss? I had no idea, but I assumed I would figure it out myself along the way. Looking back at it now I have to admit I was quite naïve.

After learning more about the theory and practice of leadership I realized that I was unprepared for this role and I’m not the only one with this experience. Before I started writing this article I talked to several leads (or ex-leads) and none of them had ever received any kind of leadership training. Some people were lucky enough to have a mentor, but even that doesn’t seem to be the standard. To me the most troubling fact is that none of the leads were ever told what was expected of them in their new role.

Given how important this role is you would think that game studios would invest some time and money to train their leads, but that doesn’t seem to be the case. The optimistic interpretation is that the companies trust their employees enough to quickly pick up the required skills themselves. The pessimistic interpretation on the other hand is that management simply doesn’t care or know any better. The real reason is probably located somewhere in between these extremes, but it doesn’t change the fact that most new leaders are simply thrown in at the deep end.

For example when I was first promoted into a leadership role I really had no clue what I was doing or what I was supposed to do. I was a good programmer and a responsible team player (which is why I was promoted I guess) and I figured I should simply continue coding until some kind of big revelation would turn me into an awesome team lead. Obviously I never had this magical epiphany and after a while I realized I should probably start investigating leadership in a more methodical way.

My goal for this two-part article is to share some of the lessons I learned myself while adjusting to my role as a lead programmer. If you were recently promoted into a leadership position hopefully you’ll find some of the content in this post helpful. If you had different experiences or have additional advice you’d like to share, then please leave a comment or contact me directly.

I want to emphasize the fact that leadership isn’t magic nor do you have to be born for it. Leadership is simply a set of skills that can be learned and in my experience it’s worth the time investment!

What is leadership anyway?

At the heart of a leadership position are people skills which make this role different from a regular production job. Being a great programmer, designer or artist doesn’t necessarily mean you are also an awesome team lead. In fact your production skills are merely the foundation on which you’ll have to build your leadership role.

But what exactly are these necessary people skills and what makes an effective team lead? Depending on who you talk to you’ll get different answers, but I think that the core values of leadership are about developing trust, setting directions and supporting the team in order to make the best possible product (e.g. game, tool, engine) with the given resources and constraints.

In order to be an effective lead you’ll first have to earn your colleagues trust. If your team feels like they can’t come to you with questions, problems or suggestions, then you (and the company) have a big problem. Gaining the trust of your team doesn’t happen automatically and requires a lot of effort. You can find some practical advice how to work on this in the ‘leadership starter kit’ in part 2 of this article.

Similarly if your supervisor (e.g. project lead) doesn’t trust you, then he or she will probably manage around you which is a bad situation for everyone involved. In my experience transparency is crucial when managing up especially when things don’t go as planned. Let your supervisor know if there is a problem and take responsibility by working on a solution.

Making games is complicated and it would be unrealistic to assume that there won’t be problems along the way. Dealing with difficult situations is much easier if everyone on your team is on the same page about what has to get done. Setting a clear direction for your team is therefore a crucial part of your role.

A great mission statement is concise so that it’s easy to remember and explain. For an environment art team this could be “We want to create a photorealistic setting for our game” whereas a tools lead might come up with “Every change to the level should be visible right away”. Of course it is important that your team’s direction is aligned with the vision of the project, because creating a photorealistic environment for a game with a painterly art style doesn’t make sense.

In addition to defining a clear direction for your team one of your main responsibilities as a lead is to provide support for your team, so that they can be successful. This might seem very obvious, but the shift from being accountable only for your own work to being responsible for the success of a group of people can be a hard lesson to learn in the beginning.

Almost all leads I talked to mentioned that they were surprised by how little time they had for their ‘actual job’ after being promoted. It is essential to realize that the support of your team is your actual job now, which means that you’ll have to balance your workload differently. Some practical advice for this specific issue can be found in the second part of this article in the ‘leadership starter kit’.

Support can be provided in many different ways: Discussing the advantages and disadvantages of a proposed solution to a problem is one example. Being a mentor and helping the individual team members with their career progression is another form of support. A third example is to make sure that the team has everything it needs (e.g. dev-kits, access to documentation, tools …) to achieve the goals.

As a lead you might also have to support your team by letting someone know that his or her work doesn’t meet your expectations. A conversation like this isn’t easy, but it is important to let the person know that there is a problem and to offer advice and assistance to resolve the situation.

What leadership isn’t

In order to avoid misconceptions and common mistakes it can be quite useful to define what leadership (in the games industry) is not. This topic is somewhat shrouded in mystery and there are many incorrect or outdated assumptions.

For example I thought for the longest time that leadership and management are the same thing. This is not the case though and when I talked to other leads about what they dislike about their role I found that most aspects mentioned were in fact related to management rather than to leadership. Of course it would be unrealistic to assume that you will be able to avoid management tasks altogether, but getting help from a producer can reduce the amount of administrative work significantly.

Another misconception that is often popularized by movies is that you have to demonstrate your power as a leader by barking out orders all day. This might work well in the army, but making video games requires collaboration and creativity and an authoritative leadership has no place in this environment. An inspired team is a productive team and autonomy is crucial for high morale.

Equally as bad is to ignore the team by using a hands-off leadership approach. This mistake is quite common since most team leads started their career with a production job. It can be tough for a new lead to accept the changed responsibilities, but in my opinion this is one of the most important lessons to learn. Rather than contributing to the production directly your primary responsibility is to support your team. Having time for design, art or programming in addition to that is great, but the team should always come first.

As a lead you are responsible for your team, which means that you’ll also have to deal with complications and it’s inevitable that things will go wrong during the production of a game. Your team might introduce a new crash bug or maybe you run into an unexpected problem that causes the milestone to slip. Whatever the issue may be you are responsible for what your team does and playing the blame game is the worst thing you can do, because it’ll ruin trust and team morale. Instead of shifting your responsibility to a team member you should concentrate on figuring out how to solve the problem.

End of Part 1

The second part of this article focuses on practical advice for newly minted team leads and it also discusses effective strategies to develop leadership skills, so please check it out once it is online (very soon).

I hope you enjoyed this post and thank you for reading.

PS.: Whether you just got promoted or have been leading a team for a long time I would love to hear from you, so please feel free to leave a comment.

PPS: I would like to thank everybody who helped me with this article. You guys rock!

Data flow in a property centric game engine

Original Author: Fredrik Alströmer

Introduction

I stopped taking on contracts about a year ago to focus on building my own indie game. I decided to build my own engine, and I know, I know, indies shouldn’t build their own engine, but let’s ignore reason for now and instead focus on something else. I wanted to build a property centric engine, using composition for pretty much everything rather than inheritance.

So what does it mean when you say you have a property centric model? In contrast to using inheritance, a game object might no longer be a renderable, physics-simulated, camera-trackable, weapon-carrying, and enemy-discoverable object. Instead it has the corresponding properties.

So what? Potato, pot-ah-to, right?

Well, not really. When a player is playing the game, they see distinct objects, perhaps a couple of monsters, wielding broadswords, strolling down Sunrise Lane or, who knows, maybe a heavyset man in a suit and a hard-hat blocking off the George Washington Bridge. Basically, what the player sees is this:

Fruit Salad

However, the advantage of having a property centric engine, is that instead of dealing with it this way, we have the option of organizing our things somewhat differently. What if we instead dealt with our entities like this?

Sorted Fruit Salad

That is, we have monsters, we have broadswords, and we have Sunrise Lane. (As a side note, these images makes me think of scatter/gather I/O, am I alone here?) Of course, the analogy is greatly simplified, and — as I hinted at above — we’d have a great number of different properties of which only a few are directly related to actual visible objects and their shape. I like this layout and the way it allows us to deal with all instances of a property at once.

The bridge

So how do we create the bridge between the engine’s sorted and ordered view of the world, and the player’s view? An object is bound to have dependent properties, for example, a rendering geometry property which depends on the physical simulation property, so we need to somehow combine them to give the illusion of being independent objects, rather than independent properties.

Data exchange

My research into a property centric design was driven by a fascination for data oriented design, and specifically the “where there’s one there are many” mantra. I got obsessed with arrays of raw data, of lean structures. I wanted this to be the core of my engine.

Each property is handled by a separate manager, which takes care of both memory allocation and ‘ticking’ or updating the properties each frame, and it’s free to move data around and sort it as it sees fit. The data itself is dumb, straight arrays of floats or perhaps structured in groups of vectors or quaternions, or similar smaller groups of data which is generally accessed together.

Raw data in this structure-of-arrays layout can’t really do inheritance (in the traditional OOP sense) even if I wanted to, so the property centric model became the natural choice.

Read and write data

With arrays of raw data, the easiest solution to fetch information from, or pass information to, a different component is to simply read or write the data directly using pointers. This also has the advantage that there can be no immediate side-effects as we’re not calling any functions, and we’re certainly not calling any virtual functions. This approach has several draw-backs, but what it lacks in flexibility, it makes up for in a lack of complexity, so we’ll need to be aware of the limitations and work accordingly. I’ve often found that having to adapt to a simple mechanism has a tendency to make it more robust too, so there’s also that aspect.

We’re effectively passing messages back and forth, quite similar to a data bus. We don’t need to know who receives the data, or who sent it, all we care about is that it is formatted correctly, and how to slap a destination address on it, which is a nice characteristic.

Semi-standardized memory blocks

If we define a set of standard memory blocks, and try to use these as often as possible in our properties, chances are we don’t need to worry about data formatting very often. If you look closely, you’ll notice that data that you typically would want to pass around is probably already using a small defined set of data types. For example, I have a type called f3 which is simply an array of three floats (a not an all too uncommon type when dealing with three dimensional space), and I use fixed time steps interpolating between the previous and the current step of the simulation, thus the most common type in my engine is f3[2]. I also place the ‘current’ value first so I can use the same reference to read both f3 and f3[2], where I need it. I guess this could technically be considered a very naive implementation of (multiple) inheritance, but let’s not go there. It does allow us to interact with objects we know very little about though, so I guess you could call it something similar to polymorphism if you wanted to, and you’re an ad-man with affinity for buzz-words.

We still need to know the ‘interface’ or data layout of our components. Assuming we’re using standardized data-blocks, this can be extracted out of the manager logic, and into a wiring phase which is done by a higher level game object. The game object code creates the properties, passing references to the appropriate data blocks as input, and all managers can remain completely oblivious to how they’re connected.

Wiring it up

It was important to me that each manager would be free to reorganize or sort its data arrays as it saw fit, so I couldn’t use raw pointers without forcing every manager to keep track of who’s referencing which of the properties it’s managing (in order to keep them up to date on the new address). As the number of references can be pretty arbitrary depending on where a specific property is used, I chose to insert an indirection instead, i.e. a lookup table.

Furthermore, I elected to go with a handle scheme instead of straight up pointers into the lookup table. The lookup table still stores pointers into the property data though. Using handles for the lookup has a couple of advantages, the first thing that comes to mind is that we can eliminates the problem of dangling pointers by keeping a version counter in the handle, which is nice. Second, I cut my handles to 32 bits, which is half the size of a pointer on a 64 bit system. And third, I reserve 8 bits of the handle for offsets into the data being pointed to, which lets me store one pointer per property structure, while still allowing a handle to ‘point’ somewhere within that structure. This is useful when I’m storing, for example, both origin and orientation together, and for a particular case I’m only interested in orientation. The offset handling is hidden in the lookup table functions, so as long as we set it correctly during wiring, the user of the handle doesn’t need to worry about it, giving us a bit of flexibility and reducing the sheer number of properties that otherwise would’ve needed to be registered and kept in sync. The observant reader might notice that this limits the maximum size of the data structure to 256 bytes, but as mentioned earlier, I want these to be as small as possible and only contain the data which is generally accessed together. So really, 256 bytes ought to be enough for everybody…

I don’t have graphical UI with little squiggly lines representing wires connecting an input of some property to the output of some other; but in my mind that’s what’s going on, I arbitrarily connect data ‘slots’ to one another and the property managers are none the wiser. As an example, this is how I picture what the wiring looks like when setting up the player camera.

Player camera wiring

The ID of the network entity to track is provided by the server, the camera logic doesn’t know what kind of object we’re tracking, is only given the origin property. During gameplay, it’s not even a server object directly, but a client side prediction property. The rest of the wiring applies an offset (the transformation) to the entity origin before feeding it into a PID controller, controlling a linear momentum property. The output is fed to rendering views as well as ground synthesis (only generate ground mesh where we can actually see it) and directional lighting (the shadow map rendering needs to follow the camera around). Each of these boxes represent separate property managers which, when called, update all the instances of that property, e.g. the PID manager updates all PID controller instances, it doesn’t matter if they’re being used to control the player camera or moving UI elements around.

Where’d the game logic go?

You’ll notice I haven’t really talked about game logic. It hasn’t magically disappeared just because the game engine has a particular architecture, it’s still there. The fact is still that the player will see compound game objects on screen, such as a soldier, and you will need something to keep track of the relevant properties. I still have a soldier game object. The difference is, there’s no soldier-update (or -tick, or -think, whatever you want to call it), there’s pretty much only a create and destroy for client and server side respectively. The create function initializes the appropriate properties, after that the property managers take over and deal with the frame-to-frame work. I’m sure you can think of other functions you’d want for special game logic stuff, but the important notion is that there’s no frame-by-frame update function.

Actually, I lied. I have a soldier-update function. But it is a specialized soldier property which handles animations on the client side, with dependencies on the server object property, if the object starts moving it’ll trigger walk cycle animations, do some simple forward kinematics to aim the gun in the right direction, and so on. It does not move the soldier around on screen, that’s a redraw property wired to a server object property via a client-side prediction property.

Wrap up

I find this design has a certain charm, it keeps the implementation of each property manager focused on doing a single thing, and doing it efficiently.

Additionally, if you focus on keeping your data in a raw format like this, you’ll end up with very lean data. It’ll make you think about that and how to organize it. You won’t end up with objects where, alongside your couple of vectors worth of data, you have a virtual table pointer, references to a couple of engine sub systems, and so on and so forth. Don’t underestimate how objects may balloon thanks to a couple of references, especially if you’re on a 64 bit architecture, and you’re creating maybe a million instances.

Two problems that stand out are particularly affected by the design choice are concurrent access to lookup tables and shared data, if we run updates of separate property managers simultaneously. Note that this doesn’t stop us from dividing the list of properties to update over several threads. Related to this is ordering, if we run property manager updates sequentially, and cannot wait for the next frame to let the value propagate, we’ll need to be clever about in what order we run our updates.

All in all, it’s a neat set up, and it really appeals to how my brain works. So should I have built my own engine? Probably not. It’s been a cool ride though, and I’ve learned a tremendous amount. It hasn’t been without its share of dark moments, but perhaps that’s a topic for a different post.

Feel free to let me know what you think, either in the comments below or poke me on Twitter, I’d love to hear it.

Post scriptum – In the works

Concurrency and multi-threading is something close to heart for me, and having a set up that works reliably in parallel without slapping a lock on everything (mind you, that approach to concurrency will bite you in the behind soon enough). I have not yet come around to implementing this part of the system, but this is what I have planned.

I’ve elected to go for a semi-static directed acyclic graph (DAG) model, where property updates are carried out in a breadth-first manner. Each property can only reference the layers before it, thus all properties in a layer can be updated simultaneously without risking interfering with other properties. I don’t want to have a fixed, compile-time, but rather I want it to sort itself appropriately during wiring. This solves both ordering (properties depending on other properties) and concurrent access.

To achieve this, I’ll reserve a few more bits in the handle to denote the ‘graph depth’ of the referenced property. Thus when I create a property, passing the appropriate dependencies to it, it’ll examine all handles and set its own depth to maximum plus one. I’ll keep all instances of the property sorted by graph depth, and during update, process single depth at a time. During each frame, I’ll step through each populated layer of the graph and fire off multithreaded jobs for each property type, synchronizing after each layer. As each layer only depends on the layers before it there will only be concurrent reads which is fine. In the PID-controlled camera example above, the feed back to the momentum property would have to be queued and applied at the end of the frame (thus explicitly delaying the input by one frame), this doesn’t change the current behavior though as, depending on how the updates are ordered, one of the properties are already reading one frame old data.

This implies I cannot let the property change depth without rebuilding the graph, which is non-trivial as we do not keep track of who depends on us, only who we depend on. I doesn’t stop me from rewiring the dependencies of a property, but it does stop me having it depend on something in its own layer or something further down the line.

Crafting Madness #1: Volufaketric Fog in Asylum

Original Author: Francisco Tufró

If you’ve been following Asylum’s development, you already know that the game’s graphics are based on pre-rendered textures projected on top of a cube with inverted normals. Since gouraud shading is avoided in these faces, you have the illusion of a panoramic view. Using this technique instead of 3D rendering allows the game to have great looking graphics without the need of costly (in development and computational time) real-time rendering techniques. A clever trade off if you ask me.

The downside of this technique is that you lose all the benefits of dynamic lighting and depth provided by the third dimension, and with it, a lot of realism. I had been thinking that with some shader-level magic we could re-create some stuff needed for a few visual effects we wanted to implement. This post is about one of those visual effects…

A horror game without fog? No way!

This was basically the feeling of the whole team. We needed to have animated fog and it should be realistic, it couldn’t be just an overlay on top of each cube’s face, that wouldn’t look good enough.

Pablo and I started discussing the idea of using a Z-Depth mask exported from 3D Studio Max in some way to simulate depth in each face.
For you to understand, Pablo exported an image where he defined white as far and black as near and all the distances in between as shades of grey. You can see an example here:

original_vs_zdepth

The idea is that I could use the depth information inside a shader to control the fog’s opacity. To give you an idea, if fog is represented by a white square the result would look something like this:

volumetric_fog_white

This was the first test we did, and it already looked promising!
The second step was to use a cloud texture instead of a white square, which ended up looking fine but if we were going to use a cloudy fog, we couldn’t use a static one, it needed to be animated.

Screen Shot 2014-07-10 at 4.22.05 PM

Let’s move!

I’ve done some post-processing effects in the past, so I was already familiar with using framebuffers as textures. This approach worked like charm.

In Unity we solved this using Render Textures in a few steps (sorry, but this is a Pro-Only feature):

Create a fog particle system

The particles move really slow and have their alpha changing slowly during its lifespan, they also rotate a little bit over time.
We added a layer called “Fog” and selected it in the particle system.

Screen Shot 2014-07-10 at 5.32.21 PM

 

Create a camera

We created another camera (without the main camera tag), and positioned it to look at the particle system, and limited its Culling Mask to ‘Fog’ only.
The background of the camera was set to black too, in this way we were able to sum the fog texture on top of the original one.

Screen Shot 2014-07-10 at 4.30.40 PM

Create and assign a render texture

We created a Render Texture (in the project browser Create->Render Texture) and assigned it as the Render Texture to the camera.
Then also assigned it as the Fog texture to send to our custom shader and…

Enjoy fog

In this video you can see the result of the technique, it added a really nice looking and animated fog that is consistent in terms of depth.

Conclusion

This was the first experiment to achieve a set of ideas we have about re-creating some 3D visual effects on our 2D projected space in Asylum.
Using a texture with z-depth as an opacity mask for an animated fog texture proved to work really well and the results looked good enough.
In the future we’re planning to use similar techniques to simulate dust particles and dynamic lights. We started to evaluate using normals to recreate specular highlights and other interesting stuff like that. We’ll see how far we’ll be able to take this!

Is there any interesting idea that this post has brought to your mind? What do you think about the technique? Any optimization that you may think about? It would be great to read some comments!

The Indie Content Problem

Original Author: Alistair Doulin

Now that we’re wrapping up work on Battle Group 2 we’ve begun planning out our next major project. I’ve briefly spoken about this previously and today I’m going to share some further discussions that have come out of our planning. The main theme revolves around creating enough content for a game with a small development team. With three main developers (a programmer, a designer and an artist) and a project timeframe of 12 months we need to make smart decisions about how we will create enough content for our game. I see the same problem crop up with a lot of other indie friends and I thought I’d give my thoughts on the subject.

The Problem – Not Enough Content

The underlying problem is the creation of enough quality content to keep players engaged for a set period of time. For Battle Group 2 this was a handful of hours, however for our next project, we are aiming for something people can play for months without running out of content. The problem is that a small team is limited in what it can produce in a given period of time. For us, 3 (hu)man-years of work. So what are our options to solve this problem?

Solution 1 – Reduce Scope

The first solution is to reduce the scope of the game. Instead of providing x-months of content for the player, cut this back to weeks or hours. This is the usual advice I give to game developers when they are concerned with the amount of time/budget required to develop their game. It’s a common trap to overscope a project and have the development go on for years, abandoning it entirely or releasing something that doesn’t live up to the original vision of the game. Reducing scope has the advantage of focussing the design back on the core “5 minutes of fun” and making sure the game being built is the tightest play experience possible.

Solution 2 – Change Design

The second solution is the change the design of the game to cater to limited resources. From a business point of view, this can involve changing the monetization for the product. Free to play games often require a large amount of content to keep people engaged/playing and therefore draw out more money. Switching to a paid model allows developers to “get paid” up front and focus on quality over quantity. The game is then more about providing an enjoyable experience than keeping people playing and extracting as much money for as long as possible. From a game design perspective, this involves changing the underlying design of the game to cater to reduced resources. The difficult part to this is keeping the original vision of the game at the same time.

Solution 3 – Roadblocks

The current trend for free to play games (Boom Beach, Candy Crush) is to stop the player from racing through the content by placing artificial blocks on their progress. Players continually run into roadblocks that require them to wait, ask a friend for help or pay cash. This is not something we want to do for our future projects. While it has become the norm for a particular set of games, I’m glad to see it hasn’t made its way into more mainstream games outside of F2P mobile domain.

Solution 4 – Procedural Content

The solution I am leaning towards on our future project is to use procedural content generation for the majority of our content. This changes the problem from one of time/resources to one of solving complex problems and tweaking algorithms to make quality content. This in itself can sometimes be as time consuming as simply creating the content and therefore needs to be handled carefully. The major advantage to this solution is that it frees the team up to make the building blocks for the game and have players explore the space in the direction they enjoy. One risk of this approach is creating content that all feels the same. Players quickly see through procedural generation when all that changes is simple stats or superficial changes to content. However games that are built upon procedural content from their core (Minecraft, No Man’s Sky) can give deep experiences that allow almost unlimited play time.

Our Decision

We are in the middle of making this decision for our next project at the moment. We have not decided on the best option and this blog post is a way for me to think through our options as clearly as possible. Have you encountered a similar problem and what was your solution? Are there any other solutions you would suggest?