Original Author: Colt McAnlis
This article explains why it’s important to have your own patching system, and describes how to implement a simple patching system modeled after the Quake3 file-based patching process.
Since the rise of PC games in the early 90s, game developers have needed ways to quickly issue fixes, updated builds, and new content to existing users – hence the rise of ‘game patch’ systems. Over time, this method of updating games has made its ways from PCs to consoles, and is now trickling into mobile development. It may take some effort to build and use a patching system for your game content, but once you’ve got such a system up and running, it’s a very powerful tool for your development studio.
Why have your own patching system?
For modern game developers, the most popular avenue to sell games is through one of many digital distribution services like Google Play, Steam, XBLA, and the Chrome Web Store. Besides marketing games to their users, these distribution services generally handle the lion’s share of transferring game content to customers on developers’ behalf.
For games that need to update frequently, however, the content hosting process from such distribution services can be problematic. For example, some of the services can introduce significant costs in patch creation, or delays in issuing updated builds to users. For multiplayer games, delays can let client builds get out of sync with server builds, with no way of triggering an update directly.
One opportunity to work around such hurdles is in embracing the ability to make your content available outside of distribution services. In general, building your game technology on top of your own patching system gives you a great deal of control as a developer, and generally provides options that give your company and products a great deal of flexibility to move between platforms.
Reach your users directly
In addition to solving the problems listed above, a patching system gives you the opportunity to reach your users directly. Nowadays every game platform is constantly connected to the interwebs, and keeping a long tail of customers happy means constantly listening to the community, fixing their issues, and furnishing new content to them. A patching system lets you market to existing users with new content, as well as news, updates, and notices relating to your game.
So, be your name Buxum or Bixby or Bray, your mountain of users is waiting, patch them to happiness, and be on your way!
Patching system overview
Patching systems generally have 3 components:
- A build server that generates builds and patches (this server resides with the developer)
- A content server from which to distribute builds and patches
- A user client that can detect differences between the local and server versions of a game, retrieve assets, and update the local version
At its core, these are the three pillars of a patching system. You can create more fancy versions once you start getting into details, but such details tend to be game-specific and are beyond the scope of this article.
A simple patch-aware file system
The Quake3 source code contains an elementary example of a successful patching system. This simple system allows a patch to append new archives to the file system. The new archives contain all the assets that differ from previous patches/builds. Another way to describe this system is that new archives are overlaid on top of existing archives. When an asset is to be read in from disk, the file system traverses the archives and selects the newest version of the asset.
In this simple system, over time a user who installed the original game would have multiple archive files from each progressive patch, each archive containing an updated set of assets. In contrast, a user who acquired the game much later in its life span would not have a plethora of archive files on their disk, but rather a collapsed archive representing the proper state of the world as of the time of their installation.
The Quake3 model is hard to beat for simplicity, and offers a good starting point to address more complex topics as your patching system gets more sophisticated. The example patching system that we will implement is thus based on the Quake3 model, and has the following rules:
- The majority of the content is archived.
- Content in newer archives take precedence over content in older archives.
- Archived content is generally not patched, but rather replaced entirely.
- We ignore binary patching altogether and instead include loose assets that are replaced entirely.
To restate, we assume that as far as assets go, you’ll have the lion’s share in a small number of archives, and that new content will be shipped out in the form of additional archives, the contents of which will wholly replace older content.
We thus assume that there will be a series of archive files on disk. At load time, we open the archives and merge their file lists into a global dictionary. When it’s time to read an asset, we consult the dictionary to determine what the newest version of the asset is, and which archive to pull the asset from.
Updating your build system for patching
Build systems are a bit like sacred rituals – each company tends to have its own flavor and guards its process heavily. I’m not going to cover the concepts of a build system (or tell you how to write one); rather I assume that you’ve got that under wraps.
To generate a patch for a build, your build system needs to generate a list of the files that are different from the prior build (for example, between build 299 and 300, 27 textures may have been updated, 3 models may have been deleted, and the rendering DLL may have been updated). Once you have the ability to generate this type of delta list, you need to combine the information into a patch definition, which is described below.
For our example patching system, any content that has changed or that has been added is included in the archive for a new patch. Finding new files is generally easy: Simply compare the file name listings between two folders to find what didn’t exist before. Finding existing files that have been modified can be trickier. For instance, simply testing the last-modified time of files may not work because of how your build system touches content. The fool-proof approach is to use a brute-force comparison between all the binary data in two build folders.
The ease with which your build system can compute these types of file set differences depends greatly on the language and tools of the build system. For example, if your build system is driven by C++, a binary data compare of a 40GB build would be a gnarly and less-than-ideal process. In contrast, if your build system is driven by Python, you can simply call dircmp, which gives you all the proper delta data between files in two directories.
Figure 2 below shows an example of build deltas. Blue indicates modified files, red indicates deleted files, and green indicates new files. In our example patching system, the new and modified files are included in archives, shown on the right.
Creating a patch definition
Once your build system can calculate what’s changed between two builds, the next step is to merge that data into some form that allows the client to consume it. Patch definitions are used to list key changes in builds over time, such that we can minimalistically update the client to the latest build.
We need a few pieces of information in each patch definition to help guide a client to the required actions to update itself. Here are a few examples of the type of information that should go into a patch definition:
- build number – What build is this? A simple integer is easiest to track.
- target region – If you distribute your game internationally, there may be restrictions on the types of patches/content you can ship to a specific region.
- files to add – This list should include new archives, as well as specific loose files that need to be added to the local build.
- files to remove – At times, in the course of builds, you’ll succeed in completely replacing an old build, or there may be some security/privacy risk with old data existing on the client disk. Having the ability to remove files from disk in these situations is useful.
- files to binary update – For files that need direct, in-place binary patching, this can present a list of tuples, the content to be patched, and the patch file to use.
- patch importance – Is this patch required before the user can play? Or can it be streamed in the background?
Determining what files to download
Once you can create per-build patch definitions, the next step is to allow the client to consume this information. The process is generally as follows:
- The client queries the patch server, sending the local version and other metadata.
- The patch server responds with some form of file information.
- The client processes the information and begins requesting new patch data to update the local copy.
There are two primary ways (with lots of variants) to make the determination of what files to download – at the client level or at the server level. Your choice of implementation is highly dependent on the engineering resources that are available to you. The simplest approach in terms of server technology is to keep a text-based manifest file on the server that lists all the patches, their versions etc. This entire manifest file is passed to the client upon request, and the client is responsible for building up the series of file requests to update the local copy.
While simple to implement, this approach quickly runs into limitations. Significantly, this approach requires some advanced logic built into the client to parse the manifest file and generate the request list properly. If a large content-shift occurs (for example, you change the manifest file format), the client will likely need special processing to handle the changes, and may require a patch of the client itself before it’s able to update the content.
A much more complex but scalable solution is to keep all the version data on the patch server, listed as entries in a database. The client provides some simple metadata about the state of the local data (easily encodable in a URL) to the server. The server then computes the proper series of actions the client needs to perform in order to update itself, and transfers an ‘update script’ to the client for direct execution.
The main benefit of this server-based approach is that the computation of what the client needs to do in order to update the local state is all handled on the server. Thus, as the update logic changes, the client can remain neutral to those issues and simply react accordingly. This also allows the client to generally store less data needed for the update process (for instance, the client may only need to store its region and build number). The server can store the rest of the information needed to complete the update process, as well as provide the client with more advanced functionality, like grouping multiple patches into a single request update action.
Once the client has a clear roadmap of what’s required to update the local build, the next step is to actually update the data. For our simple system of downloading new archives, updating content is easy – we download the bits and write them to disk. Done. Let’s get tacos.
Eventually you will encounter a situation where you need to update the game client itself. This can be tricky if the game is running. To solve this problem, most PC games distribute a separate application that checks for patches and updates the local state, including the executable code. Typically these applications are easiest to generate as standalone applications that can patch and then launch the game itself.
For embedded environments, applying patches is a bit trickier. For example, on consoles the base data is shipped on DVD, so you generally have to write patches to the hard drive and check for content there first. Mobile platforms have a whole separate set of requirements that I won’t get into. Thankfully, most of those platforms contain APIs to help out with this process of applying patches, which makes things a bit easier.
Determining what files to delete
Over time a client can accumulate many patches, and in some cases, it may have lots of data that is no longer needed. For example, if all of the files in a given patch have been replaced by newer versions in subsequent patches, then the files in the original patch will never be used and are not needed. To keep the user’s machine from consuming disk space unnecessarily, it’s helpful to identify such instances and allow a patch to delete files from disk.
An archive that’s no longer needed is one whose assets are entirely replaced by newer archives. To test for this, your build system needs to be modified such that a target patch-archive has the ability to query newer builds to determine if it’s relevant any more; depending on how your build system is set up, adding this processing can be easy, or months of man-work, so make sure you take full stock of your system before trudging down the path of enlightenment.
Binary-level file patches
If you do a search for ‘patching game content’, your results will include numerous articles that describe how to minimally modify the on-disk contents of a file at a binary level. Typically this is done by computing the difference for a file and shipping only the difference to the client, resulting in a smaller transfer and a faster patching process. Unfortunately most of the research on patch generation revolves around how to patch executable files (see e.g. Courgette (source), bsdiff, and DCA). Very little (if any) research has been focused on patching binary assets like textures, models, and sounds.
Consequently the patching system described in this article focuses less on the traditional notion of a ‘patch’ and more on a process that allows us to distribute specific assets to clients so that we can easily update the clients to the latest build. With modern compression technologies this process can result in the transfer of fairly small files to users, and the mental overhead of maintaining this type of system is significantly lower. If however you’re one of the brave souls who needs to do the more traditional version of patching, here are some quick notes.
As a starter, I suggest taking a look at XDelta, which is a fairly straightforward command-line tool: You can run a simple command to create a patch, and another command to apply a patch to a file. The app is open-source so that you can build it into a custom part of your client executable. I haven’t seen XDelta produce amazing compression results, but it does the job fairly well in terms of patching. It’s also worth noting that XDelta doesn’t produce very small deltas for archive files, mainly because the predictive look-ahead model that it uses does not fare well with the contiguous file data laid out in archive files. You’ll quickly find that simply running XDelta on two archive files won’t produce the net savings that you’re looking to gain. As such, it may be more beneficial to generate patch files for each of your assets, and archive/compress those for transfer to the client.
If you’d like to roll your own patching system for arbitrary binary content, be warned that this is a tricky problem. While it’s easy to come up with a naive solution, you’ll soon realize that there can be multiple patch points to a single file, which usually throws simple solutions out the window. You’ll also realize that the delta between two files may not be linear – it may include deletions, insertions, and replacements, which is difficult to track. Make sure you’ve got manager approval before going down this route 😉
File processing and restricted systems
Newer operating systems place restrictions on what content can be executed, mostly requiring that applications be signed before they can be executed. Once an application is signed, the system can detect any change to the application, whether the change is introduced accidentally or by malicious code. How to sign your assets and properly distribute/patch them to the client on a given platform is left as an exercise to you. Each OS seems to have its own policies on that process, and your actions will be depend on the systems to which you ship your game.
It takes some effort to implement a patching system, but the benefits are well worth it. A patching system gives you increased flexibility and control over your game, lets you provide a better user experience, and gives you the opportunity to market new content to your users.
My github account has some very simple source code that implements the patching system described above. (Be warned, it uses Python.) The code includes a mock build system, a content server, and a client. You can use this system with the following commands:
- python build/gen_patch.py
- python server/httpd.py
- python client/client.py
Command (1) generates a patch from two builds.
Command (2) starts a server from which to distribute patches.
Command (3) calculates differences in the local version (assuming the client already has the initial build) and pulls down a delta patch from the content server.