Simplex Update #1: Extreme modularization with CMake and Git

Original Author: Francisco Tufró

This is the first update on my Simplex Engine Journey. The first topic I want to talk about is modularization.

Motivations

One of the most important aspects of my engine’s design is modularization. Engines are usually big in terms of classes, dependencies and systems; that’s why a modular approach made sense for me.

From my previous experience working on Moai SDK, a monolythic git repository that includes all the different systems (being modular or not), and third party libs becomes really huge, really soon.

That’s why I wanted to solve that problem for this engine.

There are two different problems to solve:

  1. How can I sort the code in a suitable way to work with modules?
  2. How to build the modules consistently and be able to include the necessary ones in the final build of a program that uses the engine?

This article will guide you through how I’ve solved this problem and give back to you some of the knowledge I got working on Moai SDK and other personal projects.

Git for Extreme modularization

While working on Moai SDK, I never felt really happy about the monolithic git repository.

It was a huge git repo that took hours to download completely, for me that’s EXTREMELY anti-agile (Rule #5).

The solution that came to my mind was something I proposed to Zipline, but didn’t make it because of the $$$ factor that I don’t care about for my project: Having one git repository per module and one ring repository to rule them all (aka. central git repository using submodules).

The pros for having a single repository per module include:

  • Single history of git commits for each module. (This is pretty interesting, since you avoid having a lot of noise when searching the git log for something specific).
  • Easier to find files, since modules include between one and 10 source files in general.
  • Bundled third party libraries (if a library is ONLY used by this module, it can be bundled inside the repository, and added to it’s build script).

The cons are:

  • If you have a third party lib that it’s used by more than one module, you need to be careful with versions, having a common-third party module might be the way to solve this.
  • More complex directory structure, since instead of having all the code in a single place you have it split among many folders. (This shouldn’t be huge though, since modern IDE’s or text editors allow you to search inside a directory tree pretty easily).

Please, comment below if you want to discuss about the pros and cons of this!

Directory structure

In order to be able to have a unified way to build and sort the code among different modules, I’ve thought about the following direcory structure:

  • simplex-[module-name]/src (Actual source code for the module)
  • simplex-[module-name]/include (include files for the module’s classes)
  • simplex-[module-name]/tests (unit tests for the module’s classes)
  • simplex-[module-name]/samples (samples that show the module’s functionality)
  • simplex-[module-name]/docs (documentation on how to use the module or other stuff)
  • simplex-[module-name]/third-party (third party libraries that are used by this module alone, if I want to include them instead of using them from the  system)

Having this standard structure, allows me to create a build system that know what will be found inside each module.

So, when I want to create a module I just create the repository in bitbucket and run my create-module script:

#!/bin/bash
 
  #
 
  # This script is used to create a module for simplex engine
 
  # use:
 
  # ./bin/create-module [module-name]
 
  #
 
  # NOTE: the script will add the 'simplex-' prefix to the module name
 
  # so you should not add it or you will end up having a simplex-simplex- prefix.
 
  #
 
  NAME=simplex-$1
 
  
 
  #
 
  # We have one repository per module, so after creating the repository in
 
  # BitBucket, I add it as a submodule of my main repository.
 
  # In this way, we can rebuild the whole directory structure for the engine
 
  # using git submodule init, instead of having to download each module by hand.
 
  # This is also useful if you want to avoid using some module and customize what
 
  # gets built into your final application. Just remove or add the modules you want
 
  # to your root repository.
 
  git submodule add git@bitbucket.org:franciscotufro/$NAME.git modules/$NAME
 
  cd modules/$NAME
 
  git checkout -b master
 
  
 
  #
 
  # In order to work consistently with the module, we need to create
 
  # a standard directory structure.
 
  DIRECTORIES="src include tests samples docs third-party"
 
  for DIRECTORY in $DIRECTORIES
 
  do
 
    mkdir $DIRECTORY
 
  
 
    # Git doesn't include directories if they don't have content inside.
 
    # So we create a file called .gitkeep in order to create the directories
 
    # in th remote repository.
 
    touch $DIRECTORY/.gitkeep
 
  done
 
  
 
  #
 
  # Just have something show up on BitBucket, we can create an empty
 
  # README.md
 
  echo -e "Module $NAMEn===" > README.md
 
  
 
  #
 
  # This is an empty CMakeLists.txt for the module.
 
  # All the specifics for creating the library and building its
 
  # tests should go here.
 
  echo -e "cmake_minimum_required(VERSION 2.8.5)nproject($NAME)" > CMakeLists.txt
 
  
 
  #
 
  # Now we need to push these default changes as an initial commit.
 
  # After this we can start working on the module.
 
  git add .
 
  git commit -m "[Module] Initial commit for $NAME" -a
 
  git push origin master
 
  
 
  cd ..
 
  cd ..

After running the script, (for example ./bin/create-module core) you’ll end up with the module created (simplex-core), the basic directory structure, a README.md, and the CMakeLists.txt, which will be used to build the module.

This is how I solved the first problem: “How can I sort the code in a suitable way to work with modules?“.

Building modules with CMake

Having solved the core problem for the module’s directory structure, I had to see how was I going to compile all that stuff and get it together.

The approach I found more appealing to me is building each module as a static library, and linking to the necessary ones from the final application.

Since the engine is being built in C++ and needs to be cross-platform (Rule #2), I decided to use a single build system: CMake.

I had the opportunity to learn about using CMake as a cross platform build system (which I really enjoyed) while working on Moai SDK.

CMake is a cross-platform build system, a really well thought one from my point of view. It allows you to write build scripts that are really well organized, and simple. And it’s really well-suited for modularized code.

For it to work, you need to create a file called CMakeLists.txt, which includes the directives on how to build your piece of software.

I was sure I was not going to use another build system, so I included one CMakeLists.txt per module, in order to be able to write all the build instructions for a specific module inside it’s own repository. Another approach could’ve been having a different directory structure with all the CMakeLists.txt, like in Moai SDK, but it made more difficult to have CMakeLists.txt inside each project’s repository, something that I think is a must for my engine.

The main pattern I followed for creating the modules is that each module builds a static library with its name (simplex-core, simplex-math, simplex-graphics).

This static library includes all the code for the namespace Simplex::Module. If a third-party lib is included in the repository, it will be added to this static lib as well.

You need to be very careful with third party libs, including the same library but with two different versions as a static lib could bring problems, my proposed solution as I stated previously, is to have a common third party libs module that will include the dependencies that are shared among modules.

Since I’m TDD-Addict, I decided to include all the mechanics for unit-testing directly into the build system. For this I create a specific executable on each module, and then tell CMake that the executable is a test suite. I’m using Google C++ Testing Framework for writing unit-tests, I’ll write more about this on the next article.

So, each module includes a CMakeLists.txt that looks close to this:

cmake_minimum_required(VERSION 2.8.5)
 
  project(simplex-core)
 
  
 
  #
 
  # src/ includes all the necessary cpp files for this module, so what we do is 
 
  # to use CMake's GLOB instruction to find all the .cpp files and load them into 
 
  # the $CODE variable.
 
  file ( GLOB CODE
 
    ${CMAKE_CURRENT_SOURCE_DIR}/src/*.cpp
 
  )
 
  
 
  #
 
  # To use the module in our app we create a static library to be linked against.
 
  # This library includes all the source files in our src/ directory, so we use 
 
  # the $CODE variable as the list of source files to include.
 
  add_library ( simplex-core STATIC
 
    ${CODE}
 
  )
 
  
 
  #
 
  # tests/ includes all the unit-tests for the module.
 
  # Again, we use CMake's GLOB instruction to get all of them
 
  # into the $TEST_CODE variable. 
 
  file ( GLOB TESTS_CODE ${CMAKE_CURRENT_SOURCE_DIR}/tests/*.cpp ) 
 
  #
 
  # In order to run the tests, we need to create an executable file (our test runner)
 
  # You'll understand more about this on the next article, that will focus on testing, for now
 
  # it's enough that you understand that we include all the tests from $TEST_CODE into a single executable.
 
  add_executable ( SimplexCoreTests ${TESTS_CODE} )
 
  
 
  #
 
  # Then we need to link the test executable to the modules that are used.
 
  # In this case, we're linking agains simple-test (the library needed to run the tests) and simplex-core,
 
  # the module we're testing.
 
  # This is usually the case, you link only to the test library and the module you're testing, but there may be
 
  # exceptions where mocking is not an option.
 
  target_link_libraries ( SimplexCoreTests simplex-test simplex-core ) 
 
  
 
  #
 
  # Finally, in order to be able to run "make tests" and run all the tests for the different modules,
 
  # we have to add the executable file we just created to CMake's test list.
 
  # We do that using add_tests
 
  add_test ( SimplexCoreTests SimplexCoreTests )

Careful and non caffeine-influenced readers should have noted that I’ve not set the include path for the module, this has a reason.

Each module has some information on how to build itself, but there is a piece missing: the main CMakeLists.txt

I’ve created a generic CMakeLists.txt (outside of the modules structure) in the main simplex-engine repository. It basically iterates through all the modules adding its include dir to the include path and loading each module’s CMakeLists.txt to the mix.

I’ll omit some CMake and platform-specific stuff (like stuff for finding OpenGL, Cg, and MacOSX flags) and focus on the custom part, you can mail me and ask me for the whole file if you want:

#
 
  # Every module has a CMakeLists.txt in it's root. As we saw, those files provide the necessary
 
  # directions to build each module.
 
  # To be able to iterate on the modules' names we use the GLOB instruction again.
 
  # This will generate a list of all the modules present in the "modules/" directory.
 
  file ( GLOB MODULES ${CMAKE_SOURCE_DIR}/modules/* )
 
  
 
  # 
 
  # Each module has an include directory where all the headers go to.
 
  # To make it easier to find header files, I decided to make all the headers 
 
  # available to all modules globally.
 
  # You can achieve this using include_directories in the root CMakeLists.txt
 
  # before loading the modules.
 
  foreach ( MODULE ${MODULES} )
 
    include_directories ( "${MODULE}/include/" )
 
  endforeach ()
 
  
 
  #
 
  # After adding the include paths, the only remaining thing is to 
 
  # include each CMakeLists.txt for each module.
 
  # add_subdirectory searches for a CMakeLists.txt file in the given
 
  # directory, so that's what we use to include each module's build script.
 
  foreach ( MODULE ${MODULES} )
 
    add_subdirectory ( ${MODULE} )
 
  endforeach ()

Note that this main script doesn’t know anything about which modules are being built. It just iterates through the contents of the modules/ directory and includes all its sub directories.

This way of adding modules provides an abstract level that doesn’t care about what you’re building, it just builds what you want.

Also note that having a custom build (with some modules you want and not other ones) it’s just a matter of removing the module from the directory tree.

This whole CMake configuration solved the second problem I was facing: How to build the modules consistently and be able to include the necessary ones in the final build of a program that uses the engine?

You may ask here, “Where is the actual final program?”, well.. it’s just another module. :)

Conclusion

Using CMake and Git submodules you can create a really comfortable structure to create a heavily modularized C++ project. The use of good conventions (module directory tree, for example) allows you to avoid complex build scripts, and makes it easy to find and organize code.

On the next article, I’ll talk about how I implemented my testing environment on top of this structure.

I hope you had a really good reading! Feel free to contact me and comment below!

Till next time!