About 2 months ago, I released a modified chunk of code from one of my personal projects under the MIT license and put the repository up on Github. The project itself was comparatively much larger than the repo then (orders of magnitude larger if you compare sloc), and is even larger now, but I thought that there was value in releasing an abstracted, bare-bones version of my main program loop, mostly because I often benefit from such efforts myself (often when I’m trying to learn something new) and thought of it as some form of giving back, and partly because I believe that releasing code under flexible licenses helps foster a sense of community among developers.
The project was borne out of a desire to learn a bit more about Control Theory, in particular its applications in multi-agent systems, and having decided (in my infinite wisdom, and largely unemployed state) to brush up on my C++, I wanted to make a simulation so I could test a few core concepts from the multitude of papers I got off Springer and IEEE and whatnot using my girlfriend’s library account. Since it was a simulation, and since it was going to require both some form of artificial intelligence and some amount of user input, I decided to build it like a game from the offset. In many ways the idea was basically a real-time strategy anyway, since the tests I had in mind all had at least some of the elements you’d expect from a strategy game: customizable input schemes, some quantifiable amount of resources on a finite map, deterministic results and playback, agent competition, differences in agent abilities, and so forth. As such, it needed a loop, and needing a loop meant that I had to revisit some old reference material from way back when.
It had been a while since I wrote games from scratch (having started back when Flash 5 introduced AS, and later moved to mods instead) so I spent a day or two looking at a few /r/gamedev posts and a few Wikipedia pages before finding an old post from deWiTTERS where he outlines incrementally more adaptable solutions to the game loop problem. I thought he had the right idea, and got to implementing a version of his loop for my own project. Having played games back when CPU speeds were in the MHz range and then having revisited them more than a decade later, I understood his desire for a loop that somewhat decoupled the system clock from the game clock, and relied instead on variable FPS even if you needed constant calls per second for your game logic. Some older games used to rely on the system clock for their timekeeping and looping code, and if you haven’t tried, playing them on a machine that has a clock that’s three orders of magnitude faster can be quite an experience.
Before I go into too much detail about my own implementation, I’d like to point out that the problem I tackled here is largely a solved problem and has been since the mid-90s. Especially with the prevalence of commercial engines (from Unreal to Unity), I don’t think what I’m about to write will be useful for too many people. In fact, if you’re in game development because you want to express yourself or at least finish a game, I’d suggest against rolling your own engine, as has been mentioned countless times elsewhere. There are, of course, innumerable reasons you might want to roll your own, or at least extend an open source one, but for the type of game most would be comfortable writing alone (i.e. no tech or genre break-throughs), I often find that there is at least one engine that fits the bill – if not more. I personally needed something specific, and was on a budget, and also thought I had the necessary skills, so I went ahead with Ogre and built an engine on top of that, but I’d probably opt for a commercial engine if I were serious about finishing, publishing and living off the profits of a game.
First off, let’s talk about what I needed the game to do, and what that meant for the source code I shared.
- I wanted to have many independent agents that planned and carried out complicated AI tasks.
- I wanted to have a nice way to input commands, and debug in-game if possible
- I wanted to be able to introduce network/communications lag into my system
- I wanted the AI to be extendable, that is, I wanted run-time scripting
So, what do my 4 wishes mean for the loop?
1 & 2. Handling Independent Agents, Inputs and UI updates (Debug or not) from the Loop
Without going into too much detail myself, I used an ECS, or Entity Component System. For the most part, this doesn’t change the loop, and the majority of the change happens to whatever code is to be run once the loop calls whatever function corresponds to, say, input collection, AI updates or physics, but it is useful to note because you’ll see lines in the code that call Systems that aren’t actually included in the repo save for a lackluster definition in the header and an empty implementation in the source. Things such as gameworld updates and AI decision-making are all relegated to these Systems, and have little place in what hopes to be a generalized game loop.
I therefore needed the loop to signal to these Systems that they needed to update and move onto the next system. I’ll touch on multi-threading toward the end, but I tried to write the loop in such a way that it would function well with both single and multiple thread applications. In fact, you might find that a single-thread RTS with networking would work just as well with this loop as would one with a thread dedicated to networking, or that you would rather use thread pooling and call each system sequentially anyway, making the distinction moot as far as the main loop is concerned. Either way, the loop should be going through each system, triggering it, and moving on to the next. How often each system gets triggered, and what triggering does exactly are up to the implementer, of course, so I wanted the loop to accommodate that as well.
3. Induced Lag
This one was very case-specific, so I removed much of the code I had used in the earlier builds from both my own codebase and the git repo. The essence of the problem was that communications between mobile agents in a test environment are generally under the same physical constraints as any other real-world radio communications, i.e. distance, but my software agents could communicate in fractions of a millisecond. As such, problems such as incomplete information, information relaying and cooperative control at high speeds were greatly reduced. High speeds are relative, after all, and coordinating between two agents that can instantly readjust course is not as fulfilling.
As I said, I did have the workings of a system for this kind of simulation in place, but probably didn’t keep it for more than a few commits, so what you’ll see instead is a single-thread timer system. If you’re planning on converting the loop to a multi-threaded approach, you could probably just separate all code pertaining to the Timer class into its own thread and have the loop poll that thread at the start of each loop, or have that thread directly interact with the Systems, but as it stands, the timer system will at most be behind by two loops if the timer is reached while the check has almost been completed, or approx. 40-50 ms, with an average of 10-15 ms at other times, which should roughly correspond to the time it takes a ~60 fps program to actually get to the timer calls from the beginning of the loop. That being said, feel free to rip out any Timer code and substitute your own Boost ASIO callback timers or whatnot. I don’t trust myself enough to incorporate that level of threading (or rather, that level of mutex management) into my program, and would rather have each system utilize a thread pool in sequence, so I opted to include a Timer class instead.
As far as the induced lag goes, I decided somewhere along the way (but before I actually had a solid AI System down, because you should always plan everything else around Networking rather than the other way around) that having actual Networking capability would be nice, and that the project would be better for it overall, so I implemented a networking system using Boost’s async tools. That’s probably the only reason there’s a call to the network system in the loop and not one to the sound system (as I didn’t have a use for a sound system at that time, even though both deserve their own threads with framerate independent polling and updating, probably), so feel free to extend the loop to include sound as well. It will be pretty simple to do once I explain how the loop actually works. (I’m getting there! Sorry!)
4. Extendable AI
To be honest, this didn’t affect the my game loop at all, as everything that needed extension was handled internally by the systems through some System-level (i.e. AI, UI, etc.) binds to a language like Lua (though I did experiment with Python as well)… But if you were so inclined, you could instead move those calls to the loop and have a script run on every loop and modify some global variables. I just ended up doing it this way instead.
Ok. Onto some actual code and explanation.
Breaking it down: The Game Loop and its helper functions/classes
I won’t include any implementations (.cpp) past the stripped down main loop here, but I thought I’d add some of the headers and go over each function and why I think it deserves a place in your loop as well. Some of them may particularly seem out of place to you, and that’s fine – I am definitely not as well versed in game engine design or loop structures as some, but this worked for me, and it might work for you. Provided as is, no guarantee of merchantability, etc. etc. Also, before we start, take a look at the readme at the bottom of the repository page on Github. That’ll give you a general idea of what we’re looking at. For posterity’s sake, here’s the general usage from within an application:
And here is the general form of what run() does:
So, let’s go ahead and look at the MainLoop class.
Of the first group of functions, requestShutdown() and run() are the only ones that are meant to be called directly by your program after you create an instance of GameLoop. If you’re going to dedicate a separate thread to the loop (say you’re running multiple loops from the same program), you’ll want to bind GameLoop::run as that is the main function that starts the loop, as the name suggests. The rest of the functions in that group are used to call the Systems from within the loop. They should be private, but apparently I didn’t fix that when I made the repo. It’s not a catastrophe for the test program anyhow, as the only class with access to the GameLoop header is the actual _tmain(int argc, …) loop for the console.
The next set of functions, startGraphicsEngine() and getInputManager() initialize the graphics engine / renderer and the input system respectively. These two systems generally need to be running before you attempt anything else with the AI, Gameworld or Graphics anyhow, so they are called in the constructor and are saved for later. For the example application, I slacked on memory management and didn’t follow the rule of 3, 4 and 5, but you should for an actual application.
The next three functions are defined in Timers.h and are used to keep track of timers, notify timer listeners and do general timer management. As I mentioned, they can be removed without hurting the loop.
The final group of functions are what I use to inform any APIs (that is, if you wish to provide an API for any extensions or script binds). They too can be discarded without harming the loop, but I find them useful for debugging, and they can be used in game mods if you’d like to allow access to data such as uptime, number of loops, raw fps, etc. There are possibilities, and they are endless (sort of).
There are also many private member variables that keep track of the current fps, uptime, and what have you, and you can see them in the real header file for the loop.
Let’s move onto the implementation…
Humble Beginnings
The source is on the long side, and doesn’t make the intent of the code immediately obvious, so I think it’ll be useful to run through the building blocks line by line.
For a very simple loop, you’d simply want to run until the call to shutdown has been received, like so:
This will run until some function inside the loop tells it not to. If you’re doing stuff with threads, this will block unless you yield.
Once we’re in the loop, we want to start actually doing AI calculations, game world updates and all that. For a single thread, it makes sense to do these calculations in the following order: check for input, do networking, do AI calculations, update the game world, (update any sounds,) and finally, when everything’s ready to render, render one frame. This allows us to make use of any system that has anything to offer that loop. The input can be sent to a server or be used to modify anything the AI, game world or graphics might make use of; the network might receive data that changes the AI’s decisions or entities in the game world, or make a difference in the rendering pipeline; the AI might make changes to the game world; and finally, the renderer will gather all of this and display the ultimate state of the world. For a busy loop (where there are calls to each system), all of these might happen and therefore need to happen in sequence to gain the most benefit (unless you have some weird multi-thread system set up). This would leave us with something like this:
The problem with this loop is that it’ll call all of these functions, sequentially, every single time. Your graphics might appreciate 120 fps, but you really don’t need that degree of accuracy for your network or AI code unless you’re interfacing with a robot that needs very low reaction times or something. For a simulation, you can get by with a lot fewer AI and Network calls per second… And therein lies the problem. You might want to fix your AI to a call every 100 ms or so, but try doing that with your graphics and it’ll look like you’re playing a DOS game. Therefore, we need a way to make sure all the systems still always run sequentially, but at independent rates.
Here’s where it starts getting fun. We add desired frames per second (or calls per second) targets to our code, and allow each system to be governed by the number of clock ticks (milliseconds) that have passed since the last call to that system.
For any given system, the number of ticks required before it begins falling behind its fps goal can be found like so:
So simply:
If we keep track of this for each system, and also keep track of when we last called that system, we can determine each system’s desired fps/cps in the constructor or on the fly during runtime without incurring any sort of penalty for machines with different speeds. (see: deWiTTERS game loop above)
In the loop itself, we can then check each system against this maxTicks value to see if we need to update the system that loop:
This will then make sure that the update calls to that system are called after maxTicks or later, but never before, allowing any other systems to eat up the clock cycles we theoretically allotted that loop (in our minds – we didn’t actually set a limit to each loop’s length).
Using such a control flow also has the added benefit of allowing us to essentially uncap any system’s fps by setting the desired fps to a negative number (but never zero – you can’t divide by zero). In that case, the statement timeSinceLastSystemCall > maxTicks will always return true (since any non-negative number will always be greater than a negative number) and will therefore always enter the if statement. Using this knowledge, we can cap the AI, Network and Gameworld to independent calls per second while removing any such caps from our input and graphics:
This will also guarantee that the loop will never *not* do anything – that is, the loop will always have a system ready to be called on the next run through, and will therefore always do at least something on each pass. The problem with this is that the loop might call the input and graphics systems twice in a row without even touching any of the systems that would change anything that needs to be drawn on the screen, which would be as bad as having skipped a frame (since frame 2 will look exactly the same as frame 1). To counteract this, we can use our variables that keep track of how long it’s been since the last time the system was updated and pass that along with the update call so some form of interpolation can be done during the update. Since the graphics are generally derived from actual (desired) entity positions or velocities, this means that our update function to the graphics system needs to receive a delta time variable in the form of:
You can also provide this information to other systems such as sound, API and even UI – although I generally just call the UI update function when I call the graphics renderer, since I’d rather they both update every frame. Regardless, you can pass the information because you already have it handy for the if statement.
Some notes on threading
Generally speaking, very few of my projects have so far required threading for performance. I usually do not make use of threading geared toward specific systems (but I do use threads for convenience to decouple asynchronous systems, e.g. Network code, or allow for runtime graphic setting changes). Instead, I use thread pools and batching if I need speed or concurrency. This ensures that systems don’t try to access data that isn’t ready yet, and it also makes my life easier since I deal with less issues relating to concurrency. If you need to use per-system threads, however, you should remember a common mantra from UI coding: Stay off the UI thread. Or, in this case, stay off the MainLoop thread. A call to a system in that case should make a note of the call and return immediately. The call can then be handled internally once the system loop has returned to the start, without hanging the main loop and therefore blocking it.
Anyhow, I encourage you to fork the project and poke around the source to see how you can improve it. I can’t promise to keep it maintained, but I will at least try to keep it available. Happy coding!