I’m almost half a decade late, but I recently read xathis’ postmortem for his entry into the Google-sponsored AI Challenge 2011, where contestants pitted their AI ants against other AI ants in a bid to dominate a small, grid-based map with static obstacles (water tiles). He had a very clear approach to the whole thing, and I enjoyed reading the postmortem, but as I watched the replay, I noticed he underutilized his ants, even though he eventually won hands down. Especially for some early skirmishes, he loses ants or halts progress due to stalemates on account of poor tactics (or rather, a failure to prioritize engagements in his strategy) because some ants decide to wait for food spawns around the hill instead of pushing outward or securing a frontline.
Here’s an example of what I mean:

Red spends close to 200 turns holding a line when it could have pushed with backups.

White sacrifices ants over and over when it could have gained an upper hand by joining forces.
Anyway, let’s analyze this situation a little.
For an exercise like this, terrain knowledge, or map awareness, is extremely important. Typically, for strategy games, this means that as a player, you need to do some fog-busting: a term you might be familiar with from the Civilization series that describes the act of keeping a unit at a strategic location so you can remove as much of the fog of war as possible. Fog-busting prevents the risks associated with being surrounded by the unknown. In this particular example, unknown terrain means the risk of ambush and lost opportunities in the form of missed food (which means less ants spawned at your hill) and unrazed hills (which means enemy ants continue to spawn). xathis attempts to solve this problem (and another one I’ll mention later) by making sure that as much of the map is visible to him at every turn using a clever algorithm:
Every tile on the map has an exploreValue that is increased by one at the beginning of each turn, unless it is reachable within 10 steps by one of my ants (determined via bfs), in that case the exploreValue is resetted to zero. So the exploreValue is an indication of how long the tile has been out of my reach. [Read more]
The ant then moves to a tile that has the highest exploreValue (>10) in either one or two moves. Now, this behavior is only called if the ant has nothing else to do that turn (like collect food, defend or attack), so it has low enough priority that it doesn’t screw with any immediate fighting behavior from what I can tell. Still, many ants end up chasing tiles with high exploreValues as a result. The emergent behavior this provides is very similar to the aforementioned fog-busting a human player might do, where there ends up being ants spread throughout the player’s “territory.” This creates a blanket of ping-ponging ants across the whole map. As xathis says:
Often single ants keep moving back and fourth for a long time. [An ant in a dead-end cave] has to move one step into the cave because it has to explore a tile … but every other second turn the ant has nothing to explore thus … [it moves] one step in direction of the cave exit only to discard this mission the following turn moving back in again.
This approach gives the player maximal, but not optimal, vision of the map. The reason I distinguish between the two is simple: if your only goal is to prevent ambushes, there is no use in keeping track of “dead tiles” such as dead ends, or tunnels. The rules of the game are clear and constant; thus, there is a guaranteed limit to the enemy’s movements, i.e., no enemy unit can move into the tunnel (or out of it) unless they go through the mouth of the tunnel. Thus, it is a waste of an ant to keep the tunnel clear of fog. This might not seem like a big deal initially, but imagine there is a very long tunnel that has been cleared by a group of 6 ants. With xathis’ naive algorithm, the army is suddenly out of 6 ants because they are stuck in a loop of moving a step toward the mouth of the cave and back into the action, only to move back a turn later and fade into a marginally useful greedy fog-busting mission.
Which brings me to the second problem xathis solved by blanketing the map with ants. I think he wanted to keep this fog-busting behavior as is, because the Challenge required him to collect food to spawn units. According to his postmortem, this apparently meant that seeing as much of the map as possible was the best strategy and allowed for faster army growth. He’s got a point: One thing maximal vision of the map would do is trivialize food collection. If you keep the entirety of your territory free from fog, any food that appears within your territory will be seen and gathered and thus turned into another ant as quickly as possible… And quick is good.
So why do I think this fog-busting strategy is sub-optimal? It’s sub-optimal because it doesn’t make use of our knowledge of the game’s rules. The game specifications have a few notes on food spawning:
- Each map is supposed to have some hidden rate for spawning food.
- Food spawning rotates through sets of symmetric tiles, such that a set isn’t visited again until every other set has already been spawned.
- Some specific tiles in mirror-symmetric maps are hardcoded to never spawn food.
The Challenge page even has a nice hint: “The best way to gather the most food is to explore the map and cover the most area.” But I think this hint is unintentionally misleading. From what I can tell, depending on the rate of food generation, blanketing the map with ants might not be optimal. For very high rates, you’d indeed want to keep an eye on every tile inside your territory so you can pick up any food before the next spawn wave. However, if food spawns slowly (and from observation, in some games/maps it sometimes only spawns a few times in hundreds of turns inside a 10×10 area), your ant may have time to patrol a larger area for food at the cost of losing the immediacy of acquisition. I made a quick animation to illustrate my point:

Depending on the food spawn rate, it might be suboptimal to keep ants on fog-busting missions.
As you can see, patrolling frees up more ants for other tasks such as fighting than blanketing, but comes with the tradeoff of delaying ant spawns. So how important is this tradeoff? Well, let’s look at the source code for the Challenge.
Here’s the function that handles what happens at the end of a turn:
1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 |
def finish_turn(self): """ Called by engine at the end of the turn """ self.do_orders() self.do_attack() self.do_raze_hills() self.do_spawn() self.do_gather() self.food_extra += Fraction(self.food_rate * self.num_players, self.food_turn) food_now = int(self.food_extra) left_over = self.do_food(food_now) self.food_extra -= (food_now - left_over) ... |
The default function that handles food spawns is the food_do_symmetric(self, amount) function, as called from line 1467 above through do_food (which selects the right do_food_ function based on the game settings):
1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 |
def do_food_symmetric(self, amount=1): """ Place food in the same relation player start positions. Food that can't be placed is put into a queue and is placed as soon as the location becomes available. Positions are randomly ordered and cycled to evenly distribute food. """ # if this is the first time calling this function then # create the food sets if not hasattr(self, 'food_sets'): self.food_sets = deque(self.get_symmetric_food_sets()) # add a sentinal so we know when to shuffle self.food_sets.append(None) # place food while next food set is <= left over amount while True: s = self.food_sets.pop() # if we finished one rotation, shuffle for the next if s is None: shuffle(self.food_sets) self.food_sets.appendleft(None) s = self.food_sets.pop() if len(s) > amount: self.food_sets.append(s) self.place_food() return amount else: amount -= len(s) self.food_sets.appendleft(s) for loc in s: self.pending_food[loc] += 1 |
The function basically pops a set of food spawn locations from a deque and them places them at the back when done. When it hits a null element instead of an actual set of tiles, it shuffles the spawn locations and places the null element back in at the back. If there are no spawn locations to pull from, it calls a function to generate them. If there are, and if there are enough players so that self.food_extra += Fraction(self.food_rate * self.num_players, self.food_turn) in finish_turn() is greater than or equal to the number of spawn locations in the set, it adds pending_food to each location and decreases amount by the number of locations in the set. Eventually, either amount food is queued to be placed and then immediately asked to be placed through place_food(), or no food is placed at all until the game passes a large enough amount. That’s how food_rate regulates food generation.
For reference, here’s place_food, which goes through all the pending food items and places them if the tile is free:
1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 |
def place_food(self): """ Place food in scheduled locations if they are free """ for loc in list(self.pending_food.keys()): if self.map[loc[0]][loc[1]] == LAND: self.add_food(loc) self.pending_food[loc] -= 1 # remove from queue if the count reaches 0 if not self.pending_food[loc]: del self.pending_food[loc] |
The generation method for the food_sets used for the default case is get_symmetric_food_sets(), as found in line 1272 of ants.py (and called on line 1117 above in do_food_symmetric):
1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 |
def get_symmetric_food_sets(self, starting=False): """ Split map into sets of squares Each set contains self.num_players points where each point is at a consistent offset from each player's starting position. A square may be the same distance to more than one player which will cause the set to be smaller than the number of players Assumes map is symmetric. """ if not hasattr(self, 'map_symmetry'): # randomly choose one symmetry # get_map_symmetry will raise an exception for non-symmetric maps self.map_symmetry = choice(self.get_map_symmetry()) food_sets = [] # start with only land squares visited = [[False for col in range(self.width)] for row in range(self.height)] # aim for ahill0 will always be 0 ant0 = self.map_symmetry[0][0] if starting: vision_squares = self.get_initial_vision_squares() for row, squares in enumerate(visited): for col, square in enumerate(squares): # if this square has been visited then we don't need to process if square: continue # skip locations of hills if (row, col) in self.hills: continue if starting: # skip locations outside of initial ants' view radius if (row, col) not in vision_squares: continue # offset to ant 0 o_row, o_col = row - ant0[0], col - ant0[1] # set of unique food locations based on offsets from each starting ant locations = list(set([ self.destination(loc, self.offset_aim((o_row, o_col), aim)) for loc, aim, _ in self.map_symmetry ])) # duplicates can happen if 2 ants are the same distance from 1 square # the food set will be smaller and food spawning takes this into account # check for spawn location next to each other # create food dead zone along symmetric lines too_close = False loc1 = locations[0] for loc2 in locations[1:]: if self.distance(loc1, loc2) == 1: # spawn locations too close too_close = True break if too_close: continue # prevent starting food from being equidistant to ants if not starting or len(locations) == self.num_players: # set locations to visited for loc in locations: visited[loc[0]][loc[1]] = True food_sets.append(locations) return food_sets |
As the comment at the start of the snippet states, this function generates sets of symmetric food spawn locations. These sets cannot contain locations that touch; that is, the game will not spawn food at locations that are a distance of 1 apart. These sets cannot have locations on hills or water. These sets cannot be equidistant from ants, etc… As far as I can tell, these sets cover all land that isn’t a hill, so food will spawn on the same set of tiles twice only after food has already spawned on every other set of tiles. This means that for a large map, you probably won’t see food spawn at the same location twice, given the game ends at 1000 turns. You could utilize this information to min-max your movement, but that’s a bit overkill in my opinion, especially given the cost of storing and processing that information.
The main question I had asked above was whether the trade-off between blanketing and partolling was worth it. Well, there are two things about the code that are worth mentioning. Firstly, as you might have seen above in the code, any food that can’t be placed due to being blocked (i.e. tile is not marked as LAND) gets thrown in a list called pending_food and is spawned the next time the location is free. Secondly, a similar system exists for ant spawning, where ants are queued to spawn when food is collected, and only spawn if the hill is free. Effectively, this means that the only penalty to not collecting food is delaying a spawn. There is no loss of food. There is no loss of ants. The only thing lost is the opportunity to attack and defend with a greater force because you let the food sit. That means that as long as you can keep up with the number of enemies approaching the frontlines, there is no penalty to patrolling instead of blanketing, and given that holding a frontline is easier than forming it at the start of the game, having the ants to spare to form frontlines early would greatly improve the defensive position of a bot, and possibly allow it to rush an attack.
Again, this depends on the map’s food spawn rate, and the number of players still in the game and whatnot will affect this rate (vide finish_turn()). Accordingly, it might be nice to keep track of a general spawn rate within a given (somewhat sizable) box and keep enough ants patrolling your territory to – on average – keep up with the rate at which food is generated, while sending whatever ants you have left to fight and expand your territory. This probably means that you’d have more ants patrolling toward the start of the match than you would at the end, as the number of players still in the game is sure to decrease, effectively slowing food spawns. This might end up making it so you slowly shift from blanketing to patrolling as the game goes on, which might be better in the long run than simply blanketing (and might allow you to win matches that might otherwise end with multiple survivors).
All of this is, of course, rather pedantic, and xathis proved that a simpler, solid approach is sometimes all you need, but it got me thinking. For games like this, uncertainty is pretty much a given: Will I have enough resources at a given time to counter something that may or may not happen? What’s my opponent’s build or unit composition? etc. Random number generators also add some intrinsic uncertainty, making player choices anything from calculated risks to pure gambles. Perhaps that shines a light on something that lacks in a lot of AIs, namely the willingness to take stupid risks. And I’m not talking just “I might be able to win a 3 vs. 5 engagement” kind of risk taking. I’m talking about planning for the unknown; I’m talking about meta-gaming; I’m talking about human vs. human behavior.
A not so insignificant portion of human behavior is filling in gaps, guessing and going with a gut feeling, mostly using knowledge already obtained. Some AIs already do this to some extent, and switch tactics when they see you’re countering their setup. SC2 does this for example. By and large, though, AI is dumb. Not dumb in the sense that it executes bad strategies or tactics, but dumb in the sense that it’s some form of state machine that can only look at what it has been programmed to look at and do only what it has been programmed to do. It doesn’t create new tactics, it doesn’t improvise out of desperation. An AI might imitate human behavior to some extent if trained to do so (e.g. through machine learning), but a) most games don’t do this and b) the AI still has similar bounds to its knowledge and actions. Perhaps machine-learning isn’t incorporated into games much because preparing a good training set is hard (really hard), or because it turns out it’s not really fun that way. I don’t know. There is also the fact that we’re so far away from strong AI that perhaps it is naive of me to want some semblance to it in my games…
I’m again somewhat falling back to talking about RTS games, of course (as I’m wont to do) – keep in mind there are some great FPS, Racing and TCG AIs out there. Still, no matter what RTS/TBS I’ve played, it has seemed to me like there’s always been some way to trick the AI consistently. Some shortcut that only a human player can utilize, some knowledge that can let you see a few steps ahead: “cheesing.” It might be as simple as knowing the enemy’s timing or something. Not to say that I can beat AIs consistently (I can’t), but most AIs generally behave consistently. I know Gandhi’s going to nuke me eventually in Civ. I know the Wolf’s going to bring lots of swordmen in SH:Crusader. I know that Castile/Spain won’t be able to pick up Quest for the New World or Reconquista till they reach a certain tech level in EUIV. Perhaps it’s because the rulesets are so complicated, the movesets are so varied that the clarity of creating an FPS AI doesn’t translate to what is necessary to create an RTS AI. There always seems to be something that gets missed or doesn’t feel natural, and it’s a symptom of the rigidity of the code; and you can capitalize on that as a player. It seems to me that all of that detracts from the “fun” in the game. Well, maybe not detract – it changes the mode of fun in the game. Kind of like the difference between playing a game for the first time and playing it for the 100th time. The thrill of the unknown becomes the gratification of out-classing the AI. It’s still fun, but in a different way.
Is it even possible to program AI so that it behaves human? Would that even be fun? You tell me.
Tags: AI, AI Challenge 2011, ants, Strategy, xathis