Building a believable computer opponent for a 4X strategy game is one of those problems that turns out to be bottomless. I’d use the cliche it looks simple from the outside… but I don’t think thats true, I thought this would be a tough nut from the outset. I’ve built a chess playing engine before and that was far simpler to get a strong opponent - though it helps that that is such a well understood and documented problem. The player wants an opponent that explores, expands, exploits and exterminates with apparent intent — one that musters an army over several turns, marches it across a continent, lands it on your shore and takes your city, all while you watched it coming and couldn’t quite stop it. They do not want an opponent that teleports units, reads your mind, or sits inert in its starting cities until you wander into range.
This post is a tour through the Annhexation AI — explaining how it makes decisions, what it remembers between turns, and how the same core machinery produces eight distinct civilizations and four difficulty levels. Annhexation isn’t open source, so rather than quote the implementation I’ll describe the design and illustrate the interesting bits with pseudocode.
I should note that the AI is still under development but after a lot of bashing with a hammer its feeling in a pretty decent place.
The core idea
The single most important design decision in the Annhexation AI is that strategy, planning and execution are decoupled. These are three layers that are seperated on purpose and an AI turn flows through three layers:
- Strategic layer — What am I trying to achieve? Peace or war, expansion or consolidation, science race or turtle with wonders. This layer thinks in goals that last tens of turns.
- Operational layer — How do I achieve my strategic goals? Resource allocation, unit quotas, attack plans, city production, research direction. This is the planning.
- Tactical / execution layer — What do I actually do this turn? Move this unit here, attack that stack, fortify this garrison, embark these troops. Turn to turn execution.
The payoff of this separation is, hopefully, coherence over time. A greedy turn-by-turn AI looks twitchy: it builds an army, gets distracted, disbands it, builds another. By contrast, an Annhexation AI that adopts a militaryPush goal will hold that goal for twenty-plus turns, funnelling production, research and unit movement toward a single objective until the city falls, the campaign demonstrably fails, or something seismic interrupts it. Strategy should be sticky while execution is flexible.
A complete turn runs as an ordered sequence of discrete phases — from threat assessment and diplomacy through combat, movement, production and fortification:
function runTurn(player, world, aiState):
detectEvents(aiState, world) # diff against last turn → fire interrupts
aiState.goals = evaluateStrategy(player, world, aiState)
plans = buildOperationalPlans(aiState.goals, player, world)
executeTactics(plans, player, world) # the phase sequence (see below)
aiState.snapshot = snapshot(world) # remember this turn for next time
return aiState
Strategy
At the heart of the strategic layer is a prioritized goal stack. Each turn the AI either keeps its current goals or re-evaluates them, and the menu of things it can want is rich:
earlyExpand— plant N cities before consolidatingearlyRush— exploit the opening with an aggressive early attackinfrastructureConsolidation— buildings, population, growthmilitaryPush— sustained warfare against a chosen playerdefensiveWar/counterattack— react to aggression, retake what was lostnavalInvasion— assault a distant landmasswonderRace,scienceVictoryPush,scoreOptimisation— the peaceful victory pathsraidWar,asymmetricWar— economic harassment instead of conquestwarPreparation,nuclearFirstStrike,recovery— the situational specials
Goals don’t fire on rigid rules rather they’re scored against each other and the highest-utility ones win. The scoring blends several signals:
- Proximity. How far is the nearest enemy city? Distant neighbours (≥14 hexes) push the AI toward peaceful expansion; close ones (≤4 hexes) pull it toward military goals. Geography shapes temperament.
- Force balance. Am I winning the simulated battles? Losing exchanges suppress military goals and inflate defensive ones.
- Catch-up. Falling behind on city count inflates expansion scores so a boxed-in AI tries harder to grow.
- Opportunity. Multipliers derived from how every met rival is currently behaving (more on that below).
Every score is then multiplied by a personality weight. Roughly:
function scoreGoals(player, world, personality):
scores = {}
for goal in CANDIDATE_GOALS:
base = goal.baseValue(player, world)
world_factors = proximity × forceBalance × catchUp × opportunity
scores[goal] = base × world_factors × personality.weightFor(goal)
return sortDescending(scores)
# e.g. early-expand ≈ base × siteRatio × proximityAdj × catchUp × personality.expansion
Two of those terms are about the world; one is about who this civ is. That’s how the same evaluation function produces a cautious turtle and a rampaging horde.
The top goal (priority 0) drives the turn. Secondary goals queue behind it, ready to take over the moment an interrupt fires.
Reading the opponents
A 4X AI that only looks at its own empire plays in a vacuum. Annhexation’s AI explicitly models every player it has met before deciding who to fight.
The AI profiles each known rival across roughly eleven dimensions, each normalised to [0, 1]:
militarisation,development,expansionism,techPaceexposureandcoastalExposure(undefended or weakly-garrisoned cities)borderTensionandaggression(forces massed near our borders, active wars)wonderFocus,scienceFocus, and the all-importantisRunawayLeaderflag
It also tracks trends — rising, flat or falling over the last five turns — so the AI reacts to a rival who is accelerating, not just one who is currently strong. Those snapshots are kept in persistent state so trend detection survives across turns.
A second pass turns those profiles into a war-target ranking. For each rival it weighs:
- Aggression affinity — does attacking this player suit my personality?
- Strength — can I actually win?
- Accessibility — can I even reach them?
- Stability — are they conveniently distracted by another war?
function scoreWarTargets(rivals, me, personality):
for r in rivals:
affinity = personality.aggression × r.borderTension
winnable = clamp(myStrength / r.militarisation)
reachable = 1 / (1 + travelCost(me, r))
distracted = r.aggression_elsewhere
r.score = affinity × winnable × reachable × (1 + distracted)
return sortDescending(rivals)
The winner of that scoring becomes the target of a militaryPush, and the magnitude feeds back as an opportunity multiplier into goal evaluation. An exposed, accessible, distracted neighbour is a temptation the AI is built to notice and exploit.
Personalities and doctrine
Personality in Annhexation isn’t a single “aggression” slider — it’s a vector of about twenty weights (military production, attack appetite, expansion, wonder-building, research, naval production, raid preference, plus early-game tuning like second-city urgency and first-build preference).
On top of that sits the doctrine system — eight civ-specific playbooks that override those weights and the AI’s unit-composition preferences:
| Civ | Doctrine | Signature |
|---|---|---|
| Mongolia | HORSE_RUSH |
+50% military production, +50% attack, double raid preference, cavalry-heavy armies |
| Aztecs | WARRIOR_RUSH |
+40% military & attack, −20% expansion, melee-heavy early aggression |
| Russia | EXPAND_WIDE |
+40% expansion, +30% garrison commitment |
| Rome | INFRA_FIRST |
+40% infrastructure, +30% expansion |
| France | WAR_FOR_SCIENCE |
+40% research, +30% science-victory focus |
| Greece | STRATEGIST |
balanced militarisation across all domains |
| Egypt | TURTLE_WONDERS |
+50% wonders & culture, −20% military |
| England | COASTAL_ONLY |
+40% naval, +50% coastal-site preference, harbour priority |
Because the doctrine only modulates shared machinery, Egypt and Mongolia run the identical goal-evaluation and combat code — they simply weight it toward completely different ends. Mongolia drowns you in cavalry; Egypt hides behind wonders and culture; England fights for the coastline.
Combined with unique per civ units this gives each civ a distinctive personality.
Operational planning: from intent to orders
Once a goal is chosen, the operational layer turns intent into concrete plans.
Unit quotas compute empire-wide demand for each unit class — settlers, workers, garrison, field army, reserve, naval, raiders — each scaled by goals, threat levels, personality and difficulty. During a militaryPush against a walled city, for instance, the garrison quota rises with threat level, melee demand jumps, and siege units become mandatory — you cannot crack walls without them, and the AI knows it.
Unit composition picks the melee/ranged/siege/mounted ratio for an army. Against an unwalled city it loads up on ranged units (free damage); against walls it must bring siege. Doctrine tilts the mix, and resource gating caps it — no horses means no cavalry, no iron means no siege, full stop:
function targetComposition(target, doctrine, resources):
if target.walled: mix = {melee: 0.4, siege: 0.4, ranged: 0.2}
else: mix = {melee: 0.4, ranged: 0.5, mounted: 0.1}
mix = applyDoctrineBias(mix, doctrine) # HORSE_RUSH → more mounted, etc.
if not resources.horses: mix.mounted = 0
if not resources.iron: mix.siege = 0
return normalise(mix)
Attack plans are first-class, multi-turn objects with an explicit lifecycle:
mustering → gathering → advancing → besieging → assaulting
↘ (naval) awaitingTransport → embarking → sailing → landing ↗
Target selection scores enemy cities by proximity (−5 per hex of distance), with bonuses for being unwalled (+15), being a capital (+10), and sitting near iron or horses the AI needs (a big multiplier gated on personality and urgency). It goes for the weakest reachable target first — and it commits.
City production is a distributed priority queue: high-output cities feed global military needs first, low-output cities backfill settlers and workers. The priority cascade runs upgrades → settlers → garrison → military → naval → workers/roads → buildings → wonders, gated by the active goal.
Research follows the goal: an expanding AI beelines the wheel and animal husbandry. A science-victory AI walks a hardcoded path toward rocketry while a warring AI weights military techs. It searches the prerequisite tree but abandons paths longer than three techs — no hundred-turn detours. In theory!
Worker management plans and caches road routes between cities and strategic resources, invalidating them when borders flip. Bottleneck detection explicitly diagnoses why military modernisation is stalled — waiting on a tech, lacking road access to iron, missing currency for trade — and escalates urgency the longer the bottleneck persists.
Tactical execution: a turn, phase by phase
When the planning is done, the AI executes the turn as an ordered sequence of phases. Roughly:
Event detection & city-loss response (compare against last turn's snapshot)
Emergency garrison fill (enemy standing on a city tile)
Unit upgrades & recalls
Retreats (pull damaged units that aren't committed)
Combat (city defence first, then general)
Naval invasion lifecycle (drive the beachhead state machines)
Settler escorts & transport convergence
Army movement (via the movement planner)
Build orders (worker tasks, roads)
Diplomacy (trade, war declarations)
City Defence Commander (per-city garrison assignment)
Government & tech completion
Fortification & hidden-unit setup
A few pieces deserve a closer look.
- Combat simulation estimates each attack before committing: attack strength (scaled by a difficulty-dependent effectiveness multiplier) versus defence strength (garrison, terrain and fortify bonuses), turned into a win probability and an expected HP loss.
- On higher difficulties, combat phasing models ranged-fires-first, melee-counterattacks, melee-finishes — so the AI understands the value of softening a target with archers before the melee goes in. On Easy, that phasing is switched off, dumbing the AI down on purpose.
function shouldAttack(attacker, defender, difficulty):
atk = attacker.strength × difficulty.combatEffectiveness
def = defender.strength × terrainBonus × fortifyBonus × garrisonBonus
winProb = clamp(0.5 + (atk - def) × 0.1, 0, 1)
return winProb ≥ attacker.riskTolerance
-
Movement shares a context across all units so two units never plan into the same tile (no accidental stacking). It uses strategic pathing with an A* fallback, plus anti-oscillation rules — it won’t step back onto a tile it occupied in the last couple of turns unless it’s hurt or there’s an enemy adjacent — which kills the classic “AI unit jitters back and forth forever” bug.
-
Retreat pulls units below an HP threshold (50% on Easy, down to 20% on Deity) or when outnumbered 2:1 nearby — but garrisons never retreat, assault-committed units only break below 15%, and loaded transports never run. Commitment is respected.
-
The City Defence Commander automates each threatened city’s garrison through its own little state machine —
reinforcing → defending → critical → secure— tracking the local force balance and issuing movement orders to defenders. Cities defend themselves intelligently without the strategic layer micromanaging every hex.
Memory: what the AI carries between turns
None of this multi-turn coherence works without persistence. The AI’s state object is serialised between turns and carries, among other things:
- the goal stack and all live attack plans with their lifecycle state
- unit assignments — which unit is a garrison, a field-army member, a raider, a scout — and what it’s committed to
- the border model, classifying cities as capital / frontier / critical / interior and tracking tension per neighbour
- posture snapshots (five turns of history), grievances, pending attacks and city-defence commands
- cached road routes, resource-access graphs, and settler journey state
- the IDs of cities we’ve lost, so a
counterattackknows what to retake - a full snapshot of last turn for event detection
That last point drives the AI’s reactivity. Each turn it diffs the current world against last turn’s snapshot to spot captured or lost cities, fresh war declarations, lost wonders, completed techs, detected nukes, and pillaged tiles. Any of these can fire an interrupt that pre-empts the current goal — lose a city and the AI drops what it was doing to respond; lose your capital and counterattack jumps the stack.
function detectEvents(aiState, world):
prev = aiState.snapshot
for change in diff(prev, world):
if change is CITY_LOST: raise Interrupt(counterattack, change.city)
if change is WAR_DECLARED: raise Interrupt(defensiveWar, change.by)
if change is NUKE_DETECTED: raise Interrupt(recovery, change.where)
... # wonders lost, tiles pillaged, techs done
Difficulty: honest tuning plus a few sanctioned cheats
Difficulty in Annhexation is partly competence and partly bonus — and the line between them is deliberate.
| Easy | Normal | Hard | Deity | |
|---|---|---|---|---|
| Production / Research / Gold | 0.8× | 1.0× | 1.15× / 1.1× / 1.1× | 1.3× / 1.25× / 1.2× |
| Combat phasing & focus fire | off | on | on | on |
| Will retreat | no | yes | yes | yes |
| Combat effectiveness | 0.95× | 1.0× | 1.08× | 1.15× |
| Decision accuracy | ~60% | 100% | 100% | 100% |
| Strategy re-evaluation | every 20 turns | 12 | 10 | 8 |
So an Easy AI isn’t just weaker — it genuinely plays worse: it makes suboptimal choices more often, doesn’t phase its combat, doesn’t retreat damaged units, and reconsiders its strategy only sluggishly. A Deity AI plays the engine to its full ability and gets economic bonuses on top.
The higher difficulties also unlock a small, clearly-scoped set of adaptive cheats: a fog-of-war peek at rival posture, conditional production boosts while pursuing a goal, completion boosts on the home stretch of a wonder or spaceship, and an increased chance of coordinating a joint attack with another AI. These are bonuses with a purpose rather than omniscience.
What it’s optimised for
The Annhexation AI deliberately trades short-term tactical perfection for long-term strategic coherence. Its unit movement is somewhat greedy; it will occasionally make a locally-suboptimal step. But it musters real armies, plans amphibious invasions across several turns, reads which neighbour is weak and accessible, holds a campaign together through a dozen turns of grinding siege, and reacts when you take one of its cities.
The architecture is what makes that possible: a sticky goal stack on top, multi-turn plans in the middle, flexible greedy execution at the bottom, and a persistent memory threading it all together — with personality and difficulty as multipliers reaching into every layer. The result is eight civilizations that feel different, four difficulty levels that genuinely play differently, and an opponent whose intentions you can usually see coming. Stopping them is the game.
Testing and tools
It doesn’t take long before you realise that working on the AI will need you to analyse a lot of games and a lot of data. You need to see why it did something - as the AI grows in complexity you’ll find, or I found, that I would end up with units sat idle, units osciallating between two positions, hopeless attacks, settlers refusing to found cities. And all this can be impacted by all the possibilities that can emerge from the complex set of rules the AI follows and the situations that develop on the map.
And so you need instrumentation, a way to interrogate it, and a way to play more games than you humanly can. At least as a solo developer!
And so a big chunk of work turned out not to be the AI itself but building tools to let me use it and interrogate it.
A headless CLI for batch simulation
Playing the game by hand to test the AI is hopeless — turns are slow, and you need hundreds of them across many games to spot patterns. So there’s a command-line testbed that runs all-AI games with no rendering and no human in the loop:
testbed new --map continent --difficulty deity --players 6 # create an all-AI game
testbed run <gameId> --turns 250 --snapshot-every 10 # advance it, headless
testbed inspect <gameId> # one-shot state summary
testbed list # all games + winners
run advances a game by N turns as fast as the machine will go, printing per-turn progress and bailing early if someone wins. inspect dumps a per-player table — civ, city count, unit count, gold, current research, alive or dead — and list shows every game in the diagnostics directory with its current turn and winner. This is what turns “I think the Mongolian AI rushes too hard” into “I ran forty games and Mongolia wins by turn 90 in thirty of them” — the difference between a hunch and a regression test. Everything is stored in a per-game directory (state.json, ai-states.json, a run.log of notable events like cities founded and wars declared) ready for inspection.
An in-browser testbed and AI inspector
The CLI is great for volume but blind to space — it can’t show you that the army is stuck because a single enemy scout is sitting on the only bridge. For that I run all-AI games inside the actual client. When a game has no human player the normal “End Turn” button is replaced by a testbed panel: buttons to advance 1, 5, 10, 20, 50 or 250 turns, and a “view as” dropdown that swaps the map’s fog-of-war filter so you can watch the game unfold from any AI’s perspective.
Layered on top of that is an AI inspector that lets you select any AI unit or city and it surfaces the internal state that the JSON logs hold, but anchored to what you’re looking at on the map:
- the player’s goal stack — each goal’s type, priority, whether it’s active or blocked, the turn it was created, and goal-specific detail (
militaryPush vs player_2 → city_42,scienceVictory: 4/4 parts, 5 techs left) - live attack plans — target city, lifecycle state (
gathering → besieging → assault), unit fill (5/8 units, siege needed) and rally point - the selected unit’s assignment — role, commitment, target, the turn it was assigned, and the plan it belongs to
- the selected city’s classification (interior / border / coastal), the goals that involve it, and its garrison strength
- the border model — per-rival tension, culture pressure with turns-to-flip, chokepoint counts
- the personality weights that are notably high or low
Turn-by-turn decision logs
Underneath both of those is the thing I lean on most: every AI writes a complete, structured record of its reasoning every single turn. Point an environment variable at a directory and each turn produces a pretty-printed JSON file per AI player — turn-014-mongolia.json and a companion full-state ai-state-014-mongolia.json.
These aren’t log lines; they’re a forensic snapshot of the entire decision. A single turn file captures the goal stack with its scores, the posture and opportunity score it assigned every rival, every city’s production and classification, every unit’s assignment (role, target, commitment, position, HP), the active attack plans — and, crucially, a command trace: an ordered list of every command the AI issued that turn, tagged with the phase that issued it, and success: true or a blocked reason straight from the engine. So when a move silently does nothing, the log tells you the engine rejected it and why.
There are dedicated traces for the gnarly subsystems too: a combat trace of every simulated fight, a naval lifecycle narrative for debugging amphibious invasions (the single most fiddly thing in the whole AI), and a citySiteDecisions list recording every settle attempt and its outcome — accepted, too-close-to-foreign-city, food-tiles-short, on-foreign-landmass-blocked. That last one is the cure for the maddening “why won’t this settler settle?” bug: the answer is right there in the file. Here’s a heavily, heavily, trimmed example JSON from a turn:
{
"turn": 18, "playerId": "player_4", "civilisation": "greece",
"doctrine": "STRATEGIST", "difficulty": "hard",
"goals": [
{ "type": "earlyExpand", "priority": 0, "status": "active", "createdOnTurn": 11,
"targetCityCount": 4, "settlerCount": 0,
"bestSites": [
{ "q": 23, "r": 20, "totalScore": 111.4, "penalties": 0 },
{ "q": 25, "r": 19, "totalScore": 109.6, "penalties": 0 }
/* … 277 more, descending … */
] },
{ "type": "infrastructureConsolidation", "priority": 1, "status": "active" },
{ "type": "warPreparation", "priority": 2, "status": "active",
"targetPlayerId": "player_1", "targetForceSize": 4, "currentForceSize": 3 }
],
"postures": {
"player_2": { "militarisation": 0.69, "isRunawayLeader": true, "borderTension": 0.27 }
},
"cities": [
{ "name": "Athens", "population": 2, "production": "library", "classification": "border" }
],
"commandTrace": [
{ "step": "10", "command": "moveUnit", "unitId": "unit_14", "role": "worker",
"from": "25,23", "to": "26,23", "success": true },
{ "step": "10", "command": "buildImprovement", "unitId": "unit_14", "success": true },
{ "step": "16", "command": "endTurn", "success": true }
]
}
The workflow ties together neatly. Run a few hundred turns headless with the CLI; spot a game that went wrong in the list output; either replay it in the browser with the F3 inspector or crack open the turn-N JSON and read, in order, exactly what the AI was thinking and what the engine let it do. Most of the “the AI is being dumb” moments turn out to be one specific, fixable thing — and these tools are how you find it instead of guessing.
Conclusions
Creating an AI for a 4X is definitely quite an undertaking. Its pretty easy to get units moving around but getting the AI to act in ways that are both interesting and credible takes a lot of effort. Its not that the code is complicated but that their is so much interacting that small changes can result in difficult to predict second and third order effects.
I spent countless hours on things that on the one hand seem simple “stop a unit from oscillating between A and B” but turn out to be really rather complex. While yes you can put in guards “don’t do this” the guards themselves can have unforeseen effects and don’t fix root problems.
You also can’t automate all this away. Yes you can create test cases, yes you can have the AI play countless games against the AI, but an AI isn’t a human and its the human the AI has to respond interestingly to.
I’ve released Annhexation into early access now and the primary reason for that is the AI. I need more people to play it and then resolve the things that inevitably will emerge.
If you’d like to give it a go you can play it online, for free, now.