Civ Evo is a turn-based war game. I will compare it to chess. See Starting from a general-purpose AI for a more technical description. For precise figures see also the Civ Evo in-game manual.
If you are already familiar with Civilisation 2 you might like to skip this and view some of the differences.
For an AI player the turn consists of a sequence of two or more segments. Each segment has one irreversible operator (applied automatically or by the AI) and zero or more reversible operators (applied by the AI).
Like chess, players take turns to move their pieces. Unlike chess, in Civ Evo you can usually move all of your pieces every turn. Friendly pieces may occupy the same square. Some types of pieces have special moves. City pieces cannot move (but have other operations).
The normal move operator changes the location of a unit to one of the eight adjacent tiles and reduces the movement points of the unit depending on the terrain type and tile improvements of the source and destination tiles. Movement can reveal information. Movement is irreversible (unlike many operators).
A unit which did not move in the previous turn fortifies (takes up a defensive position). Fortified units have higher effective defense.
Because there is a significant attacking advantage in moving all of your pieces before opponents can respond, there is a game rule to reduce this. The zone of control for a particular player includes squares with their city and unit pieces and adjacent squares (ZOCs may overlap). It is illegal to move a piece from a square in an opponent's ZOC to another square in that opponent's ZOC unless you have a unit or city piece in the target square. Some types of units ignore ZOC rules.
In traditional chess each square of the board is black or white but this does not affect movement - in Civ Evo the squares have terrain types which affect the cost of moving into them and of attacking ("capturing") a unit located in them.
Each unit piece has a movement rating - a move is illegal unless the remaining movement points are sufficient. Some types of units are not affected by terrain movement cost and some may not move onto some types of terrain.
Unlike chess, capturing can fail. For a successful attack the attacker remains in it's original square. It may attack again if it has remaining movement points (for most units an attack costs 100 movement points). Each type of unit has an attack rating and a defense rating. The following factors affect this:
(Attacker or defender)
The net bonus is the sum of each bonus. Effective strength is calculated as the product of rating, net bonus, health and movement.
Movement = min( 100, MovementPoints ) EStrength = Rating * Bonus * Health * Movement
Movement is always 100 for the defender irrespective of movement points.
If the effective strengh of the attacker is greater than that of the defender then the attacker is the victor otherwise the defender is the victor.
An attack results in the removal of the loser and damage proportional to the ratio of effective strengths on the victor. Some special capabilities reduce or increase damage.
Damage% = int( Loser.Health% * Loser.EStrength / Victor.EStrength )
Damage is subtracted from the victor's health. Damage may not be greater than or equal to the victor's health; damage must be at least 1 (unless the victor's health is 1).
Damage is measured as a percentage. A unit can repair some damage by resting (ie not moving for a turn). Damage is repaired more quickly on a square with a city piece which has an appropriate barracks improvement.
There are restrictions on the type of pieces each type of piece may attack.
One piece defends, usually the one with the highest effective defense. Some special abilities make a unit preferred for defense.
If there is not a city piece on the defender's square or a fort or military base then if the combat is successful (ie the defender is destroyed) all units on the same square will be destroyed.
In Civ Evo the board is called the map - a diamond or quadrilateral rotated through forty-five degrees with East and West edges joined (like a map of the Earth). By convention, location 0 is at the North Pole; Location 1 is East of location 0. The coordinates use a nonlinear scale (see the AI Developer Manual).
The standard (100%) Civ Evo map is 50x60. Unlike Chess, squares which have not been observed are hidden - the terrain and contents is unknown (there is a finite number of possible states).
Squares within a city radius or adjacent to a unit are observed - you can see their current state. The terrain of a square is "remembered" after an observation (however it can change by player or game action). If a player possesses the MIR space station wonder the whole map is observed.
City pieces cannot move. They have special options: set/buy production, sell improvement, set tiles...
You typically begin with no city pieces. City pieces are created by transforming settler pieces (the 'Found City' move).
Each turn the state of a City changes. Food, trade and resources are produced.
New unit pieces are created by cities producing them. A city can produce a unit piece (or an improvement or Wonder or National Project) by producing (or buying) enough resources to complete construction of it. Buying production is irreversible and may only be applied once per city per turn.
Buildings (improvements, Wonders, National Projects) affect the attributes of the city. Some improvements have a maintenance cost which must be paid from the player's treasury at the start of the turn.
Only one player can possess each Wonder.
Each improvement a city has built may be "sold". The treasury is increased by a fixed amount. This operator is irreversible and may only be applied once per city per turn.
The production of a city depends on it's tile selection. A city utilises it's tile and upto one tile per worker from a set of twenty adjacent tiles. Only one city (of all players) may use each tile.
Tile production depends on terrain type and terrain improvements (unit pieces with the settler ability have special moves to make terrain improvements).
A city produces food, trade and resources depending on the tile selection and it's current improvements (and any Wonders or National Projects the player possesses).
Food is used immediately to feed the citizens (each turn) at a rate of 2 units per citizen. If there is excess food it is stored and when there is enough (citizens*20), the number of citizens increases (and the food store is emptied unless there is a granary improvement in which case it is only half emptied).
If there is not enough food the store is depleted. If it is exhausted then the number of citizens reduces.
Trade is allocated between taxation, research and luxury production. A player can specify the proportion to allocate to each as a multiple of 10% (the total for all three must sum to 100 obviously). If the player is in debt (money is less than 0) tax is automatically set to 100% and may not be changed.
Taxation increases your treasury (shared between all cities). It is illegal to buy production for a city unless you have sufficient money in your treasury. Some improvements have a maintenance cost per turn which is deducted from your treasury at the start of the turn for each instance. If your treasury runs too low during maintenance then the improvement is sold.
Research allows you to discover new technologies or design new unit classes. When sufficient research accumulates the new technology or unit class becomes available and a new target can be selected. The cost is proportional to the number of technologies and unit classes the player possesses. Excess research is carried over to the next research project.
Civ Evo 0.81 Advance Chart (130kb PNG) by Steffen Gerlach.
Luxuries reduce the number of unhappy citizens in each city which can increase the number of workers.
Resources are required to build city projects. Each turn, resource production accumulates until there is enough for the project. When the current project completes upto 10 excess resources can be carried over to the next project.
You need to have some programming skills and preferably experience with medium to large software projects and AI development.
You need Delphi (Object Pascal) skills as the main example code is written in Delphi. As more developers create AIs more languages might be supported (porting suggestions). Civ Evo is still in development so the AI interface and the game rules are subject to change.
Suggested development sequence:
The AI interface used by the main example code is protocol.pas and AIClasses.pas written in Delphi.
If you would prefer to use a language other than Delphi and you would like to be able to reference the example code (strongly recommended) then you need to port these files. And maintain them for each Civ Evo update (ie if protocol.pas or AIClasses.pas change). Some AIs are written in C++ - there is a C++ version of protocol and AIClasses available from the Civ Evo files section.
The C template (in the AI developer subdirectory) is minimal (and possibly complete but very reduced compared to the Delphi interface with AIClasses) so you could do a fast port of that to test if you can get the server interface working (ie if your compiled DLL is loaded correctly and can execute server commands).
The basic client/server interface depends on the ability to pass a pointer to an unspecified type and cast this to a specific type depending on the command. A possible Windows type for this would be LPVOID. You could also implement this as a pointer to a variable type (C:Union, Delphi:Variant) which is one of a,b,c... (Where a,b,c... are the types for each client and server command).
Also your client requires the ability to call a function (the server) whose address is defined at runtime by program code (eg by using a functional variable or a pointer to a function or a function reference etc).
The protocol is supposed to consist of only constant data and no code. (This is not true for 0.7.1! But I don't think the code is essential - it just leaked out of AIClasses for some reason!)
The constant data declarations use: (simple types are integer types of various byte sizes [1,2,4])
The type specifications use:
AIClasses declares:
The methods in AIClasses (of the objects and class) use:
With no expert AIs available it might be less effort to adapt a predesigned AI than to write one from scratch or develop the existing AIs. Also it's interesting to compare the strength of different AI designs.
As an AI problem Civ Evo has these properties:
A Total War victory occurs when all the cities and units of all opponents are captured or destroyed.
A Starship Victory occurs when one player successfully builds a spacecraft.
There could be other types of victories. There could be a time limit in turns. The default turn limit is 800 for Civ Evo 0.8.
The number of turn permutations (sequences of actions) can be more than 1000! (1000*999*998...*1) There are fewer local states.
As a strategy war game Civ Evo is (essentially) two-dimensional with two basic terrain types (land, ocean), discrete movement, instantaneous action (no delay between orders and actions for military units), zones of control, damage to victor, defensive and rough terrain (some specific types of terrain improve defense and/or affect movement), deterministic combat, (usually) limited sensing (fog of war), melee weapons (no long-range fire), unit design, unit and base (factory) production, fortifications (forts, city walls), terraforming.
You could make an interface to the Civ Evo client interface. If the source AI is still in development you might want the option to add updates to the Civ Evo version. Civ Evo AIs are not meant to cheat - the interface doesn't support AIs having information not available to human players. Calling Civ Evo server functions is slightly more expensive compared to an AI embedded in the main program.
Civ Evo doesn't presently implement espionage.
There are a few other differences from Civ2 Total War games:
The main source of suitable AIs is Civ(2) clones. There are several open source clones.
It's improbable that you could take a general purpose games-playing AI and make a reasonably fast Civ Evo AI without significant customisation. (But even an AI too slow for competition could be used to test developing AIs if it was competent in some areas.)
With Civ games still in commercial development it seems unlikely that a polite approach to MicroProse or Firaxis would produce anything substantial.
Repeatability means that an AI will make the same moves every time if the circumstances are the same. Obviously being predictable would be a disadvantage. But random number generators can produce a repeatable sequence (which can be unique for each game) so repeatability does not require predictable behaviour.
Fortunately it's easy to save the random number seed (which determines the sequence) at the start of the game.
Repeatability is valuable for debugging and competitions.
Suppose that you find a bug which you think is in Civ Evo. You send your AI and the save file to the Civ Evo administrator but if the AI is not repeatable it might be impossible for the administrator to reproduce the error. (This problem also applies if someone sends a save file to you pointing out a possible error in your AI - you might not be able to reproduce the error unless your AI is repeatable.)
In a competition you probably want to see how your AI did in each game. But if your AI (or a competing AI) is not repeatable you will not be able to watch the competition games because when you rerun the save file the game will not be the same. It's important for transparency that a competition host can offer proof of game results. How would it look if the host's AI won and they couldn't offer proof?!
Persistent data is data which is saved when a game is stopped and restored when a game is loaded. You can determine which data is persistent by checking if the AI's behaviour would be different if the data were re-initialised at the start of each turn. (Data which is not persistent should be re-initialised before it is used eg each time a game is started or loaded.)
Civ Evo has the more stringent requirement that persistent data should be restorable to any turn in the game. Which means that it should be saved every turn it changes in a form which can be reconstructed practically (in a reasonable time and using a reasonable amount of memory).
A possible solution is to save all the persistent data every turn. However this could lead to very large save files. I think that this might be the only practical solution for complex data - Civ Evo needs to compress save files if they are to be restorable to an arbitrary turn.
Characteristically in the field of Artificial Intelligence these algorithms are optimal but are hard (or impossible) to make a reasonable implementation for - although approximate (heuristic) methods might give adequate results.
One algorithm which can find the best move in any circumstance is the minimax algorithm.
Briefly, this involves looking ahead for each possible move by each player until a terminal state (eg someone wins) is reached. You can then compare each possible current move depending on the game result(s) it leads to.
Minimax is feasible to implement for simple games such as noughts-and-crosses (tictactoe).
A more constrained version (popular in chess AIs) is a limited horizon version. Instead of scanning the possible moves until the end of the game (which would take longer than the expected life of the universe for chess or Civ Evo), only the possible moves for (say 5) moves ahead (by each player) are examined. A heuristic evaluation function is used to estimate the value of the result states (because for most of the game a terminal state is unlikely to result after only five moves).
Civ Evo has many more possible moves per turn than chess and uncertainty. For this reason it would be a great achievement to implement non-trivial one turn look-ahead. Unit movement (ie combat) is where the most benefit from lookahead could be realised. It might be feasible to implement several turns combat look-ahead if the scope (eg the number of units and the area) could be constrained.
By definition an imperfect playing algorithm is an approximation of minimax.
Here is an example high-level categorisation of unit activities:
| Activity | Priority |
|---|---|
| Improve/terraform | Value of increased production |
| Found | Value of new city |
| Observe | Threat/opportunity from unobserved regions |
| Defend | Threat, cost of losing control over a region or city |
| Attack | value of new city or region; cost to opponent |
| Transport | Priority of activity which requires transport |
If the categorisation is complete it will allow optimal unit management.
Make a unit when the benefit of the highest priority activity outweighs the cost of the unit. (Conversely destroy units when the cost outweighs the maximum benefit.)
The best unit to execute an activity will vary (ie the cost of an activity is different for different units).
Apart from the problem of constructing a linear scale of value for production, city possession and threat, the problem implementing this is that there can be many units and many (potential) activities. Assigning each unit the best choice of activity each turn could be an expensive calculation (a parallel travelling salesperson problem if the units need to move to the job sites). Optimal assignment could require looking ahead to future turns (for future activities) - the minimax problem.
A candidate for a uniform value scale is (potential) tile production.
It is important to choose the optimal sequence of improvements for a city in order to maximise production.
This can be achieved by scheduling the completion time of each improvement based on long-term value. (Selling of improvements can also be scheduled like this).
The scheduling problem becomes the minimax problem for a long enough horizon - what to build in one hundred turns depends on the activities of opponents (in fact the long-term benefit of any improvement depends on opponents).
Trade can increase your chances of victory. In games with a time or turn limit you can win more quickly by trading (reducing the risk that you will run out of time/turns). When two players exchange trade items each can increase their chances of victory. Even if there are only two players in the game they can both gain if there is a risk of a draw.
Trade proceeds by the current player making an offer to exchange X for Y (X or Y can be nothing or a list of trade items). How should you decide what to request (Y)? How should you decide what price to offer (X)?
For a profitable trade you should request trade items which increase your chances of winning at least as much as the price you offer. When there is already a large difference in chances of victory (eg you are most likely to win) you could be prepared to offer a higher price.
The precise value of each trade item will depend on the victory conditions. For example, if the victory condition is that you hold a specific city at the end of the turn limit then you should pay almost any price for it as the turn limit approaches. (Conversely if you already hold it, you should pay almost any price for Peace).
For Total War Victory conditions, larger army, more cities, more advances, more Wonders, (alliances in games with more than two players) tend to increase your chances of victory.
For Spaceship Victory conditions, being first in research, controlling special resources and having high production are important.
Here are several ways to measure values and costs.
| Trade Item | Benefits | Costs | Cost to obtain |
|---|---|---|---|
| Knowledge of Advance | Reduced research cost. | (Reveals research preferences.) | Half the science necessary to research the advance; Half the turns necessary for research; Delaying of other advances. OR cost of stealing. |
| Map | Knowledge of enemy cities. More efficient pathfinding, city siting, attack, defense. | - | Resources and turns to make (or allocate or bribe) explorers/attackers and explore. |
| Model | Ability to make the unit class. | (Increased research cost) | Science necessary to research the upgrades and features; (Resources to build/capture Wonders for Wonder models/upgrades/features); Science necessary to research the model; Turns necessary for research. |
| City | The production of the city; A base for units and improvements; Increased observation range; Supported units; Increased efficiency of some Wonders. | Maintenance for improvements; Additional defense requirements | Cost of a settler; Cost of moving settler to site; Cost of building improvements and supported units; Cost of maintaining improvements; -Gain of production; Turns to develop equivalent city. OR Cost of capturing/bribing |
| Unit | (value eg extra defender, attacker, settler, transporter); Utilization value. | Support, (Happiness cost) | Cost of the unit; Turns to make it; Turns to move it to the location; (Cost of researching the unit class.) OR cost of bribing |
For physical items (vs information eg advances) a benefit for the buyer is a "cost" for the seller because they lose the item. While a cost for the buyer is a "benefit" for the seller. The location of the item can affect it's value eg if obtaining a city reduces the perimeter:area ratio then defense requirements are reduced.
For information items eg Map, seller and buyer know that they possess the information.
Effective learning means changing your behaviour given new information. Information can be observed or deduced or traded during a game or by post-mortem analysis of sequences of turns from complete games.
The two perspectives on the game world are the player view (what a specific player can see) and the superviser view (everything visible).
If there is a communication protocol the player view might be extended by information from other players (eg shared maps) although a complete communication protocol could allow lies!
Usually competition games will not allow the superviser view during the game (although this is available to human players it should properly be regarded as cheating).
It is efficacious for learning to examine the complete game in the superviser view - it would be convenient for Civ Evo to provide this facility as the save file format changes frequently. It is desirable for training and testing to do this but it might be disallowed during the current round of a competition. It would be funny to disallow it on the grounds of cheating because the AI author could manually update the AI from watching games in the superviser view achieving the same effect. The reason to disallow learning from competition games is to prevent an AI from changing during a round (if this is a rule of the competition).
Because an optimal algorithm for play is known it is not necessary to use learning for AIs designed to play against AIs unless learning is a more convenient or practical way to approximate it than other methods.
Some authors believe that AI designed to play against humans will not be interesting (in the long-term) unless it learns. It might be possible to design a mixed strategy AI which adapts to opponents and approximates minimax optimally - in which case (depending on the quality of the minimax approximation) it might remain interesting for most human players until better strategies are known (unless humans use a better algorithm than minimax or a better approximation or knowledge not available to the optimal minimax approximation).
The possibility of training an AI to be a good player by playing against a human expert using machine learning techniques is impractical because general learning techniques require very many (1000s) of examples to produce adequate results and it would take too long to play so many games sequentially (although parallelisation is promising if you have hundreds of experts available). However learning in a more constrained area eg attacking/defending a single city with a small number of units might be feasible as the optimal algorithm could be used as the opponent with a learning algorithm attempting to find the best approximation which is reasonably fast.
Due to uncertainty some in-game reasoning will be probabilistic estimates eg the size and models of an opponent's army, the production of an opponent's cities.
Using knowledge gained from previous games implies that the AI maintains a database of information (ie an external file which contains the knowledge). This implies that each user of the AI could have a slightly different version (due to each instance of the AI having different knowledge depending on the history of games played). This is a problem for competitions because it's desirable that the strength of an AI remains constant throughout a round to give clear results. Learning might be disabled in competition games. Although presumably manual updates would be allowed with the usual policy (eg an update will be applied at the start of the next round).
It might be useful to have a facility to collate several databases so that you can rapidly accumulate knowledge by parallel development (eg you could invite users of your AI to e-mail their database to you each week and you could make a collated database available for them to download).
For knowledge about an opponent it is important that you can identify the opponent because generalisations might not apply to all opponents. The server should provide the facility to identify AI opponents (as human players can). It is difficult to identify a specific human opponent. You could raise a dialogue box requesting the name of a user or you could use operating system facilities to get an identifying feature such as the user's login name.
Information should be stored in a separate compartment for each set of opponents. (Each compartment has an entry for each player which is the knowledge about that player given the opponents). For a game with two players the set of opponents is just the opponent. For more than two players the set of opponents is all of the opponents in the game.
If your AI is designed for games with more than two opponents you could have very many possible compartments. For example there are more than one million sets of opponents for games with upto nine opponents. Because no-one can play a million games(!) you should store the compartments as a list of sets which you have seen in a game (not a table of the possible compartments). The specific slot (index) for a player is insignificant but the relative index (ie which player moves first, which moves second etc) may be significant.
It is also useful to dimension the compartments according to the starting conditions (eg the size of the map and landmass). Conditions which depend on terrain will not be available in the player view because the map is not visible initially.
If your AI is designed for more than one type of victory condition it could be useful to dimension compartments for each victory condition.
A problem with this approach is that you can end up with few records in each compartment (generalisation requires more than one example). It might be worthwile to average the knowledge about an opponent over all the relevant opponent sets (eg close starting conditions, similar number of players etc).
When you have knowledge about several sets of opponents you could try to generalise to abstract high level principles (for example you could try to find a function which relates the starting conditions to an opponent's performance so that you can predict it for arbitrary starting conditions). When you encounter a new opponent set which contains opponents you have knowledge about from different opponent sets you can generalise from the existing knowledge to fill-in the new opponent set with default values.
When you encounter the same opponent set again you can update the knowledge using a weighted averaging function. The weighting should bias the value towards more recent results in case the opponent is changing their strategy. In principle you could keep each set of information from each game but it would take a lot of space and you would probably just average it anyway (if you feel that there should be more discrimination add more dimensions to the compartmentalisation). During the game you should initialise the live knowledge with the values from the database so that if no new information is obtained the knowledge will not change.
There are numerous types of information which would be useful to know and can be deduced during a game or by analysis of a save file.
Models
Units
Cities
Some of these (eg How large is a typical army?) could be specified as a table (or function) of expected values for each turn. Information which implies research preferences can be analysed for sequence.
Because all of this information can be constructed from old save files at the start of each game (assuming they are not deleted) it is not necessary in principle (assuming that you can analyse the save files quickly) to maintain a database.
Games which have an unambiguous result (two players; win, lose) can be analysed to deduce the causes of the result.
Games with an ambiguous result (draw, more than two players) are more difficult to analyse. Probabilities can be assigned to each player for their chances of victory and the causes can be analysed.
Suppose you have N strategies and you don't know which is best. You can use learning to select the best strategy.
Play several games. At the start of each game select a strategy. Record the number of times each strategy has been played and the number of times it won. The ratio wins divided by plays for each strategy can be used as an estimate for the probability of victory.
E(Win given use strategy S) = S.Wins / S.Games
Given that you have only one opponent who always uses the same strategy eventually (after many games) the estimated probabilities will accurately reflect the chances of victory for each of your strategies. If the opponent is deterministic and the initial game conditions are always the same, picking the strategy with the highest estimated chance of winning will give you the best chance of winning.
Note that during learning the best simple way to select strategies is to pick strategies randomly because this tends to bring the estimates into the correct rank relationships fastest (you don't need perfect estimates you only need to know which strategy is expected to have the highest probability of winning). Why is random selection best for learning? The optimal selection (for learning) is to choose the strategy for which a result (win or not-win) would reduce uncertainty the most. With this form of estimates that is to choose the estimate with the least games because one more game tends to change it the most. Random selection achieves this because it tends to select each strategy an equal number of times. A more sophisticated selection could be biased towards high estimates which are close together also (because you are mainly interested in the highest ranked estimate).
If you wanted to use learning in competition games (eg a learning competition) you could use random strategy selection but with a bias towards selecting strategies with the highest estimated chance of winning. There should be no bias initially because you have no information about the estimates (assuming the opponents are unknown). As the number of games increases the probability that the estimates are correct increases so you can increase the bias.
The number of games you need to play to learn rank relationships is proportional to the number of strategies. (Obviously you need to play every strategy at least once to get any information about it.)
Suppose that you have more than one opponent (still two-player games). If you can identify each opponent uniquely (assume with 0 delay eg the server interface makes the names of the AIs available as for human players) you can maintain an estimated probability table (the estimated probability of victory for each of your strategies) for each opponent.
Suppose that some games have more than two players. This can be handled with probability tables for each combination of players. (The "database" file ie the estimated probability tables could be getting quite large now if you have more than a handful of opponents. You can use sparse tables [perhaps most possible combinations of opponents don't occur] and compression to reduce it.)
Suppose you find an opponent which always wins against all N strategies (If each has been tried ten times the probability of victory for each could be less than 10%). The concept of a preset list of strategies can be generalised to a strategy generator which can analyse an opponent's play (eg from a save file) and produce new strategies designed to defeat them. The generator could have a risk parameter which determines what degree of risk is acceptable in a strategy (you could increase this until an adequate probability of victory is achieved). You could apply evolution to the set of strategies (eg delete the ones with low probability of victory) to minimise the size of the set. Strictly speaking you should keep separate sets of strategies for each opponent (because the generator optimises given a specific opponent or set of opponents - it could take too long to optimise for more than one set of opponents) but this might take an impractical amount of storage space (10Mb isn't much on a hard drive but think about e-mailing it to a competition host).
Opponents with more than one strategy. For mixed strategy opponents I think it is simplest to have (or generate) mixed strategies (ie strategies which adapt to the opponent) as it could be difficult to identify enemy strategies quickly. If you could identify the enemy's strategy quickly you could learn a table of the expected probability of winning for each of your strategies given the opponent's strategy. The number of games required to learn the table would be proportional to the product of the number of strategies for each player.
Noise consists of factors which affect the probability of winning which are not determined by the choice of strategy.
Noise prevents learning from converging on precise estimates (and makes it slower) although rank estimates can be produced if the noise is insignificant compared to the differences between estimates.
Nondeterministic strategies can have a variable chance of winning. The opponent(s) could also use nondeterministic strategies.
The initial conditions are random (for a random map). To the extent that this affects each player's chance of winning this creates noise.
This example demonstrates the drawbacks of learning from complete games.
Both of these problems could be reduced if there was a way to play a complete game quickly (eg in less than one minute). But that would depend on having fast versions of third-party AIs.
Some features of the example could be improved:
The aim of an AI competition is to measure the relative strength of several AI designs in a particular area. Because the competition is conducted on a real machine in real time, implementation considerations (eg that the AIs are reasonably fast) also apply.
These are the main prerequisites for an AI competition:
For an ongoing competition you need to attract AI designers. Here are some things you can do (and some things to avoid):
For an interesting competition there should be uncertainty about the result. This condition can be met if the AIs are in active development.
The rules must specify the competition result for every possible game result. Example rules:
Notice how a victory is defined unambiguously. A victory should be significantly more valuable than a draw to encourage risk-taking. Also a turn limit and a time limit are specified - games can run a long time so (unless you could leave them running over night!) these need to be specified to make it practical to run the games. The turn limit allows you to detect a draw (setting a specific turn makes the result more consistent - it's more valid to compare a specific turn across different games assuming that the starting conditions are similar) whereas the time limit penalises AIs which enter infinite loops or which are too slow (for the competition). The time limit is specified per player so that the faster AI can receive a bonus. This incentive is to discourage forcing a draw by taking long turns but it tends to favour simpler implementations. Implicitly games with more players have a longer time limit - for a fixed time limit you could divide the total time by the number of players.
To avoid problems like AIs failing to load you should have some minimum qualification rules eg:
It might seem that cheating could be prevented by adding more checks to the Civ Evo server to prevent it. However these will never be complete so you should always be vigilant for suspicious results.
Some game crashes are not reproducible on different computers (eg running out of memory). You should specify the target platform (OS, memory etc) for the competition. It would be convenient if you had an independent authority who could confirm crashes on the target platform.
The penalty for a crash is quite low in the example. As the knowledge of how to develop a Civ Evo AI increases the penalty should be increased to encourage debugging (the best possible AI should never crash fatally provided the game rules are consistent). There is a slight incentive for AIs to crash the game when they think the result will be unfavourable (eg losing is worse than a crashed game if the penalty is low). I don't think it's reasonable to disqualify an AI for crashing in a competition game (assuming that it doesn't crash in most games) as it's hard to identify the cause of a crash and it's not proven that the game rules are consistent or that the server can't cause crashes.
If a player can enter more than one AI in a competition (where the AIs could influence each other's rank) and they are not explicitly allied in all games, it is (practically) impossible to prove that the AIs do not collaborate unfairly (by accident or by design).
In some circumstances it is rational (intelligent) to play for a draw. (In general to play for another result than a win.)
If a player can force a draw (or other result) without using reason (ie without demonstrating skill in the area of the competition) they can subvert the aim of the competition. For example a player who thinks that they are about to lose could take all of their remaining time in one turn - forcing a draw. You can use game rules (eg bonuses) to prevent or discourage players form obtaining results by other means than good play.
If an AI has (or can deduce) information about it's rank (or the rank of opponents) in the competition it could play for a tactical draw knowing that it doesn't need to win.
There is a wide scope for judging bonuses from a nominal draw - make sure the judgement is objective so that anyone who does it will get the same result. The purpose of bonuses is to discriminate between AIs which are too close to get a victory. The AI which is more likely to win (due to demonstrating better play or due to having better potential) should gain at the expense of the opponent(s). The bonuses should reflect the area of interest in the competition - if the competition is played on large maps with the aim of finding the best AI player (on large maps) it would make sense to have a bonus for rapid early growth (because other things being equal growing faster than your opponent(s) tends to increase your chances of victory especially for time-constrained games).
A draw is the expected result between expert players on a level playing field - if the competition result is decided by bonuses it is crucial that the entrants are happy with the bonus rules. If the competition is prestigious and has stable rules, developers might design their AIs to maximise benefits from bonus rules - provided that the bonus rules accurately reflect the chances of winning this is appropriate (because effectively they are maximising their chances of winning which is the proper aim of designing AIs for the competition).
An alternative to having bonuses affect the points score of an AI is to have a separate score for good play - like goal difference in a football league - so that bonuses only decide the rank if the points score is equal. This places more emphasis on winning games - it could make the competition more open eg new developers who have a new strategy but not enough time to maximise for every bonus rule can win the competition if their strategy is much better. If an AI wins the competition with very low bonus score this suggests that the bonuses should be revised as they do not reflect the chances of winning.
Too many seemingly arbitrary bonuses will antagonise potential entrants.
Too few bonuses might mean that the competition is inconclusive if most AIs have close scores at the end of a round.
Sophisticated AI designs have an unlimited demand for processing time.
Tournaments must restrict the time for each player so that games can be scheduled. Human players become frustrated by a delay longer than a few seconds (waiting for an AI player to move).
To meet time constraints while maximising performance requires the ability to adjust behaviour given the remaining time. When you have little to do and lots of time you can do everything precisely and perfectly. When you have lots to do and little time you need to prioritise what must be done this turn and how precisely or perfectly you can afford given the remaining time.
You can use OS calls such as GetTickCount() to measure how much time you have left.
Repeatable AIs cannot vary processing depending on the time remaining if it could affect behaviour in the current game.
AIs designed to play against humans don't need to be repeatable (but it is useful for debugging and could be a feature for human players). Remaining time is nondeterministic because the time to execute an activity depends on the state of the computer. Also different computers run at different speeds. Repeatable AIs could vary processing depending on the probability of victory (which can be estimated repeatably). The problem then is to find which moves (activities) are essential for winning and which are unnecessary (and therefore can be omitted).
You could divide your activities into several groups and measure (or estimate) the expected times.
| Activities | Category | Max time/s | Min time/s |
|---|---|---|---|
| City management, rates | Cities | >10 | 0.2 |
| Attacking, defense, exploration... | Units | >100 | 0.2 |
| Everything else | Misc | 4 | 0.1 |
Because the time for units and cities depends on the number of units or cities we assume a reasonable maximum but indicate it could be higher.
In this example the Units category takes the most time followed by the Cities. In most games there will be more units than cities and for many AIs moving units takes the most time on average.
From this it's clear that reducing the time for Units will produce the most benefit. In your AI, if another category has high maximum or average time you should also look at reducing it.
Analyse each of the Unit activities and try to make faster versions which produce approximate results. The results must be good enough to use but don't need to be optimal.
If you know there isn't enough time to use the optimal (slow) versions, use the faster approximate versions. Alternatively you could generate approximate results then check if there is enough time to improve them (by generating the optimal results).
Many expensive calculations (eg paths) don't need to be updated every turn. There is a tradeoff reducing processing time in future turns for increasing the save file. If the data could be recalculated differently (in future turns) they must be saved (or you must update them for changes).
If you have spare time it could be worthwile to use it to generate data for subsequent turns. Eg frequently used paths between cities.
In each category of activities decide which specific activities must be accomplished in this turn and which could be delayed until a later turn.
| Activity | Priority |
|---|---|
| Restoring civil order | High |
| Moving threatened unit | High |
| Defending a threatened city | High (depends on value of city) |
| Attacking enemy city | Medium-High (depends on value of city) |
| Exploration | Medium-Low |
The optimal priority setting is the value of an activity divided by the time it will take.
After doing the high priority activities, do lower priority activities until you have done everything or you run out of time.
For delayed activities increase the priority each turn so that they will eventually be accomplished. (You need to save the priorities for repeatability.)
You could combine this with a trade off between speed and optimality by allocating more time for high priority activities (to generate more optimal results).
For static (non-persistent) data there is a trade off between cost of allocating/freeing and free memory.
When free (physical) memory is exhausted, dynamically allocated memory is obtained by saving (swapping) some physical memory to disk (typically memory for another player's data). When the saved memory next needs to be accessed it must be loaded into physical memory first.
If you minimise the scope of static data by allocating memory for it at the start of each turn (generating and using the data) then freeing it at the end of each turn, free memory for all players is maximised but you pay the cost of allocation/deallocation every turn. (And you can't use generated data from previous turns.)
From a selfish point of view you would gain if you maximised the scope by allocating the memory at the start of the game and only freeing it at the end of the game. You would only have the cost of one allocation (the deallocation doesn't count towards the game time limit) and you could reduce the cost of generating the data (ie you don't need to regenerate it unless it changes). But if every player did this then free memory for all players would be minimised so every player would be slower because there would be more disk swapping.
If the sum of static data (plus persistent data) for all players (and the server) plus the maximum dynamic data requirement for any player (including server calls) is more than available physical memory there will be disk swapping - making each player slower. If each player reduces their static data by allocating/freeing it more often (each turn) there will be less disk swapping but more allocation overhead. The equilibrium occurs when the cost of disk swapping equals the cost of allocation/deallocation (and data generation).
As a general guideline, small amounts of data (kb) which are expensive to generate or cheap to regenerate should have long scope whereas large amounts of data (Mb) which can be generated cheaply should have short scope.
Some practical worst-case calculations: Suppose the target machine has 8Mb physical memory available (remember you should take into account low-end machines and there will be other processes such as the operating system using memory). The maximum number of players is 15. We also count the server as a process so there are 16 processes (count server calls as part of the player process - the server process runs the game and draws graphics etc).
If none of the 16 processes use persistent data or static data with long scope, each process has 8Mb available for static data and dynamic data. If each process uses an average of 256kb of persistent data there is 4Mb available for static and dynamic data.
Suppose there is an average of 1Mb of static data per process. How much static data can each process give a long scope? Suppose the maximum dynamic data required is 1Mb. Then the free physical memory available per process (after taking 256kb*16 for persistent data) is 4Mb-1Mb-16x-(1Mb-x) ie free memory minus maximum-dynamic-memory minus long-scope-static-data minus short-scope-static-data. In this case x (the maximum amount of long scope static data) is 136kb.
In this example if the sum of persistent and long scope static data is greater than 392kb (256+136) you should try to reduce the scope of the static data unless it is very expensive to generate.
A more optimistic example: Suppose the target machine has 32Mb free physical memory available. There are two players so (counting the server) there are three processes.
If none of the three processes use persistent data or static data with long scope, each process has 32Mb available for static data and dynamic data. Suppose the maximum dynamic data requirement is 16Mb. There is 16Mb (32-16) available for persistent data and long scope static data. If this is divided evenly, each player can use 16/3=5.3Mb.
In some cases you could need to do processing during other player's turns (or concurently) eg enemy moves, responding to diplomacy/trade. You should try to minimise memory use for this. (The maximum dynamic data requirement is then the sum of the maximum for any player plus the maximum any other player could require for out-of-turn processing.)
Unfortunately the 0.8 server has unbounded requirement for persistent data because the entire save file is stored in memory. This means that when the save file exhausts available RAM there will be disk swapping each time it grows. Also each 256kb of save data the memory buffer is reallocated - potentially causing Megabytes of disk swapping.
The constraints for an AI designed to play against humans are different:
For it to be practical to make a good AI player which takes less time than a human would notice to move (say a few seconds) additional "thinking" time must be found. (Ideally AIs designed to play humans should play at least as well as AIs designed to play AIs do in competition games.)
You could precalculate moves before a game and store them in a database. You could generate it by consulting experts or automatically from saved games or by recording the moves of an AI designed to play AIs. A practical database is likely to be rather large however. (If using precalculated moves is fast and effective AIs designed to play AIs should do - but this would mean that a database for an AI designed to play humans of equivalent strength would be much larger.) In contrast to Chess the terrain is different for each game and each opponent may have different types of units - this makes it difficult to construct a database which would be useful for most games.
You could also use the time during the human player's turn to do processing. When there are many units and cities most humans will not complete every turn in a few seconds(!) so there will be enough time to do significant processing against most humans.
Use of this borrowed time (and space) is constrained because there must be enough time and memory for the human to issue server commands concurrently (via the player interface). If there is more than one AI they compete for the time and space. In the worst case you must assume that every player except one is an AI and divide resources between players equally (use 1/nPlayers proportion of resources). Eg with two players use 1/2 (50%) of resources, with three players use 1/3 (33%) etc. During lengthy calculations (which would tie-up the processor) use OS calls such as Sleep(0) to release time for other players. Of course if all the AI players are controlled by one module, you can detect if the current player is a human or an AI.
You only have data from the previous turn - you must calculate predictable changes and estimate unpredictable changes (ie other player's moves). You can access RO during enemy move messages - you should not access it when it is not explicitly your time because it is changing. If it seems like cheating to use an opponent's time remember that the human player benefits.
Optimal AIs designed to play against AIs will not do processing in other (AI) player's turns because this makes all AIs run slower which reduces the (effective) time for the game to complete.
See AI page for download links.