The first Iterated Prisoner's Dilemma competition was held under the direction of a political scientist, Robert Axelrod, and, as the above writeups describe, was won by the tit for tat approach developed by Anatol Rapoport. That system defended its title against a potentially stronger field in Axelrod's follow-up event, with 62 participants compared to an initial field of fourteen, then held its own in the following years, as it seemed to have an unassailable combination of co-operating where possible, but punishing where not. Yet in a twentieth-anniversary competition, this time run by computer scientist Graham Kendall, and featuring 223 entries, tit for tat couldn't claim even one of the top three places. All of those positions went instead to programs developed at Southampton University, which were able to recognise each other and then assumed master/slave roles that maximised the payoff to the master program- at the expense of the slaves, which sunk rapidly to the bottom of the performance table through their heroic sacrifice.
Some see this as simply cheating, others as ingenious "meta-gaming"- beating the rules rather than the opponents. Certainly it has limitations and could fail spectacularly in rival configurations of the Dilemma game; but with a bit of thought it becomes apparent that tit for tat could also be accused of gaming the system, and that the true point of running Prisoner's Dilemma games is to find insights into other tasks. The Southampton group was motivated by an interest in the development of co-operative behaviour within multi-agent systems, and tweaks to their approach should shed light on the extent to which self-sacrificing agents can help the system as a whole- or perhaps just a favoured group within it.
The remarkable feature of tit for tat was that it would, over the course of a competition, defeat the "always defect" approach, even though it would always lose to it in a one-on-one situation! Always defecting is known as an evolutionary stable strategy, meaning that in a population of such programs, any deviation away from the approach (by co-operating at some time) can only disadvantage you and further the ends of the others. Tit for tat is indeed guaranteed to lose points in the first round against a consistent defector, but from round two onwards against such a foe, it assumes the strategy itself, so the damage is minimised. In the initial 1984 competition, each program competed against all others in a round-robin fashion, with the twist that at some point it would play against a copy of itself. With sufficiently long iterations, this allows tit for tat to win even as the sole member of an otherwise constantly defecting group: since when it plays itself, it settles into a mutually beneficial pattern of co-operation that brings more points per round than mutual defection- and hence counteracts the few points lost in the opening play against defectors. Should other programs themselves be tit for tat, or any other strain which throws out the occasional run of co-operation, then the effect is magnified still further. So whilst the defectors never give away points, they miss opportunities to gain them- opportunities which tit for tat seized to climb the scoreboard to success. As LincolnQ observes above, "your only goal is to get the highest possible score, not to "beat" your opponent", and this tit for tat does admirably.
How then did the Southampton team out-play this decades old strategy? Tit for tat lets itself set up a virtuous circle of co-operation to score more points than the more vicious approach of mutual defection. But the Southampton programs made use of what makes the dilemma a dilemma in the first place- you can score even more than you would for co-operation, if only you could get the other player to co-operate whilst you defect. The 2004 rules allowed for more than one program to be submitted, so you could submit just a such a helper- the suicidal "always co-operate" method- to exploit, except all the other always-defect programs would get to feed upon it too. The solution was to have the programs use pre-determined sequences of co-operation and defection to act as a signature. On encountering an appropriate 5-10 turn sequence, the programs would assume their roles, with the slaves switching to constant co-operation to offer up bonus points to the always-defecting masters. If, however, the trigger sequence wasn't received from the opponent, the always-defect approach was activated, to minimise points available to the rival (which may have picked up some by defecting whilst the Southampton code had to co-operate as part of its opening dance). In this way, the masters would rack up more points than tit-for-tat, which would often be stuck with the paltry rewards of mutual defection.
The most interesting aspects of this approach are the issue of just what density of colluders is required in the population- it turned out that there were probably too many in the 2004 event; and whether it would be possible for other programs to spoof or manipulate the trigger sequences needed. In evolutionary games where code is mixed or programs are eliminated, it's possible for the slaves to be removed from the game, stranding the masters with the potential of being exploited during their now-useless setup phase, and then performing no better than always-defect after those steps. Nor can the fate of those slaves be entirely ignored, as their sacrifice could reduce the average performance per colluding program to below that of tit for tat, depending on the ratios present in the population.
The second round of Kendall's competition is scheduled for April 2005, and perhaps yet another level of refinement will emerge then, alongside answers to some of these issues. But even if not, this latest twist shows that even after two decades, the Prisoner's Dilemma remains both beautiful in its simplicity and intriguing in its implications.
- I first heard about the Southampton result through a Slashdot article, and the discussion generated there doubtless had some influence on this writeup:
- Wired article as linked by Slashdot:
- Press-release from the University:
- Background reading- chapter 1 of Five Golden Rules: Great Theories of 20th-Century Mathematics- and why they matter by John L. Casti.