Play it safe solution to the Prisoner's Dilemma.

Suppose A and B are prisoners in the situation described above, and that neither is confident of their ability to predict whether the other will confess or not based on her previous decisions (because A and B are unprofessional, so they did not get their stories straight beforehand). Thus, assume that each prisoner confesses at random (with a weighted coin, say) such that s/he confesses a certain percentage of the time (this is A's "strategy" over a large number of "games"). We wish to find a strategy for A that will ensure him a moderate expected value regardless of B's decisions.

First, some definitions:

  • Let a := the outcome if A confesses and B does not. ( 0 in the above writeup by Jennifer).
  • Let b := the outcome if A does not confess and B does. ( -9 above).
  • Let c := the outcome if both confess. ( -6 above).
  • Let d := the outcome if both do not confess. ( -1 above).
  • Note that a,b and c are negative because I like to think of positive as good (who knows why :).
  • Let p := the probability that A confesses (we want to find this... ).
  • Let q := the probability that B confesses (...such that this doesn't matter).

Then, the function E:[0,1]X[0,1] -> R defined by

E(p,q) := p(1 - q) a + q(1 - p) b + pq c + (1 - p)(1 - q) d
        = (a - d) p + (b - d) q + (c - b + d - a) pq
        = (a - d) p + ((b - d) + (c - b + d - a) p) q + d

returns A's expected outcome.

Now we observe that if

((b - d) + (c - b + d - a) p) = 0
, then the q term above vanishes, which is exactly what we wanted. So
         (d - b)
p = ----------------- 
    (c - a) + (d - b)

            1
  = ----------------- .
    (c - a)     
    ------- +    1 
    (d - b)

Thus (c-a)/(d-b) >= 0, since otherwise, p would not be in [0,1], so we would have no play safe solution. (We don't worry that d=b here, because if so, p=0 is a valid solution)
Thus we have a solution iff ((c>=a) and (d>b)) or ((c<=a) and (d<b)).
Note: An alternative approach that leads to the same result is to take the partial derivative of E with respect to q, set it equal to zero and solve for p. This approach works because E is linear in q, so that dE/dq is constant with respect to q.

Some practical discussion:
Practically speaking, we can strengthen this condition:
d<b implies that A gets a heavier sentence if both do not confess (so there is not as much proof that they did it) than if A does not confess and B turns him in; thus, the second "or" term above cannot be satisfied, so we must have:
(c>=a) and (d>b)
We have just seen that d>b. Now we can assume that if A confesses, his/her sentence will be no less if B does not confess than if B does so c<=a (it could be more, because A's sentence may be shortened since s/he was more cooperative than B). So the only practical case with a solution is where A has nothing to lose by confessing. In this case, p=1, as our intuition tells us. The expected outcome is then:
E(1,q) = (a - d) 1 + ((b - d) + (c - a + d - b) p) q + d = a - d + (0)q + d = a
Of course, this case is pretty trivial, since c=a removes the "dilemma". But isn't it nice to know that the math supports it?