Emil Post's work on undecidability is often overshadowed by the contributions of his more famous contemporary, Alan Turing. The undecidable nature of the halting problem has an elegant proof itself, but emulating Turing machines in other problems to prove their undecidability is a delicate and messy business. Post's Correspondence Problem, introduced in 1946, is a more natural fit for this role, and since it can itself mimic the halting problem, is of equal power in proving undecidability. Thus it found application to formal language theory; it can also be elegantly applied to the analysis of communication protocols, which I wish to cover here.

# Post's Correspondence Problem

Firstly, here's an easy way to understand Post's Correspondence Problem. Imagine a particularly fiendish variant of Scrabble, whereby the tiles are split, like dominoes, into two halves, each bearing a string of letters. The question is, can we line up some (non-zero) number of tiles so that the top and botttom halves both spell out the same 'word' (simply a string of letters)? The types of tiles are fixed, but you may use as many of each type as you wish.

For instance, if our tile set is the following:

_______ _______ _______
| | | | | |
| a | | abaaa | | ab |
| - | | - | | - |
| aaa | | ab | | b |
|_______| |_______| |_______|
(1) (2) (3)

Then the arrangement (2)(1)(1)(3) gives the 'word' abaaa a a ab = abaaaaaab on the top, and ab aaa aaa b = abaaaaaab on the bottom: that is, we have a match.

As observed above, the general form of this innocent-looking problem is undecidable: no program can take an arbitrary set of tiles A_{i}= g_{i}/h_{i} and decide whether there are matching sequences g_{(i1)}...g_{(in)}=g=h=h_{(i1)}...h_{(in)}.

# The Secrecy problem

The secrecy problem is the obvious question in the analysis of cryptographic protocols: given a protocol and some intended secret S, can an intruder intercept S? Answering this question depends of course on the powers attributed to the intruder, and the restrictions placed on the protocol. But I'll demonstrate here how even a fairly restricted definition of protocol - exploiting neither unbounded parallel sessions nor unbounded numbers of nonces (in fact, no nonces will be required at all!) - reduces to PCP. This means of course that secrecy is undecidable, else we'd be able to determine the answer to arbitrary PCP instances by encoding them into secrecy problems and answering those.

## Reduction of Secrecy to PCP

The trick to proving undecidability of secrecy is to convert the correspondence problem into a communication protocol, in such a way that the secret is disclosed if and only if the encoded PCP has a solution. We give each tile to a participant; the holder of tile i will be called Appender_{i}. There will also be two other legitimate participants, the Initiator and the compromiser. The Appenders, Initiator and compromiser are all assumed to share a secret key which is unknown to any intruder.

To start a run of the protocol, the Initiator encrypts a pair of messages, both consisting of the empty string, into a packet and sends it to any other participant. Any Appender receiving a packet decrpyts it, and changes the strings by appending the top half of their tile to the first string and the bottom half to the second string. They then pair these two new strings together, encrpyt into a packet, and send to anyone other than the initiator. In this way, candidates for matching strings are generated, under encryption.

Should the compromiser receive a packet containing two identical non-empty strings, they send back the secret message S "in the clear" - that is, without any encryption. So an intruder will discover the secret iff the compromiser receives a solution to the PCP problem, and thus only iff such a solution exists.

For the PCP instance above, the secret would be disclosed as follows. Firstly, the initiator sends a pair of empty strings to Appender_{2}. He examines his tile, and creates a pair (abaaa,ab) which is sent (encrypted) to Appender_{1}. She adds the contents of her tile, producing (abaaaa,abaaa) then sends this to herself for another round, creating (abaaaaa,abaaaaaa). This time she sends it to Appender_{3}, who generates (abaaaaaab,abaaaaaab) and forwards this to the compromiser. He discovers the two strings match, so sends S back to Appender_{3}, which, bereft of encryption, is then known to any eavesdropper.

We can formalise all this as follows, firstly fixing some notation: Let the i^{th} tile have g_{i} written on the top half, and h_{i} written on the bottom half: an empty string of no letters will be denoted by e, an arbitrary string by a capital letter, an arbitrary letter by s and the concatenation of strings simply by writing them next to each other. Let K be a key shared between each of the Appender_{i}s, the initiator and the compromiser; P be any participant other than the initiator, and S the secret. This gives roles for the legitimate participants as follows:

**Initiator role**

Init-->P: {(e,e)}_{K}

**Appender roles** (one for each tile)

(P or Init)-->Appender_{i}: {(X,Y)}_{K}

Appender_{i}-->P: {(Xg_{i},Yh_{i})}_{K}

**Compromiser role**

P-->Comp: {(Xs,Xs)}_{K}

Comp-->P: S

Which would give a secret-disclosing run of our example as follows:

Init-->Appender_{2}: {(e,e)}_{K}

Appender_{2}-->Appender_{1}: {(abaaa,ab)}_{K}

Appender_{1}-->Appender_{1}: {(abaaaa,abaaa)}_{K}

Appender_{1}-->Appender_{3}: {(abaaaaa,abaaaaaa)}_{K}

Appender_{3}-->Comp: {(abaaaaaab,abaaaaaab)}_{K}

Comp-->Appender_{3}: S

Thus, since such a run is one of the many possibilities for this protocol, secrecy is not guaranteed. Since the existence of the compromising run depends on the existence of a PCP solution, the secrecy problem is hence undecidable.

### Criticisms and variations

The analysis of cryptographic protocols is made difficult by various sources of infinity- a priori, the number of sessions, the number of nonces, and the length of messages could all be unbounded. The example above is particularly devastating since it does not require the first two of these. However, since Post's Correspondence Problem does not restrict the number of tiles used the messages involved are of unbounded length. This, coupled with the 'blind copying' ability such as that of the Appender roles, means we can encode computational models.

However, there is a strong criticism that these protocols are pathological examples: they serves no useful communication purpose, having been crafted with the benefit of an intruder in mind and to prove the existence of a reduction. But practical interest is not in the class of all protocols, but of all that we might wish to use- those with an 'honest run' whereby all the roles introduced are necessary components for some legitimate purpose such as authentication. We cannot exclude blind copying completely, however, since such manipulations are necessary in, for instance, key exchange via a trusted server.

Decidability of secrecy remains an open problem for protocols with honest runs restricted to either bounded message size; or such that participants may make only a single blind copy. Sadly, even with the 'honest run' requirement, unbounded message size gives rise to an undecidable secrecy problem, but there is a decidable case: bounded nonces with participants allowed only a single blind copy.

# References

CM30071:

*Logic and its Applications* Undergraduate module, Computer Science Dept., University of Bath.

*Formal analysis of Cryptographic Protocols* Taught Postgraduate course, Laboratory for Foundations of Computer Science, University of Edinburgh.

*Wikipedia*: Emil Post, Post's Correspondence Problem.