In a given first order language, there may be many ways to translate a statement into a valid formula- thus given a set of seemingly different formula, it is desirable to be able to transform them into a standard form of notation which preserves logical equivalence. Then, if this form is the same for each formula, then they are in fact different formulations of the same idea. This standardisation is achieved through the generation of the normal form, of which there are many types.
However, it is also desirable to have a normal form which is in some sense the "simplest" rendering of a formula. Clausal form is the closest current approach to this; although ingenuity may offer up a more elegant version of a particular formula, there is no systematic, mechanical method for this (assuming a common agreement on simple can even be made).
The simplicity of clausal form is due to it's composition- a clausal form, rather than being a single long expression such as with CNF, consists of a number of clauses, which have an extremely specific and easily understood structure-
Definition: A formula is said to be a clause if it is of the form-
(A1 ^ A2 ^ .... ^ Am) --> (B1 V B2 V ... V Bn)
Where ^ denotes AND, V denotes OR, --> denotes IMPLIES; A1,...,Bn are atomic formulae
Hence, a clause states that if a list of conditions A are met, then at least one other condition from list B must also hold. It is very important to note that A's and B's are atomic formulae, not formulae- thus they contain no logical connectives and in particular no negations. For instance, if you wanted a clause to encapsulate "I work on weekdays other than tuesdays" then weekday ^ (NOT)Tuesday --> I work" wouldn't do. Instead, you'd need to move the negated atomic formulae to the other side of the implication: "weekday --> I work V Tuesday" is a correct clause which, if you think about it, has the same meaning. This approach will always work due to the following theorem-
A disjunction of literals (an atomic formula or its negation) is logically equivalent to a clause.
To prove this, consider that the disjunction can be written as (with appropriate labelling A's and B's and possibly some reordering, which does not alter the meaning of a disjunction)
¬A1 V ¬A2 V ... V Am V B1 V B2 V ... V Bn
Where ¬ denotes negation.
Then this is the same as ¬(A1 ^ A2 ^ .... ^ Am) V (B1 V B2 V ... V Bn) by DeMorgan's Laws. But if you have ¬C V D, then this is precisely the definition of C --> D. So we have the clause structure (A1 ^ A2 ^ .... ^ Am) --> (B1 V B2 V ... V Bn) as desired.
There are some particularly useful types of clause:
- Clauses with m=0 i.e. of the form --> (B1 V B2 V ... V Bn)
This means that there are no preconditions for one of the B's to hold- so we can conclude that at least one of them must hold true. If furthermore n is 1, then we can accept B1 as a fact (Its truth is always implied).
- Clauses with n=0 i.e. of the form (A1 ^ A2 ^ ... ^ Am) -->
This means that A1,...,Am cannot all hold at the same time (Consider the disjunction of literal form of this clause; with no B's, we have ¬A1 V ¬A2 V ... V Am i.e. the negation of at least one is true- they cannot all be asserted at once).
- Clauses with n=1 i.e. of the form (A1 ^ A2 ^ ... ^ Am) --> B1
These are known as headed horn clauses or just horn clauses and are central to logic programming systems such as PROLOG or Cyc. They are particularly desirable for such reasoning systems as you can assert the truth of B1 if you have the Ai's all true. Hence if the A's are simpler predicates to test, more complex ideas can be constructed from simpler building blocks; and complex goals can be matched with a probably larger but generally simpler set of sub-goals. Sadly, as the example of working on Tuesdays described above shows, you may not always be able to get a horn clause. (You may be able to work around it by defining a "not tuesday" predicate, but negation of existing predicates can get messy in logic programming- certainly in my experience with prolog!)
So a clausal form breaks down a formula into a list of the rules it gives for the universe or domain described by the first order language. Testing individual clauses can give insight into predicates contained within the original formula, and this will generally be a lot simpler than trying to infer their truth value direct from that formula. It is also handy when faced with trying to say something useful if presented with incomplete data, as some predicates may have no bearing on the values of others. How then to find a clausal form?
This requires a sentence
A- that is, there can be no free variable
s. It may therefore be necessary to use the universal closure
of A (as if that holds then the original statement A does; although not the converse
- Find the Prenex normal form of A.
- Find Skolem form. This has some effect on logical equivalence which carries through to the clausal form (see below).
- We now have something of the form (FORALL X1)...(FORALL Xn)M(X1,...,Xn) where M is quantifier free. Remove the universal quantifiers- literally, this is a typographical operation, just delete them and keep M.
- Obtain the CNF of M. This gives C1 ^ C2 ^ ... ^ Cj with the C's being disjunctions of literals. This is computationally intensive, being of exponential order relative to the number of predicates.
- Rewrite the conjunction as a list Γ = [ C1, ..., Cj ] .
- Rewrite each Ci as a clause (as discussed in the above theorem). The resultant list Γ of clauses is the clausal form of sentence A.
As noted, when A is skolemized we alter the equivalence. The end result is that any interpretation for which the clausal form is true makes A true; if we have an interpretation I in which A holds then by appropriate definition of skolem functions we can extend I to I* in which the clausal form is true. Thus, the clausal form is satisfiable if and only if the original sentence is satisfiable.
A worked example of clausal form
Some people like other people who do not like anyone
- looks quite a mess! We can put this into a first order language using predicates p(A) to mean A is a person, likes(A,B) to mean A likes B, our domain is all the objects in the universe. This yields:
(EXISTS X)( p(X) ^ (EXISTS Y)( p(Y) ^ likes(X,Y) ^ (FORALL Z)( p(Z) --> ¬likes(Y,Z) ) ) )
That is, Y is a person who doesn't like any things that happen to be people; and X likes Y. We can immediately pull quantifiers through for a prenex form:
(EXISTS X)(EXISTS Y)(FORALL Z) ( p(X) ^ p(Y) ^ likes (X,Y) ^ (p(Z)--> ¬likes(Y,Z)) )
Now we introduce skolem constants- use α for the person in our universe who satisfies X (a person liking Y) and β for the person who likes no other people (i.e. the person who satisfies Y). We'll also drop the universal quantifier on Z as directed:
( p(α) ^ p(β) ^ likes(α , β) ^ ( p(Z) --> ¬likes(β, Z) )
Rewriting the implication gives us
( p(α) ^ p(β) ^ likes(α , β) ^ ( ¬p(Z) V ¬likes(β, Z) )
This is conjunctive normal form; we get a list of clauses as follows:
- --> p(α)
- --> p(β)
- --> likes(α, β)
- p(Z) ^ likes(β , Z) -->
Which is a very simple formulation- it tells us that α is a person; so is β , α likes β ; and it's impossible to find a Z which is both a person and liked by β .