Diagonalisation
& Paradox
Diagonalisation
Diagonalisation
is a method of indirect proof (via reductio) developed by Georg
Cantor. It is difficult to overestimate the importance of the method. It is the
basis for transfinite number theory, set theory, the set theoretic paradoxes,
the semantic paradoxes, Gödel’s incompleteness theorem, Tarski’s
indefinability theorem, the Church-Turing thesis,
decidability results,
etc.
The
method is called ‘diagonalisation’ after the intuitive
geometrical presentation of Cantor’s results on the cardinality (size) of sets,
and real numbers in particular. Informally, the method consists in the
demonstration that the assumption that a given ‘realm of objects’(let
us keep to sets of numbers, although the procedure applies also to formulae;
see appendix I) satisfying some condition is denumerable - in 1:1
correspondence with the natural numbers - leads to contradiction, i.e., one can
construct a set that satisfies the condition but which can’t be on one’s ‘list’
on pain of contradiction. The set is formed by a kind of accidental
self-reference. By the enumeration, each set has a number, a numerical name, as
it were. Therefore, one can define a set in terms of a condition pertaining to
the membership relation between sets and their ‘names’. In itself, this is
trivial and does not necessarily lead to paradox or contradiction. It was
Cantor’s genius to realise that certain conditions
can be specified which entail that their satisfying sets can’t be ‘named’.
Formally,
the typical technique is a construction of a set numbers defined by an anti-diag formula:
(D)
("x)[x ε
D(E) ↔ x Ï
Sx]
D(E)
is here the defined set formed from the enumeration E; it is defined such that
it consists of all those numbers x
that are not members of the set S which they ‘name’ in E. For example, 5
is in D(E) just if 5 is not in the fifth set of E.
Cantor’s
Theorem: The power set of set S (the set of subsets of S) is greater than S
Hypothesis:
P(S) £
S
(The
power set of S is less than or equal to S.)
Assume
P(S) < S.
(The
power set of S is less than S.)
But,
("x)($y)[x ε
S ↔ y f
S & x ε
y], i.e., y = the unit set {x}.
(Anything
is a member of S just if it is a member of a subset of S.)
Therefore, P(S)
Û
S.
(The
power set of S is not less than S.)
Assume
P(S) = S.
(The
power set of S is equal to S.)
Therefore,
("x)($y)[x ε
S → y f
S & f(x) = y];
(Anything
is a member of S only if there is a subset of S to which it corresponds. Let ‘f(x)
= y’ mean ‘the set to which x corresponds is y’.)
It
follows that x ε
f(x)
v x Ï
f(x).
(x is either a
member of the set it corresponds to or isn’t.)
Take
the set: {x: x Ï
f(x)}.
(The
set of members of S which are not members of the sets to which they
correspond.)
So,
($z)[f(z) = {x: x Ï
f(x)}]
(There
is some member of S which corresponds to the set of members of S which are not
members of the sets to which they correspond.)
Assume
f(a)
= {x: x Ï
f(x)}.
(Let a
be the member of S that corresponds to the set of members of S which are not
members of the sets to which they correspond.)
Assume
a
ε
{x: x Ï
f(x)}.
(Let a
be a member of the set of members of S which are not members of the sets to
which they correspond.)
Contradiction:
a
corresponds to and is a member of the
set of objects which are not members
of the sets to which they correspond.
Assume
a
Ï
{x: x Ï
f(x)}.
(Let
a
not be a member of the set of members of S which are not members of the sets to
which they correspond.)
Contradiction:
a corresponds to and is not a member of the set of objects which are not members of the sets to which they
correspond.
Therefore,
¬($z)[f(z)
= {x: x Ï
f(x)}]
Therefore,
¬("x)($y)[x ε
S → y f
S & f(x) = y].
Therefore,
¬(P(S) = S)
Therefore,
P(S) > S
Russell’s
Paradox
Sainsbury’s
Reading
Sainsbury
seeks to support Russell’s reasoning via an appeal to Cantor’s Theorem; that
is, the generation of paradox in Russell’s case is legitimate, for the same
diagonal reasoning is employed in Cantor’s Theorem. This is a total misreading of the situation.
For
Russell, Cantor’s ‘Theorem’ was a paradox. Consider:
There
is a greatest of all infinite numbers, which is the number of all things
altogether, of every sort and kind. It is obvious that there cannot be a
greater number than this, because if everything has been taken, there is
nothing left to add. Cantor has a proof that there is no greatest number… But
in this point, the master has been guilty of a very subtle fallacy, which I
hope to explain in some future work.
¾
Russell, ‘Mathematics and Metaphysicians’, 1901
In
other words, whereas one can ‘diagonalise out’ new
sets, certainly up to and beyond ±0,
one can’t diagonalise the putative set of everything.
In a letter to Frege (June 1902), Russell said he had
an argument for the class of all things being of the same cardinality as the
class of all classes. (Presumably, Russell’s argument traded on the thought
that classes are ‘things’; if there were more classes than things, then classes
wouldn’t be things.) Cantor, on the
other hand, was utterly sanguine about an open-ended sequence of alephs.
Russell
was “led to [the Russell paradox] in the endeavour to
reconcile Cantor’s proof that there can be no greatest cardinal number with the
very plausible supposition that the class of all terms has necessarily the
greatest possible number of members” (Principles,
§100, 1903).
By
the time of Principles, Russell had
softened: there was nothing wrong with Cantor’s reasoning, but it leads to contradiction
when confronted with “large classes”. Should we dump such classes? “[I]t is
undesirable to adopt so desperate a measure as long as hope remains of some
less heroic solution” (ibid. §344).
Russell’s
Paradox for Cantor, 1903
(1)
Let S be Russell’s class of
everything, i.e. the set of all sets.
(2)
Let f: U µ P(U)
such that f(x) = x, if x is a set; f(x) = {x} otherwise.
(3)
Applying Cantor’s anti-diag construction to S gives us R, i.e., the class of all classes that are not members of
themselves.
(4)
Is there an x such that f(x)
= R?
(5)
By Cantor’s reasoning there can be no such x,
i.e., the lack of such an x shows
that there are more subsets than members - R
is ‘diagonalised out’.
(6)
But, transparently, by (2), f(R) = R.
(7)
So, where x = R, if x ε
f(x),
then R ε
R.
(8)
If R is a member of R, then it is a member of itself, but R
consists of all and only those sets which are not members of themselves. So R
can’t be a member of R.
(9)
But, if x Ï
f(x),
then R Ï
R.
(10)
If R is not a member of R, then it is not a member of itself, but R
consists of all and only those sets which are
not members of themselves. So R
must be a member of R.
(11)
Therefore, there is no R.
(12)
Therefore, there is no diagonalisation from S.
Russell only later realised that the paradox applies not just to Cantor’s
reasoning in particular, but to his own assumptions about sets.
Hence we have the paradox in its familiar form (expressed in a letter to Frege (June 1902) and published in Appendix A of Principles).
Frege’s
‘Comprehension Axiom’ (Axiom V of The
Basic Laws): ("F)($y)[Fx ↔ x ε
y]
Instantiating:
x
ε
{x: R Ï
R}↔ x ε
R,
This
gives us the contradiction:
R
ε
R & R
Ï
R
Therefore,
¬("F)($y)[Fx ↔ x ε
y]
Zermelo-Fraenkel
Set Theory
Russell’s
paradox is not a refutation of Cantor, for Cantor simply didn’t accept a set of
everything. Russell requires an independent argument for the coherence of such
a notion.
In
ZF, Russell’s paradox is avoided by two axioms.
Axiom of Foundation
("x)($y) ¬($z)[x ¹
{x: x ¹
x}→ z ε
y & y ε
x & z ε
x]
(Every
non-empty set x has a minimal member y such that there is no member z
of y that is a member of x.)
Axiom of Replacement
(schema)
f
is a function → ("z)($y)("x)[x ε
y ↔ ($w)[w ε
z & f(w) = x]]
(For
every set z, there is a set y such that, for all x, x is a member of y just
if there is a member w of z and x is a value of a function on w.
In other words: every set can be specified in terms of a function defined over
the subsets of a set.)
These
axioms suffice to block Russell’s Paradox and, with the rest of ZF+C, are not
known to be inconsistent.
Sainsbury
suggests that we need a philosophical resolution to the paradoxes, not ad hoc stipulations (axioms?), and Russell,
as opposed to Cantor, presumably, is to be credited with seeking such a
resolution. This is spurious on three grounds.
(1)
The ‘paradoxes’ are formal in the first place insofar as we are concerned with
set theory. The only problem is: if sets cannot be defined as intelligible
aggregates, i.e., if Cantor/Frege naïve set theory is
inconsistent, then what should be our conception of a set? The air of paradox
is removed by so little as saying that the Russell paradox was a reductio of naïve
set theory.
(2)
AF and AR are neither ad hoc, nor
merely stipulative. They express the idea that a set
is a cumulative object, i.e., the
members of a set are formed/specifiable independently of the set itself.
(3)
Cantor was well aware that his own diagonalisation
method could be applied to his sets. He corresponded with Hilbert about the ‘Burali-Forti paradox’ in 1896 - a year before Burali-Forti’s publication. Besides, it is well appreciated
in the literature that Cantor presumed well-founded sets insofar as he took all
sets to be reducible to ur-elements.
The same can’t be said for Frege.
The
Liar
Solutions?
(i) Gaps
Strengthened
liar
(ii)
Hierarchy
Artificial.
No evidence for the hierarchy. The cure is worse than the disease.
(iii)
Syntax
(a)
Too inclusive
(b) Syntactic self-reference is a legitimate technique in
logic/maths
(c)
The liar doesn’t depend on syntactic self-reference anyhow.
Comments
on Hierarchy and Tarski
Sainsbury
reads Tarski as arguing that “our actual concept of
truth is incoherent”; it “needs to be replaced by a series of concepts of
truth, hierarchically arranged, and each expressed in a language different from
any natural language” - “This is the famous “Tarski
hierarchy””. All of this is nonsense.
(1)
Tarski didn’t think that colloquial truth is
incoherent, or that natural languages are inconsistent.
Consider:
At
first blush it would seem that [“everyday”] language satisfies [the conditions
for the generation of paradoxes], that that therefore it must be inconsistent.
But actually the case is not so simple. Our everyday language is certainly not
one with a specified structure. We do not know precisely which expressions are
sentences, and we know to an even smaller degree which sentences are to be taken as assertible [Tarski means
‘provable’]. Thus the problem of consistency has no exact meaning with respect
to this language. We may at best only risk the guess that a language whose
structure has been exactly specified and which resembles our everyday language
as closely as possible would be inconsistent.
¾
Tarski, The Semantic
Conception of Truth, 1944.
(2)
Tarski had no interest in offering the correct
‘theory’ of truth. Consider:
I
do not have the slightest intention to contribute in any way to those endless,
often violent discussions on the subject ‘What is the right conception of
truth?’ I must confess that I do not understand what is at stake in such
disputes; for the problem itself is so vague that no definite
solution is possible. In fact, it seems to me that the sense in which the
phrase ‘the right conception’ is used has never been made clear. In most cases
one gets the impression that the phrase is used in an almost mystical sense
based upon the belief that every word has one ‘real’ meaning (a kind of
Platonic or Aristotelian idea), and that all the competing conceptions really
attempt to catch hold of this one meaning; since, however, they contradict each
other, only one attempt can be successful, and hence only one conception is the
‘right’ one.
(op. Cit.)
(3)
There is no Tarski hierarchy.
A
Tarski definition of a truth predicate (for objectlanguage L
in metalanguage L+)
is, in effect, a relative consistency proof for L. That is, the definition shows that L is consistent relative to L+.
Contrapositively: if we couldn’t consistently
introduce a truth predicate into L+,
then L would be inconsistent.
Tarski
wasn’t the least bit interested in hierarchies. He was concerned to offer rational reconstructions of semantic notions (truth, definition,
consequence, etc.) in terms of which we can consistently theorise
the properties of specific formal systems (especially, for Tarski,
arithmetic, algebra, and set theory). Thus, the import of a definition of a truth
predicate is exhausted by the specifiable consequences for the objectlanguage. The lack of a general definition of truth
is no lack at all for Tarski.
Mind
the Gap
(1)
A ‘gap’ account faces perhaps a graver problem than the strengthened liar. A gap
account is necessarily one which says
that it’s not the case that the liar is true or it’s
negation is true:
(G)
¬[Tr(k)
v Tr(¬k)].
But
G must fall into any gap into which the liar falls. Hence, a gap theory can’t
cover the putative truth of its own thesis.
Kripke recognised this deficiency of gaps and conceded that our
intuitive truth concept is bivalent: “[W]e still cannot avoid the need for a metalanguage … the goal of a universal language seems
elusive… [a Tarskian predicate] express[es] the genuine concept of truth [for L]” (‘Outline of a Theory of Truth’, 1975)
(2)
Sainsbury is quite wrong to treat the truth teller - This sentence is true - as on a par with the liar. The truth teller
falls into no gap; rather, it’s truth value is
arbitrary (in Kripke’s language, it can be in the
extension of TRUE or FALSE at any fixed point). The liar tells us that the
union of truth and our logic is potentially inconsistent.
The truth teller tells us that the union of truth and our logic is incomplete.
A
Solution
Whereas
gap accounts say that the liar is neither true nor false, it might be more
felicitous to say that the liar is true and
false. This just means that we accept that any ‘global’ truth concept admits diagonalisation. The price we pay is that ‘true’ has no fixable extension.
Appendix
I: Godel’s Diagonalisation
Gödel’s
‘diagonal argument’ for the incompleteness of arithmetic proceeds along a
different course. Gödel proves a ‘diagonal lemma’ that effects
‘self-reference’ within the theory T
(a first-order axiomatisation of PA).
Gödel
first constructs a diagonalisation of an expression v:
(D-v)
($x)[x = évù
& v],
where,
if v
is a formula of arithmetic with only x
free, then (D-v)
says that v
is true of its gödel-number, i.e., (D-v)
is true iff v is true of its gödel-number.
Gödel
further constructs a function diag(x) such that if n is the gödel-number of
v,
then diag(n)
is the gödel-number of the diagonalisation
of v.
Gödel
proves that, for a theory T in which diag(x) is representable, for all formulae F(y), with just y free, there is a sentence
G such that
├
T G ↔ F( é
G ù
)
In
other words, for every open formula of arithmetic F(y), one can construct a sentence, via the substitution of a numeral
for y, that is provably equivalent to
the sentence whose gödel-number has the numeral for
which y was substituted.
In
particular, one can construct a formula - ¬($x)[B(x, y)]
- which says that there is no x such
that x is a proof of y. The above results show that, where
the numeral for the gödel-number g of the formula is
substituted for ‘y’, there will be a
provably equivalent sentence G such that, if G were provable, then there would
be a proof with gödel-number k, hence B(g, k),
and ‘($x)[B(x, k)]’
would be provable, contradicting the consistency of T. If ¬G were provable, then either B(0, k) or B(1, k) or…, in which case ‘($x)[B(x, k)]’
would be provable, contradicting the consistency of T. Therefore, there is a formula of T such that, on the assumption that T is consistent, neither it nor its negation is provable.
Appendix
II: Löb’s Paradox
How
to ‘prove’ that God exists by just using propositional logic with identity.
(D)
If D
is true, then God exists.
1
(1) D
is true A
1
(2) ‘If D
is true, then God exists’ is true
1 substitutivity =
1
(3) If D
is true, then God exists 2 T↔
1
(4) God exists 1,3 MPP
(5) If D
is true, then God exists 1,4
CP
(6) ‘If D
is true, then God exists’ is true
5 T↔
(7) D
is true
6 substitutivity =
(8) God exists
5,7 MPP
It
would appear that we can prove anything we like! But…
If
we read the conditional materially (i.e. false just if the antecedent is true
and the consequent is false), then:
(a)
If God exists, then D
is true, for the consequent of D
is true, which suffices for the truth of D.
(b)
If God doesn’t exist, then D
is paradoxical.
(i) Assume D
is true. Since D’s
consequent is false, it follows that its antecedent must also be false if D
is to be true. But the antecedent says that D
is true. Therefore, D
can be true only if D
is false.
(ii)
Assume D
is false. Since D’s
consequent is false, it follows that its antecedent must be true if D
is to be false. But the antecedent says that D
is true. Therefore, D
can be false only if D
is true.