When doing mathematics, it’s generally a good thing if we can only prove things that are true. A key requirement for this is that mathematics is consistent, meaning it contains no contradictions.
While Gödel’s second incompleteness theorem shows that no mathematical system can prove itself to be consistent, we can get close to the ideal by choosing a meaningful, intuitively-consistent foundation for mathematics and build more complex theories up from there. That’s where axiomatic set theory comes in.
Zermelo-Fraenkel set theory with the axiom of Choice (ZFC for short) was first conceived of in 1908 and reached its final form by the 1930s. Its development was piecemeal, with different mathematicians finding limitations in the theory and suggesting new axioms to fill in the gaps.
ZFC is made up of 9 axioms that we can break down into three parts: saying what counts as a set and how to manipulate them, stating the existence of at least one set, and rules for building new sets from ones we already have.
ZFC also uses the rules of predicate logic which I’ll be omitting here for brevity.
What is a set?
A set is a collection of objects that are grouped together. We write the set containing , , and as
where is a set, and , , and are all ‘elements’ or ‘members’ of . We can write to say ’ is an element of ’, which we normally read as ’ is in ’ or ’ contains ‘.
Sets are unordered, so , , and are all the same set.
The elements of sets can also be sets themselves. If we define a new set as , then would look like where we’ve ‘unfolded’ the elements of for clarity.
Crucially, there’s a difference between ‘the set containing the set containing ’ and ‘the set containing ’, i.e. .
We can now introduce our first axiom:
1 - Axiom of extensionality
Two sets are the same set if and only if they have all the same elements.
For example, if , , and , then because every element of is an element of and vice versa (remember that sets are unordered by default), but and because and .
Let’s introduce some rules for making new sets out of ones we already have.
2 - Axiom of pairing
For any two sets, there exists a new set with only those two sets as elements.
This means that if we have any two sets, and , there exists a new set (the name doesn’t matter) such that , with no other elements.
3 - Axiom of union
For any given set , there exists the set whose elements are exactly the elements of the elements of .
This one’s kind of confusing at first, so let’s go through some examples.
If we have a set , then ‘the union of A’ according to the axiom of union is , written .
Some more examples:
If we have two sets and we can use the axiom of pairing and then the axiom of union to create a new set which contains all the elements of and combined. For example:
Before we give the next axiom, we should go through the concept of a ‘subset’. A set is a subset of if every element of is also an element of . We write to say ’ is a subset of ‘.
For example, is technically a subset of as every element of is also an element of . In general, every set is a subset of itself. If and , then we say that is a ‘proper subset’ of .
At the other end, if we have a set that has no elements, that is also a subset of since every element of this set (of which there are none) is also an element of . This ‘empty set’ plays a special role in set theory and has the symbol .
In-between these two extremes are all the interesting cases. If , then , , and are all subsets of . Similarly, if , then , , and are all subsets of .
This notion of a subset means we can introduce another axiom for building new sets out of ones we’ve already got.
4 - Axiom of power set
For any given set , there exists the set of all subsets of .
This set of all subsets is called the ‘power set’ of , written .
For example, let . In general, if a set has elements, then has elements. The power set grows very quickly as the sets become large.
Making this real
So far we’ve gone through axioms for knowing when sets are the same, joining them together, ‘unpacking’ them into the elements of their elements, and creating the set of all their subsets. The issue is that we haven’t yet got an axiom that says any specific sets exist — all the rules we have require a starting set to build off of.
Let’s change that.
5 - Axiom of infinity
There exists a set with infinitely many members.
Specifically, there exists a set that we build inductively from two facts:
- ( contains the empty set)
- For every element , is also in
This means that is in , and so is , and so is , and , and so on and so on without limit. Each successive element in contains all the elements before it.
We now have a set we can use! You could take the power set of , for example, and get an extremely large set. We also know that the empty set exists, as it forms the foundation of this infinite set . Unfortunately we have no way to access specific sets within as none of the axioms so far allow us to do so. In order to get at any of these sets inside , we need still more axiomatic machinery.
Needles in a haystack
We need some way to get at the sets inside if we want a useful theory of sets. To do this, we bring in the power of predicate logic to specify which sets we want using logical statements.
6 - Axiom schema of specification
Any definable subset of a set is itself a set.
Definable here means ‘definable in predicate logic’. Predicate logic deals with statements about objects. These statements can make claims that every object in a given collection satisfies some property, or claim that there exists
I’m not going to go over the details of predicate logic here to keep things simple. Wikipedia’s page on ZFC has a good formulation of every axiom in formal logic. The most important idea is that we can use these statements to extract specific subsets from existing sets.
For example, let’s say we want to get the set containing the empty set out of our infinite set . The property that makes the empty set special is that it has no elements. To specify this property in predicate logic, we can write
which reads as “for all , is not in “. You could also write this as to say “there does not exist where is in “. These statements are equivalent and both imply that has no elements.
To pick out the empty set in particular, we also need to say that this set that we’re aiming for is in :
The wedge symbol in the middle means ‘and’, so this reads as ” is in and for all , is not in “. This statement now uniquely defines the empty set but we need a little more syntax to go from just a description of the empty set to the set containing it. That syntax, called ‘set-builder notation’, looks like
which reads as “the set with elements such that is in and for all , is not in ”, i.e. the set containing the empty set.
7 - Axiom schema of replacement
The image of a set under any definable function will fall inside a set
Some terminology: we say that the set of all inputs to a function is the function’s domain, and the set of all possible outputs is the function’s range. This axiom schema then means that all the possible outputs of any definable function will fall inside a set.
We haven’t explicitly defined functions yet which makes this a bit of sleight of hand. I’ll leave out the full definition as it’s not essential to explaining ZFC, but trust that it can be done by constructing a few more structures. The semi-rigorous understanding of functions as structures that take in a single input and spit out a single output is enough at this level.
(For the curious, a function is a total and functional binary relation, which is itself a subset of the cartesian product of two sets, which is a set of ordered pairs — paste this into ChatGPT or Claude for a full explanation. That sentence is not particularly helpful unless you know what all those words mean, but serves as a good example of the trade-off between brevity and intelligibility when using terminology.)
Note that the axiom schema of specification and the axiom schema of replacement are called schemas because there’s one axiom for every valid logical statement / function, as opposed to just one axiom like the others.
These are incredibly powerful axiom schemas that do most of the work in ZFC. I’ve only given one example of a logical statement, but the interaction of the seven axioms we’ve gone through so far produce huge syntactical richness.
Avoiding paradox
To motivate the necessity for a formal system like ZFC, let’s go into the history of set theory.
In the past, mathematicians used ‘naïve set theory’ — any set that you can specify with predicate logic is valid. This is a much stronger assumption that the axiom schemas of specification and replacement, which only allow for . This was how set theory was performed for many years after its invention by Georg Cantor in 1874 with the paper “On a Property of the Collection of All Real Algebraic Numbers”.
Naive set theory ran into issues around the turn of the century, however. The unrestricted nature of the axiom schemas meant you could construct objects like ‘the set of all cardinals’, ‘the set of all ordinals’, or, most well-known, ‘the set of all sets which do not contain themselves’. Constructing each of these sets produces Cantor’s paradox, the Burali-Forti paradox, and Russell’s paradox respectively. Crucially, these are not just counterintuitive results that seem confusing at first. They are logical contradictions that invalidate the entirety of naive set theory by their existence. I’ll explain Russell’s paradox to convey the magnitude of the problem.
Under naive set theory, we can construct the set of all sets that do not contain themselves in two steps:
- Define the set of all sets:
- From that set, take the subset of sets that do not contain themselves
Now consider : does contain itself? If contains itself, then it does not contain itself by its own definition which is a contradiction. If does not contain itself, then it satisfies the statement in its definition and must contain itself, another contradiction. The existence of the set causes paradox within naive set theory and invalidates the theory.
One part of the issue here is the unrestricted use of predicate logic. ZFC overcomes these issues with naive set theory by restricting predicate logic to specifying subsets of existing sets using the axiom schema of specification. This means problematic sets like ‘the set of all cardinals’ or ‘the set of all ordinals’ are simply not allowed.
The other part is that sets are allowed to contain themselves, e.g. the set of all sets must be an element of itself. ZFC avoids this by introducing another axiom.
8 - Axiom of foundation
Every set must contain an element that is -minimal.
A -minimal (read ‘in-minimal’) element is an element that does not contain any of the elements in the set, including itself. For example, in the set , is -minimal as it does not contain , but is not -minimal as it contains which is in the outer set.
This bans sets from containing themselves, so you can’t have ‘the set of all sets’ and ‘the set of all sets that do not contain themselves’ becomes ‘the set of all sets’ (as self-containment is banned) and also cannot exist.
There’s one more axiom left that makes the ‘C’ in ZFC.
Extending ZF
9 - Axiom of choice
For any collection of sets, we can ‘choose’ one element from each set in the collection.
This statement is provably true using the other axioms for finite collections of sets, but unprovable for infinite collections.
Normally if something is unprovable it’s false, but the axiom of choice is in fact independent of the other eight axioms, meaning that the axiom of choice can both be true or false without affecting the behaviour of the other axioms. Hence mathematicians treat it as safe to assume and it’s widely used across different fields.
More disconcerting are the results that can be proved using the axiom of choice, such as the Banach-Tarski paradox. I’m personally okay with the axiom of choice as it makes sense and allows us to prove meaningful results about the structure of mathematical objects.
Taking stock
We now have an axiomatic system of set theory that is powerful enough to express all of standard mathematics. To recap:
- Axiom of extensionality: sets are defined by their elements
- Axiom of pairing: sets can be paired up
- Axiom of union: sets can be unpacked
- Axiom of power set: the set of all subsets exists
- Axiom of infinity: there exists an infinite set
- Axiom schema of specification: we can specify subsets of any set
- Axiom schema of replacement: the outputs of any function fall inside a set
- Axiom of regularity: sets cannot contain themselves
- Axiom of choice: we can always pick one element from each set in a collection
Using these nine axioms, we can build up to number theory, calculus, algebraic geometry, and beyond. They provide a solid foundation that’s free from paradox, ensuring mathematical results are meaningful even when they get far, far away from the axioms of ZFC.