\[\newcommand{\N}{\mathbb{N}} \newcommand{\Q}{\mathbb{Q}} \newcommand{\R}{\mathbb{R}}\]

You may already know the punchline of Georg Cantor’s work on infinity, which is that there are different sizes of infinity. I’ve also “known” this for a while, but it was one of the many mathematical curiosities that I could recite, but that I didn’t really understand at any deep level. Recently, while learning about Godel’s theorems and computability, I ran head first into a practical consequence of this result that I’d like to discuss in this post.

What does it mean for two infinite sets to have different sizes?

For the record, that this has always been a very counterintuitive idea to me. Growing up, I heard your typical grade-school examples about infinity: infinity + 1 = infinity, infinity * 2 = infinity, or even infinity * infinity = infinity. From these examples, I drew the natural conclusion that infinity was this sort of black-hole from which you cannot escape. Once something was infinite, it didn’t really matter what you did to it, it just stayed infinite. A perhaps less well-founded extrapolation was that there was only one infinity. Given its black-hole-like nature, it seemed impossible to distinguish between two things that were both infinity - so maybe they were both the same size.

So it struck me as very odd when I learned that, in fact, there are different sizes of infinities, with the typical examples being “countable” vs. “uncountable”. What does that even mean?

Here is a rigorous definition of what it means for two sets (infinite or not) to be the same size:

Two sets are the same size if there is an invertible function that maps between them.

With that in mind, let’s work through our grade-school examples. In each case, I will pick two sets whose sizes correspond to the number on each side of the equation. For example, if the right hand side of one equation is infinity, I might pick the set of all natural numbers, \(\N\), to represent that number. Then, we will check whether the sets representing each side of the equation are the same size (according to the stated definition above). If they are, then we can argue that the equals sign in the equation is justified.

  • infinity + 1 = infinity
    • Set for left hand side (infinity + 1): the set of all natural numbers with the addition of a the single negative number (-1) (i.e. \(\{-1, \N\}\)).
    • Set for right hand side (infinity): the set of all natural numbers, \(\N\).
    • Invertible function that maps elements of the left set to elements of the right set: \(f(x) = x + 1\)?
    • So, indeed, those sets are the same size.
  • infinity * 2 = infinity
    • Set for LHS (infinity * 2): The set of all natural numbers, \(\N\).
    • Set for RHS (infinity): The set of all even natural numbers.
    • Invertible function that each natural numbers to a unique even number: \(f(x) = 2x\).
    • So, again, according to our definition of what it means for two sets to have the same size, the set of even natural numbers and the set of all natural numbers have the same size.
  • infinity * infinity = infinity:
    • Set for LHS (infinity * infinity): The set of all rational numbers (fractions), \(\Q\).
    • Set for RHS (infinity): The set of all natural numbers, \(\N\).
    • Why am I saying that the set of all rational numbers has a size of infinity * infinity? Many of you will have seen this before, but look at the table below that shows one way of arranging all rational numbers into an 2-D grid. Each side of the grid has all the natural numbers (except for 0 in the denominator), so the total numbers of squares in the grid (and therefore rational numbers) is infinity * infinity.
    • Invertible function that maps each rational number to a unique natural number: It turns out, this function exists, but it’s a bit more complicated than the last two. The trick is to iterate through the 2-D grid in a zig-zag fashion as opposed to the more intuitive way of counting row by row. I recognize that’s not nearly enough to intuit the solution if you haven’t already seen it, but rather than reproduce it in full here, I will refer you to this short paper.
Denominator ↓ \ Numerator → 0 1 2 3
1 0/1 1/1 2/1 3/1
2 0/2 1/2 2/2 3/2
3 0/3 1/3 2/3 3/3

One way of arranging all rational numbers in a 2-D grid.

From the three examples above, you can see how we can use our definition (of what it means for two sets to have the same size) to answer the question of whether two particular sets are the same size. It may not be easy to find the invertible function between the two size, as in the case of the rational numbers, but at least it’s clear.

Before moving on, we should point out at least one example of two infinite sets that are not the same size, and I will use the classic example of the real numbers \(\R\) and the natural numbers \(\N\). Instead of actually proving that you cannot find an invertible mapping between these two sets (which is not an easy thing to prove), I will suggest two things:

  1. Try it! Try to construct an invertible function that takes real numbers and produces a natural number (or vice versa) and you will get an intuitive sense for why it’s impossible.

  2. Read Cantor’s diagonal argument for a proof of why it cannot be done.

To sum up, two sets are the same size if it’s possible to construct an invertible function that maps between them. A set being countably infinite means that it is the same size as the natural numbers (e.g. \(\Q\)). A set being uncountably infinite means that it is bigger (e.g. \(\R\)). Uncountably infinite is a blanket term that encompasses infinitely many different sizes.

Why does this matter in real life?

I’m sure there are many good answers to this question, but I’m going to use the one that I stumbled across while learning about Godel’s theorems and computability.

How do computers represent things? With sequences of bits. In other words, computers represent things in binary.

This is almost too obvious to say, but there’s an invertible mapping between sequences of bits and the natural numbers - just consider the bits to be the base-2 representation of the number (that’s basically what binary means).

So, if an arbitrary set \(X\) is countably infinite then, by the definition of countably infinite, there’s an invertible mapping between \(X\) and \(\N\). If you have that mapping, then it’s trivial to construct a mapping between \(X\) and sequences of bits, the practical consequence of which is that you can represent elements of the set \(X\) on a computer!

So, what goes wrong when the set in question is uncountably infinite, like \(\R\)? Well, you cannot come up with a unique natural number - and therefore a unique sequence of bits - for each real number. At some point, multiple real numbers will need to share the same representation on the computer. And this leads to well-known issues such as:

>>> 0.1 + 0.2
0.30000000000000004
>>> 0.1 + 0.2 - 0.3 == 0
False

The code snippet above was produced using python, but most programming languages will suffer the same fate. And if they don’t for this example, then they will for some other example.

In fact, infinitely many real numbers will end up sharing the same representation. For example, you can pick some countably infinite set of real numbers to represent perfectly, but that will leave an uncountably infinite number of real numbers “left over”. In practice, we squash an uncountably infinite number of real numbers into every unique floating point number and hope the loss of precision isn’t too important.

Are real numbers forever doomed to be misrepresented on computers?

This is highly speculative (in the sense that I’m not sure I know what I’m talking about), but I think that the way quantum computers represent things, it may be possible for those computers to accurately represent real numbers. Of course, even if true, that only buys us one additional level of infinity over classical computers. Any sets that are “bigger” than \(\R\) will fail to be represented even on quantum computers.

Why do I think that quantum computers can represent real numbers? Well, as opposed to a classical bit, which is either 0 or 1, a qubit (a bit on a quantum computer) is in some superposition of 0 and 1. One way to think about that is that there is some probability \(p\) that, when measured, the qubit will be a 0 and probability \((1-p)\) that it will be a 1. So, in some sense, I need a real number, \(p\), for each qubit to represent the state of the computer at any given time. In fact, you need more than that due to entanglement (correlation) between qubits, but that just strengthens my argument.