There’s a lot to say about Bayes theorem. And most of it’s been said. In my little echo chamber of the internet, just about every YouTuber that I watch or blogger that I read has talked about Bayes theorem. So the critical question is: what do I have to add?

Just this for now. Almost every introduction to Bayes theorem introduces the formula like so:

$P(A|B) = \frac{P(B|A)P(A)}{P(B)}$

I think the reason that this formulation is so common is that when you want to use Bayes theorem, you’re probably trying to compute $$P(A|B)$$. The fact that this formula has isolated $$P(A|B)$$ on one side makes it practical to use.

However, I’d argue that an introduction to a topic should prioritize giving the reader an intuition for why the formula is true. If possible, the reader should come away thinking that it’s so obvious that it’s almost uninteresting. I think the following formulation does a better job at that.

$P(A|B) P(B) = P(A \& B) = P(B \& A) = P(B|A) P(A)$

### Derivation

One way to compute the probability of two events $$A$$ and $$B$$ happening is as follows:

$P(A \& B) = P(A|B) P(B)$

To make this concrete, let’s say $$A$$ is “I go to the park tomorrow” and $$B$$ is “it rains tomorrow”. The probability that it rains tomorrow and I go to the park is [the probability that it rains] times [the probability that I go to the park given it rains]. Stated that way, it seems kind of obvious.

From there it’s a small leap to notice that $$P(A\&B) = P(B \& A)$$, so we could also write the probability like so:

$P(A \& B) = P(B|A) P(A)$

And the very last step is to combine the two equations:

\begin{align} P(A|B) P(B) &= P(A \& B) = P(B|A) P(A) \\ P(A|B) P(B) &= P(B|A) P(A) \end{align}

Divide each side by $$P(B)$$ and you will get the standard formulation of Bayes theorem.

I think this route to deriving Bayes theorem makes the connection to $$P(A \& B)$$ more clear and, at least in my opinion, the fact that you can rewrite $$P(A\&B)$$ in two ways is a nice intuition for why Bayes theorem works.