Bayesian reasoning: How does my email inbox know I haven’t really won a million pounds?

Bayes’ Theorem is the maths tool that we probably encounter every day without realising. It is the foundation for the functioning of a lot of the modern world, from hospital diagnoses to global warming, banking and, probably most important to you at this very moment, email spam filters. 


The theory shows that the probability of a hypothesis being true depends on two factors: 


  1. Whether it is rational, given any knowledge we already know regarding it.

  2. Whether it remains rational with new evidence.


This is different to the traditional way of proving something which we would find in, say, a science lesson at school. At school, we use new pieces of evidence to confirm whether a hypothesis we have made is true. For example, if I wanted to prove that the earth is flat, I could place a ball on the ground and show that it doesn’t roll in any particular direction. One could conclude from this that the new evidence is consistent with my original hypothesis, and that therefore it is that much more likely that the earth is flat. 


Bayes’ theorem allows us to see that while the original statement that the earth is flat fits in with the new evidence that we have found, the original statement is nonsensical when considered with the weight of scientific research that has come before it. 


The overall likelihood of the earth being flat, therefore, is outweighed by copious amounts of research into physics and geography.


How do we use Bayes’ theorem?


Bayes’ theorem is described by this elegant equation:

Screenshot 2021-03-24 at 16.43.41.png

Let’s apply this to an example. Imagine that we have 100 coins where 99 are fair coins and 1 coin has a head on both sides. If we pick a coin at random and flip it three times and it is heads each time, what is the probability that we picked the biased coin ?



In this example: 

  • P(H|E) = Chance of picking biased coin (H) given we flipped three heads (E). This is what we want to know.

  • P(E|H) = Chance of flipping three heads (E) given that we picked the biased two-headed coin (H). In this case, the probability is 100%.

  • P(H) = Chance of picking biased coin. In this case, the probability is one out of a hundred - 1%.

  • P(not H) = Chance of picking a fair coin, ninety-nine out of a hundred - 99%.

  • P(E|not H) = Chance of flipping three heads (E) given that we picked a fair coin (not H). The probability of heads for a fair coin is 50% therefore three heads in a row is 0.50 x 0.50 x 0.50 – 12.5%.




From these values, we can calculate P(H|E) as approximately 7.5%. If we repeat the same scenario but now we flip the coin 6 times and get 6 heads the probability we picked the bias coin increases to 86.6%. This demonstrates that the information from the 3 extra flips have significantly increased the likelihood we picked the biased coin. 



Bayes’ Theorem and Spam Emails


An application of Bayes’ Theorem that we interact with every day is in email spam filtering. Take the following:

  • Event: The message is spam.

  • Test ‘words’: The message contains certain words (‘words’).

Screenshot+2021-03-24+at+16.45.46.jpg

Bayesian filtering permits one to forecast the probability an email is actually spam given the appearance of certain words. Evidently, words similar to “Casino” have a greater chance of emerging in spam emails than in regular emails.



Spam filtering based on prohibiting these certain words is too obstructive and would result in too many legitimate emails being filtered. However, with Bayesian filtering we can evaluate the words in an email, we can compute the probability it is actually spam. If a message has a very high chance of being spam, it probably is. The word probabilities are exclusive to each user of the filter and are able to evolve over time with training whenever the filter incorrectly classifies an email. 



So the next time someone is having a cancer screening or is driving an automated car, spare a thought for Bayes’ Theorem. It works away in the background considering alternate explanations and how probable it is that they are true given the evidence. This means that one cannot just come up with a theory and assume it is definitely true even if it agrees with the evidence. You are required to examine alternate theories and inspect whether these theories explain the evidence better.

Previous
Previous

Why study an Arts and Humanities degree?

Next
Next

The art of asking questions, and why we need to think about them more.