How to Develop an Intuition for Probability With Worked Examples

Author: Jason Brownlee

Probability calculations are frustratingly unintuitive.

Our brains are too eager to take shortcuts and get the wrong answer, instead of thinking through a problem and calculating the probability correctly.

To make this issue obvious and aid in developing intuition, it can be useful to work through classical problems from applied probability. These problems, such as the birthday problem, boy or girl problem, and the Monty Hall problem trick us with the incorrect intuitive answer and require a careful application of the rules of marginal, conditional, and joint probability in order to arrive at the correct solution.

In this post, you will discover how to develop an intuition for probability by working through classical thought-provoking problems.

After reading this post, you will know:

How to solve the birthday problem by multiplying probabilities together.
How to solve the boy or girl problem using conditional probability.
How to solve the Monty Hall problem using joint probability.

Discover bayes opimization, naive bayes, maximum likelihood, distributions, cross entropy, and much more in my new book, with 28 step-by-step tutorials and full Python source code.

Let’s get started.

How to Develop an Intuition for Probability With Worked Examples
Photo by Bernal Saborio, some rights reserved.

Overview

This tutorial is divided into three parts; they are:

Birthday Problem
Boy or Girl Problem
Monty Hall Problem

Birthday Problem

A classic example of applied probability involves calculating the probability of two people having the same birthday.

It is a classic example because the result does not match our intuition. As such, it is sometimes called the birthday paradox.

The problem can be generally stated as:

Problem: How many people are required so that any two people in the group have the same birthday with at least a 50-50 chance?

There are no tricks to this problem; it involves simply calculating the marginal probability.

It is assumed that the probability of a randomly selected person having a birthday on any given day of the year (excluding leap years) is uniformly distributed across the days of the year, e.g. 1/365 or about 0.273%.

Our intuition might leap to an answer and assume that we might need at least as many people as there are days in the year, e.g. 365. Our intuition likely fails because we are thinking about ourselves and other people matching our own birthday. That is, we are thinking about how many people are needed for another person born on the same day as you. That is a different question.

Instead, to calculate the solution, we can think about comparing pairs of people within a group and the probability of a given pair being born on the same day. This unlocks the calculation required.

The number of pairwise comparisons within a group (excluding comparing each person with themselves) is calculated as follows:

comparisons = n * (n – 1) / 2

For example, if we have a group of five people, we would be doing 10 pairwise comparisons among the group to check if they have the same birthday, which is more opportunity for a hit than we might expect. Importantly, the number of comparisons within the group increases exponentially with the size of the group.

One more step is required. It is easier to calculate the inverse of the problem. That is, the probability that two people in a group do not have the same birthday. We can then invert the final result to give the desired probability, for example:

p(2 in n people have the same birthday) = 1 – p(2 in n people do not have the same birthday)

We can see why calculating the probability of non-matching birthdays is easy with an example with a small group, in this case, three people.

People can be added to the group one-by-one. Each time a person is added to the group, it decreases the number of available days where there is no birthday in the year, decreasing the number of available days by one. For example 365 days, 364 days, etc.

Additionally, the probability of a non-match for a given additional person added to the group must be combined with the prior calculated probabilities before it. For example P(n=2) * P(n=3), etc.

This gives the following, calculating the probability of no matching birthdays with a group size of three:

P(n=3) = 365/365 * 364/365 * 363/365
P(n=3) = 99.18%

Inverting this gives about 0.820% of a matching birthday among a group of three people.

Stepping through this, the first person has a birthday, which reduces the number of candidate days for the rest of the group from 365 to 364 unused days (i.e. days without a birthday). For the second person, we calculate the probability of a conflicting birthday as 364 safe days from 365 days in the year or about a (364/365) 99.72% probability of not having the same birthday. We now subtract the second person’s birthday from the number of available days to give 363. The probability of the third person of not having a matching birthday is then given as 363/365 multiplied by the prior probability to give about 99.18%

This calculation can get tedious for large groups, therefore we might want to automate it.

The example below calculates the probabilities for group sizes from two to 30.

# example of the birthday problem
# define group size
n = 30
# number of days in the year
days = 365
# calculate probability for different group sizes
p = 1.0
for i in range(1, n):
	av = days - i
	p *= av / days
	print('n=%d, %d/%d, p=%.3f 1-p=%.3f' % (i+1, av, days, p*100, (1-p)*100))

Running the example first prints the group size, then the available days divided by the total days in the year, then the probability of no matching birthdays in the group followed by the complement or the probability of two people having a birthday in the group.

n=2, 364/365, p=99.726 1-p=0.274
n=3, 363/365, p=99.180 1-p=0.820
n=4, 362/365, p=98.364 1-p=1.636
n=5, 361/365, p=97.286 1-p=2.714
n=6, 360/365, p=95.954 1-p=4.046
n=7, 359/365, p=94.376 1-p=5.624
n=8, 358/365, p=92.566 1-p=7.434
n=9, 357/365, p=90.538 1-p=9.462
n=10, 356/365, p=88.305 1-p=11.695
n=11, 355/365, p=85.886 1-p=14.114
n=12, 354/365, p=83.298 1-p=16.702
n=13, 353/365, p=80.559 1-p=19.441
n=14, 352/365, p=77.690 1-p=22.310
n=15, 351/365, p=74.710 1-p=25.290
n=16, 350/365, p=71.640 1-p=28.360
n=17, 349/365, p=68.499 1-p=31.501
n=18, 348/365, p=65.309 1-p=34.691
n=19, 347/365, p=62.088 1-p=37.912
n=20, 346/365, p=58.856 1-p=41.144
n=21, 345/365, p=55.631 1-p=44.369
n=22, 344/365, p=52.430 1-p=47.570
n=23, 343/365, p=49.270 1-p=50.730
n=24, 342/365, p=46.166 1-p=53.834
n=25, 341/365, p=43.130 1-p=56.870
n=26, 340/365, p=40.176 1-p=59.824
n=27, 339/365, p=37.314 1-p=62.686
n=28, 338/365, p=34.554 1-p=65.446
n=29, 337/365, p=31.903 1-p=68.097
n=30, 336/365, p=29.368 1-p=70.632

The result is surprising, showing that only 23 people are required to give more than a 50% chance of two people having a birthday on the same day.

More surprising is that with 30 people, this increases to a 70% probability. It’s surprising because 20 to 30 people is about the average class size in school, a number of people for which we all have an intuition (if we attended school).

If the group size is increased to around 60 people, then the probability of two people in the group having the same birthday is above 99%!

Want to Learn Probability for Machine Learning

Take my free 7-day email crash course now (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

Download Your FREE Mini-Course

Boy or Girl Problem

Another classic example of applied probability is the case of calculating the probability of whether a baby is a boy or girl.

The probability of whether a given baby is a boy or a girl with no additional information is 50%. This may or may not be true in reality, but let’s assume it for the case of this useful illustration of probability.

As soon as more information is included, the probability calculation changes, and this trips up even people versed in math and probability.

A popular example is called the “two-child problem” that involves being given information about a family with two children and estimating the sex of one child. If the problem is not stated precisely, it can lead to misunderstanding, and in turn, two different ways of calculating the probability. This is the challenge of using natural language instead of notation, and in this case is referred to as the “boy or girl paradox.”

Let’s look at two precisely stated examples.

Case 1: A woman has two children and the oldest is a boy. What is the probability of this woman having two sons?

Our intuition suggests that the probability that the other child is a boy is 0.5 or 50%. Alternately, our intuition might suggest the probability of a family with two boys is 1/4 (e.g. a probability of 0.25) for the four possible combinations of boys and girls for a two-child family.

We can explore this by enumerating all possible combinations that include the information given:

Younger Child | Older Child | Conditional Probability
Girl            Boy           1/2
Boy             Boy           1/2 (*)
Girl            Girl          0 (impossible)
Boy             Girl          0 (impossible)

There would be four outcomes, but the information given reduces the domain to 2 possible outcomes (older child is a boy).

Indeed, only one of the two outcomes can be boy-boy, therefore the probability is 1/2 or (0.5) 50%.

Let’s look at a second very similar case.

Case 2: A woman has two children and one of them is a boy. What is the probability of this woman having two sons?

Our intuition leaps to the same conclusion. At least mine did.

And this would be incorrect.

For example, 1/2 for a boy as the second child being a boy. Another leap might be 1/4 for the case of boy-boy out of all possible cases of having two children.

To find out why, again, let’s enumerate all possible combinations:

Younger Child | Older Child | Conditional Probability
Girl            Boy           1/3
Boy             Boy           1/3 (*)
Boy             Girl          1/3
Girl            Girl          0 (impossible)

There would be four outcomes, but the information given reduces the domain to three possible outcomes (one child is a boy). One of the three cases is boy-boy, therefore the probability is 1/3 or about 33.33%.

We have more information in Case 1, which allows us to narrow down the domain of possible outcomes and give a result that matches our intuition.

Case 2 looks very similar, but in fact, it includes less information. We have no idea as to whether the older or younger child is a boy, therefore the domain of possible outcomes is larger, resulting in a non-intuitive answer.

These are both problems in conditional probability and we can solve them using the conditional probability formula, rather than enumerating examples.

P (A | B) = P(A and B) / P(B)

The trick is in how the problem is stated.

The outcomes that we are interested in are a sequence, not a single birth event. We are interested in a boy-boy outcome given some information.

First, let’s state a table of all possible sequences regardless of what information is given, e.g. the unconditional probabilities:

Younger Child | Older Child | Unconditional Probability
Girl            Boy           1/4
Boy             Boy           1/4
Girl            Girl          1/4
Boy             Girl          1/4

We can calculate the conditional probabilities using the table of unconditional probabilities.

In case 1, we know that the oldest child, or second part of the outcome, is a boy, therefore we can state the problem as follows:

P(boy-boy | {boy-boy or girl-boy})

We can calculate the conditional probability as follows:

= P(boy-boy and {boy-boy or girl-boy}) / P({boy-boy or girl-boy})
= P(boy-boy) / P({boy-boy or girl-boy})
= 1/4 / 2/4
= 0.25 / 0.5
= 0.5

In case 2, we know one child is a boy, but not whether it is the older or younger child; therefore, we can state the problem as follows:

P(boy-boy | {boy-boy or girl-boy or boy-girl})

We can calculate the conditional probability as follows:

= P(boy-boy and {boy-boy or girl-boy or boy-girl}) / P({boy-boy or girl-boy or boy-girl})
= 1/4 / 3/4
= 0.25 / 0.75
= 0.333

This is a useful illustration of how we might overcome our incorrect intuitions and achieve the correct answer by first enumerating the possible cases, and second by calculating the conditional probability directly.

Monty Hall Problem

A final classical problem in applied probability is called the game show problem, or the Monty Hall problem.

It is based on a real game show called “Let’s Make a Deal” and named for the host of the show.

The problem can be described generally as follows:

Problem: The contestant is given a choice of three doors. Behind one is a car, behind the other two are goats. Once a door is chosen, the host, who knows where the car is, opens another door, which has a goat, and asks the contestant if they wish to keep their choice or change to the other unopened door.

It is another classical problem because the solution is not intuitive and in the past has caused great confusion and debate.

Intuition for the problem says that there is a 1 in 3 or 33% chance of picking the car initially, and this becomes 1/2 or 50% once the host opens a door to reveal a goat.

This is incorrect.

We can start by enumerating all combinations and listing the unconditional probabilities. Assume the three doors and the user randomly selects a door, e.g. door 1.

Door 1 | Door 2 | Door 3 | Unconditional Probability
Goat     Goat     Car      1/3
Goat     Car      Goat     1/3
Car      Goat     Goat     1/3

At this stage, there is a 1/3 probability of a car, matching our intuition so far.

Then, the host opens another door with a goat, in this case, door 2.

The opened door was not selected randomly; instead, it was selected with information about where the car is not.

Our intuition suggests we remove the second case from the table and update the probability to 1/2 for each remaining case.

This is incorrect and is the cause of the error.

We can summarize our intuitive conditional probabilities for this scenario as follows:

Door 1 | Door 2 | Door 3 | Uncon. | Cond.
Goat     Goat     Car      1/3      1/2
Goat     Car      Goat     1/3      0
Car      Goat     Goat     1/3      1/2

This would be correct if the contestant did not make a choice before the host opened a door, e.g. if the host opening a door was independent.

The trick comes because the contestant made a choice before the host opened a door and this is useful information. It means the host could not open the chosen door (door1) or open a door with a car behind it. The host’s choice was dependent upon the first choice of the contestant and then constrained.

Instead, we must calculate the probability of switching or not switching, regardless of which door the host opens.

Let’s look at a table of outcomes given the choice of door 1 and staying or switching.

Door 1 | Door 2 | Door 3 | Stay | Switch
Goat     Goat     Car      Goat   Car
Goat     Car      Goat     Goat   Car
Car      Goat     Goat     Car    Goat

We can see that 2/3 cases of switching result in winning a car (first two rows), and that 1/3 gives the car if we stay (final row).

The contestant has a 2/3 or 66.66% probability of winning the car if they switch.

They should always switch.

We have solved it by enumerating and counting.

Another approach to solving this problem is to calculate the joint probability of the host opening doors to test the stay-versus-switch decision under both cases, in order to maximize the probability of the desired outcome.

For example, given that the contestant has chosen door 1, we can calculate the probability of the host opening door 3 if door 1 has the car as follows:

P(door1=car and door3=open) = 1/3 * 1/2
= 0.333 * 0.5
= 0.166

We can then calculate the joint probability of door 2 having the car and the host opening door 3. This is different because if door 2 contains the car, the host can only open door 3; it has a probability of 1.0, a certainty.

P(door2=car and door3=open) = 1/3 * 1
= 0.333 * 1.0
= 0.333

Having chosen door 1 and the host opening door 3, the probability is higher that the car is behind door 2 (about 33%) than door 1 (about 16%). We should switch.

In this case, we should switch to door 2.

Alternately, we can model the choice of the host opening door 2, which has the same structure of probabilities:

P(door1=car and door2=open) = 0.166
P(door3=car and door2=open) = 0.333

Again, having chosen door 1 and the host opening door 2, the probability is higher that the car is behind door 3 (about 33%) than door 1 (about 16%). We should switch.

If we are seeking to maximize these probabilities, then the best strategy is to switch.

Again, in this example, we have seen how we can overcome our faulty intuitions and solve the problem both by enumerating the cases and my using conditional probability.

Summary

In this post, you discovered how to develop an intuition for probability by working through classical thought-provoking problems.

Specifically, you learned:

How to solve the birthday problem by multiplying probabilities together.
How to solve the boy or girl problem using conditional probability.
How to solve the Monty Hall problem using joint probability.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

The post How to Develop an Intuition for Probability With Worked Examples appeared first on Machine Learning Mastery.

Go to Source