EN FR

PR-2: Conditional Probability

Module 2 · Probability Foundations

Section 1: Introduction

A rapid COVID test shows a positive result. The test has a 95% accuracy rate. Does that mean there’s a 95% chance you actually have COVID?

This is a conditional probability question — and the answer might surprise you. When COVID was rare in the population (say, 1% prevalence), a positive result on a 95%-accurate test meant the chance of actually having the disease was far lower than 95%. The mathematics behind this apparent paradox is the subject of today’s lesson.

In PR-1, you learned to compute — the probability that both events occur. That joint probability is a building block. In PR-2, we use it as the starting point for a more nuanced question: given that we already know one thing has happened, how does that change the probability of something else?

After this lesson, you will be able to:

  • Compute conditional probabilities using the formula and directly from two-way frequency tables.
  • Apply the General Multiplication Rule to find joint probabilities.
  • Construct and use tree diagrams to compute probabilities in multi-stage experiments.
  • Test whether two events are independent using both equivalent definitions.

In PR-1, you also saw a preview: mutually exclusive events and independent events are not the same thing. By the end of this lesson, you will have the tools to state precisely why — and to prove it algebraically.

Section 2: Prerequisites

What you need from PR-1

  • Reading joint and marginal frequencies from two-way tables (PR-1, Core Concepts C7–C9): Today’s conditional probability formula has a numerator and a denominator. Both come from table cells. You need to identify the joint count (intersection cell) and the marginal total (row or column total) quickly and reliably.
  • Computing (PR-1, Core Concepts C8): The joint probability is the numerator of the conditional probability formula. If you can read it from a table, you already have half the formula.
  • Identifying mutually exclusive events (PR-1, Core Concepts C10): PR-1 ended with a preview: mutually exclusive events and independent events are distinct concepts. Today we resolve that preview formally — you need to know what “mutually exclusive” means to appreciate why it implies dependence.

Retrieval Checkpoint

A school surveyed 80 students about study habits and exam results. The two-way table below shows the data.

Passed examDid not passTotal
Studies in a group36945
Studies alone241135
Total602080

What is ?

Also confirm you can recall:

Success Factor:

From endpoint to numerator. In PR-1, was an endpoint — the answer to “what is the probability of both?” In PR-2, it becomes the numerator of a more nuanced calculation. We also resolve the mystery from PR-1 C10: what does “independent” mean precisely, and how is it different from “mutually exclusive”?

Section 3: Core Concepts

Navigation Guide — 8 Concepts

  • C1–C2: The core idea — restricting the sample space — and the formula that captures it. Use these every time you see “given.”
  • C3–C4: Two tools for computing joint probabilities: rearranging the formula (Multiplication Rule) and drawing tree diagrams.
  • C5–C6: What independence means precisely, and how to test it — two equivalent ways.
  • C7: Resolves the PR-1 preview: why mutually exclusive events are never independent.
  • C8: The most dangerous mistake in applied probability — reversing the conditioning direction.

C1 — Conditional Probability: The Concept

Imagine a bag with 20 marbles: 8 red, 6 blue, 6 green. If you draw a marble at random, .

Now suppose someone peeks and tells you: “The marble is not green.” That information restricts the sample space. You no longer consider all 20 marbles — only the 14 that are not green. Of those 14, how many are red? Still 8. So the probability of red, given not green, is .

This is the idea of conditional probability. Conditioning on an event means:

  1. Throw away all outcomes not in — they are impossible given what we know.
  2. Renormalize — the remaining outcomes now form the new universe, and their probabilities must sum to 1.

We write — read “the probability of given ” — for the probability of once the sample space has been restricted to .

When conditioning on , the denominator is — not . Using the full sample space as denominator gives , not . Every time you see the word “given,” ask: what is the restricting event? That is always the denominator.


C2 — Conditional Probability: The Formula

Conditional Probability Formula

For any events and with :

The numerator is the joint probability of both events. The denominator is the probability of the conditioning event (the event we know has occurred).

Reading conditional probabilities directly from a two-way table:

When data is presented as counts in a table, you don’t need to compute and separately. Work directly with counts:

The numerator is the cell where row and column (or row and column ) intersect. The denominator is the row or column total for — not the grand total.

Mini-example — two-way table method:

In the retrieval checkpoint table above (80 students): what is ?

Numerator: 36 students who both study in a group and passed. Denominator: 45 students who study in a group (row total for ).

Note: this is NOT the same as . Direction matters.

and are computed with the same numerator () but different denominators ( vs. ). They are equal only in special cases. Never swap the conditioning direction — it changes the meaning of the calculation entirely.

See the denominator switch in action:

Click “Condition on B” to collapse the non-B row. Watch the denominator move from the grand total to B’s row total — that shift is the entire difference between and .

A
(Passed exam)
A′
(Did not pass)
Row total
B
(Studies in group)
36 9 45
B′
(Studies alone)
24 11 35
Column total 60 20 80
P(A | B) = count(A ∩ B) = 36 denominator = 80 = 36/80 = 0.45 ← P(A∩B), not P(A|B)

If you divide by n = 80 you get P(A∩B) = 0.45, not P(A|B). Condition on B to restrict the universe.

The non-B row collapses — B becomes the new universe

Explore the formula interactively:

A 10 by 10 grid of dots representing a sample space of 100 outcomes. Dots are colored by region: A only, B only, A intersect B, or neither.
P(A) = 0.30 P(B) = 0.25 P(A∩B) = 0.10
P(A | B) = P(A∩B) / P(B) = 0.10 / 0.25 = 0.40

Drag the sliders to adjust the sizes of A, B, and their overlap. Then click “Condition on B” to restrict the sample space to B. Notice how changes when B is large vs. small — even for the same overlap. The independence indicator lights up green when conditioning on B tells you nothing new about A.


C3 — General Multiplication Rule

Rearranging the conditional probability formula gives a powerful tool for computing joint probabilities:

General Multiplication Rule

This works for any two events — dependent or independent. Use it whenever you know a conditional probability and need the corresponding joint probability.

Derivation: Start from . Multiply both sides by :

The same rearrangement starting from gives the symmetric form.

Mini-example: A bag contains 5 red and 3 blue marbles. Two marbles are drawn without replacement. Find .

. Given the first is red, (only 7 remain, 3 blue).

The simplified multiplication rule applies only to independent events. Do not use it unless you have verified independence first. For dependent events (like drawing without replacement), the General Multiplication Rule with the correct conditional probability is required.


C4 — Tree Diagrams for Multi-Stage Experiments

A tree diagram is a visual tool for applying the Multiplication Rule across sequential stages of an experiment.

Labeling convention (use this in every problem):

Three rules for tree diagrams:

  1. Probabilities on branches from any single node must sum to 1.
  2. The joint probability of a complete path = product of its branch probabilities.
  3. The total probability of an event across multiple paths = sum of the joint probabilities of all paths that lead to it.
Tree diagram: root node splits into Line 1 (probability 0.60) and Line 2 (probability 0.40). From Line 1: Defective (conditional probability 0.02, joint 0.012) and Good (0.98, joint 0.588). From Line 2: Defective (0.05, joint 0.020) and Good (0.95, joint 0.380).P = 0.60P = 0.40Line 1Line 2P(D|L1) = 0.02P(G|L1) = 0.98P(D|L2) = 0.05P(G|L2) = 0.95Defective0.60 × 0.02 = 0.012Good0.60 × 0.98 = 0.588Defective0.40 × 0.05 = 0.020Good0.40 × 0.95 = 0.380
Stage 1 branches carry unconditional probabilities. Stage 2 branches carry conditional probabilities. Joint probabilities appear at the end of each path.

Try it yourself — complete the tree:

Enter your own branch probabilities, or click “Load Example” to pre-fill the factory scenario above. The path products compute automatically; each pair of branches from one node must sum to 1.

Enter branch probabilities (0–1). Each pair from one node must sum to 1. Path probabilities = product of the two branches along that path.

Fill all inputs to verify the total sum.

Full example — factory with two assembly lines (tabular form):

A factory has two assembly lines. Line 1 produces 60% of output with a 2% defect rate. Line 2 produces 40% with a 5% defect rate.

Stage 1 (Line)P(Line)Stage 2 (Defect)P(Defect | Line)Joint Probability
Line 10.60Defective0.020.60 × 0.02 = 0.012
Line 10.60Good0.980.60 × 0.98 = 0.588
Line 20.40Defective0.050.40 × 0.05 = 0.020
Line 20.40Good0.950.40 × 0.95 = 0.380

(sum of defective paths)

Check: all four joint probabilities sum to 1.000 ✓


C5 — Independent Events: Definition

Independent Events

Events and are independent if:

Knowing that occurred does not change the probability of . Equivalently, . Independence is a symmetric property — if is independent of , then is independent of .

Intuition: The marble-bag example in C1 showed that conditioning changed the probability (0.40 → 0.57). That means the events were dependent. For independent events, the conditional probability equals the unconditional probability — conditioning gives no new information.

Mini-example: Suppose and . If these are independent, then — knowing someone is left-handed tells us nothing about their coffee preference.


C6 — Independence Test and the Simplified Multiplication Rule

Independence Test and Simplified Rule

To test independence: compute and compare to .

  • If : events are independent
  • If : events are dependent

Simplified Multiplication Rule (for independent events only):

Equivalent check: Compare to . Both methods give the same conclusion — use whichever is easier with the given information.

Mini-example: In a group of 200 people: 80 own a car (P = 0.40), 70 have a gym membership (P = 0.35), and 28 have both (P = 0.14).

Test: .

Conclusion: the events are independent.

Confirm:

See the test visually: Drag P(A∩B) until the actual bar matches the expected bar — that is independence. Pull it higher or lower to see positive/negative dependence.

Bar chart comparing expected P(A)·P(B) to actual P(A∩B).
Expected under independence: P(A)·P(B) = 0.180
P(A∩B) = 0.18 = P(A)·P(B) = 0.180 → Events are INDEPENDENT

C7 — Independence vs. Mutual Exclusivity

In PR-1, we noted that mutually exclusive events and independent events are not the same thing. Now we have the tools to be precise about why.

PropertyMutually Exclusive (ME)Independent
Definition — cannot co-occur gives no info about
Joint probabilityAlways 0 (when independent)
Conditional probability (if )
RelationshipME events with , are always dependentIndependent events with , can never be ME

Proof that ME implies dependence: If and are mutually exclusive with and :

Since , the events are dependent. Knowing occurred makes impossible — the most extreme form of dependence.

“These events can’t happen at the same time, so they must be independent.” This is backwards. Mutually exclusive events (with positive probability) are the most extreme form of dependence — knowing one occurred makes the other impossible. Independence requires that gives no information about , which is the opposite of ME.

Both panels respond to “Condition on B” simultaneously — see what each type of relationship looks like when the sample space is restricted to :

Two Venn diagrams side by side. Left diagram: mutually exclusive events A and B — circles do not overlap. Right diagram: independent events A and B — circles overlap proportionally to P(A)·P(B).
Expected P(A∩B) if independent = P(A)·P(B) = 0.140
Mutually Exclusive → always dependent (knowing B ⇒ A is impossible; P(A|B) = 0 ≠ P(A))
Independent → never mutually exclusive (when P(A),P(B) > 0) (knowing B tells you nothing about A; P(A|B) = P(A))

C8 — Asymmetry of Conditioning

In general, . The direction of conditioning carries meaning.

Medical testing example:

If the disease affects only 1% of the population, a person with a positive test may still have only a ~16% probability of actually having the disease (computed via tree diagram — see Boss Fight Path A). The 95% sensitivity is real, but it answers a different question than the one a patient cares about.

The Prosecutor’s Fallacy: Reversing the conditioning direction — treating as if it were — is called the Prosecutor’s Fallacy. It has led to wrongful convictions when expert witnesses stated that “the probability of this DNA match given innocence is 0.0001” and the jury interpreted it as “the probability of innocence given this match is 0.0001.” These are very different quantities.

“The test is 90% accurate, so if I test positive, I have a 90% chance of having the condition.” This confuses with . The actual probability of disease given a positive test depends on prevalence and cannot be read directly from the accuracy rate. For rare conditions, the two values can differ dramatically.

See the numbers come alive: The grid below shows 400 people. Orange dots have the disease; blue dots are false alarms. Drag the prevalence slider down toward 1% — watch the orange (true positive) dots become vastly outnumbered by blue (false positive) ones, even as test accuracy stays fixed. Then click “Condition on Positive Test” to see only who tests positive.

A 20 by 20 grid of person icons color-coded by test outcome and disease status.
True Positive — disease & test+
False Negative — disease & test−
False Positive — no disease & test+
True Negative — no disease & test−
P(disease) = 2.0%
P(+ | disease) = 90%
P(+ | no disease) = 5.0%
P(+) — Law of Total Prob = 6.70%
Of 400 people: 7 TP + 20 FP = 27 positive tests
P(disease | positive) = 26.2%
P(positive | disease) = 90% but P(disease | positive) = 26.2% — a difference of 64 pp

Section 4: Worked Examples

Example 1 — Fully Worked: Conditional Probability from a Two-Way Table

A company surveyed 200 employees on remote work preference and tenure length. The results:

Prefers remotePrefers in-officeTotal
Tenure > 5 years543690
Tenure ≤ 5 years6644110
Total12080200

Find and .

I notice the first question asks “given tenure > 5 yr” — that tells me the conditioning event. The denominator will be the row total for long-tenure employees.

Part 1:

  • Numerator: cell count for (remote AND tenure > 5) = 54
  • Denominator: row total for tenure > 5 years = 90

Part 2:

  • Numerator: same cell — 54 (intersection is symmetric)
  • Denominator: column total for remote = 120

I pause to check: are these the same? . They share the same numerator but different denominators. The direction of conditioning changed the denominator — and therefore the answer. This is C8 in action.


Example 2 — Prediction Checkpoint: Independence Test

A gym surveyed 300 members: 180 have a gym membership, 120 report eating healthily most days, and 72 have both.

Before computing: Based on context, do you think gym membership and healthy eating are independent or dependent? Write your prediction and reasoning, then check below.

Test using multiplication:

Since , the events are independent in this sample.

Confirm using conditional probability:

The independence is confirmed by both tests. Note that this result does not mean gym membership and diet are causally unrelated — independence in a sample is a statistical finding, not a causal claim.


Example 3 — Details/Summary: Tree Diagram

A factory has two assembly lines. Line 1 produces 60% of parts with a 2% defect rate; Line 2 produces 40% with a 5% defect rate.

(a) Find . (b) Given a part is defective, find .

Show Solution

Tree diagram paths and joint probabilities:

PathJoint Probability
Line 1 → Defective
Line 1 → Good
Line 2 → Defective
Line 2 → Good

(a)

(b)

Even though Line 1 produces 60% of all parts, it accounts for only 37.5% of defective parts — because its defect rate (2%) is much lower than Line 2’s (5%).


Example 4 — Find the Error

A student reads: “Among people who exercised regularly, 80% also reported good sleep.” The student writes:

Student’s work:

“The problem states that 80% of regular exercisers have good sleep, so:"

"I can now use this to find the probability that a person who has good sleep also exercises.”

Identify and correct the error:

The statement “80% of exercisers reported good sleep” translates to . The student has reversed the conditioning direction, writing instead.

These are different quantities:

The error is an instance of the Prosecutor’s Fallacy (C8): treating . The 0.80 cannot be assigned to the reversed conditional without knowing and .

Section 5: Guided Practice

Problem 1 — Reading Conditional Probability from a Two-Way Table

Context: 120 students were surveyed about study location and quiz performance.

Passes quizDoes not passTotal
Studies in library481260
Studies elsewhere362460
Total8436120

Let = “passes quiz” and = “studies in library.” Find .

(a) Which value is the denominator of ?

(b) Compute .

Context: 200 commuters were surveyed about transportation mode and on-time arrival.

Arrives on timeLateTotal
Takes public transit7248120
Drives562480
Total12872200

Let = “arrives on time” and = “takes public transit.” Find .

(a) Which value is the denominator of ?

(b) Compute .

Context: 150 gym members were surveyed about membership type and weekly frequency.

Uses gym 3+ times/weekLess frequentTotal
Student member543690
Non-student member243660
Total7872150

Let = “uses gym 3+ times/week” and = “student member.” Find .

(a) Which value is the denominator of ?

(b) Compute .

Context: 100 adults were surveyed about diet type and self-reported energy levels.

High energyLow energyTotal
Vegetarian281240
Non-vegetarian362460
Total6436100

Let = “high energy” and = “vegetarian.” Find .

(a) Which value is the denominator of ?

(b) Compute .

Context: 200 students were surveyed about work status and course completion.

Completes on timeDoes not completeTotal
Works full-time4872120
Does not work full-time562480
Total10496200

Let = “completes on time” and = “works full-time.” Find .

(a) Which value is the denominator of ?

(b) Compute .


Problem 2 — General Multiplication Rule

Given and , apply the General Multiplication Rule to find .


Problem 3 — Tree Diagram Path Probability

A two-stage experiment is described. Find the joint probability of a specified path by multiplying along the branches.


Problem 4 — Independence Analysis

A group of 200 students is surveyed: 90 play video games regularly, 80 exercise regularly, and 36 do both.

(a) What is ?

(b) Test independence using the multiplication test. Compute and compare to .

(c) Confirm using the conditional probability definition. Compute and compare to .

(d) A classmate says: “Since there are students who play video games AND exercise, the events can’t be mutually exclusive. And since they can co-occur, they must be independent.” Evaluate both claims.

Section 6: Independent Practice

Problem 1 — Two Conditional Probabilities from a Table

Context: 200 adults were surveyed about vitamin use and cold frequency.

Fewer coldsNormal/more coldsTotal
Takes multivitamins7248120
Does not take vitamins364480
Total10892200

Let = “fewer colds” and = “takes vitamins.”

(a) Compute .

(b) Compute .

(c) A classmate says: “The probability of fewer colds given you take vitamins equals the probability of taking vitamins given fewer colds.” Is this correct?

Context: 150 students were surveyed about fiction reading and stress levels.

Lower stressNormal/high stressTotal
Reads fiction regularly453075
Does not read fiction304575
Total7575150

Let = “lower stress” and = “reads fiction regularly.”

(a) Compute .

(b) Compute .

(c) A classmate says: ” for this data — so direction doesn’t matter.” Is this a valid general conclusion?

Context: 200 commuters were surveyed about bike commuting and on-time arrival.

On time (within 10 min)LateTotal
Commutes by bike562480
Does not bike7248120
Total12872200

Let = “on time” and = “bikes.”

(a) Compute .

(b) Compute .

(c) Are and equal?

Context: 150 adults were surveyed about part-time work and car ownership.

Owns a carNo carTotal
Works part-time362460
Does not work part-time543690
Total9060150

Let = “owns a car” and = “works part-time.”

(a) Compute .

(b) Compute .

(c) Are these events independent? (Hint: compare to , and to .)

Context: 200 students were surveyed about planner app use and deadline adherence.

Meets all deadlinesMisses at least oneTotal
Uses planner app6040100
Does not use planner4060100
Total100100200

Let = “meets all deadlines” and = “uses planner app.”

(a) Compute .

(b) Compute .

(c) A classmate says: “The probability of meeting deadlines given you use a planner equals the probability of using a planner given you meet deadlines.” Is this true?


Problem 2 — Total Probability from a Tree Diagram

A three-branch experiment with different conditional probabilities at stage 2. Find a joint probability and the total probability of a stage-2 outcome.


Problem 3 — Independence Test from Given Probabilities

Given: , , .

(a) Are “buys premium coffee” and “owns a car” independent?

(b) What is ?

Given: , , .

(a) Are “reads news daily” and “votes in every election” independent?

(b) Compute .

Given: , , .

(a) Are “subscribes to streaming” and “watches live sports” independent?

(b) Compute .

Given: , , .

(a) Are “exercises 3+ times/week” and “sleeps 7+ hours/night” independent?

(b) Compute .

Given: , , .

(a) Are “uses budgeting app” and “has no credit card debt” independent?

(b) Compute .


Problem 4 — Find the Error

A doctor’s claim: “Our test detects 90% of true positives. So if your test came back positive, there’s a 90% chance you have the disease.”

Patient’s interpretation: ""

What is the specific error?

Show Correction

The stated 90% is sensitivity: . This answers “given you have the disease, how likely is a positive test?”

The patient’s question is: “given I tested positive, how likely do I have the disease?” — which is .

These are different. Without knowing the prevalence (base rate of disease in the population), cannot be determined. For rare diseases (say 1% prevalence), can be dramatically lower than 90%, even with a 90%-sensitive test.

Student’s work:

Given , , and .

What is the specific error?

Show Correction

The conditional probability formula is . The denominator is always — the probability of the conditioning event — not .

Correct calculation:

Using as the denominator is equivalent to computing — the joint probability — not the conditional.

Student’s claim: “Taking the bus and being on time are completely different things — they’re independent. Therefore:

The actual table gives .

What is the specific error?

Show Correction

The simplified rule requires verified independence. “Being different things” does not imply independence — independence is a mathematical property, not a conceptual one.

The test: .

Since the values are unequal, the events are dependent. The correct joint probability is 0.25 (from the table), not 0.195.

Student’s work: From a two-way table, the student computes . Then writes:

“By symmetry, as well.”

The actual table gives .

What is the specific error?

Show Correction

and share the same numerator , but they have different denominators:

They are equal only when . In general they differ — here 0.28 ≠ 0.42. Always compute each direction separately using its own denominator.

Student’s reasoning: “I computed , so events and are mutually exclusive. Since they don’t overlap, they don’t affect each other — therefore they’re independent.”

What is the specific error?

Show Correction

Mutually exclusive events (with and ) are always dependent, never independent.

Proof: . But . So — the events are dependent.

Intuition: if and cannot co-occur, knowing happened makes impossible. That is the most extreme form of dependence — knowing completely eliminates .


Problem 5 — Multi-Step Synthesis

A company’s customer database contains 500 customers. The data cross-tabulates loyalty card status and repeat purchase behavior.

(a) Complete the 2×2 frequency table. What is the count of customers who have a loyalty card but did NOT make a repeat purchase?

(b) Compute and .

(c) Test independence: compute and compare to .

(d) Compute and . Are they equal?

(e) A marketing manager says: “Since 50% of loyalty card holders made a repeat purchase, our loyalty program is working — half our loyal customers come back.” A skeptical analyst replies: “That number doesn’t tell us if the loyalty card caused the repeat purchase, and it doesn’t account for base rates.” Who is correct?

Section 7: Mastery Check

No hints. No scaffolds. Model answers are behind the disclosure triangle.

Question 1 — Feynman Test

Explain to a classmate who missed today’s lesson why the probability of having a disease given a positive test result is NOT the same as the probability of testing positive given you have the disease. Use the words “conditioning direction” and give a concrete example where the two values are at least 0.50 apart from each other.

0 / 500
Model Answer

The conditioning direction specifies which event we already know happened. answers “given I got a positive test, what’s the chance I’m actually sick?” while answers “given I’m sick, what’s the chance the test catches it?” These are two different questions with two different denominators — one restricts the universe to positive-test people, the other to sick people.

Concrete example: A disease affects 2% of the population. The test has 90% sensitivity () and 95% specificity ().

Using a tree diagram on 10,000 people:

  • 200 have the disease; 180 test positive (90% of 200)
  • 9,800 don’t have it; 490 still test positive (5% false positive rate)
  • Total positive tests: 670

The two values: vs. — a difference of 0.63. Reversing the conditioning direction without accounting for base rates leads to dramatic overestimation of disease probability after a positive test.


Question 2 — Two-Stage Quality Control

A manufacturing plant uses a two-stage quality check. Stage 1 inspection catches 85% of defective items (). Items that pass stage 1 go to stage 2; stage 2 catches 70% of the defectives that reached it (). Suppose 4% of all items are defective.

(a) Which tool is most appropriate for organizing this problem?

(b) What is ?

(c) Are “caught at stage 1” and “caught at stage 2” independent events for a defective item?

Show Solution

Tree diagram for a defective item:

Stage 1PStage 2P (given passed S1)Joint
Caught0.85(removed)0.85
Passes0.15Caught0.700.15 × 0.70 = 0.105
Passes0.15Passes0.300.15 × 0.30 = 0.045

This 0.045 is the probability that a defective item passes both stages, given it’s defective.

Joint probability (defective AND passes both):

The stages are dependent because stage 2 only sees items that passed stage 1. is a conditional probability on a restricted population — it cannot equal the unconditional because stage 2 never sees items caught at stage 1.


Question 3 — Error Analysis

A student’s work:

A study found that and . The student needs .

”Since there’s no reason these would be related, I’ll apply the multiplication rule: .”

The actual data shows .

Which statement best identifies the error?

Show Analysis

The simplified rule is valid only for independent events. The student reasoned from context (“no reason these would be related”) rather than running the test.

Correct procedure:

  1. Compute
  2. Compare to the actual
  3. Since , the events are dependent

The correct joint probability is 0.20 — from the data, not from the formula. The formula cannot substitute for actual data; it can only confirm independence after the fact.

What makes this valid: is a consequence of independence — a fact you verify, not an assumption you make from context.


Self-Assessment

How confident are you with this material?

Still confusedReady for the Boss Fight

Section 8: Boss Fight

Choose your path. Both require integrating all eight concepts from this lesson.

🔬 Path A — The Medical Analyst

A clinical lab tests patients for a rare condition. Sensitivity and specificity are known. Compute the probability that someone with a positive test actually has the disease — and explain why the answer surprises.

📊 Path B — The Policy Analyst

An insurance company claims that age and accident involvement are independent events. Given a 2×2 frequency table, test the claim mathematically, identify the error in their reasoning, and correct their calculation.

Path A — The Medical Analyst

A clinical lab tests patients for a rare condition affecting 2% of the population. The test has:

  • Sensitivity = 0.90:
  • Specificity = 0.95:

Task 1. Set up and label the complete tree diagram for a randomly selected patient.

Include: Stage 1 branches (disease / no disease) with unconditional probabilities. Stage 2 branches from each node (positive / negative test) with conditional probabilities. Joint probabilities at all four path ends.


Task 2. Using your tree, compute — the probability that a randomly selected person from this population tests positive.

Show the full calculation: identify the two paths that lead to a positive result, compute each joint probability, then sum them.


Task 3. Compute using the conditional probability formula and your tree results.

Then compare it to the 90% sensitivity. Write 2–3 sentences explaining why the answer might surprise someone who knows the test is “90% accurate.”


Task 4. Compute . What does this number tell a patient who tests negative?


Reflection:

0 / 500

Path B — The Policy Analyst

An insurance company claims: “Our data show that being a young driver and being involved in an accident are independent events, so we multiply P(young driver) × P(accident) to find the probability that a randomly selected driver is both young and has had an accident.”

The company’s database contains 5,000 records:

Accident occurredNo accidentTotal
Young driver (≤25)1801,3201,500
Older driver (>25)1403,3603,500
Total3204,6805,000

Task 1. Test the independence claim using the multiplication test.

Compute and compare to . State your conclusion.


Task 2. Confirm using the conditional probability comparison.

Compute and . Are they equal? What does the comparison tell you?


Task 3. The company computed to set premiums for young drivers. Explain the consequence of applying the simplified multiplication rule when the events are actually dependent.


Task 4. Compute the correct from the table. Compare it to the company’s incorrect calculation. By what factor does the company underestimate (or overestimate) the true joint probability?


Reflection:

0 / 500

Section 9: Challenge Problems

Ready for more? These go beyond the lesson objectives.

Challenge 1 — Algebraic Derivation

Start from the conditional probability formula .

(a) Rearrange the formula algebraically to derive the General Multiplication Rule. Show each step.

(b) If and are independent, substitute the definition of independence () into the General Multiplication Rule. What do you get?

(c) In a tree diagram, if stage 1 and stage 2 are independent experiments, what is special about the stage-2 branch probabilities?


Challenge 2 — Redundant Inspection Systems

Three independent quality inspectors each miss a defect with probability 0.15. (A “miss” means the inspector fails to catch an existing defect.)

(a) Compute .

(b) Compute .

(c) How many inspectors are needed to reduce below 0.001?


Challenge 3 — Conditional Probability Across Three Groups

A three-group conditional probability problem. Compute conditional probabilities within each group and observe how they vary.

Section 10: Solutions Reference

View Full Solutions Page

The solutions page contains complete worked answers for all practice problems in Sections 5–9, including all five variant bank variants, generator problem examples, and Boss Fight both paths.

Quick-Reference Formulas — PR-2

FormulaPurposeWhen to use
Conditional probabilityAny time you see “given” — requires
General Multiplication RuleFind joint probability from conditional — works for dependent or independent events
Simplified Multiplication RuleIndependent events only — verify independence first
Definition of independenceTest independence: if this holds, events are independent

Key distinctions to remember:

  • vs. : Same numerator, different denominators. Not equal in general. Direction matters.
  • Mutually exclusive vs. independent: ME events (with positive probability) are always dependent. They are opposite concepts, not related ones.
  • Simplified vs. General Multiplication Rule: The simplified rule is a special case of the general rule, valid only when independence is confirmed.
  • from a table: Numerator = intersection cell; denominator = marginal total for (row or column total for the conditioning event, not the grand total).