EN FR

INF-4: Confidence Intervals for a Proportion

Module 4 · Statistical Inference

Section 1: Introduction

Every election season, pollsters publish results like this: “Candidate A leads with 54% support — margin of error ±3 percentage points, 19 times out of 20.” That last phrase — “19 times out of 20” — is not a throwaway disclaimer. It is the heart of a 95% confidence interval. But where does the ±3 come from? How do pollsters know what sample size they need to hit that precision? And what exactly are they claiming when they publish an interval?

In INF-2, you built confidence intervals for a population mean using the formula . This lesson keeps the same skeleton — point estimate ± margin of error — but swaps in a new statistic: the sample proportion . The central challenge is that proportions have their own standard error formula, and that formula introduces a subtlety that trips up almost everyone the first time.

Here is the real-world question we will be able to answer by the end of this lesson: A Léger Research poll contacts 1,000 Canadians and finds that 540 support a proposed climate policy. How confident can we be in the true level of support across all Canadians, and how many people would we need to survey to narrow that uncertainty further?

After this lesson, you will be able to:

  • Compute the sample proportion and identify it as a point estimate for the population proportion
  • Verify the conditions required for the z-interval to be valid ( and )
  • Construct a two-sided confidence interval for using
  • Interpret a confidence interval correctly — in terms of the method’s long-run reliability, not the probability that falls in any single interval
  • Determine the minimum sample size needed to achieve a target margin of error, both with and without a prior estimate of
  • Construct one-sided confidence bounds and recognize when they are more appropriate than a two-sided interval

If INF-2 gave you the blueprint, this lesson gives you a new set of tools for a different kind of question. The math is closely parallel — which means most of what you already know transfers directly.

Section 2: Prerequisites

Confidence intervals for proportions are built on the same “Point Estimate ± Margin of Error” framework you mastered in INF-2.

  • From INF-2: Critical Z-Values. The same values ( for 95%) apply whenever we use the normal distribution as our model.
  • From INF-2: The Quadruple Rule. To cut the margin of error in half, you must multiply the sample size by four.
  • Percentages to Proportions: is the same as . You must use the decimal form in all formulas.
  • Square Root Properties: . You will be working with , so ensure you are taking the square root of the entire result.

Retrieval Checkpoint

A poll reports a 95% confidence interval for a proportion as . What is the margin of error () in this study?

Success Factor:

  • In this lesson, we use only if the sample size is “large enough.” If or , the normal distribution is a poor model and these methods cannot be used. Always check your conditions first.

Retrieval Warm-up — from earlier lessons

A random sample of measurements is taken from a population with and . Which of the following is the correct statement about the sampling distribution of ?

You read a news article: “A new 95% CI shows the government approval rating is between 42% and 50%.” A researcher says this is the sample size determination problem — she wants to cut the margin of error in half before the next election. By what factor must she multiply the sample size?

Section 3: Core Concepts

Navigation tip: Eight concepts live in this section. They build on each other in order — C1 through C4 give you the interval formula, C5 tells you when to use it, C6 tells you how to talk about it, C7 adds the sample-size tool, and C8 covers one-sided variants. If you already know INF-2 well, C1–C4 will go quickly.

C1 — The Sample Proportion

Suppose we want to know what fraction of Montréal adults have a library card. We can’t ask all 1.8 million — so we sample people and count how many () have one. The sample proportion is simply that fraction:

Sample Proportion

If individuals in a random sample of size have a characteristic of interest, the sample proportion is:

(read “p-hat”) estimates the unknown population proportion . must be a count (a whole number); is the sample size.

Notice the notation carefully: is the true proportion in the whole population — unknown, fixed, a parameter. is what we compute from our data — known, varies sample to sample, a statistic. This distinction is going to matter a great deal when we write the standard error formula.

and are not the same thing. is the true population proportion — it exists, but we don’t know its value (that’s why we’re building an interval). is our best guess from the data. We will always use in our calculations, never — because we don’t have .


C2 — The Standard Error of

In INF-2, the standard error of was . Where did that come from? From the variance of a sum of random variables. The same idea applies here — but for a proportion.

Think of each sampled person as a Bernoulli trial: success (has the characteristic) with probability , failure with probability . The sample count is binomial. Recall from PR-4 that a binomial random variable has variance . Dividing by to get the proportion gives variance . Taking the square root gives the standard error:

Standard Error of a Proportion

The theoretical standard error of is:

Since is unknown, we substitute to get the estimated standard error:

Notice: we estimate an estimate. We use to estimate for the SE formula itself. This works fine in practice, but it’s one more reason the conditions check (C5) matters — the approximation is only reliable when is large enough.

Never write in a calculation. You do not know . The formula that goes into actual computations always uses : . Only the theoretical expression uses .


C3 — Margin of Error

The margin of error is the ”±” part of the interval. It answers: “How far from do we need to reach to be confident we’ve captured ?”

Margin of Error for a Proportion

where is the critical value for the desired confidence level (1.645 for 90%, 1.96 for 95%, 2.576 for 99%).

The margin of error is the half-width of the interval — not the full width. A 95% CI with spans 0.06 units in total. When a poll reports “margin of error ±3 points,” that ±3 is exactly .

The margin of error is half the interval width. If , the interval is , which has total width 0.08, not 0.04. The ”±” already signals this — but it’s easy to report as the width when writing up results.


C4 — The Confidence Interval Formula

Two-Sided Confidence Interval for a Proportion

A confidence interval for the population proportion is:

which gives the interval .

Conditions must be checked before applying this formula (see C5).

The structure is identical to INF-2: point estimate ± (critical value × standard error). Only the point estimate and SE formula have changed. Everything else — the z* values, the interpretation logic, the sample size approach — carries over directly.


C5 — Conditions for the z-Interval

The CI formula works because is approximately normally distributed when the sample is large enough (this follows from the CLT applied to a proportion). “Large enough” has a specific meaning here:

Conditions for the z-Interval for a Proportion

Before computing a CI for , verify both:

  1. — at least 5 successes in the sample
  2. — at least 5 failures in the sample

Also assume the data come from a random sample (or can be treated as one), and that of the population size (so observations are approximately independent).

Intuitively: if almost everyone in your sample is a “success” (say, with ), the distribution of is strongly skewed — the normal approximation breaks down. The conditions and ensure there are enough of both outcomes for the CLT to kick in symmetrically.

Check conditions with , not . You’re checking whether your observed sample has enough successes and failures. Use and — these are literally the observed count of successes and failures in your sample (i.e., and ).


C6 — Correct Interpretation

This is the concept most students get wrong — even after getting the arithmetic right. The issue is subtle but important.

Imagine running 1,000 polls, each on a fresh random sample of the same size. Each poll gives a different , and therefore a different interval. A 95% CI procedure guarantees that 950 of those 1,000 intervals will contain the true . The other 50 will miss.

Now you run your poll. You get one specific interval — say, (0.48, 0.56). Either is in that interval or it isn’t. There is no randomness left: is a fixed (unknown) number, and your interval is a fixed pair of numbers.

Correct CI Interpretation

“We are 95% confident that the true population proportion lies between [lower] and [upper].”

This means: the method we used to build this interval captures the true in 95% of all possible samples of this size.

“There is a 95% probability that is between [lower] and [upper].” This statement is incorrect. is not random — it is a fixed population parameter. It doesn’t have a probability of being in a range; it either is or isn’t. The 95% refers to the procedure, not to the probability that is in any one specific interval.


C7 — Sample Size Determination

Suppose you want to design a poll. You want your margin of error to be at most at a given confidence level. How large does your sample need to be? Solve the margin-of-error formula for :

Sample Size Formula for a Proportion

To achieve a margin of error of at most :

where is your best prior estimate for . If no prior estimate is available, use — this maximizes and gives the largest (most conservative) sample size.

Always round up to the next whole number.

Why does give the largest sample? Because is maximized at (you can verify: , while and ). Using ensures you won’t undershoot the required sample size, no matter what the true turns out to be.

Always round sample size UP. If the formula gives 384.16, you need 385 people — not 384. Rounding down means you’ve committed to a margin of error larger than the target, which defeats the purpose of the calculation.


C8 — One-Sided Confidence Bounds

Sometimes you only care about one direction. A hospital might want to demonstrate that their patient satisfaction rate exceeds 80%. A regulator might want to show that a defect rate is below 5%. For these questions, a one-sided bound is more appropriate than a two-sided interval.

One-Sided Confidence Bounds

One-sided upper bound (claim: is at most this):

One-sided lower bound (claim: is at least this):

Note: use (not ), which is smaller. For 95% one-sided: (not 1.96).

The key difference: a one-sided 95% bound puts all 5% of uncertainty into one tail, so it uses . A two-sided 95% interval splits the 5% equally — 2.5% in each tail — giving . One-sided bounds are tighter than two-sided intervals at the same confidence level.

Looking ahead: In INF-6, you will use the proportion — a specific hypothesized value — to test a claim. The notation (observed) vs. (hypothesized) will become critical. Start noticing this distinction now.

Section 4: Worked Examples

Four examples with progressively less scaffolding. Work through each one at your own pace — the first is fully narrated so you can see the complete thought process.

Example 1 — A Transit Policy Poll (Fully Worked)

Problem: A Léger poll contacts 600 Quebecers at random. Of those, 312 say they support a proposed new transit line. Build a 95% confidence interval for the true proportion of Quebec residents who support the policy.

Step 1: Identify what we know.

  • (sample size)
  • (number of supporters in the sample)
  • Confidence level = 95%, so

Step 2: Compute the sample proportion.

So 52% of our sample supports the policy. This is our point estimate for .

Step 3: Check the conditions.

We need and :

Conditions are met. (Notice: and — the raw counts themselves.)

Step 4: Compute the standard error.

Step 5: Compute the margin of error.

Step 6: Build the interval.

Lower bound:   Upper bound:

Step 7: Interpret in context.

We are 95% confident that the true proportion of Quebec residents who support the transit policy is between 48.0% and 56.0%.

Reality check: The margin of error is ±4.0 percentage points, consistent with what you’d see in a real Léger poll of this size. A sample of 600 gives reasonably precise estimates for proportions near 0.5 — but the interval still spans 8 full percentage points. To cut that in half, you’d need to quadruple the sample size.


Example 2 — Quality Control (Partially Scaffolded)

Problem: A quality-control inspector randomly samples 80 juice bottles from a production run. She finds 12 with a filling defect. Build a 90% confidence interval for the true defect rate.

Setup: , , confidence level = 90%.

The defect rate in our sample is . At 90% confidence, we use . Do you expect the 90% interval to be wider or narrower than a 95% interval on the same data?

Step 1: Sample proportion.

Step 2: Check conditions.

Step 3: Standard error and margin of error.

Step 4: Interval.

Interpretation: We are 90% confident that between 8.4% and 21.6% of all bottles produced in this run have a defect.

Answer to the prediction: Narrower. Lower confidence → smaller → smaller . The 95% interval on the same data would be $0.15 \pm 1.96 \times 0.03992 \approx 0.15 \pm 0.078 = (0.072, 0.228) — noticeably wider.


Example 3 — Designing a CEGEP Survey (Minimally Scaffolded)

Problem: A researcher wants to estimate the proportion of CEGEP students in Quebec who work more than 15 hours per week during the semester. She wants a margin of error of at most 4 percentage points at 95% confidence. No prior estimate of the proportion is available. How many students must be surveyed?

Hint: Use the sample size formula with (worst case), , and . Solve for and round up.

Show Solution

Round up: students.

Interpretation: To guarantee a margin of error of at most 4 percentage points at 95% confidence — regardless of the true proportion — we must survey at least 601 students.

Common mistake: Rounding 600.25 down to 600. Always round up — 601 guarantees the target precision; 600 does not.


Example 4 — Hospital Patient Satisfaction (Application Twist)

Problem: A Montréal hospital tracks patient satisfaction. Out of 200 randomly selected discharged patients, 168 rate the discharge process as satisfactory. Hospital administration wants to demonstrate that the true satisfaction rate exceeds 80%. Compute a 95% one-sided lower bound and interpret it.

Why one-sided? The administration only cares about the lower end — can they claim the rate is at least some value? A one-sided lower bound is exactly right. For 95% one-sided, (not 1.96 — all the uncertainty goes into one tail).

Step 1:

Step 2: Conditions: ✓;

Step 3:

Step 4: One-sided lower bound:

Interpretation: We are 95% confident that at least 79.7% of patients are satisfied with the discharge process.

Does this support the claim that the rate exceeds 80%? Our lower bound is 79.7% — just barely below 80%. The interval does not provide 95% confidence that the rate exceeds 80% (79.7% < 80%). The administration’s claim is not supported at this confidence level. If they used 90% confidence (), the lower bound would be , which would support the claim — at lower confidence.

Section 5: Guided Practice

Work through each problem step-by-step. Use the dropdowns to make key decisions — each one targets a place where students commonly go wrong.

Problem 1 — Sample Proportion and Condition Verification (C1)

A random sample of 400 university students was asked whether they use a budget app. Of those, 148 said yes.

(a) What is , the sample proportion who use a budget app?

(b) Do the conditions for the z-interval hold?

A survey of 250 households found that 75 have a composting bin.

(a) What is ?

(b) Do the conditions for the z-interval hold?

In a clinical trial, 18 of 40 participants reported side effects.

(a) What is ?

(b) Do the conditions for the z-interval hold?


Problem 2 — Standard Error and Interval Construction (C2)

Using the survey from Problem 1, Variant 0: , , 95% confidence.

(a) Which formula gives the standard error of ?

(b) What is the 95% CI for ? (SE = √(0.37 × 0.63 / 400) = √(0.000583) ≈ 0.02414; E = 1.96 × 0.02414 ≈ 0.047)

From the composting survey: , , 90% confidence.

(a) Which formula gives the standard error?

(b) What is the 90% CI? (SE = √(0.30 × 0.70 / 250) = √(0.00084) ≈ 0.02898; E = 1.645 × 0.02898 ≈ 0.0477)

From the clinical trial: , , 99% confidence.

(a) Which formula gives the standard error?

(b) What is the 99% CI? (SE = √(0.45 × 0.55 / 40) = √(0.006188) ≈ 0.07866; E = 2.576 × 0.07866 ≈ 0.2026)


Problem 3 — Practical Interpretation of Confidence (C3)

A researcher surveys 500 adults and finds that 310 support stricter food labelling laws. The resulting 95% CI is (0.577, 0.663).

Which of the following is the correct interpretation?


Problem 4 — Determining Required Sample Size (C4)

A public health agency wants to estimate the proportion of adults in a city who have received this year’s flu vaccine. They want a margin of error of at most 3 percentage points () at 95% confidence. A previous survey found about 40% vaccination coverage.

(a) Which value of should be used in the sample size formula?

(b) What is the required sample size? (Formula: )

Section 6: Independent Practice

These problems have no step-by-step guidance — work through them on your own, then check the solution. Concepts are interleaved across the five problems.

Problem 1 — Build a Confidence Interval (Generative)


Problem 2 — Margin of Error and Interpretation

A national poll of 900 randomly selected Canadians finds that 513 support a proposed national pharmacare program. The resulting 95% CI is (0.537, 0.603).

(a) What is the margin of error, in percentage points?

(b) A news headline says: “Majority of Canadians back pharmacare (probability 95%).” What is wrong with this headline?

(c) What sample size would be needed to cut the margin of error in half, using this as a prior estimate?

Show Solution

(a) . The interval is , so . Margin of error: 3.3 percentage points.

(b) The headline says “probability 95%” as if is random. But is a fixed (unknown) parameter — there is no probability that it falls in any given interval. The correct language is: “We are 95% confident that the true proportion is between 53.7% and 60.3%.” The 95% refers to the procedure’s long-run reliability, not the probability for this specific interval.

(c) To halve : new . Using :

Round up: n = 3,460. (To halve E, you must quadruple the sample size.)

A university surveys 500 students and finds 185 report food insecurity. The 90% CI is (0.335, 0.405).

(a) What is the margin of error, in percentage points?

(b) A report states: “There is a 90% chance that between 33.5% and 40.5% of all students are food insecure.” Identify the error.

(c) What sample size would reduce the margin of error to 2 percentage points at 90% confidence, using this as prior?

Show Solution

(a) . . Margin of error: 3.5 percentage points.

(b) The error: treating the CI as a probability statement about . The true proportion is fixed; saying “90% chance” implies is random. Correct: “We are 90% confident the true proportion lies between 33.5% and 40.5%.”

(c) , , :

Round up: n = 1,578.

A sample of 300 employees at a large company finds 81 experiencing burnout symptoms. The 99% CI is (0.208, 0.332).

(a) What is the margin of error?

(b) HR states: “We’re 99% sure that between 20.8% and 33.2% of employees are burned out.” Is this a valid interpretation?

(c) Using as a prior, find the sample size for at 99% confidence.

Show Solution

(a) . . Margin of error: 6.2 percentage points.

(b) The phrase “99% sure” is colloquially acceptable if it means “the procedure captures the true proportion 99% of the time.” However, saying “we’re 99% sure” about this specific interval implies is random. More precise: “We are 99% confident the true proportion lies in this interval.”

(c) , , :

Round up: n = 1,454.


Problem 3 — Determine the Required Sample Size (Generative)


Problem 4 — When the Conditions Are Not Met

A researcher samples 30 rare-book collectors and finds 2 who own a first edition of a specific novel. She wants to build a 95% CI for the proportion of all rare-book collectors who own this edition.

(a) Check whether the z-interval conditions are met.

(b) What should the researcher do, given the outcome of the conditions check?

Show Solution

(a) .

  • fails (< 5)
  • — passes

The first condition is not met. The z-interval is not appropriate.

(b) Options: (1) Collect a larger sample until . (2) Use an exact method (e.g., Clopper-Pearson interval) designed for small counts. (3) Use the Wilson score interval, which handles extreme better. The standard z-interval should not be reported — it would be unreliable.

A quality inspector samples 20 ultra-high-precision components and finds 1 defective. She wants a 95% CI for the defect rate.

(a) Check the z-interval conditions.

(b) What should she do given the result?

Show Solution

(a) .

  • fails (< 5)
  • — passes

Condition fails — z-interval is not valid.

(b) The inspector should inspect a larger sample (e.g., 100+ components) or use an exact binomial confidence interval. With n = 20 and only 1 defective, the normal approximation is too rough to be reliable.

A biologist samples 15 nesting sites and finds 14 occupied by the target species. She wants a CI for the occupancy rate.

(a) Check the z-interval conditions.

(b) What is the conclusion, and what should she do?

Show Solution

(a) .

  • — passes
  • fails (< 5)

The second condition fails. The z-interval is not appropriate.

(b) The distribution of is strongly left-skewed here (nearly all sites occupied, very few “failures”). The biologist needs more sites in the sample, or should use an exact method. The z-interval would be unreliable and likely produce an upper bound above 1, which is impossible for a proportion.


Problem 5 — One-Sided Confidence Bound

A consumer-protection organization tests 150 randomly selected food products and finds 27 with undisclosed allergens. They want to make a public statement: “We are 95% confident that at least ___% of all products of this type contain undisclosed allergens.”

(a) Identify: should this be a one-sided lower bound, one-sided upper bound, or a two-sided interval? Why?

(b) Compute the appropriate bound and complete the statement.

(c) What z-value do you use, and why does it differ from a two-sided 95% interval?

Show Solution

(a) One-sided lower bound. The organization wants to assert a minimum level of contamination — they care about the lower end. “At least ___%” is a lower bound statement.

(b) .

Conditions: ✓; ✓.

One-sided 95% lower bound ():

Statement: “We are 95% confident that at least 12.8% of all products of this type contain undisclosed allergens.”

(c) We use (not 1.96). For a two-sided 95% interval, each tail has 2.5% → . For a one-sided 95% bound, all 5% goes into one tail → . The one-sided bound is tighter (uses a smaller z), so the lower bound is higher than the lower end of a two-sided interval would be.


Mixed Review — Retrieval from Earlier Lessons

These problems draw on concepts from earlier in the course. Attempting them without re-reading prior lessons is the point — retrieval practice strengthens long-term memory more than re-reading.

Review Problem 1 — Sampling Distribution Concept (INF-1)

A city-wide standardized reading test has points and points. An educational researcher draws random samples of students from a large school.

(a) What is the mean and standard error of the sampling distribution of ? (b) Using , find .

Show Solution

(a) points. points.

Since , the CLT guarantees is approximately .

(b)

About 15.9% of random samples of 36 students would average above 76 points — this is within normal sampling variability.


Review Problem 2 — z-CI Construction and Sample Size (INF-2)

A food safety agency surveys 80 randomly selected restaurants and finds out of 100 on a hygiene checklist, with . Population SD unknown.

(a) Construct a 90% CI for the true mean hygiene score. (b) How many restaurants must be sampled to reduce the margin of error to at most 1.5 points at 90% confidence, using as an estimate of ?

Show Solution

(a) ; use large-sample approximation with .

(b)

At least 107 restaurants must be sampled to achieve a margin of error of at most 1.5 points at 90% confidence.

Section 7: Mastery Check

No hints. No guided steps. These questions measure whether the core ideas have actually landed. Take your time with each one — especially the Feynman test.

Question 1 — Feynman Test

A friend who missed this lesson asks you: “The formula for the standard error of a proportion involves — but you said is what we’re trying to estimate. So aren’t we using an unknown to find an unknown? Doesn’t that break the whole thing?”

Explain your answer below in plain language, as if talking to your friend. Address both why we use and what limits that introduces.

0 / 500
Show a model answer

You’re right that we don’t know — that’s exactly why we’re building the interval. The SE formula theoretically requires , but since is unknown, we substitute our best estimate: . This works because when the sample is large enough, is close to , so the estimated SE is close to the true SE.

The catch: this substitution introduces extra uncertainty — especially when is far from (which tends to happen with small samples or extreme proportions). That’s exactly why we check the conditions and first. When those hold, the approximation is good enough for practical purposes. When they don’t, the interval can be meaningfully misleading, and we need a different method.


Question 2 — Vegetable Intake CI

The National Institute of Nutrition surveys 450 adults and finds that 180 consume fewer than two servings of vegetables per day. They want to estimate the true proportion with 99% confidence.

(a) Which formula should be used for the standard error?

(b) Verify conditions, then compute the 99% CI.

Show Solution

, , .

Conditions: ✓; ✓.

Interpretation: We are 99% confident that between 34.1% and 45.9% of all adults consume fewer than two servings of vegetables per day.


Question 3 — Error Analysis

A student computes a 95% CI for a proportion as from a sample of 200 people. They write the following conclusion:

“There is a 95% probability that the true population proportion is between 0.42 and 0.58. Since 0.50 is inside this interval, we can say the population is evenly split with 95% certainty.”

Identify all errors in this conclusion and restate it correctly.

Show Solution

Error 1 — Probability statement: “There is a 95% probability that is between 0.42 and 0.58.” This is wrong. is a fixed number — it either is or isn’t in the interval. The 95% refers to the method: in 95% of all random samples of this size, the constructed interval will contain . For any specific interval, there is no probability to assign.

Error 2 — Inferring the value of p: “0.50 is inside the interval, so the population is evenly split.” The CI tells us could plausibly be anywhere in (0.42, 0.58). It does not mean , or that the population is evenly split — only that 0.50 is a plausible value that we cannot rule out.

Correct restatement: “We are 95% confident that the true population proportion lies between 0.42 and 0.58. This interval includes 0.50, meaning we cannot rule out an even split — but we also cannot conclude that one exists.”


Self-Assessment

How confident are you with the concepts from this lesson?

Still confusedReady for the Boss Fight

If your confidence is below 60%, focus on revisiting Section 3 (Core Concepts) and re-doing Examples 1–2 before the Boss Fight. The Boss Fight requires all eight concepts working together.

Section 8: Boss Fight

Two paths. Same difficulty. Different thinking style. Choose the one that feels more natural to you — there is no wrong answer. Both paths use every concept from this lesson.

🔬 The Analyst

You have data in hand. Work through it to compute and interpret intervals, and advise the city based on what the numbers say.

🏗️ The Architect

No data yet. Design the study, determine what you need, and give the CEGEP a research plan with real statistical justification.

🔬 Path A: The Analyst — Montréal Restaurant Inspections

The City of Montréal’s food inspection team has completed a round of surprise inspections. Out of 120 randomly selected restaurants, 47 had at least one critical health violation. City officials want to use this data to make public statements and plan future inspections. Your job is to advise them — with numbers.

Task 1 — Verify the Conditions

Before computing any interval, confirm that the z-interval is valid. Show your work and state your conclusion clearly.

Show Solution — Task 1

Both conditions hold. The z-interval is appropriate.


Task 2 — Construct a 95% CI for the Violation Rate

Compute the 95% confidence interval for the true proportion of Montréal restaurants with at least one critical violation. Round to 4 decimal places.

Show Solution — Task 2

,

We are 95% confident that between 30.4% and 47.9% of Montréal restaurants have at least one critical violation.


Task 3 — Evaluate the City’s Claim

The city communications team wants to issue a press release stating: “Fewer than 40% of Montréal restaurants have critical violations.” Does your 95% CI support this claim? Explain.

Show Solution — Task 3

The 95% CI is (0.304, 0.479). The upper bound of 47.9% is well above 40%. Since 40% falls inside the confidence interval, we cannot rule out that the true violation rate is 40% or higher. The data do not support the city’s claim at 95% confidence.

In fact, the point estimate itself () is barely below 40% — and given the interval, values above 40% are entirely plausible. Issuing the press release as stated would be misleading.


Task 4 — Planning Future Inspections

The inspection department wants to re-survey next year with a margin of error of at most 2 percentage points at 95% confidence, using this year’s as a prior estimate. How many restaurants must be inspected?

Show Solution — Task 4

, , :

Round up: n = 2,286 restaurants.

Note: the current sample of 120 restaurants gave a margin of error of ~8.7 percentage points. Getting to ±2 points requires roughly 20× more inspections — a significant resource commitment.

Reflection: What would you tell a journalist asking about restaurant safety in Montréal? Was the city’s claim supported? What additional data would make your advice more reliable? Write 2–3 sentences in the space below.

0 / 600

🏗️ Path B: The Architect — CEGEP Tutoring Program Study

Collège de Rosemont is considering launching a free peer-tutoring program in mathematics. Before committing the budget, administration wants to know what proportion of students would actually use it. They have no prior data and a budget for at most 300 interviews. Your job is to design the study and advise the administration.

Task 1 — Worst-Case Margin of Error

With no prior estimate available, use . What is the margin of error achievable with exactly 300 interviews at 95% confidence?

Show Solution — Task 1

, , :

With 300 interviews, the margin of error is approximately ±5.7 percentage points.


Task 2 — Using a Pilot Study

Before the main survey, a small pilot of 40 students found 22 who said they would use the program (). Recalculate the margin of error for using this prior estimate. Is it better or worse than the worst-case estimate? Why?

Show Solution — Task 2

, , :

Using : . Using : .

The improvement is tiny — because is very close to . The pilot estimate is near 0.5, where the curve is flat. When is much closer to 0 or 1, using a prior estimate saves significantly more sample size.


Task 3 — Hitting a Target Precision

Administration decides they need the margin of error to be at most 3 percentage points at 95% confidence. Using the pilot estimate , how many interviews are needed? Is this within the budget of 300?

Show Solution — Task 3

, , :

Round up: n = 1,057 interviews.

This is well above the budget of 300. To achieve ±3 points at 95% confidence, the college would need to more than triple its budget. Administration faces a choice: accept the wider ±5.7-point margin with 300 interviews, or increase the budget.


Task 4 — One-Sided Lower Bound

Suppose the survey of 300 students is conducted and 162 say they would use the program (). Administration wants to claim: “At least ___% of students would use the program (95% confidence, one-sided).” Compute this lower bound.

Show Solution — Task 4

, , (one-sided 95%).

Conditions: ✓; ✓.

Statement: “We are 95% confident that at least 49.3% of students would use the program.”

Note: since 49.3% is just below 50%, the administration cannot claim “a majority” at 95% one-sided confidence — but they’re very close.

Reflection: What recommendation would you make to the administration? Should they launch the program, gather more data, or adjust the confidence threshold? Write 2–3 sentences below.

0 / 600

Section 9: Challenge Problems

Optional stretch problems — these go beyond the lesson objectives. They’re here for students who want to push further. C1 previews a more robust method; C2 builds deep intuition; C3 requires creative multi-step reasoning.

Challenge 1 — The Wilson Score Interval

The standard z-interval () has a known weakness: when is close to 0 or 1, or when is small, it can produce intervals outside [0, 1] and performs poorly even when conditions are technically met. The Wilson score interval is more robust. It is defined as:

A sample of 20 patients finds 2 with a rare drug reaction (). Using 95% confidence ():

(a) Compute the standard z-interval. Does it stay within [0, 1]?

(b) Compute the Wilson interval. Compare the two.

Show Solution
Standard z-interval:

Interval:

The lower bound is negative — impossible for a proportion. This confirms the standard interval fails here (the conditions were not met).

Wilson interval:

With , , , :

Numerator center:

Denominator:

SE term:

Wilson interval:

Lower: ; Upper:

Wilson: — stays within [0,1] and is more meaningful than the standard interval.

A test of 15 items finds 14 conforming (). Using 95% confidence:

(a) Compute the standard z-interval. Does it stay within [0, 1]?

(b) Compute the Wilson interval. Compare the two.

Show Solution
Standard z-interval:

Interval:

Upper bound exceeds 1 — impossible. Conditions failed ().

Wilson interval:

; ,

Center:

Denom:

SE term:

Wilson: → Lower: ; Upper:

Wilson: — entirely within [0,1] and far more informative.

A poll of 8 people finds 1 planning to vote in a by-election (). Using 95% confidence:

(a) Compute the standard z-interval. Does it stay within [0, 1]?

(b) Compute the Wilson interval. Compare the two.

Show Solution
Standard z-interval:

Interval:

Lower bound is negative — invalid. Conditions fail ().

Wilson interval:

,

Center:

Denom:

SE term:

Wilson: → Lower: ; Upper:

Wilson: — bounded within [0,1] and usable despite the tiny sample.


Challenge 2 — The Shape of the Margin of Error

For a fixed and confidence level of 95%, the margin of error is:

This is a function of .

(a) Compute for .

(b) At what value of is maximized? Why does this make sense?

(c) A researcher claims: “I used in the sample size formula, so I’m safe for any .” Is this correct? Explain.

Show Solution

(a) Using :

0.10.090.0294
0.20.160.0392
0.30.210.0449
0.40.240.0480
0.50.250.0490
0.60.240.0480
0.70.210.0449
0.80.160.0392
0.90.090.0294

(b) is maximized at , where . This makes sense: is maximized at 0.5 because that’s where a Bernoulli random variable has maximum variance. A proportion near 0.5 means successes and failures are equally unpredictable — maximum uncertainty.

(c) Incorrect. Using gives , which is not the worst case. The worst case is always (giving 0.25). For between 0.5 and 0.7, the product ranges from 0.21 to 0.25 — using would underestimate the required sample size for any true in that range. Safe means conservative: use unless you’re very confident the true proportion is far from 0.5.


Challenge 3 — Two Polls, One Question (Generative)

Section 10: Solutions Reference

Complete, step-by-step solutions for all problems in Sections 5–9 are available on the solutions page. Solutions include worked arithmetic, common mistakes to watch for, and interpretation guidance.

View Full Solutions →

If you’re stuck: Re-read the relevant Core Concept in Section 3, then find the Worked Example that maps to that concept (e.g., Example 1 maps to Concept 1). The solutions page shows the reasoning behind every step, not just the final answer.

Quick-Reference Formulas

Sample Proportion:

Standard Error of :

Confidence Interval for :

Required Sample Size: (Use for a conservative estimate if no prior estimate is available. Always round up to the next whole number)

ConditionRule to check
RandomnessWas the sample randomly selected?
IndependenceIs of the population?
Success/FailureAre both and ?