Solutions — Confidence Intervals for a Population Mean (Small Sample)

How to use this page: Try each problem in the lesson before checking solutions here. If your answer doesn't match, read the solution carefully — especially the part that explains why common wrong answers are wrong. Understanding the error matters more than getting the right answer the first time.

← Back to Lesson INF-3

Section 5: Guided Practice Solutions

▾

Problem 1 — t-Interval Construction (Variants A–E)

Each variant follows the same four-step workflow: compute df → look up → compute SE and → state the interval.

Variant A (sleep duration, hrs, , , 95%): , . , → 95% CI: hrs.
Variant B (plant growth, cm, , , 90%): , . , → 90% CI: cm.
Variant C (blood glucose, mg/dL, , , 99%): , . , → 99% CI: mg/dL.
Variant D (reaction time, ms, , , 95%): , . , → 95% CI: ms.
Variant E (protein content, g, , , 90%): , . , → 90% CI: g.

Common error — df = n: using instead of yields a that is too small (closer to ), producing an interval that is too narrow. Always subtract 1, because one degree of freedom is spent estimating with .

Problem 2 — When to Use t vs. z (Variants A–E)

Decision rule: if is known, use z regardless of n. If is unknown and , use t. If is unknown but , both are acceptable (z is most common).

Variant A (nutritionist, n = 20, unknown): Use t ( unknown, n < 30). “We are 95% confident the true mean sodium content lies between [L] and [U] mg” — describing the procedure’s reliability, not the probability is in this interval.
Variant B (factory QC, g known, n = 15): Use z — known, so use (95%), not a t-value.
Variant C (researcher, n = 45, unknown): Use z (or t — both acceptable). With the CLT justifies z, but t with df = 44 is equally valid; z is most commonly reported.
Variant D (pharmacist, n = 8, unknown, population normal): Use t — small n, unknown; the normality assumption makes the t-interval valid even at n = 8.
Variant E (historian, n = 12, unknown): Use t (df = 11). “If we repeated this sampling procedure, about 90% of the resulting intervals would capture the true mean — this one may or may not.”

Problem 3 — z vs. t Width Comparison

Data: , , , 95%. .

z-based (incorrect): → .
t-based (correct): , , → .

(a) The t-interval is wider (6.180 vs. 5.684). (b) Because — the heavier tails of the t-distribution pull the critical values farther out to capture the same probability. (c) Report the t-interval: using with unknown understates uncertainty and produces false precision.

Problem 4 — Generated Problems

Solutions are embedded in the generator output. The steps are always:

Compute .
Use the provided (from the T_CRIT table for that df and confidence level).
Compute , then .
State the interval .

Section 6: Independent Practice Solutions

▾

Problem 1 — Full t-Interval Construction (Variants A–E)

For each variant, check conditions first: (1) population approximately normal or n large enough, (2) unknown, (3) independent random sample.

Variant A (ER wait times, min, , , 95%): , . , → 95% CI: min.
Variant B (calcium content, mg/L, , , 90%): , . , → 90% CI: mg/L.
Variant C (sleep hours, , , , 99%): , . , → 99% CI: hrs.
Variant D (weight loss, kg, , , 95%): , . , → 95% CI: kg.
Variant E (test scores, , , , 90%): , . , → 90% CI: .

Problem 2 — Generated Problems (IP2 & IP3)

Solutions are embedded in each generator’s output. Key rule: if is known, use z regardless of n; if is unknown and , use t with .

Problem 3 — Find the Error (Variants A–E)

Variant A ( used with n = 12, unknown ): when is unknown and , the correct distribution is t, not z. Using ignores the extra variability from estimating with → interval too narrow.
Variant B (df = n = 18 instead of 17): df should be . Using 18 gives a slightly too small → overconfident (too-narrow) interval.
Variant C (): probability trap. Once computed, is fixed; the probability is 0 or 1. The 95% is the long-run frequency of the procedure.
Variant D (“the t-interval is wrong because it’s wider”): the t-interval is supposed to be wider — its heavier tails honestly account for estimating with . A narrower interval would be overconfident.
Variant E (t-interval with n = 50, known): when is known, always use z. The t-distribution is only needed when must be estimated.

Problem 4 — Synthesis: Lead Concentration

Data: n = 22 water samples, ppb, ppb, unknown.

(a) 95% t-interval: , . , → 95% CI: ppb.

(b) z-based comparison (incorrect): → ppb. The t-interval is wider — 0.080 ppb further in each direction.

(c) Why the distinction matters: with n = 22, vs. — a 6% gap that grows dramatically for smaller samples (at n = 6, vs. 1.96 — a 31% difference). Reporting the z-interval understates uncertainty. If comparing lead levels against a safety threshold, the incorrectly narrow z-interval could falsely conclude concentrations are safe when the honest t-interval suggests otherwise.

Section 7: Mastery Check Solutions

▾

Problem 1 — Feynman Test: t vs. z and Why df = n − 1

The z-distribution assumes we know exactly. In most studies is unknown and estimated from the sample as . This adds a second source of variability — not only does vary from sample to sample, but so does . The t-distribution accounts for this with heavier tails (wider critical values), producing wider, more honest confidence intervals.

We use rather than because, once is computed, the n deviations must sum to zero — so only n − 1 can vary freely. One degree of freedom is “spent” computing , leaving n − 1 independent pieces of information about the spread. As , the constraint becomes negligible and the t-distribution converges to the standard normal.

Problem 2 — Apply: Energy Bars

Data: n = 9, g protein, g, 99%, unknown.

Distribution: t ( unknown, n < 30).
, (99%).
, .
99% CI: g.

We are 99% confident the true mean protein content is between 2.529 g and 3.871 g per bar.

Problem 3 — Error Analysis: Wrong df

The researcher used df = 15, but the sample has n = 15, so the correct df is 14. With df = 14 and 90% CI: (vs. 1.753 they used with df = 15). The SE is unchanged (), so the corrected is slightly larger (), giving a slightly wider interval. The error makes the interval marginally too narrow — df = n overestimates df and so underestimates .

Section 8: Boss Fight Solutions

▾

Path A — The Analyst: Lake Champlain pH Study

Data: , , , 95%, unknown.

Step 1 — Conditions: random sample ✓; unknown ✓ → use t; pH readings tend to be approximately normal ✓ (n = 15 is borderline — state the assumption).

Step 2 — Compute the CI: , . , → 95% CI: .

Step 3 — Does the interval suggest pH < 4.5? No. The entire interval (4.607, 4.973) lies above 4.5, so there is no evidence the lake pH is critically low.

Step 4 — Interpretation: “We are 95% confident the true mean pH at this location is between 4.607 and 4.973. The interval lies entirely above the acidic threshold of 4.5, so the lake is not in the critical range — though the lower bound of 4.61 is still quite acidic relative to neutral pH 7.”

Path B — The Architect: Critique and Redesign

The junior researcher used with n = 20 and unknown .

Error: when is unknown and , the t-distribution must be used. Using ignores the uncertainty of estimating with , producing an interval that is too narrow.

Corrected CI: Data days, , , 90%. , . , → 90% CI: days.

Comparison: the researcher’s z-based CI: → (7.901, 10.699). The corrected t-interval is wider by 0.071 days each side (0.142 days wider overall) — meaningful for clinical recovery times. Why it matters: for medical recovery data, understating uncertainty could affect treatment-protocol decisions; the t-interval honestly reflects that with only 20 patients and no known there is more uncertainty than the z-interval admits.

Section 9: Challenge Problem Solutions

▾

Challenge 1 — Why df = n − 1 (Intuitive Derivation)

Suppose you have n = 4 data values and have computed . The deviations from the mean must sum to zero: .

If you know , , and the mean is 10, then is forced: . You have no freedom to choose it.

In general, once you fix and the first n − 1 values, the last is completely determined. Only n − 1 of the n deviations vary independently, so the sample variance — built from those deviations — has n − 1 degrees of freedom. When we plug into the CI formula to substitute for , the t-distribution with correctly accounts for this reduced independence.

Challenge 2 — t-Table Convergence

values at 95% confidence:

df	(95%)	Difference from
1	12.706	+10.746
5	2.571	+0.611
10	2.228	+0.268
20	2.086	+0.126
29	2.045	+0.085
∞	1.960	0.000

Key observation: even at df = 29 (n = 30), remains meaningfully larger than — a 4.3% difference; at df = 5 the gap is 31%. The convergence is slow, which is why the t vs. z distinction stays practically important even at moderate sample sizes.

At what df does first drop below 2.05? df = 25 gives and df = 29 gives 2.045, so the crossover is near df = 26 (). The exact equality is reached only at .

Challenge 3 — Overlapping CIs Preview (Variants A–E)

This previews inf-6 (two-sample inference). Key intuition: non-overlapping CIs are strong evidence the two population means differ; overlapping CIs suggest the means may be similar, but overlap alone does not confirm equivalence. A formal two-sample t-test is needed for definitive conclusions.

For each variant:

Intervals don’t overlap: strong evidence the two means differ (unlikely to be equal if each interval excludes the other).
Intervals overlap substantially: cannot conclude the means differ — a range of values is consistent with both groups sharing a mean.
Intervals barely overlap: ambiguous — a formal two-sample t-test is needed (intervals can overlap while the means still differ significantly).

This is a conceptual preview — no arithmetic required. The logic: CIs describe plausible values for each mean; comparison requires reasoning about whether those ranges share common ground.

← Return to Lesson INF-3

INF-3: Solutions — Confidence Intervals for a Population Mean (Small Sample)

Section 5: Guided Practice Solutions

Problem 1 — t-Interval Construction (Variants A–E)

Problem 2 — When to Use t vs. z (Variants A–E)

Problem 3 — z vs. t Width Comparison

Problem 4 — Generated Problems

Section 6: Independent Practice Solutions

Problem 1 — Full t-Interval Construction (Variants A–E)

Problem 2 — Generated Problems (IP2 & IP3)

Problem 3 — Find the Error (Variants A–E)

Problem 4 — Synthesis: Lead Concentration

Section 7: Mastery Check Solutions

Problem 1 — Feynman Test: t vs. z and Why df = n − 1

Problem 2 — Apply: Energy Bars

Problem 3 — Error Analysis: Wrong df

Section 8: Boss Fight Solutions

Path A — The Analyst: Lake Champlain pH Study

Path B — The Architect: Critique and Redesign

Section 9: Challenge Problem Solutions

Challenge 1 — Why df = n − 1 (Intuitive Derivation)

Challenge 2 — t-Table Convergence

Challenge 3 — Overlapping CIs Preview (Variants A–E)