GP-1 — Compute \( \hat{p} \) and Check Conditions (Variants 0–2)
Variant 0 (budget app survey, n = 400, x = 148):
- (a) \( \hat{p} = 148/400 = 0.37 \)
- (b) \( n\hat{p} = 148 \geq 5 \) ✓; \( n(1-\hat{p}) = 252 \geq 5 \) ✓. Both conditions met — z-interval is valid.
Variant 1 (composting survey, n = 250, x = 75):
- (a) \( \hat{p} = 75/250 = 0.30 \)
- (b) \( n\hat{p} = 75 \geq 5 \) ✓; \( n(1-\hat{p}) = 175 \geq 5 \) ✓. Conditions met.
Variant 2 (clinical trial, n = 40, x = 18):
- (a) \( \hat{p} = 18/40 = 0.45 \)
- (b) \( n\hat{p} = 18 \geq 5 \) ✓; \( n(1-\hat{p}) = 22 \geq 5 \) ✓. Conditions met.
Common mistake: Writing \( \hat{p} = 148 \) without dividing by n, or checking "n ≥ 30" instead of the actual conditions \( n\hat{p} \geq 5 \) and \( n(1-\hat{p}) \geq 5 \).
GP-2 — Compute the Standard Error and CI Endpoints (Variants 0–2)
Variant 0 (\( \hat{p} = 0.37 \), n = 400, 95% confidence):
\[ \text{SE} = \sqrt{0.37 \times 0.63 / 400} = \sqrt{0.000583} \approx 0.02414, \quad E = 1.96 \times 0.02414 \approx 0.0473 \]
\[ \text{CI} = 0.37 \pm 0.047 = (0.323,\; 0.417) \]
Variant 1 (\( \hat{p} = 0.30 \), n = 250, 90% confidence):
\[ \text{SE} = \sqrt{0.30 \times 0.70 / 250} \approx 0.02898, \quad E = 1.645 \times 0.02898 \approx 0.0477 \]
\[ \text{CI} = (0.252,\; 0.348) \]
Variant 2 (\( \hat{p} = 0.45 \), n = 40, 99% confidence):
\[ \text{SE} = \sqrt{0.45 \times 0.55 / 40} \approx 0.07866, \quad E = 2.576 \times 0.07866 \approx 0.2026 \]
\[ \text{CI} = (0.247,\; 0.653) \]
Note how wide this interval is — small n (40) combined with high confidence (99%) produces a very imprecise estimate spanning over 40 percentage points.
Common mistakes: (1) Using \( \sqrt{p(1-p)/n} \) with the unknown \( p \) — always use \( \hat{p} \). (2) Adding/subtracting the SE directly instead of \( E = z^* \times \text{SE} \). (3) Using the wrong z* for the stated confidence level.
GP-3 — Interpretation (n = 500, x = 310, 95% CI = (0.577, 0.663))
Correct answer: "We are 95% confident that the true proportion of adults who support stricter labelling lies between 57.7% and 66.3%."
Why the other options are wrong:
- Option A ("95% probability that p is in this interval") treats the fixed parameter p as if it were random.
- Option C ("95% of the 500 adults") confuses sample members with the parameter.
- Option D ("future sample proportions fall in this range") describes the sampling distribution of \( \hat{p} \), not a CI for p.
GP-4 — Sample Size (vaccination coverage, E = 0.03, 95%, p* = 0.40)
(a) Use \( p^* = 0.40 \) — the prior estimate. Using \( p^* = 0.50 \) would be more conservative but wastes sample size when a good prior estimate is available.
(b)
\[ n = \left(\frac{1.96}{0.03}\right)^2 \times 0.40 \times 0.60 = (65.33)^2 \times 0.24 \approx 1{,}024.4 \]
Round up: \( n = \mathbf{1{,}025} \).
Common mistakes: (1) Using \( p^* = 0.5 \) when a prior estimate is given. (2) Rounding 1024.4 down to 1024 — always round up. (3) Forgetting to square \( z^*/E \) before multiplying by \( p^*(1-p^*) \).