EN FR

REG-5 Solutions: Applications of Bivariate Analysis

Solutions Reference · ← Back to Lesson REG-5

Section 5 — Guided Practice Solutions

Problem 1 — Variable Classification and Method Selection (Variants 0–4)

Variant 0 (weekly study hours → final exam score, n = 25):

Variant 1 (smoking status × exercise frequency, n = 100):

Variant 2 (training hours → injury count, n = 40):

Variant 3 (smartphone brand × age group, n = 150):

Variant 4 (caloric intake → body weight, n = 45):


Problem 2 — Interpreting Given Results (Variants 0–4)

Variant 0 (study hours / exam score: r = 0.72, r² = 0.518, reject H₀):

Variant 1 (smoking status × exercise frequency: χ²(1) = 4.167, V = 0.204, reject H₀):

Variant 2 (TV hours / GPA: r = −0.38, r² = 0.144, reject H₀):

Variant 3 (education level × newspaper reading: χ²(1) = 2.197, fail to reject H₀):

Variant 4 (commute time / work satisfaction: r = −0.22, r² = 0.048, fail to reject H₀):


Problem 3 — Study A / Study B Error Analysis

(a) Study A error:

The researcher concludes that because the test failed to reject H₀, students in the two programs "must have the same GPA." This is the accept-H₀ fallacy. Failing to reject H₀ does not prove the null hypothesis — it only means the data do not provide sufficient evidence against it. The two populations may still have different GPAs; the study simply lacked power to detect any difference or relationship. We never "accept" H₀; we only "fail to reject" it.

(b) Study B error:

Work-life balance measured on a 1–10 numerical scale is a quantitative variable (numerical ratings where arithmetic differences, e.g. 7 vs. 4, are meaningful). Chi-square requires two qualitative (categorical) variables. Applying chi-square to a continuous scale requires artificially binning the data — which discards information and introduces arbitrary cut-points. The appropriate method is Pearson correlation, since both work-life balance score and the second variable (if quantitative) are numerical measurements.


Problem 4 — Generator Problems

Solutions for Problem 4 are generated dynamically by the generateBivariateScenario() function. Each generated problem includes a complete solution in the solution field, covering all four steps: (1) variable type classification, (2) method identification, (3) full conclusion using correct template language, and (4) effect size interpretation with threshold label.

Section 6 — Independent Practice Solutions

Problem 1 — Full Analysis Walkthrough (Variants 0–4)

Variant 0 (QQ1 — study hours / exam score: r = 0.72, r² = 0.518, t(23) = 4.975, t*(23) = 2.069):

Variant 1 (CC4 — study method × grade category: χ²(2) = 16.333, V = 0.369, χ²*(2) = 5.991):

Variant 2 (QQ3 — monthly spending / savings: r = −0.48, r² = 0.230, t(38) = −3.373, t*(38) = 2.024):

Variant 3 (CC2 — age group × vaccine uptake: χ²(1) = 8.081, V = 0.201, χ²*(1) = 3.841):

Variant 4 (QQ5 — TV hours / GPA: r = −0.38, r² = 0.144, t(58) = −3.128, t*(58) = 2.002):


Problem 2 — Statistical vs. Practical Significance (Generator)

Solutions for Problem 2 are generated dynamically by the generateSignificanceProblem() function. Each generated problem includes a complete solution covering: (1) comparing |t| to t* and stating the decision, (2) using r² thresholds to classify practical significance, and (3) explaining why "significant ≠ practically important" (or, for moderate/strong effects, why the relationship is meaningful).


Problem 3 — Find the Error (Variant Bank, Variants 0–4)

Variant 0 — Study method encoded as 1 / 2, Pearson r applied:

Error: Study method (self-study / group study) is a qualitative (nominal) variable. Assigning the numbers 1 and 2 to categories does not create quantitative meaning — "group study minus self-study = 1" has no arithmetic interpretation. Pearson correlation applied to a nominal variable produces a meaningless coefficient. The correct method is chi-square, comparing grade categories across study method groups in a contingency table.

Variant 1 — Ice cream sales and drowning deaths: r = 0.88, "causes drowning":

Error: Correlation ≠ causation. A lurking variable — warm weather in summer — drives both ice cream sales (more people buy ice cream when it's hot) and drowning deaths (more people swim in summer). The correlation is real and strong, but it is caused by a common underlying factor, not by ice cream causing drowning. A statistically significant r proves only that \(\rho \neq 0\) in the population; it reveals nothing about causal mechanism or direction.

Variant 2 — Caloric intake (kcal) and resting metabolic rate (kcal/day), chi-square applied:

Error: Both caloric intake and resting metabolic rate are quantitative continuous measurements (expressed in kcal). Chi-square requires two qualitative (categorical) variables. Applying chi-square to continuous data requires artificially binning into categories, which discards information and creates results that depend arbitrarily on the chosen cut-points. Pearson correlation is the correct method for two quantitative variables.

Variant 3 — χ² = 48.2 with n = 5000, "confirms very strong association":

Error: The researcher uses the magnitude of χ² to judge association strength — but χ² depends on n. With n = 5000, even a negligible association produces a large test statistic. The correct measure of association strength is Cramér's V.

\[V = \sqrt{\frac{\chi^2}{n \cdot k}} = \sqrt{\frac{48.2}{5000}} = \sqrt{0.00964} = 0.098\]

V = 0.098 is a negligible association (V < 0.1) — not "very strong." Always compute Cramér's V to assess effect size independently of sample size.

Variant 4 — r = 0.05 with n = 10,000, "substantially boosts productivity":

Error: \(r^2 = 0.05^2 = 0.0025\) — work-from-home days explain only 0.25% of the variance in productivity. With n = 10,000, even an almost-zero correlation achieves statistical significance (\(t \approx 5.0 > 1.960\)). "Substantially boosts" requires a large practical effect; r² = 0.0025 is a negligible effect. The VP is confusing statistical significance with practical importance.


Problem 4 — Communication Task (Generator)

Solutions for Problem 4 are generated dynamically by the generateCommunicationProblem() function. Each generated problem presents four research summary options; the solution identifies what makes each correct or flawed, confirming whether the C10 checklist (correct test name, statistics, p-value, decision, plain-language conclusion, effect size with label, association language, observational limitation) is satisfied.


Problem 5 — Synthesis: Diabetes and Health Outcomes

(a) Why Pearson correlation for Question A?

Both fasting blood glucose (mg/dL) and BMI (kg/m²) are quantitative continuous variables — numerical measurements where arithmetic differences are meaningful (e.g., a BMI of 30 vs. 25 represents a 5-unit difference with physical meaning). Chi-square requires two qualitative (categorical) variables and cannot be applied to continuous measurements.

(b) Compute t and interpret r²:

\[t = \frac{r\sqrt{n-2}}{\sqrt{1-r^2}} = \frac{0.58 \times \sqrt{498}}{\sqrt{1-0.336}} = \frac{0.58 \times 22.316}{\sqrt{0.664}} = \frac{12.943}{0.8149} = 15.883\]

Since \(|t| = 15.883 > t^*(498) \approx 1.965\), we reject H₀ (p < 0.001). \(r^2 = 0.336\) — BMI explains 33.6% of the variance in fasting blood glucose. This is a moderate-to-strong practical effect (0.15–0.35 is moderate; approaching strong at ≥ 0.35).

(c) Why chi-square for Question B?

Both diabetes diagnosis (Non-diabetic / Pre-diabetic / Diabetic) and physical activity level (Active / Sedentary) are qualitative variables — named categories with no arithmetic meaning. Two qualitative variables → chi-square test of independence.

(d) Verify conditions, compute χ², state decision, compute V:

Expected frequencies (E = row total × col total / n, with row and column totals of 250 and 250 each):

All E ≥ 5 ✓. df = (3 − 1)(2 − 1) = 2; k = min(2 − 1, 2 − 1) = 1.

\[\chi^2 = \frac{(165-125)^2}{125} + \frac{(85-125)^2}{125} + \frac{(55-75)^2}{75} + \frac{(95-75)^2}{75} + \frac{(30-50)^2}{50} + \frac{(70-50)^2}{50}\]

\[= 12.800 + 12.800 + 5.333 + 5.333 + 8.000 + 8.000 = 52.267\]

Since 52.267 ≫ χ²*(2) = 5.991, reject H₀ (p < 0.001).

\[V = \sqrt{\frac{52.267}{500 \times 1}} = \sqrt{0.10453} = 0.323\]

\(V = 0.323\) — medium effect (0.3–0.5).

(e) Two statistical reasoning errors in the journalist's claim:

(f) Evaluate the administrator's claim that r² = 0.336 is "useless":

The administrator is wrong. \(r^2 = 0.336\) is a moderate-to-strong effect — BMI explains 33.6% of fasting blood glucose variance. In health research involving complex physiological systems where dozens of factors contribute, a single variable explaining one-third of variance is clinically meaningful. The administrator may be applying an unreasonably strict standard (expecting r² ≥ 0.80, appropriate for precision measurement, not health research), or confusing effect size with individual-level prediction precision. The remaining 66.4% unexplained variance reflects the complexity of blood glucose regulation (diet, genetics, physical activity, medications) — not a failure of BMI as a risk indicator.

Section 7 — Mastery Check Solutions

Question 1 — Feynman Test: Statistical vs. Practical Significance

Statistical significance means the sample data are unlikely to have occurred by chance if there were truly no effect in the population — we have reliable evidence that the effect is non-zero (\(p < \alpha\)). Practical significance (effect size) measures how large that effect is, independent of sample size.

The key example: with \(n = 10{,}000\) employees, a researcher finds \(r = 0.05\) between work-from-home days and productivity (p = 0.001). \(r^2 = 0.0025\) — work-from-home explains only 0.25% of productivity variance. The effect is real (reliably non-zero) but negligibly small. No rational organization would restructure workplace policy based on a 0.25% variance explained.

Both pieces of information are needed: the p-value confirms the signal is real; \(r^2\) or Cramér's V tells you whether it matters. A result can be statistically significant but practically negligible (large n, tiny effect). A result can also be practically important but not statistically significant (small n, effect that might be real but uncertain). Always report both.


Question 2 — Apply Question: Educational Program Study

(a) Variable types and method:

Reading score is quantitative (numerical measurement). Program completion (yes/no) is qualitative (binary categorical) — this is a mixed variable type situation. The researcher has coded completion as 0/1 and computed Pearson r, which is technically a point-biserial correlation. However, with one quantitative and one binary categorical variable, a more appropriate analysis is an independent-samples t-test comparing mean reading scores across the two groups (completed vs. not completed).

(b) Two problems with "Proven — the program works":


Question 3 — Error Analysis: Sleep and Academic Performance

Error 1 — Wrong method: Hours of sleep per night is a quantitative continuous variable (measured in hours, e.g., 6.5, 7.0, 8.5 — arithmetic differences are meaningful). Academic performance score is also likely quantitative. Chi-square requires two qualitative (categorical) variables. To run chi-square, the researcher must have artificially binned the continuous sleep data into categories — which discards information and produces results that depend on arbitrary cut-points. Pearson correlation is the appropriate method for two quantitative variables.

Error 2 — Causal language: "Causes poor academic performance" is unjustified from any bivariate analysis, including correlation. This is observational data — no random assignment was used. We can only conclude that sleep hours and academic performance are associated. Causation requires controlled experimental design.

Section 8 — Boss Fight Solutions

Path A — The Analyst: Reading Speed and Comprehension (n = 45, r = 0.66, r² = 0.436)

Task 1 — Variable classification and method:

Reading speed (words per minute) and comprehension score (out of 100) are both quantitative variables — numerical measurements where arithmetic differences are meaningful. Reading speed in wpm and comprehension on a 0–100 scale both have meaningful averages and differences. → Pearson correlation / linear regression is appropriate.

Task 2 — Compute t and make the decision:

\[t = \frac{r\sqrt{n-2}}{\sqrt{1-r^2}} = \frac{0.66 \times \sqrt{43}}{\sqrt{1-0.4356}} = \frac{0.66 \times 6.5574}{\sqrt{0.5644}} = \frac{4.3279}{0.7513} = 5.761\]

Since \(|t| = 5.761 > t^*(43) = 2.017\), reject H₀ (p < 0.001).

Task 3 — Interpret r²:

\(r^2 = 0.436 \geq 0.35\) → Strong practical significance. Reading speed explains 43.6% of the variance in reading comprehension score. This is a strong practical effect — reading speed is an important predictor of comprehension, though 56.4% of variance remains unexplained by speed alone.

Task 4 — Model research conclusion:

"There is sufficient evidence of a positive linear relationship between reading speed and reading comprehension score in the population (\(r = 0.66\), \(t(43) = 5.761\), p < 0.001). Reading speed explains 43.6% of the variance in comprehension scores (\(r^2 = 0.436\)), indicating a strong practical association; because this is observational data, we cannot conclude that faster reading causes higher comprehension — other factors (vocabulary, prior knowledge, engagement) likely contribute."


Path B — The Communicator: Three Flawed Research Reports

Report 1 — Nutrition Study (diet quality → athletic performance, r = 0.61, p < 0.001):

Report 2 — Social Media Study (r = 0.14, p = 0.002, n = 500):

Report 3 — Brand Preference Study (brand × age group, Pearson r = 0.28):

Section 9 — Challenge Problem Solutions

Challenge 1 — The ANOVA Gap: Mixed Variable Types

(a) Can chi-square be applied? No. Happiness on a 1–10 numerical scale is quantitative (ratings where arithmetic differences are meaningful). Chi-square requires both variables to be qualitative (categorical). Applying chi-square here would require artificially binning the happiness scores — discarding information.

(b) Can Pearson correlation be applied? No. Political affiliation (Liberal / Conservative / Independent) is qualitative (nominal). There is no numeric meaning to political affiliation labels; "Conservative + Liberal / 2" has no interpretation. Pearson correlation requires both variables to be quantitative.

(c) Appropriate method: A group-comparison method — specifically one-way ANOVA (Analysis of Variance) if normality and equal-variance assumptions hold, or the Kruskal-Wallis test (non-parametric alternative). The research question becomes: "Does mean happiness differ across the three political affiliation groups?" This compares group means on the quantitative variable rather than testing association between two same-type variables.

(d) General principle: Always classify both variable types before choosing a method. When one variable is qualitative and one is quantitative (mixed types), neither chi-square nor Pearson correlation applies. A group-comparison method is needed — or the research question must be reframed. Method selection is driven entirely by variable types, not by research topic or domain.


Challenge 2 — Sample Size and the p-value Trap

(a) Are both studies statistically significant?

Yes — both are statistically significant at α = 0.05.

(b) Which study shows stronger practical evidence?

Study A — \(r^2 = 0.270\) (diet quality explains 27% of variance in CRP; moderate effect) versus Study B's \(r^2 = 0.032\) (negligible — only 3.2% explained). Despite Study B having a slightly smaller p-value (0.011 vs. 0.019), its practical effect is far weaker. The p-value comparison is misleading here.

(c) What this illustrates about using p-values alone:

p-values are heavily influenced by sample size. With \(n = 200\), even negligible associations produce significant t-statistics. Two studies can have nearly identical p-values (0.019 vs. 0.011) while having vastly different practical significance (\(r^2 = 0.270\) vs. \(r^2 = 0.032\)). Never use p-values to compare study importance — always report and compare effect sizes. The p-value answers "is this effect reliably non-zero?" Effect size answers "how big is it?"

(d) Which study better supports individual diet coaching?

Study A — \(r^2 = 0.270\) indicates a moderate effect with clinical plausibility; diet quality explains a meaningful share of CRP variance. Study B's \(r^2 = 0.032\) (3.2% of variance) provides very weak basis for individual-level interventions. A nutritionist recommending diet coaching based on Study B would be acting on a real but negligibly small association.


Challenge 3 — Full Research Report (Diabetes Study from Section 6, Problem 5)

Using the results: Question A: r = 0.58, r² = 0.336, t(498) = 15.883, reject H₀, moderate-to-strong effect. Question B: χ²(2) = 52.267, V = 0.323, reject H₀, medium effect.

Paragraph 1 — BMI and Fasting Blood Glucose (Pearson Correlation):

"A Pearson correlation was computed to assess the linear relationship between BMI (kg/m²) and fasting blood glucose (mg/dL) in a random sample of 500 adults. The analysis revealed a significant positive relationship, \(r = 0.58\), \(t(498) = 15.883\), p < 0.001. BMI explains 33.6% of the variance in fasting blood glucose (\(r^2 = 0.336\)), indicating a moderate-to-strong practical association. Adults with higher BMI tend to show higher fasting blood glucose in this sample. Because this is an observational study, causal conclusions cannot be drawn; other factors (diet, physical activity, genetics) likely contribute to blood glucose levels."

Paragraph 2 — Diabetes Status and Physical Activity Level (Chi-Square):

"A chi-square test of independence was used to determine whether diabetes diagnosis (Non-diabetic / Pre-diabetic / Diabetic) and physical activity level (Active / Sedentary) are independent in the population. The test was significant, \(\chi^2(2, n = 500) = 52.267\), p < 0.001. Cramér's \(V = 0.323\) indicates a medium association — sedentary adults are proportionally more represented among pre-diabetic and diabetic groups. Because this is observational data, the direction of any causal relationship cannot be determined: whether inactivity is associated with diabetes risk, or whether health declines from diabetes reduce physical activity, cannot be established from these analyses alone."