Central Tendency Measures

In 2021, the Canadian Real Estate Association reported that the average home price in the Greater Toronto Area exceeded $1.1 million. Headlines ran everywhere: “Average Toronto home now costs over a million dollars.”

But here’s the thing — most people in Toronto were not living in a million-dollar home. A small number of ultra-expensive luxury properties — mansions, penthouses, waterfront estates — pulled the “average” price up dramatically. The majority of homes traded hands at significantly lower prices. The single number reported in the headline was technically correct, yet it gave almost everyone reading it a misleading picture of the market.

This is the central tension in this lesson: we want to summarize an entire dataset with a single number that represents the “typical” value. Three measures compete for that role — the mean, the median, and the mode — and each tells a different story. Choosing the wrong one is one of the most common errors in data analysis, in journalism, and in everyday decision-making.

By the end of this lesson, you’ll know exactly when each measure is appropriate, why they diverge from each other, and what that divergence tells you about the shape of a distribution.

After this lesson, you will be able to:

Compute the sample mean using the summation formula
Find the median for both odd- and even- datasets
Identify the mode(s) of a dataset, including unimodal, bimodal, and multimodal cases
Explain why the mean is sensitive to outliers while the median is resistant
Select the most appropriate measure of central tendency based on the distribution’s shape and skewness

This lesson builds on DS-2 (Data Visualization). The most important concept you’ll need is distribution shape — because the shape of a histogram is what guides your choice of measure. Let’s do a quick refresher.

From DS-2: Distribution Shape

From DS-2: Histogram Shapes. The direction of skew is the direction of the longer tail. (You will need this to select the correct measure of central tendency for a given dataset).
From basic math: Summation (). means “add up all values”. (You will need this to compute the sample mean).
From basic math: Locating the middle. If , the middle is position . (You will need this to locate the median).

Prerequisite self-check

Retrieval Checkpoint: Before continuing, test your foundational knowledge from DS-2.

A frequency table records the number of hours 30 students spent on homework last week: 0–2 h (4 students), 3–5 h (11 students), 6–8 h (9 students), 9–11 h (5 students), 12+ h (1 student). Which is the modal class — the class with the highest frequency?

Check your comfort level with the remaining foundations:

I know what an outlier looks like on a histogram or dot plot
I can compute for a short list of numbers
I understand what a “typical” value means informally

Retrieval Warm-up — from DS-1 and DS-2

A health researcher collects the number of cigarettes smoked per day by each participant in a study, and also records each participant’s smoking status (Never / Former / Current). Which of the following statements about these two variables is correct?

A frequency histogram shows the number of hours per week 50 university students spent studying. The five bars show: 0–4 h (5 students), 5–9 h (12 students), 10–14 h (20 students), 15–19 h (10 students), 20–24 h (3 students). How many students studied fewer than 10 hours per week?

Success Factor:

Where DS-3 diverges from DS-2: In DS-2, we drew pictures of data distributions. In DS-3, we are going to summarize those entire pictures into a single “typical” number, and we will see how the shape of the picture dictates which number we should use.

C1 — The Sample Mean

The mean is the most familiar measure of centre — it’s what most people mean when they say “the average.” Intuitively, it’s the value each observation would take if we redistributed the total evenly across all observations.

Sample Mean

The sample mean (read “x-bar”) is the arithmetic average of observations:

where denotes the -th observation and is the sample size.

Properties:

Uses every value in the dataset
Is the algebraic centre of mass: (deviations cancel exactly)
Minimises the sum of squared deviations

Notation note: is the sample mean. The population mean uses the Greek letter (mu). We’ll work with throughout this module; appears in Module 3 when we move from describing samples to making inferences about populations.

Unpacking: Think of placing weights on a number line — one weight per data value. The mean is exactly the balance point. If you add a very heavy weight far to the right, the balance point shifts right — no matter how many lighter weights are clustered on the left.

“The mean is always a data value.” Not so. The mean of is — not a value in the dataset. The mean is a calculated summary, not necessarily an observation. You can have 2.4 children on average even though no individual family has 2.4 children.

The Balance Scale — Find the Mean by Balancing the Beam

The visualization below makes the balance-point property concrete. Each blue weight is one data value resting on a plank. Drag the fulcrum (▲) until the beam is perfectly level — you’ll discover it balances at exactly one place: the mean. Notice the “net turning effect” reading hits zero precisely when the beam is level. Drag the weights to change the data and the balance point moves with them.

Interactive: Drag the fulcrum to balance the beam. It levels out only when the fulcrum sits exactly at the mean (x̄) — the point where the turning effects on both sides cancel to zero.

C2 — Mean Sensitivity to Outliers

Here is the key insight the housing-price example from Section 1 was driving at: the mean is pulled toward every outlier in the dataset, proportionally to how extreme that outlier is. One extreme value can move the mean far from where most of the data sits.

Illustration: Consider seven recent home sales in a neighbourhood (in ):

The first six are clustered around 1.85M.

The mean says the “average” home costs 350K. The mean misrepresents what a typical buyer actually paid.

The Outlier Puller — See It Happen

The visualization below makes this concrete. Drag the orange outlier dot left and right. Watch how the mean (orange line) chases the outlier while the median (blue line) barely moves.

Interactive: Drag the outlier to see how mean and median respond differently. The mean is pulled toward the outlier; the median is resistant.

“A larger dataset protects the mean from outliers.” Only partially. With values and one extreme outlier, the mean is less distorted than with — but a sufficiently extreme outlier still pulls the mean away from the bulk of the data. The median is structurally resistant regardless of sample size.

C3 — The Median

The median is the middle value of a sorted dataset. Because it depends only on position (not magnitude), a single extreme value cannot move it far.

Median

The median is the value that divides a sorted dataset into two equal halves.

Algorithm:

Sort the data in ascending order.
Count observations.
If is odd: the median is the value at position .
If is even: the median is the average of the values at positions and .

The median is resistant to outliers: changing the most extreme value (without crossing the median) does not change the median at all.

Unpacking: For the seven home-sale prices above, first sort:

(odd) → median is at position → Median = 320 (thousands).

The median says 538K mean.

The Sort Animator — See Why Sorting is Non-Negotiable

Work through the four steps below. Toggle between odd- and even- datasets to see how the median algorithm changes.

Step through: raw data → sorted → middle position found → median identified. The #1 median error — skipping the sort — becomes visible in Step 1 vs Step 2.

“You can find the median without sorting first.” This is the #1 median error. The median is defined as the middle of the sorted list. Finding the middle value of the unsorted list gives a wrong answer. Always sort before locating the median position.

C4 — The Mode

The mode is the value (or values) that appear most frequently. It’s the only measure of centre that works for qualitative (categorical) data — you can’t compute a mean or median for colours or political parties, but you can find which occurs most often.

Mode

The mode of a dataset is the value(s) that occur with the highest frequency.

Unimodal: exactly one mode (one value appears more often than all others)
Bimodal: two modes (two values tie for highest frequency)
Multimodal: more than two modes
No mode: all values appear equally often (or each exactly once)

The mode is the only measure of central tendency applicable to qualitative (nominal or ordinal) data.

Example: Shoe sizes in a sample of 12 customers: .

Size 8 appears 4 times; size 9 appears 4 times; sizes 7, 10, 11 appear once each.
Mode = 8 and 9 (bimodal). A shoe store should stock both sizes most heavily.

Mode for qualitative data: A survey of 200 students asks their favourite social media platform: Instagram (88), TikTok (71), YouTube (31), X (10). Mode = Instagram — it occurs most frequently. No mean or median is possible here because platform names are not numbers.

The Frequency Chart Builder — See the Mode Emerge

Click Build frequency chart to watch the bars grow from the raw data. Toggle to the bimodal example to see what happens when two values tie.

Interactive: build the frequency chart to reveal the mode. The bimodal case often signals two distinct sub-groups in the data.

C5 — Skewness and Measure Selection

The three measures — mean, median, mode — coincide in a perfectly symmetric unimodal distribution. As soon as the distribution becomes skewed, they separate, and the direction of that separation tells you something important.

Skewness and Central Tendency

For a unimodal distribution:

Shape	Relationship	Use
Symmetric	Mean ≈ Median ≈ Mode	Either mean or median
Right-skewed (longer right tail)	Mode < Median < Mean	Median
Left-skewed (longer left tail)	Mean < Median < Mode	Median

Rule: The mean is pulled toward the tail. In a right-skewed distribution the mean exceeds the median; in a left-skewed distribution the mean falls below the median. The median sits between them and better represents the “typical” value when skewness is present.

Note: This ordering is a reliable heuristic for unimodal continuous distributions, not a mathematical law — some irregular distributions can violate it. Use it as a guide, not a guarantee.

Three Canonical Shapes — See the Ordering Directly

The visualization below shows why the table above is true. Switch tabs to compare. Notice how the tail always drags the mean past the median.

Three canonical distribution shapes. In each, the mean is pulled toward the tail; the median sits between the mean and mode.

Why the mean is pulled toward the tail: Outliers and extreme values cluster in the tail. Because the mean uses every value’s magnitude, those extreme values drag it in the tail’s direction. The median is immune — it only counts positions.

The practical decision rule:

Distribution approximately symmetric, no extreme outliers → use the mean (it uses more information)
Distribution skewed or has outliers → use the median (it’s resistant)
Data is qualitative (categories) → use the mode (only option)

“The mean is always the best average because it uses all the data.” Using all values is only an advantage when those values all legitimately represent the typical phenomenon. When one value is 320K homes, including it fully misleads the summary. The mean’s strength (sensitivity to all values) becomes a weakness when outliers are present.

The Income Histogram — Mean vs. Median as the Right Tail Grows

The slider below adds high earners one at a time. Watch the mean (orange) drift right while the median (blue) barely moves — exactly the divergence that makes skewed income data require the median.

Add high earners with the slider. The mean is pulled rightward by each addition; the median, which tracks position not magnitude, stays anchored near the bulk of the data.

Let’s walk through four examples with fading scaffolds. Example 1 is fully narrated; by Example 4 you’re doing most of the thinking yourself.

Example 1 — Computing the Sample Mean using the Summation Formula

Scenario: A statistics instructor records the number of hours seven students spent studying for a midterm exam:

4, 7, 3, 9, 5, 6, 8

Find the sample mean .

🤔 Your prediction: The seven study hours are . Before computing — do you expect the mean to be below 5, around 6, or above 7? Think first, then reveal the solution.

Show Solution

Step 1: Identify and list all values. . Values: .

Step 2: Compute the sum.

Step 3: Divide by .

Step 4: Interpret in context. On average, the students in this sample studied 6 hours. This is a whole number here — a coincidence. Means are frequently non-integer.

Sanity check: Is 6 a reasonable “middle” for ? Yes — it’s the centre of the sorted list, which makes sense for a symmetric distribution.

Algebraic centre property: Check that deviations from the mean sum to zero:

✓

Example 2 — Finding the Median for Odd and Even Sample Sizes

Part A — Odd : Seven patients’ resting heart rates (bpm): .

Before revealing the steps, predict: after sorting, which position will the median occupy?

🤔 Your prediction: is odd. Which position holds the median? Think first, then reveal the solution.

Show Solution — Part A

Step 1: Sort ascending.

Step 2: Locate the middle position. Position .

Step 3: Read off the 4th value. Median bpm.

Part B — Even : A new patient is added: rates are ().

Show Solution — Part B

Step 1: Sort ascending.

Step 2: Locate the two middle positions. Positions and .

Step 3: Average the 4th and 5th values.

Note that 73 bpm is not in the dataset — perfectly valid. The median is a summary statistic, not a required data point.

Forgetting to sort: The unsorted list has 72 in position 1 and 85 in position 3 — neither is the median. The position formula works only on sorted data. This is the most common median error in exams.

Example 3 — Identifying the Mode in Unimodal and Bimodal Data

Scenario: A clothing retailer records the shoe size purchased by each of 15 customers:

9, 8, 10, 8, 9, 11, 8, 7, 9, 10, 8, 9, 8, 10, 9

🤔 Your prediction: Scan the list briefly. Which size do you expect to appear most often — 8 or 9? Commit to a prediction before revealing the solution.

Show Solution

Step 1: Tally each value.

Size	Count
7	1
8	5
9	6
10	3
11	1

Step 2: Identify the highest frequency. Size 9 appears 6 times — more than any other.

Conclusion: Mode = 9 (unimodal). Stock size 9 most heavily, followed by size 8.

Bimodal distributions: If sizes 8 and 9 had both appeared 6 times, the dataset would be bimodal — both 8 and 9 are modes. A bimodal distribution often signals two distinct subgroups in the data (for example, men’s and women’s shoe sizes recorded together). Spotting bimodality is a prompt to ask: are these really one population or two?

Example 4 — Robust Central Measure and Frequency Distributions

Part A — Robust Central Measure for Right-Skewed Salaries

This example tests transfer — the same reasoning in a new context. The CEO’s salary pulls the mean far above where most salaries sit:

Which measure best characterizes what a typical employee earns?

Show Solution — Part A

Compute the mean:

Find the median ( even, average positions 4 and 5):

Interpret:

Mean (): inflated by the CEO’s salary. Six of eight employees earn below this figure.
Median (): accurately captures typical compensation.

Decision: Report the median. This is a right-skewed distribution — the CEO’s salary is an outlier in the right tail.

The pattern to remember: Salary, income, housing prices, and net worth are almost always right-skewed. Statistics Canada and the U.S. Bureau of Labor Statistics consistently report median household income for precisely this reason.

Part B — Computing the Mean from a Frequency Distribution

When data are presented as a frequency table rather than a raw list, we use a compact version of the mean formula:

Clubs ()	Frequency ()
0	8
1	15
2	18
3	7
4	2

Find the mean number of clubs per student.

Show Solution — Part B

Step 1: Identify the formula for frequency data.

Each value appears times, so the total sum equals , and the sample size is .

Step 2: Compute each product .

Clubs ()	Frequency ()
0	8	0
1	15	15
2	18	36
3	7	21
4	2	8

Step 3: Sum the products and divide.

Step 4: Interpret in context. On average, students in this sample participate in 1.60 extracurricular clubs.

Why the formula works: adds value exactly times — identical to summing a full raw list. You will see this formula again in the Mastery Check and in later modules whenever data arrive pre-aggregated as a frequency distribution.

Time to try it yourself — with immediate feedback. Work through each problem before selecting an answer. If you choose incorrectly, read the rationale carefully before moving on.

Problem 1 — Computation of the Arithmetic Mean (C1)

Work the mean step by step — commit to each step before moving on. A fresh dataset every attempt.

Problem 2 — Median Localization and Even/Odd Sample Sizes (C3)

Practice finding the median — sort first, then locate. A fresh scenario each attempt; try the whole problem solo once the steps feel routine.

Problem 3 — Influence of Skewness on Central Measures (C4, C5)

Apply the decision rule — no computation needed. A new scenario each time.

A city reports household incomes. The histogram is strongly right-skewed — most households earn 80K, but a small number of very high earners extend the right tail past $500K.

Which measure best represents the typical household income?

A botanist measures petal lengths (mm) of 200 flowers from the same species. The histogram is approximately bell-shaped and symmetric, with no extreme outliers.

Which measure is most appropriate for the typical petal length?

A university asks 500 students to identify their preferred study location (Library, Café, Home, Dorm Room, Other).

Which measure of central tendency can be computed?

Ten marathon finish times are collected. Nine runners finish between 3.5 and 5.0 hours; one elite runner finishes in 2.1 hours — a clear outlier in the left tail.

Which measure best represents the typical recreational runner’s finish time?

A teacher records absences per student over a semester. The histogram is approximately symmetric, ranging 0–8 absences, with most students at 3–5. No extreme outliers.

Which measure is most appropriate?

Work through these problems without guided hints. Commit to an answer before revealing the solution. Problems are interleaved across C1–C5.

Problem 1 — Diagnosing Skewness via Central Measure Discrepancies

A real-estate report shows: mean sale price = $485,000 — median sale price $342,000.

What does the gap between mean and median reveal about the distribution?

Problem 2 — Defending Measure Choices for Skewed Environments

No computation needed — apply the decision rule.

Hospital length-of-stay after knee surgery is strongly right-skewed: most patients leave in 2–4 days; a few with complications stay 3–4 weeks.

Which measure represents the typical length of stay?

Lifetimes of 500 light bulbs are approximately bell-shaped and symmetric, centred near 1,000 hours, with no outliers.

Which measure best describes typical lifetime?

A streaming platform surveys 1,000 users: “Which genre do you watch most?” Options: Drama, Comedy, Documentary, Thriller, Other.

Which measure of central tendency is appropriate?

Mercury levels (ppb) in lake fish are right-skewed — a few fish near industrial sites carry very high concentrations.

For a public health report, which measure of “typical” mercury level is most appropriate?

A gym tracks members’ weekly workout frequency (sessions/week). The histogram is approximately symmetric, 0–7 sessions, most at 3–4. No outliers.

Which measure best summarises the typical member?

Problem 3 — Algorithmic Computation of the Arithmetic Mean

A fresh dataset every time. Apply and check your answer.

Problem 4 — Algorithmic Sorting and Location of the Median

Sort the generated list, locate the middle (or average the two middle values for even ).

Problem 5 — Algorithmic Identification of the Mode

Review the scenario and identify the most frequent value (or values, if bimodal).

Mixed Review — Retrieval from Earlier Lessons

These problems draw on concepts from DS-1 and DS-2. Attempting them without re-reading prior lessons is the point — retrieval practice strengthens long-term memory more than re-reading.

Review Problem 1 — Reading a Frequency Distribution (DS-2)

A fitness centre tracks the number of visits per member over one month. The frequency distribution is:

Visits	Frequency
1–4	12
5–8	28
9–12	35
13–16	18
17–20	7
Total	100

(a) What is the relative frequency of the 9–12 class? (b) What is the cumulative frequency through the 9–12 class (i.e., the number of members who visited 12 times or fewer)? (c) Based on the distribution alone, would you expect the mean number of visits to be greater than, less than, or approximately equal to the median? Explain your reasoning without computing either value.

Show Solution

(a) Relative frequency for the 9–12 class:

(b) Cumulative frequency through the 9–12 class (including all classes up to and including 9–12):

(c) Shape and central measures: The distribution rises from the 1–4 class to a peak at 9–12, then falls off — and the right side (13–20) falls more gradually than the left side (1–8). This suggests a slightly right-skewed distribution. In a right-skewed distribution, the mean is pulled upward by the higher values in the right tail, so we would expect the mean to be somewhat greater than the median. The difference will be modest because the skew is mild.

Review Problem 2 — Variable Type and Appropriate Measure (DS-1)

A school conducts a survey of its 850 students. For each student, it records: (i) the student’s favourite academic subject (Math, Science, French, History, Art, Other), and (ii) the student’s GPA on a 0–4.0 scale.

(a) For which of these two variables can you compute a mean? Explain why the mean is not applicable to the other variable.

(b) A counsellor summarizes the favourite-subject data by reporting: “The most popular subject is Science.” Which measure of central tendency is she implicitly using? Name it and explain why it is the only measure of central tendency applicable here.

(c) GPAs across the school are slightly left-skewed — a cluster of very low GPAs pulls the distribution’s left tail. Which measure — mean or median — better represents the typical student GPA, and why?

Show Solution

(a) Which variable allows a mean: GPA is quantitative continuous (it can take any value between 0 and 4.0, and arithmetic on GPAs is meaningful — you can add and divide them). The mean GPA is well-defined.

Favourite subject is qualitative nominal — the categories (Math, Science, etc.) are labels. Adding “Math” and “Science” has no meaning. There is no numerical value to sum or divide, so the mean cannot be computed.

(b) Measure of central tendency: The counsellor is using the mode — the most frequently occurring value. The mode is the only measure of central tendency applicable to qualitative nominal data, because it requires no arithmetic: it simply identifies which category appears most often.

(c) Skewed distribution: With a left-skewed distribution (the tail extends left, toward very low GPAs), the mean is pulled downward by the small cluster of very low values. The mean will be somewhat lower than the median. The median better represents the typical student because it is resistant to the influence of those extreme low values — it simply locates the middle position in the sorted list, ignoring how extreme the outlying values are.

No hints. No guidance. These questions test whether you can recall and apply what you’ve learned without support — the clearest signal that you’ve actually internalized it.

Question 1 — Feynman Test (C1 + C2 + C5)

Imagine a friend who missed this entire lesson. They just asked you: “I keep hearing about mean, median, and mode — what’s actually the difference, and when should I use each one?”

Write your explanation below (aim for 200–400 characters). No formulas required — plain language only.

0 / 400

Show a Model Answer

Model answer: “The mean adds up all values and divides by how many there are — it’s like spreading the total equally. The median is literally the middle value when you sort the list. The mode is just the most common value. The catch is that one really extreme value (like a billionaire’s income) pulls the mean far from where most data sits, but it barely moves the median. So for skewed data — incomes, home prices, wait times — use the median. For roughly symmetric data with no outliers, the mean is fine. And for categories (like favourite colour), only the mode works.”

Question 2 — Applying Measures in Frequency Data (C5)

A city planner collects the number of cars per household for 200 homes in a neighbourhood. The data are:

Cars	Frequency
0	18
1	74
2	82
3	21
4	5

Step 1: Without computing, what shape do you expect the distribution to have?

Step 2: Compute the mean for a frequency distribution like this one. A fresh table each attempt.

Question 3 — Median Retrieval (C3)

One more cold retrieval, mixed in from the same concept cluster: sort, locate, and state the median.

Confidence Check

How confident are you with the material in this lesson?

Still shakyVery confident

If you’re below 60%: Return to Section 3 (Core Concepts) and Section 4 (Worked Examples). Focus especially on C2 (outlier sensitivity) and C5 (skewness → measure selection) — those are the concepts most tested in the Boss Fight.

Boss Fight — Pick Your Path. Both paths are equally demanding. The difference is in how you engage with the material: the Analyst works with real data; the Architect designs a study. Choose the one that fits your thinking style — you can attempt the other afterward for extra practice.

🔬 Path A: The Analyst — Tech Startup Salary Analysis

You’re given a salary dataset. Compute, compare, and defend your choice of measure with precision.

🏗️ Path B: The Architect — Commute Time Study Design

You’re designing a survey to study commute times. Choose your approach and justify every decision.

🔬 Path A: The Analyst

A tech startup’s HR department shares the following annual salaries for all 10 employees:

Employee	Role	Salary (CAD)
E1	Junior Developer	$52,000
E2	Junior Developer	$54,000
E3	Junior Developer	$55,000
E4	Intermediate Developer	$72,000
E5	Intermediate Developer	$75,000
E6	Senior Developer	$98,000
E7	Senior Developer	$102,000
E8	Engineering Manager	$130,000
E9	VP of Engineering	$195,000
E10	CEO	$450,000

Task A1 — Compute the mean.

Show Answer

Task A2 — Find the median. (, even — average positions 5 and 6 of the sorted list.)

Show Answer

The salaries are already sorted. Positions 5 and 6 are 98,000.

Task A3 — Interpret the gap. The mean (41,800 higher than the median ($86,500). Which measure better represents what a typical employee earns? Explain your reasoning.

Task A4 — Reflection. If the startup posted a job ad saying “Average salary: $128,300”, would that be misleading? Write 2–3 sentences.

Show a Model Response

Yes, it would be misleading. Eight of ten employees earn below 100,000. The figure is technically correct (it’s the arithmetic mean) but it’s dominated by the CEO’s 86,500) — and an even better picture if salaries were reported by role.

🏗️ Path B: The Architect

Your city council wants to understand typical commute times for residents in order to plan public transit routes. You have been hired as the data analyst to design the study.

Task B1 — What will you measure? Which of the following best describes the variable “commute time”?

Task B2 — Which measure will you report? Before collecting data, you check Statistics Canada: commute times in Canadian cities are typically right-skewed (most people commute 10–40 min; a few commute 90–120+ min). Which measure should your report lead with?

Task B3 — Sample size tradeoff. You can survey 50 residents (cheap, fast) or 500 residents (expensive, slow). You discover that commute times vary widely — some people commute 10 minutes, others 2 hours. Which consideration most strongly favours the larger sample?

Task B4 — Reflection. The council chair asks: “Why not just report the mean? Everyone understands average.” Write 2–3 sentences defending your choice of the median.

Show a Model Response

The mean is familiar, but it’s not robust when the distribution is skewed. In a city where most residents commute 20–35 minutes but a handful drive 90+ minutes from suburban areas, those extreme commutes inflate the mean well above what a typical resident experiences. Reporting the median gives council members an accurate picture of the commute the median resident faces — which is the right input for transit route planning.

Optional — stretch beyond the lesson objectives. These problems require combining concepts or applying mathematical reasoning. They are not required for lesson completion.

Challenge 1 — Balancing Robustness and Data Retention with a Trimmed Mean

The trimmed mean is a compromise between the mean and the median. A trimmed mean removes the lowest and the highest of values, then computes the mean of the remaining observations.

Scenario: Nine exam scores: .

The regular mean is .

Question 1: Compute the 10% trimmed mean (remove the lowest 10% and highest 10% of observations).

Show Solution

of is , which rounds to 1 observation removed from each end.

Remove the lowest (42) and the highest (95).

Remaining 7 values:

Here the trimmed mean is very close to the regular mean because the extreme values (42 and 95) largely cancel out. Try a dataset where both extremes pull the same direction to see a larger effect.

Question 2: Consider the same 9 scores: . As the trim percentage increases from 0% toward 50%, the trimmed mean approaches which value?

Question 3: A dataset of values has a single massive outlier at the right tail. Which trimmed mean best controls for this outlier?

Question 4: Olympic judging trims scores (removes the highest and lowest judge’s score) before averaging. This is an application of which statistical concept?

Challenge 2 — Proving the Least-Squares Minimizer Property

Calculus required. This challenge uses derivatives to prove a property of the mean. Skip it if you have not yet studied differentiation.

You learned in Section 3 that the mean minimises over all possible constants . Here is the proof sketch.

Claim: The value of that minimises is .

Proof sketch:

Treat as a function of and differentiate with respect to :

Set the derivative equal to zero:

The second derivative is , confirming this is a minimum.

Implication: is the unique “least-squares centre” of the data. This is why the mean appears everywhere in regression and inference — it is the constant prediction that minimises squared error. The median, by contrast, minimises the sum of absolute deviations — a less familiar but equally principled criterion.

Verify: Compute SSE at

and at

for a small dataset

Data: . .

Any shift from increases SSE. The mean is the unique minimiser.

Challenge 3 — Anticipating the Need for Measures of Spread

You now know how to find the centre of a distribution. But two datasets can have the same mean and yet look very different:

Dataset	Values	Mean
A		50
B		50

Both have , but Dataset B is far more spread out. The centre alone doesn’t tell the whole story.

In DS-5 (Variability and Spread), you’ll meet the standard deviation — a measure of how far the typical observation sits from the mean — and the interquartile range (IQR) — a robust spread measure analogous to the median.

Warm-up question: For Dataset A (), the deviations from the mean are . For Dataset B (), the deviations are . What do these deviations have in common for both datasets?

Show Answer

In both cases, . This is the algebraic centre property: deviations from the mean always sum to zero, for any dataset. That’s why the mean is the balance point — and also why we square the deviations when measuring spread (to prevent cancellation). You’ll use this fact extensively in DS-5.

Complete, step-by-step solutions for all problems in Sections 5–9 are available on the solutions page. Solutions include worked arithmetic, common mistakes to watch for, and interpretation guidance.

View Full Solutions →

If you’re stuck: Re-read the relevant Core Concept in Section 3, then find the Worked Example that maps to that concept (e.g., Example 1 maps to Concept 1). The solutions page shows the reasoning behind every step, not just the final answer.

Quick-Reference Formulas

Sample Mean:

Median:

Mode:

Shape	Relationship	Recommended measure
Symmetric, no outliers	Mean ≈ Median ≈ Mode	Mean
Right-skewed	Mode < Median < Mean*	Median
Left-skewed	Mean < Median < Mode*	Median
Qualitative (nominal)	—	Mode

*The ordering is a heuristic that holds for most unimodal continuous distributions, not an exact mathematical rule.

DS-3: Central Tendency Measures

Section 1: Introduction

Section 2: Prerequisites

From DS-2: Distribution Shape

Prerequisite self-check

Section 3: Core Concepts

C1 — The Sample Mean

Sample Mean

The Balance Scale — Find the Mean by Balancing the Beam

C2 — Mean Sensitivity to Outliers

The Outlier Puller — See It Happen

C3 — The Median

Median

The Sort Animator — See Why Sorting is Non-Negotiable

C4 — The Mode

Mode

The Frequency Chart Builder — See the Mode Emerge

C5 — Skewness and Measure Selection

Skewness and Central Tendency

Three Canonical Shapes — See the Ordering Directly

The Income Histogram — Mean vs. Median as the Right Tail Grows

Section 4: Worked Examples

Example 1 — Computing the Sample Mean using the Summation Formula

Example 2 — Finding the Median for Odd and Even Sample Sizes

Example 3 — Identifying the Mode in Unimodal and Bimodal Data

Example 4 — Robust Central Measure and Frequency Distributions

Section 5: Guided Practice

Problem 1 — Computation of the Arithmetic Mean (C1)

Problem 2 — Median Localization and Even/Odd Sample Sizes (C3)

Problem 3 — Influence of Skewness on Central Measures (C4, C5)

Section 6: Independent Practice

Problem 1 — Diagnosing Skewness via Central Measure Discrepancies

Problem 2 — Defending Measure Choices for Skewed Environments

Problem 3 — Algorithmic Computation of the Arithmetic Mean

Problem 4 — Algorithmic Sorting and Location of the Median

Problem 5 — Algorithmic Identification of the Mode

Mixed Review — Retrieval from Earlier Lessons

Review Problem 1 — Reading a Frequency Distribution (DS-2)

Review Problem 2 — Variable Type and Appropriate Measure (DS-1)

Section 7: Mastery Check

Question 1 — Feynman Test (C1 + C2 + C5)

Question 2 — Applying Measures in Frequency Data (C5)

Question 3 — Median Retrieval (C3)

Confidence Check

Section 8: Boss Fight

🔬 Path A: The Analyst — Tech Startup Salary Analysis

🏗️ Path B: The Architect — Commute Time Study Design

🔬 Path A: The Analyst

🏗️ Path B: The Architect

Section 9: Challenge Problems

Challenge 1 — Balancing Robustness and Data Retention with a Trimmed Mean

Challenge 2 — Proving the Least-Squares Minimizer Property

Challenge 3 — Anticipating the Need for Measures of Spread

Section 10: Solutions Reference

Quick-Reference Formulas