Mean, median, mode: picking the right summary statistic
“Average salary”, “median rent”, “most common answer” — there’s more than one way to summarize data into a single number. Each captures something different and works for different distributions.
Definitions
Mean (arithmetic average)
Sum divided by count:
mean = (x₁ + x₂ + ... + xₙ) / n Example: [2, 4, 6, 8, 10] → (2+4+6+8+10) / 5 = 6
Median
The middle value when sorted (average of the two middle ones if even-sized):
[2, 4, 6, 8, 10] → 6
[2, 4, 6, 8] → (4 + 6) / 2 = 5 Mode
The most frequent value:
[1, 2, 2, 3, 4, 4, 4, 5] → 4 A dataset with two modes is bimodal.
Sensitivity to outliers
The big difference is how outliers move the value.
Salaries [300, 400, 500, 600, 9999] (in some currency unit, last person way above):
- Mean:
(300 + 400 + 500 + 600 + 9999) / 5 = 2,360 - Median:
500 - Mode: no repeats, no mode (or “all”)
“Average is 2,360” sounds wealthy, but everyone except one person is at or below 600. The median tells the more honest story of a typical individual.
When to use mean
- Roughly normal distribution (symmetric, bell-shaped).
- No or few outliers.
- The total matters (sum redistributed evenly?).
- Examples: test scores, height, weight in healthy populations.
Mean is mathematically tractable (linear), so it underpins much of statistics. Standard deviation and variance are defined relative to the mean.
When to use median
- Skewed distributions (income, housing prices — no real upper bound).
- You want to suppress outlier influence.
- You’re communicating “what’s typical”.
- Examples: income, rent, real estate, response times.
If the median is much lower than the mean, the distribution has a long right tail (a few high values pulling the mean up).
When to use mode
- Categorical data (not numeric): “the most common answer”.
- Discrete data when you want one “typical value”.
- Examples: most common survey answer, most-needed shoe size.
Mode is rarely useful on continuous data — exact repeats are too unusual.
Response times: medians and percentiles, not means
For service performance, mean is the wrong default:
- One tail-end slow request (near-timeout) shifts the mean a lot.
- User experience is captured by median or P95 / P99 (95th / 99th percentile).
Example: response times [100, 110, 120, 130, 5000] ms
- Mean: 1,092 ms
- Median: 120 ms
- P95 (top 5%): 5000 ms
“Average 1 second” mismatches reality. SLOs are typically written as “P95 under 500ms” rather than “mean under X”.
Weighted mean
When values shouldn’t be treated equally:
weighted_mean = Σ(wᵢ × xᵢ) / Σwᵢ Grade calculation with “final exam 60%, assignments 40%” is the textbook case. National averages weighted by population follow the same idea.
Geometric and harmonic means
For data where multiplication matters:
- Geometric mean — nth root of the product. Use for compound growth rates (annualized investment returns).
- Harmonic mean — reciprocal of the mean of reciprocals. Use for averaging rates (round-trip speed).
Round-trip speed: 60 km/h there, 40 km/h back.
- Arithmetic mean: 50 km/h (wrong)
- Harmonic mean:
2 / (1/60 + 1/40) = 48 km/h(correct)
Variance and standard deviation
Beyond a central value, you also want spread:
- Variance — mean of squared deviations from the mean.
- Standard deviation — square root of variance, in the same units as the data.
[100, 100, 100] and [50, 100, 150] share a mean of 100, but the latter has larger standard deviation (more spread).
Summary
| Situation | Good choice |
|---|---|
| Normal distribution, no outliers | Mean |
| Skewed distribution / outliers | Median |
| Categorical data | Mode |
| Response times | Median + P95, P99 |
| Growth rates, returns | Geometric mean |
| Round-trip rates | Harmonic mean |
Reporting more than one statistic side by side is the analyst’s basic discipline.
To compute mean, median, mode, and standard deviation for a dataset, the statistics calculator on this site processes pasted values and shows everything at once.