Section 1.7

Sampling Distributions

Here's an idea that quietly powers all of inference: any number you calculate from a sample — its mean, its median, its standard deviation — is itself a random quantity. Take a different sample and you'd get a slightly different value. The pattern those values make, over many samples, is a sampling distribution.

One sample gives one estimate

Suppose the truth is fixed but unknown — the average reaction time of an entire population. You can't measure everyone, so you take a sample of n people and compute their mean. That single number is your estimate. But it came from a random draw, so it's a little high or a little low by luck.

The crucial move is to imagine repeating the study thousands of times. Each repetition gives one estimate; collect them all, and you get the sampling distribution of that statistic. It tells you how much your estimate bounces around — and that's exactly what you need to judge how trustworthy a single estimate is.

🎮 Build a Sampling Distribution

Each draw grabs n values from the population (top) and drops the sample's statistic into the collection (bottom).

Sample size n = 10

Statistic

The population (fixed, and deliberately skewed)

The sampling distribution of your chosen statistic

Draws0

Average estimate—

Standard error (SD of estimates)—

Two things to take away

1. The spread of the sampling distribution is the standard error. It measures how much your estimate would vary from study to study. A small standard error means a single estimate is trustworthy; a large one means "take this with a grain of salt."

2. Bigger samples give tighter sampling distributions. Slide n up and the bottom distribution narrows — more data per sample means each estimate lands closer to the truth, so the standard error shrinks.

Different statistics, different distributions

Switch the statistic from mean to median and the sampling distribution changes shape and width. Every statistic has its own sampling distribution. That's a big deal: it means we can reason about the reliability of any estimate, not just the mean.

Why it matters: the sampling distribution is the bridge between one sample and a claim about the whole population. Next, the Central Limit Theorem tells us its shape is predictable, and confidence intervals turn its width into an honest margin of error.