What is the Central Limit Theorem?
The Central Limit Theorem (CLT) is one of the most important results in statistics. It tells us that:
If you take a large enough sample from any population (with a known mean and finite variance), the distribution of the sample mean will be approximately normal (bell-shaped), even if the original population is not normal.
This is why we can use the normal distribution for many real-world problems, even when data is not perfectly normal.
CLT Statement (Simple Version)
Suppose X1, X2, ..., Xn are independent, identically
distributed random variables with:
– Mean (μ)
– Variance (σ²)
Then the sample mean (X̄ = (X₁ + X₂ + ... + Xₙ) / n) will be approximately
normally distributed with:
– Mean: μ
– Standard Deviation: σ / √n
As n → ∞, the approximation gets better!
Why is CLT Useful?
- It allows us to use normal distribution tools even when data isn't normal.
- It helps in calculating probabilities for sample means.
- It's used in hypothesis testing and confidence intervals.
Real-Life Example
Suppose you measure the height of students in a class. Each individual height might vary a lot. But if you take many random samples and compute the average height each time, the averages will follow a normal distribution!
When Can You Use CLT?
- Sample size n is large (usually n ≥ 30 is considered enough)
- Samples are independent and identically distributed
- The population has a finite variance