Variance
IT 위키
Variance is a statistical measure that quantifies the dispersion of a dataset relative to its mean. It indicates how much the values in a dataset deviate from the average value.
Definition[편집 | 원본 편집]
The variance (σ²) of a dataset with n elements is calculated as:
- Population Variance (σ²):
- σ² = (1/n) * Σ (xᵢ - μ)²
 
- Sample Variance (s²):
- s² = (1/(n-1)) * Σ (xᵢ - x̄)²
 
where:
- xᵢ – Each data point.
- μ – Population mean.
- x̄ – Sample mean.
- n – Number of data points.
- Σ – Summation notation.
Interpretation[편집 | 원본 편집]
- Low Variance – Data points are close to the mean.
- High Variance – Data points are spread out from the mean.
- Zero Variance – All values are identical.
Example Calculation[편집 | 원본 편집]
Given dataset: [5, 7, 9, 10, 14]
- Calculate the mean:
- x̄ = (5 + 7 + 9 + 10 + 14) / 5 = 9
 
- Compute squared differences:
- (5 - 9)² = 16
- (7 - 9)² = 4
- (9 - 9)² = 0
- (10 - 9)² = 1
- (14 - 9)² = 25
 
- Population variance:
- σ² = (16 + 4 + 0 + 1 + 25) / 5 = 9.2
 
- Sample variance:
- s² = (16 + 4 + 0 + 1 + 25) / (5-1) = 11.5
 
Properties[편집 | 원본 편집]
- Always non-negative – Squaring deviations ensures variance is ≥ 0.
- Measured in squared units – Different from the original data scale.
- Used to compute standard deviation – Standard deviation (σ) is the square root of variance.
Applications[편집 | 원본 편집]
- Finance – Measuring risk (volatility of asset returns).
- Machine Learning – Evaluating feature distribution in datasets.
- Quality Control – Assessing variability in production processes.
Variance vs. Standard Deviation[편집 | 원본 편집]
| Measure | Formula | Interpretation | 
|---|---|---|
| Variance (σ²) | (1/n) * Σ (xᵢ - μ)² | Average squared deviation from mean. | 
| Standard Deviation (σ) | √Variance | Measures dispersion in original units. | 

