Standard Deviation

IT 위키

Standard Deviation (σ) is a statistical measure that quantifies the amount of variation or dispersion in a dataset. It indicates how spread out the data points are from the mean.

1 Definition[편집 | 원본 편집]

The standard deviation is the square root of variance and is given by:

  • Population Standard Deviation (σ):
    • σ = √[(1/n) * Σ (xᵢ - μ)²]
  • Sample Standard Deviation (s):
    • s = √[(1/(n-1)) * Σ (xᵢ - x̄)²]

where:

  • xᵢ – Each data point.
  • μ – Population mean.
  • – Sample mean.
  • n – Number of data points.
  • Σ – Summation notation.

2 Interpretation[편집 | 원본 편집]

  • Low Standard Deviation – Data points are close to the mean.
  • High Standard Deviation – Data points are widely spread.
  • Zero Standard Deviation – All values are identical.

3 Example Calculation[편집 | 원본 편집]

Given dataset: [5, 7, 9, 10, 14]

  1. Calculate the mean:
    • x̄ = (5 + 7 + 9 + 10 + 14) / 5 = 9
  2. Compute squared differences:
    • (5 - 9)² = 16
    • (7 - 9)² = 4
    • (9 - 9)² = 0
    • (10 - 9)² = 1
    • (14 - 9)² = 25
  3. Population variance:
    • σ² = (16 + 4 + 0 + 1 + 25) / 5 = 9.2
  4. Population standard deviation:
    • σ = √9.2 ≈ 3.03
  5. Sample standard deviation:
    • s² = (16 + 4 + 0 + 1 + 25) / (5-1) = 11.5
    • s = √11.5 ≈ 3.39

4 Properties[편집 | 원본 편집]

  • Always non-negative – Standard deviation is always ≥ 0.
  • Same units as the data – Unlike variance, which is squared.
  • Sensitive to outliers – Extreme values can significantly impact standard deviation.

5 Applications[편집 | 원본 편집]

  • Finance – Measures risk (volatility of asset returns).
  • Machine Learning – Evaluates feature distribution and normalization.
  • Quality Control – Assesses consistency in production processes.

6 Standard Deviation vs. Variance[편집 | 원본 편집]

Measure Formula Interpretation
Variance (σ²) (1/n) * Σ (xᵢ - μ)² Average squared deviation from mean.
Standard Deviation (σ) √Variance Dispersion in original units.

7 See Also[편집 | 원본 편집]