Given σa\sigma_a and σb\sigma_b, Ask for σ\sigma

1. 簡化 σ\sigma

  • σ\sigma 乘開
    • σ=(xixˉ)2n\sigma = \sqrt{\frac{\sum{(x_i-\bar x)^2}}{n}}
    • σ=xi22xˉxi+nxˉ2n\sigma = \sqrt{\frac{\sum{x_i^2-2\bar x\sum{x_i}+n\bar x^2}}{n}}
  • 平均等於總和除以個數 xin=xˉ\frac{\sum x_i}{n}=\bar x,故
    • σ=xi2n2xˉxin+nxˉ2n\sigma = \sqrt{\frac{\sum x_i^2}{n}-\frac{2\bar x\sum x_i}{n}+\frac{n\bar x^2}{n}}
    • σ=xi2n2xˉ2+xˉ2\sigma = \sqrt{\frac{\sum x_i^2}{n}-2\bar x^2+\bar x^2}
    • σ=xi2nxˉ2(1)\boxed{\sigma = \sqrt{\frac{\sum x_i^2}{n}-\bar x^2}}-(1)

2. 求個別平方和

  • (1)(1)式可推得各別的標準差為
    • σa=xai2naxˉa2(2)\boxed{\sigma_a = \sqrt{\frac{\sum x_{ai}^2}{n_a}-\bar x_a^2}}-(2)
    • n=na+nb(3)\boxed{n = n_a+n_b}-(3)
    • xi2=xai2+xbi2(4)\boxed{\sum x_i^2=\sum x_{ai}^2+\sum x_{bi}^2}-(4)
  • 欲求 xai2\sum x_{ai}^2,我們將(2)(2)式展開
    • σa2=xai2naxˉa2\sigma_a^2 = \frac{\sum x_{ai}^2}{n_a}-\bar x_a^2
    • σa2+xˉa2=xai2na\sigma_a^2+\bar x_a^2= \frac{\sum x_{ai}^2}{n_a}
    • xai2=na(σa2+xˉa2)(5)\boxed{\sum x_{ai}^2=n_a(\sigma_a^2+\bar x_a^2)}-(5)

3. 求總體標準差

  • (1)(1)式展開
    • σ=(xai2+xbi2)nxˉ2(6)\boxed{\sigma = \sqrt{\frac{(\sum x_{ai}^2+\sum x_{bi}^2)}{n}-\bar x^2}}-(6)
  • (5)(5)代入(6)(6)
    • σ=na(σa2+xˉa2)+nb(σb2+xˉn2)nxˉ2(7)\boxed{\sigma=\sqrt{\frac{n_a(\sigma_a^2+\bar x_a^2)+n_b(\sigma_b^2+\bar x_n^2)}{n}-\bar x^2}}-(7)
  • 其中 xˉ=naxˉa+nbxˉbn(8)\boxed{\bar x=\frac{n_a\bar x_a + n_b\bar x_b}{n}}-(8)
  • 故我們可以從上式輾轉得通式:
    • σ=(ni(σi2+xˉi2))nxˉ2(9)\boxed{\sigma=\sqrt{\frac{\sum(n_i(\sigma_i^2+\bar x_i^2))}{n}-\bar x^2}}-(9)
    • 或寫成
    • σ=(ni(σi2+xˉi2))nixˉin(9)\boxed{\sigma=\sqrt{\frac{\sum(n_i(\sigma_i^2+\bar x_i^2))-\sum n_i\bar x_i}{n}}}-(9)

summary

  • 個數
    • n=na+nb=ni\boxed{n=n_a+n_b=\sum n_i}
  • 平均數
    • xˉ=naxˉa+nbxˉbna+nb=nixˉini\boxed{\bar x=\frac{n_a\bar x_a+n_b\bar x_b}{n_a+n_b}=\frac{\sum{n_i\bar x_i}}{\sum{n_i}}}
  • 標準差
    • σ=nai(σai2+xˉai2)+nbi(σbi2+xˉbi2)(naxˉa+nbxˉb)na+nb=(ni(σi2+xˉi2))nixˉini\boxed{\sigma=\sqrt{\frac{n_{ai}(\sigma_{ai}^2+\bar x_{ai}^2)+n_{bi}(\sigma_{bi}^2+\bar x_{bi}^2)-(n_a\bar x_a+n_b\bar x_b)}{n_a+n_b}}=\sqrt{\frac{\sum(n_i(\sigma_i^2+\bar x_i^2))-\sum n_i\bar x_i}{\sum n_i}}}

4. sql

  • 現有一 table 存有
  • avg_value
  • std_value
  • site_count
with stats as (
    select
        ...
        sum(site_count*avg_value)/sum(site_count) as avg_value,
        sqrt((sum(site_count*(square(std_value)+square(avg_value)))-sum(site_count*avg_value))/sum(site_count)) as std_value,
        sum(site_count) as site_count
    from data
    where ...
    group by ...
)
select * from stats