🔛 👸 👨🏼‍🔧 Simple and fast approximations to statistical functions 👯 🍏 ⛹🏾

Task. There is a calculator , but no statistical tables at hand . For example, you need tables of critical points of the Student's distribution to calculate the confidence interval. Get a computer with Excel? Not athletic.

Great accuracy is not needed, you can use approximate formulas. The idea of the formulas below is that by transforming the argument, all distributions can be somehow reduced to normal. The approximations should provide both the calculation of the cumulative distribution function and the calculation of its inverse function.

Let's start with the normal distribution.

Φ (z) = P = \frac{1}{2} [1 + e r f (\frac{z}{\sqrt{2}})]

$\Phi(z)=P=\frac{1}{2}\left[1+\mathrm{erf}\left(\frac{z}{\sqrt{2}}\right)\right]$

z = Φ^{- 1} (P) = \sqrt{2} \cdot {e r f}^{- 1} (2 P - 1)

$z=\Phi^{-1}(P)=\sqrt{2}\cdot\mathrm{erf}^{-1}(2P-1)$

It requires calculating the function $\mathrm{erf}(x)$ and the reverse to it. I used the approximation [1]:

e r f (x) = s i g n (x) \cdot \sqrt{1 - \exp (- x^{2} \cdot \frac{\frac{4}{π} + a x^{2}}{1 + a x^{2}})}

$\mathrm{erf}(x)=\mathrm{sign}(x)\cdot\sqrt{1-\exp\left(-x^{2}\cdot\frac{\frac{4}{\pi}+ax^{2}}{1+ax^{2}}\right)}$

{e r f}^{- 1} (x) = s i g n (x) \cdot \sqrt{- t_{2} + \sqrt{t_{2}^{2} - \frac{1}{a} \cdot \ln t_{1}}}

$\mathrm{erf}^{-1}(x)=\mathrm{sign}(x)\cdot\sqrt{-t_2 + \sqrt{t_2^{2}-\frac{1}{a}\cdot \ln t_1}}$

Where $t_1$ and $t_2$ - auxiliary variables:

t_{1} = 1 - x^{2}, t_{2} = \frac{2}{π a} + \frac{\ln t_{1}}{2}

$t_1=1-x^{2},\:t_2=\frac{2}{\pi a}+\frac{\ln t_1}{2}$

and the constant $a=0.147$ ... Below is the code in Octave language.

function y = erfa(x)
  a  = 0.147;
  x2 = x**2; t = x2*(4/pi + a*x2)/(1 + a*x2);
  y  = sign(x)*sqrt(1 - exp(-t));
endfunction

function y = erfinva(x)
  a  = 0.147; 
  t1 = 1 - x**2; t2 = 2/pi/a + log(t1)/2;
  y  = sign(x)*sqrt(-t2 + sqrt(t2**2 - log(t1)/a));
endfunction

function y = normcdfa(x)
  y = 1/2*(1 + erfa(x/sqrt(2)));
endfunction

function y = norminva(x)
  y = sqrt(2)*erfinva(2*x - 1);
endfunction

Now that we have the normal distribution functions, we give an argument and calculate the Student's t-distribution [2]:

F_{t} (x, n) = Φ (\sqrt{\frac{1}{t_{1}} \cdot \ln (1 + \frac{x^{2}}{n})})

$F_t(x,n)=\Phi\left(\sqrt{\frac{1}{t_1}\cdot\ln(1+\frac{x^{2}}{n})}\right)$

t = F_{t}^{- 1} (P, n) = \sqrt{n \cdot \exp (Φ^{- 1} (P)^{2} \cdot t_{1}) - n}

$t=F_t^{-1}(P,n)=\sqrt{n\cdot\exp\left(\Phi^{-1}(P)^{2}\cdot t_1\right)-n}$

where auxiliary variable $t_1$ there is

t_{1} = \frac{n - 1.5}{(n - 1)^{2}}

$t_1=\frac{n-1.5}{(n-1)^{2}}$

function y = tcdfa(x,n)
  t1 = (n - 1.5)/(n - 1)**2;
 y = normcdfa(sqrt(1/t1*log(1 + x**2/n)));
endfunction

function y = tinva(x,n)
  t1 = (n - 1.5)/(n - 1)**2;
  y  = sqrt(n*exp(t1*norminva(x)**2) - n);
endfunction

The idea of calculating the distribution roughly $\chi^{2}$ is clearly represented by formulas [3]:

σ^{2} = \frac{2}{9 n}, μ = 1 - σ^{2}

$\sigma^{2}=\frac{2}{9n},\:\mu=1-\sigma^{2}$

F_{χ^{2}} (x, n) = Φ (\frac{{(\frac{x}{n})}^{1 / 3} - μ}{σ})

$F_{\chi^{2}}(x,n)=\Phi\left(\frac{\left(\frac{x}{n}\right)^{1/3}-\mu}{\sigma}\right)$

χ^{2} = F_{χ^{2}}^{- 1} (P, n) = n \cdot {(Φ^{- 1} (P) \cdot σ + μ)}^{3}

$\chi^2=F_{\chi^2}^{-1}(P,n)=n\cdot\left(\Phi^{-1}(P)\cdot\sigma + \mu\right)^3$

function y = chi2cdfa(x,n)
  s2 = 2/9/n; mu = 1 - s2;
  y  = normcdfa(((x/n)**(1/3) - mu)/sqrt(s2));
endfunction

function y = chi2inva(x,n)
 s2 = 2/9/n; mu = 1 - s2;
  y = n*(norminva(x)*sqrt(s2) + mu)**3;
endfunction

Fisher distribution (for $n/k\geq3$ and $n\geq3$ ) . $\chi^2$ [4], , .

σ^{2} = \frac{2}{9 n}, μ = 1 - σ^{2}

$\sigma^2=\frac{2}{9n},\:\mu=1-\sigma^2$

λ = \frac{2 n + k \cdot x / 3 + (k - 2)}{2 n + 4 k \cdot x / 3}

$\lambda=\frac{2n+k\cdot x/3+(k-2)}{2n+4k\cdot x/3}$

F_{f} (x; k, n) = Φ (\frac{{(λ \cdot x)}^{1 / 3} - μ}{σ})

$F_f(x;k,n)=\Phi\left(\frac{\left(\lambda\cdot x\right)^{1/3}-\mu}{\sigma}\right)$

, .

q = {(Φ^{- 1} (P) \cdot σ + μ)}^{3}

$q=\left(\Phi^{-1}(P)\cdot\sigma+\mu\right)^3$

b = 2 n + k - 2 - 4 / 3 \cdot k q

$b=2n+k-2-4/3\cdot kq$

D = b^{2} + 8 / 3 \cdot k n q

$D=b^2+8/3\cdot knq$

x = F_{f}^{- 1} (P; k, n) = \frac{- b + \sqrt{D}}{2 k / 3}

$x=F_f^{-1}(P;k,n)=\frac{-b+\sqrt{D}}{2k/3}$

function y = fcdfa(x,k,n)
  mu = 1-2/9/k; s = sqrt(2/9/k);
  lambda = (2*n + k*x/3 + k-2)/(2*n + 4*k*x/3);
  normcdfa(((lambda*x)**(1/3)-mu)/s)
endfunction

function y = finva(x,k,n)
  mu = 1-2/9/k; s = sqrt(2/9/k);
  q = (norminva(x)*s + mu)**3;
  b = 2*n + k-2 -4/3*k*q;
  d = b**2 + 8/3*k*n*q;
  y = (sqrt(d) - b)/(2*k/3);
endfunction

Sergei Winitzki. A handy approximation for the error function and its inverse. February 6, 2008.
Gleason J.R. A note on a proposed Student t approximation // Computational statistics & data analysis. – 2000. – Vol. 34. – №. 1. – Pp. 63-66.
Wilson E.B., Hilferty M.M. The distribution of chi-square // Proceedings of the National Academy of Sciences. – 1931. – Vol. 17. – №. 12. – Pp. 684-688.
Li B. and Martin E.B. An approximation to the F-distribution using the chi-square distribution. Computational statistics & data analysis. – 2002. Vol. 40. – №. 1. pp. 21-26.

Simple and fast approximations to statistical functions

More articles: