Simple and fast approximations to statistical functions

Task. There is a calculator , but no statistical tables at hand . For example, you need tables of critical points of the Student's distribution to calculate the confidence interval. Get a computer with Excel? Not athletic.



Great accuracy is not needed, you can use approximate formulas. The idea of ​​the formulas below is that by transforming the argument, all distributions can be somehow reduced to normal. The approximations should provide both the calculation of the cumulative distribution function and the calculation of its inverse function.



Let's start with the normal distribution.



Ξ¦(z)=P=12[1+erf(z2)]



z=Ξ¦-1(P)=2β‹…erf-1(2P-1)



It requires calculating the function erf(x)and the reverse to it. I used the approximation [1]:



erf(x)=sign(x)β‹…1-exp⁑(-x2β‹…4Ο€+ax21+ax2)



erf-1(x)=sign(x)β‹…-t2+t22-1aβ‹…ln⁑t1



Where t1 and t2 - auxiliary variables:



t1=1-x2,t2=2Ο€a+ln⁑t12



and the constant a=0.147... Below is the code in Octave language.



function y = erfa(x)
  a  = 0.147;
  x2 = x**2; t = x2*(4/pi + a*x2)/(1 + a*x2);
  y  = sign(x)*sqrt(1 - exp(-t));
endfunction

function y = erfinva(x)
  a  = 0.147; 
  t1 = 1 - x**2; t2 = 2/pi/a + log(t1)/2;
  y  = sign(x)*sqrt(-t2 + sqrt(t2**2 - log(t1)/a));
endfunction

function y = normcdfa(x)
  y = 1/2*(1 + erfa(x/sqrt(2)));
endfunction

function y = norminva(x)
  y = sqrt(2)*erfinva(2*x - 1);
endfunction


Now that we have the normal distribution functions, we give an argument and calculate the Student's t-distribution [2]:



Ft(x,n)=Ξ¦(1t1β‹…ln⁑(1+x2n))



t=Ft-1(P,n)=nβ‹…exp⁑(Ξ¦-1(P)2β‹…t1)-n



where auxiliary variable t1 there is



t1=n-1.5(n-1)2



function y = tcdfa(x,n)
  t1 = (n - 1.5)/(n - 1)**2;
 y = normcdfa(sqrt(1/t1*log(1 + x**2/n)));
endfunction

function y = tinva(x,n)
  t1 = (n - 1.5)/(n - 1)**2;
  y  = sqrt(n*exp(t1*norminva(x)**2) - n);
endfunction


The idea of ​​calculating the distribution roughly Ο‡2 is clearly represented by formulas [3]:



Οƒ2=2ninen,ΞΌ=1-Οƒ2



Fχ2(x,n)=Φ((xn)1/3-μσ)



χ2=Fχ2-1(P,n)=n⋅(Φ-1(P)⋅σ+μ)3



function y = chi2cdfa(x,n)
  s2 = 2/9/n; mu = 1 - s2;
  y  = normcdfa(((x/n)**(1/3) - mu)/sqrt(s2));
endfunction

function y = chi2inva(x,n)
 s2 = 2/9/n; mu = 1 - s2;
  y = n*(norminva(x)*sqrt(s2) + mu)**3;
endfunction


Fisher distribution (for n/kβ‰₯3 and nβ‰₯3) . Ο‡2 [4], , .



Οƒ2=2ninen,ΞΌ=1-Οƒ2



Ξ»=2n+kβ‹…x/3+(k-2)2n+4kβ‹…x/3



Ff(x;k,n)=Ξ¦((Ξ»β‹…x)1/3-ΞΌΟƒ)



, .



q=(Ξ¦-1(P)β‹…Οƒ+ΞΌ)3



b=2n+k-2-4/3β‹…kq



D=b2+8/3β‹…knq



x=Ff-1(P;k,n)=-b+D2k/3



function y = fcdfa(x,k,n)
  mu = 1-2/9/k; s = sqrt(2/9/k);
  lambda = (2*n + k*x/3 + k-2)/(2*n + 4*k*x/3);
  normcdfa(((lambda*x)**(1/3)-mu)/s)
endfunction

function y = finva(x,k,n)
  mu = 1-2/9/k; s = sqrt(2/9/k);
  q = (norminva(x)*s + mu)**3;
  b = 2*n + k-2 -4/3*k*q;
  d = b**2 + 8/3*k*n*q;
  y = (sqrt(d) - b)/(2*k/3);
endfunction




  1. Sergei Winitzki. A handy approximation for the error function and its inverse. February 6, 2008.
  2. Gleason J.R. A note on a proposed Student t approximation // Computational statistics & data analysis. – 2000. – Vol. 34. – β„–. 1. – Pp. 63-66.
  3. Wilson E.B., Hilferty M.M. The distribution of chi-square // Proceedings of the National Academy of Sciences. – 1931. – Vol. 17. – β„–. 12. – Pp. 684-688.
  4. Li B. and Martin E.B. An approximation to the F-distribution using the chi-square distribution. Computational statistics & data analysis. – 2002. Vol. 40. – β„–. 1. pp. 21-26.



All Articles