👂🏻 🛷 🌤️ Wilcoxon test: a sweet spot for practitioners 😨 👤 🚇

In the practice of processing observation results, the distribution of the general population is unknown or (for continuous random variables) differs from the normal distribution, so the use of classical statistical methods is unreasonable and can lead to errors. In this case, methods are used that do not depend (or are free) from the distribution of the general population - nonparametric methods.

The article discusses from a unified point of view three single-sample tests that are frequently encountered in practice: the sign test, the t-test, and the Signed-Rank Wilcoxon test, a nonparametric procedure whose power is comparable to the power of the t-test in the case of a normally distributed sample, and exceeds the power of the t-test if the distribution of the sample has "heavier tails" compared to the normal distribution.

1. Define a model for the location model as follows. Let $X_1, X_2, \ ldots, X_n$ - denote a random sample obtained according to the following law

$X_i = \ theta + e_i,$

where it is assumed that random errors $e_1, e_2, \ ldots, e_n$ are independent and identically distributed random variables with a continuous distribution density f (t) symmetric about zero.

2 . Under the condition of symmetry, any position parameter X_i , including the mean and median, is equal to $\ theta$ . Consider the hypothesis

$H_0: \ theta = 0, ~~~ H_a: \ theta> 0.$

3. To test this hypothesis, consider three tests that are often used in practice: the sign test, the t-test, and the Wilcoxon test.

3.1. Classic signs test (sign test) is based on statistics

$S=\sum_{i=1}^nsign(X_i),$

where sign(t)=-1,0,1 for, t<0,t=0,t>0 respectively. Let be

$S^+=\#_i\{X_i>0\}.$

S=2S^+-n . , X_i ( , , ). H_0 , S^+ 1/2 . s^+ – S^+ p-value $P_{H_0}(S^+\geq s^+)=1-F_B(s^+-1;n;0.5)$ , F_B(t;n;p) – (R pbinom

cdf ).

, H_0 () f(t) .

3.2. t- (t-test) .

$T=\sum_{i=1}^nsign(X_i)\cdot|X_i|.$

, f(t) . t- t-

$t=\frac{\bar{X}}{s/\sqrt{n}},$

$\bar{X}$ , . , t- n-1 . t_0 . p-value t- $P_{H_0}(t\geq t_0)=1-F_T(t_0;n-1)$ , $F_T(t;\nu)$ – t- c $\nu$ (R pt

cdf t-). p-value , .

3.3. t- , t- .

(signed-rank Wilcoxon test) , . R|X_i| X_i $|X_1|,\ldots,|X_n|$ , .

$W=\sum_{i=1}^nsign(X_i)\cdot R|X_i|.$

t-, , H_0 f(t) .

. , , W^+ ,

$W^+=\sum_{X_i>0}R|X_i|=\frac{1}{2}W+\frac{n(n+1)}{4}.$

p-value $P_{H_0}(W^+\geq w^+)=1-F_{W^+}(w^+-1;n)$ , $F_{W^+}(x;n)$ – (R psignrank

cdf W^+ ).

4. . : , t- $\theta$ . .

4.1. $\theta$ ,

$\hat{\theta}=med\{X_1,X_2,\ldots,X_n\}.$

$0<\alpha<1$ $\theta$ $(1-\alpha)100\%$ $\left(X_{(c_1+1)},X_{(n-c_1)}\right)$ , $X_{(i)}$ – - , c_1 – $\alpha/2$ p=1/2 . e_i . , - $\alpha$ .

4.2. $\theta$ , t- $\bar{X}$ . $\bar{X}\pm t_{\alpha/2,n-1}\cdot[s/\sqrt{n}]$ , $t_{\alpha/2,n-1}$ – $\alpha/2$ t- n-1 . e_i .

4.3. $\theta$ , - (Hodges-Lehmann)

$\hat{\theta}_W=med_{i\leq j}\left\{\frac{X_i+X_j}{2}\right\}.$

$A_{ij}=(X_i+X_j)/2$ , $i\leq j$ (Walsh averages) . $A_{(1)}<\cdots<A_{(n(n+1)/2)}$ . $(1-\alpha)100\%$ $\theta$ $\left(A_{(c_2+1)}, A_{(n(n+1)/2-c2)}\right)$ , c_2 – $\alpha/2$ signed-rank Wilcoxon . e_i . , W^+ – $\left\{0,1,…,n(n+1)/2\right\}$ n^2 . , , , $\alpha$ .

5. ( ) A B . , ?

, A B. $\theta$ . R t- $H_0: \ theta = 0, H_a: \ theta> 0.$

> Store_A <- c(82, 69, 73, 43, 58, 56, 76, 65)
> Store_B <- c(63, 42, 74, 37, 51, 43, 80, 62)
> response <- Store_A - Store_B

> wilcox.test(response, alternative = "greater", conf.int = TRUE)

	Wilcoxon signed rank exact test

data:  response
V = 32, p-value = 0.02734
alternative hypothesis: true location is greater than 0
95 percent confidence interval:
   1 Inf
sample estimates:
(pseudo)median 
          7.75 

> t.test(response, alternative = "greater", conf.int = TRUE)

	One Sample t-test

data:  response
t = 2.3791, df = 7, p-value = 0.02447
alternative hypothesis: true mean is greater than 0
95 percent confidence interval:
 1.781971      Inf
sample estimates:
mean of x 
     8.75

wilcox.test()

W ^ + , p-value , - $\ theta$ $95 \%$ $\ theta$ . - t.test()

. , 0.05 , , A .

, . , t- t- « » .

Wilcoxon test: a sweet spot for practitioners

More articles: