The Wilcoxon test (Wilcoxon signed-rank test) tests whether the mean values of two dependent groups differ significantly from each other.
The Wilcoxon test is a non-parametric test and is therefore subject to considerably fewer assumptions than its parametric counterpart, the t-test for dependent samples. Therefore, as soon as the boundary conditions for the t-test for dependent samples are no longer fulfilled, the Wilcoxon test is used.
You should check whether your memory performance is better in the morning or in the evening.
A V-belt producer has very high downtimes on his 5 production lines. You should now find out whether a system setting has an influence on the downtimes.
Since the Wilcoxon test is a nonparametric test, the data need not be normally distributed. However, to calculate a Wilcoxon test, the samples must be dependent. Dependent samples are present, for example, when data is obtained from repeated measurements or when so-called natural pairs are involved.
Furthermore, the distribution shape of the differences of the two dependent samples should be approximately symmetrical. If the data are not available in pairs, the Mann-Whitney U test is used instead of the Wilcoxon test.
The hypotheses of the Wilcoxon test are very similar to the hypotheses of the dependent t-test. However, in the case of the Wilcoxon test, the test is whether there is a difference in the central tendency; in the case of the t-test, the test is whether there is a difference in the mean. Thus, the Mann-Whitney U test results in: in:
Now of course the question may come, why don't I just always use the Wilcoxon test instead of the t-test for dependent samples? Then I don't need to test for normal distribution! Parametric tests like the t-test are usually more powerful!
With a parametric test, a smaller difference or a smaller sample is usually enough to reject the null hypothesis. Both are, of course, very convenient. Therefore, if possible, always use parametric tests!
To calculate the Wilcoxon test for two dependent samples, the difference between the dependent values is first calculated. After the differences are calculated, the absolute values of the differences are used to form the rankings. It is important to note the original sign of the differences (An example with tied ranks comes below)..
In the last step, the sums of the ranks are formed, which are derived from a positive and a negative difference. The test statistics W is then calculated from the smaller value of T + and T -
In this example, the test statistics W results in 8
If there is no difference in the rank sum, the expected value is
In this example, the expected value is 10.5. The calculated test statistic must now be tested for significance.
If the sample is sufficiently large, i.e. there is a number of cases greater than 25, the critical value is approximately normally distributed. If normal distribution is assumed, the z-value can be calculated using the formula above. If less than 25 values are present, the critical T-value is read from a table of critical T-values. Therefore, in this case, the table would actually be used.
The calculated z value from the Wilcoxon test can now be checked for significance by comparing it with the critical value of the standard normal distribution.
If several people share a rank, connected ranks are present. In this case, there is a change in the calculation of the rank sums and the standard deviation of the W-value. We will now go through both using an example.
In the example it can be seen that there are.
To account for these connected ranks, the mean values of the joined ranks are calculated in each case. In the first case, this results in a "new" rank of 3 and in the second case in a "new" rank of 6.5. Now we can calculate the rank sums of the positive and negative ranks.
Since the rank ties are clearly visible in the upper table, a term is calculated here that is needed for the later calculation of the W-value in the presence of rank ties.
Now all values are available to calculate the z-value considering connected ranks.
Again, noting that you actually need about 20 cases to assume normal distribution of W values.
The effect size indicates how large the observed effect is compared to the random noise. There are several measures to calculate the effect size in the Wilcoxon test. A common method is to use r, defined as:
Where z is the standardized test statistic value from the Wilcoxon test and n is the total number of observations (i.e., the sum of the sizes of both groups).
The value of r can range from -1 to 1, with values near 0 indicating that there is no effect and values near -1 or 1 indicating a strong effect. The sign of r indicates the direction of the effect.
The following table can be used to interpret the effect size (effect size r according to Cohen (1988)).
|r| < 0.1 | no effect / very small effect |
---|---|
|r| = 0.1 | small effect |
|r| = 0.3 | medium effect |
|r| = 0.5 | large effect |
A Wilcoxon test can easily be calculated with DATAtab. Simply copy the table below or your own data into the Statistical Calculator and click on Hypothesis tests Then click on the two variables and select Non-Parametric Test.
Reaction time morning | Reaction time evening |
---|---|
34 | 45 |
36 | 33 |
41 | 35 |
39 | 43 |
44 | 42 |
37 | 42 |
39 | 43 |
39 | 43 |
45 | 42 |
DATAtab then gives you the following result.
If you have more than two dependent variables, you can also easily calculate a Friedman test online. To do this, simply click on more than two metric variables.
"Super simple written"
"It could not be simpler"
"So many helpful examples"