Overview
The Wilcoxon test is a nonparametric alternative to the ttest for comparing averages in one or two (dependent or independent) samples. There are different types of wilcoxon tests, depending on the type of sample: Onesample Wilcoxon signed rank test, Wilcoxon matchedpairs test for a dependent sample and Wilcoxon rank sum test (MannWhitney U test) for an independent sample. In this building block, you will learn how to perform these tests on R using wilcox.test()
.
Onesample Wilcoxon signedrank test
The onesample Wilcoxon signedrank test is used to assess whether the median of the sample is equal to a known standard or theoretical value. It is a nonparametric test equivalent to a onesample ttest and is appropriate for continuous and ordinal data. To perform this test on R, you can use the inbuilt function wilcox.test()
, in the following way:
where:
 x denotes a numeric vector containing your data values.
 mu is the theoretical mean/median value. The default is 0 but you can change it.
 alternative defines the alternative hypothesis. It can be â€śtwo.sidedâ€ť (default), â€śgreaterâ€ť or â€ślessâ€ť.
 exact determines whether to compute an exact pvalue for small sample sizes. When exact = TRUE, R will compute an exact pvalue using the exact distribution of the test statistic rather than relying on an asymptotic approximation.
 correct is a boolean that indicates whether to apply continuity correction in the normal approximation for the pvalue. It is TRUE by default.
Suppose you recorded the heights of 100 trees in a dataset called tree_heights
to check if the median height is less than 85ft.
tree_heights < c(90.4, 93, 62.3, 72.9, 88.7, 81.2, 72.5, 67, 58, 66.8,
99.5, 95.6, 92.4, 62.7, 61.2, 103.7, 76.8, 78.2, 63.3, 84.2,
68.5, 66.5, 89.6, 69.1, 95.5, 59.7, 96.1, 76.4, 92.8, 88.1,
77.2, 86.4, 55, 65.9, 90.2, 65.8, 95.7, 70.4, 89.4, 101.6 )
Let’s first check if this data set follows a normal distribution via a histogram and the ShapiroWilk test.
shapiro.test(tree_heights)
#ShapiroWilk normality test
#data: tree_heights
#W = 0.94126, pvalue = 0.03814
The shape of the histogram shows that the data does not follow a normal distribution and the pvalue of the ShapiroWilk tests confirms it (since we reject the null hypothesis of normality for both distributions at the 5% significance level). Since the normality assumption is violated we cannot use parametric tests such as Studentâ€™s ttest and should instead utilise the Wilcoxon test.
This test can be performed on R using wilcox.test()
. Since you are interested in analysing if the median height is less than 85ft you should set mu = 85 and alternative = “less”.
The hypothesis for this test will be set as:
 H0: Median height = 85
 H1: Median height < 85
The output of the code gives a pvalue of 0.008 which is less than 0.01. Thus, you can reject the null hypothesis at a 1% significance level which is highly suggestive of the median height of the trees being less than 85ft.
If your sample size is large, an asymptotic approximation is often used because otherwise, the exact calculation might be computationally expensive. Thus, given the large sample size of the dataset in this example, we set exact = FALSE
.
It is important to note that in the case of large samples, normality is not required. By the central limit theorem, sample means of large samples are often wellapproximated by a normal distribution even if the data are not normally distributed. Therefore, we typically use the wilcoxon test for smaller samples where normality is unlikely to hold.
For the sake of this building block, in the following two examples, we will assume that normality does not hold even with a large sample.
Wilcoxon matchedpairs test
The Wilcoxon matchedpairs signedrank test (or simply the Wilcoxon matchedpairs test) is a nonparametric method to compare beforeafter or matched subjects. For performing this test, wilcox.test()
has the following syntax:
where:
 y is another (optional) numeric vector with data values.
 paired is a logical value specifying whether you want a paired test. It is FALSE by default, so you need to set it to TRUE for performing the Wilcoxon matchedpairs test.
To understand how to use this test, let us look at the dataset mice2
from the package datarium
. It contains the weights of ten mice before and after a treatment. Suppose you want to assess if there is a significant difference between the weights of the mice before and after the treatment (paired samples) you can use the Wilcoxon matchedpairs test. The hypothesis will be set as:
 H0: Weights before = Weights after
 H1: Weights before â‰ Weights after
library(datarium)
wilcox.test(mice2$before, mice2$after,
paired = TRUE,
alternative = â€śtwo.sidedâ€ť,
exact = TRUE)
The output of this test gives a pvalue equal to 0.002 which is less than 0.01. This suggests that at a 1% significance level, the difference in the weights of the mice before and after the treatment is highly significant.
If your dataset does not have the ‘Before’ and ‘After’ values in separate columns like so:
id  before  after 

1  187.2  429.5 
but instead in the form of, say
id  group  weight 

1  before  187.2 
1  after  429.5 
you can specify your data values as
wilcox.test(weight ~ group)
For an independent sample, it is sufficient to check for normality individually on the two samples. However, for paired samples (when in the presence of a small sample), normality must be checked on the differences between the two paired samples.
Wilcoxon rank sum test
The Wilcoxon rank sum test is a nonparametric test to compare two unmatched (independent) groups. It is also known as the MannWhitney U test. The syntax of wilcox.test()
for performing this test is the same as that for the matchedpairs test.
There is no need to explicitly specify the exact argument for this test. That argument is more relevant when you are dealing with paired samples. For the MannWhitney U test (which is essentially a Wilcoxon ranksum test for independent samples), the default behaviour is to use an asymptotic approximation.
For this let us look at another dataset from the datarium
package, called genderweight
. It contains the weights by sex (M for male; F for female). The genders are specified under the column ‘group’. Suppose you want to answer the question of whether the average womanâ€™s weight significantly differs from the average manâ€™s weight you can use the MannWhitney U test. The hypothesis will be set as:
 H0: Female weight = Male weight
 H1: Female weight â‰ Male weight
The output of this test gives a pvalue less than 0.0001 so you can reject the null hypothesis. This suggests that the difference in the average weights of the two genders is significant.
It is not always ideal to default to nonparametric tests to avoid normality checks. The reason is that nonparametric tests are usually less powerful than corresponding parametric tests when the normality assumption holds. This means that ceteris paribus, with a nonparametric test you are less likely to reject the null hypothesis when it is false if the data follows a normal distribution. It is thus preferred to use the parametric version of a statistical test when the assumptions are met.

The Wilcoxon test family provides versatile options for comparing samples without making strong assumptions about the underlying distribution.
 Onesample Wilcoxon SignedRank test is used when you have one sample and want to test if its distribution is symmetrically centred around a hypothesized median.
 Wilcoxon matchedpairs test is used when you want to compare averages of two paired (dependent) samples.
 Wilcoxon rank sum test is used when you have two independent samples.

These tests can be performed on R using its inbuilt test
wilcox.test()