How to Perform Tukey HSD test for Parametric?

Learn how to effectively use the Tukey test in R and Tukey HSD in RStudio to compare multiple groups.

Introduction

The Tukey test is a statistical method used to compare multiple groups and determine if there are significant differences between them. The article provides an informative and technical overview of the Tukey test. And its implementation in the R programming language, specifically in RStudio. 
It explores the Tukey HSD (honest significant difference) variant. By understanding and applying these techniques, researchers and data analysts can gain valuable insights from their data.

Tukey test, Tukey test in R, Tukey HSD in RStudio.

What is the Tukey Test?

The Tukey test, also known as Tukey's range test or Tukey's honest significant difference test (Tukey HSD), is a statistical procedure used to compare the means of multiple groups. It is handy when conducting experiments or analyzing data with more than two groups. The Tukey test examines all possible pairwise comparisons of means and identifies significant differences.

Implementation of the Tukey Test in R

To utilize the Tukey test in R, it is necessary to have R and RStudio installed on your computer. The following steps outline the implementation process:

Installing R and RStudio

Begin by visiting the R website and downloading the latest version of R compatible with your operating system. Install R by following the provided instructions.

Loading the Required Packages

In R, packages extend the software's functionality by providing additional functions and datasets. To perform the Tukey test, load the necessary packages. Open RStudio and execute the following code:

install.packages("agricolae")
library(agricolae)

Conducting the Tukey Test

With the required packages loaded, proceed to perform the Tukey test. Assume you have a dataset named data containing multiple groups that require comparison. Execute the following code to conduct the Tukey test:
 anova <- anova="" aov="" by="" data="PlantGrowth)" group="" hsd.test="" pre="" summary="" thsd="" thsdg="" treatment="" trt="group" tukeyhsd="" weight="">

Visualization of Tukey test 

plot(THSD , las=1 , col="brown")
plot(THSDG, las = 2)

Interpreting the Results of the Tukey Test

Upon executing the Tukey test, the output provides insights into the mean differences between the groups, confidence intervals, and p-values. The p-value is crucial for determining the statistical significance of the differences. The p-value below the chosen significance level (typically 0.05) indicates significant differences between the means.

The results shown here are obtained through a statistical test known as Tukey multiple comparisons of means. The test is utilized to compare the weights of plants across different groups. Specifically, in this analysis, we are examining three distinct groups: control, trt1, and trt2.

The outcome of this test provides information on the variations in the average weights of these groups. Each row in the results pertains to a specific comparison between the two groups. Let's delve into the details of one such row:

  • "trt1-ctrl": This label signifies that we are comparing the group labeled "trt1" with the group labeled "ctrl" or control.

  • "diff": The "diff" value represents the weight difference between trt1 and ctrl, measured at -0.371. In other words, on average, the plants in the trt1 group weigh approximately 0.371 units less than those in the control group.

  • "lwr" and "upr": These values correspond to a confidence interval's lower and upper limits, respectively. The confidence interval provides us with a range within which the actual weight difference is likely to fall. In the trt1-ctrl comparison, the lower limit is recorded as -1.0622161, while the upper limit is 0.3202161.

  • "p adj": The "p adj" value denotes the adjusted p-value, which assists in determining whether the disparity in weights between the two groups is statistically significant. When the p-value is below a predetermined threshold, typically 0.05, we can assert that the observed difference holds statistical significance. 


Tukey test, Tukey test in R, Tukey HSD in RStudio.

In this case, the p-value is calculated as 0.3908711, indicating that it is greater than 0.05. As a result, the weight dissimilarity between the trt1 and ctrl groups lacks statistical significance.

Similar information is presented in the remaining rows, each involving different group comparisons (trt2-ctrl and trt2-trt1). When examining these results, it is crucial to consider the differences in weights, the associated confidence intervals, and the p-values to identify any noteworthy distinctions between the various groups.


Comparison 

Difference 

Lower Limit 

Upper Limit 

Adjusted p-value 

trt1-ctrl  

-0.371

-1.062

0.320

0.391

trt2-ctrl  

0.494

-0.197

1.185

0.198

trt2-trt1  

0.865

0.174

1.556

0.012

Tukey's Honest Significant Difference 

In our analysis using Tukey's Honest Significant Difference (THSDG) test, we are comparing the weights of different groups of plants. Let's take a closer look at the information we have.
In the "statistics" section, we find several summary statistics. The "MSerror" value of 0.3885959 represents the mean square error, which tells us about the variability within each group's weights. The "Df" value of 27 refers to the degrees of freedom, indicating the number of independent pieces of information used for estimation.
 The "Mean" value of 5.073 represents the average weight across all the plants in our study. The "CV" value of 12.28809 is the coefficient of variation, a measure of how much the weights vary relative to the mean. This value suggests a moderate amount of variability. Lastly, the "MSD" value of 0.6912161 corresponds to the minimum significant difference, which is the smallest weight difference that is considered statistically significant.
Moving on to the "parameters" section, we learn more about the test parameters. The test used is Tukey's test, specifically comparing the different groups. We have three treatments or groups being compared. The "StudentizedRange" value of 3.506426 helps us determine the significance of differences between the means of these groups. 
The "alpha" value of 0.05 represents the significance level or the probability threshold we use to determine if the differences are statistically significant.
The "means" section provides us with information about the average weights and additional statistics for each group. Each row corresponds to a different group, and the columns give us insights such as the average weight ("weight"), standard deviation ("std"), sample size ("r"), minimum and maximum weights ("Min" and "Max"), and quartiles ("Q25", "Q50", "Q75"). These statistics allow us to understand the characteristics of each group's weights.
Notably, the "comparison" section is empty, indicating that no specific pairwise comparisons were made between the groups in this analysis.

Tukey test, Tukey test in R, Tukey HSD in RStudio.


Lastly, the "groups" section presents the grouping of the different treatments based on their weights. The groups are labeled as "a", "ab", and "b", corresponding to the group names "trt2", "ctrl", and "trt1" respectively. These groupings help us identify if there are any statistically significant differences between the groups based on their weights.


Statistics:

MSerror    

Df 

Mean  

CV       

MSD       






0.388596

27

5.073

12.28809

0.691216





Parameters:

Test 

name.t 

ntr 

StudentizedRange 

alpha 






Tukey   

group  

3

3.506426

0.05





Means

Group 

weight 

std        

r  

Min  

Max  

Q25   

Q50   

Q75    


ctrl  

5.032

0.583091

10

4.17

6.11

4.55

5.155

5.2925


trt1  

4.661

0.793676

10

3.59

6.03

4.2075

4.55

4.87


trt2  

5.526

0.442573

10

4.92

6.31

5.2675

5.435

5.735

Conclusion

The Tukey test is a powerful statistical tool for comparing means of multiple groups. This article has provided an informative and technical overview of the Tukey test, its implementation in R, and the Tukey HSD variant in RStudio. By applying these techniques, researchers and data analysts can make informed decisions and gain valuable insights from their data.

 
Related Posts

About the author

Zubair Goraya
Ph.D. Scholar | Certified Data Analyst | Blogger | Completed 5000+ data projects | Passionate about unravelling insights through data.

Post a Comment