How do I interpret the output of cor.test() in R?

cor.test() output contains: t (the test statistic), df (degrees of freedom = n-2), p-value (below 0.05 means significant at 95% confidence), and the 95% confidence interval for the true correlation. The 'sample estimates: cor' line is the actual correlation coefficient. If the confidence interval does not cross zero, the correlation is reliably non-zero.

cor Function in R | Calculate Correlation Coefficients in R

The cor function in R computes correlation coefficients between numeric variables — either as a single value between two vectors or as a full correlation matrix across a data frame. Paired with cor.test(), it also delivers the p-value and confidence interval you need to report statistically sound results. This guide covers every practical use case: basic syntax, all three methods (Pearson, Spearman, Kendall), missing-data handling, significance testing, and visualization — with copy-ready code for each.

cor Function in R — Calculate Correlation Coefficients and Run cor.test in R

Task	Code	Output
Correlation between two vectors	`cor(mtcars$mpg, mtcars$hp)`	Single coefficient, −1 to 1
Full correlation matrix	`cor(mtcars)`	n × n matrix
Correlation test with p-value	`cor.test(mtcars$mpg, mtcars$hp)`	t, df, p-value, 95% CI
Spearman correlation	`cor(x, y, method = "spearman")`	Rank-based coefficient
Kendall correlation	`cor(x, y, method = "kendall")`	Concordance-based coefficient
Handle missing values (listwise)	`cor(df, use = "complete.obs")`	Matrix, complete rows only
Handle missing values (pairwise)	`cor(df, use = "pairwise.complete.obs")`	Matrix, max pairs used
Correlation matrix with p-values	`rcorr(as.matrix(df))` — Hmisc	Matrix + significance levels

Table of Contents

Key Points

The cor function in R returns a value between −1 and 1. A result near +1 means a strong positive relationship; near −1 means a strong negative relationship; near 0 means no linear relationship.
Use cor.test() — not just cor() — whenever you need to report whether the correlation is statistically significant. It gives you a t-statistic, degrees of freedom, p-value, and a 95% confidence interval.
Choose the right method: Pearson for continuous, normally distributed data; Spearman for ranked or non-normal data; Kendall's tau for small samples or data with many tied ranks.
Always handle missing values explicitly. Leaving the use parameter at its default ("everything") returns NA if any value is missing. Use "complete.obs" or "pairwise.complete.obs" instead.
Visualize with corrplot or ggplot2 to spot patterns across many variables simultaneously. A correlation heatmap communicates structure that a raw matrix cannot.
Correlation is not causation. A coefficient of −0.78 between mpg and hp (as in mtcars) tells you they move together — it does not tell you that horsepower causes poor fuel economy.

What Is the cor Function in R?

The cor function in R is a base-R function that computes the correlation coefficient — a standardized measure of the linear relationship between two or more numeric variables. It accepts either two separate vectors or a full data frame, and returns either a single number or a square correlation matrix.

Aspect	Detail
Function name	`cor()`
Package	Base R — no installation needed
Output range	−1 to +1
Default method	Pearson
Companion function	`cor.test()` — adds p-value and CI

cor() Syntax

cor(x, y = NULL, use = "everything", method = c("pearson", "spearman", "kendall"))

x — a numeric vector, matrix, or data frame.
y — a second numeric vector or matrix. Omit when x is a data frame (produces full matrix).
use — how to handle missing values. Options: "everything", "complete.obs", "pairwise.complete.obs".
method — the correlation algorithm. Default: "pearson".

How to Interpret the Correlation Coefficient

Value range	Meaning
0.9 to 1.0 (or −0.9 to −1.0)	Very strong correlation
0.7 to 0.9 (or −0.7 to −0.9)	Strong correlation
0.5 to 0.7 (or −0.5 to −0.7)	Moderate correlation
0.3 to 0.5 (or −0.3 to −0.5)	Weak correlation
0.0 to 0.3 (or 0.0 to −0.3)	Negligible or no correlation

Understanding Correlation Coefficients in R

Basic Usage of the cor Function in R

Calculating Correlation Between Two Variables

Pass two numeric vectors to cor() to get a single correlation coefficient. The example below uses the built-in mtcars dataset to check the relationship between miles-per-gallon (mpg) and horsepower (hp).

data(mtcars)
cor(mtcars$mpg, mtcars$hp)

Correlation Between Two Variables using the cor function in R — output showing -0.7761684

The result is −0.776 — a strong negative correlation. As horsepower increases, fuel efficiency decreases. The value is between −1 and 1, where the sign gives direction and the magnitude gives strength.

Generating a Correlation Matrix

Pass a full data frame to cor() to produce a correlation matrix — a symmetric table showing the coefficient for every pair of numeric variables at once.

library(dplyr)
mtcars %>% select_if(is.numeric) %>%
  cor()

Correlation Matrix generated by the cor function in R using mtcars dataset

The diagonal always shows 1.000 — every variable is perfectly correlated with itself. Off-diagonal values are the pairwise coefficients you analyze. This matrix is the fastest way to scan for strong relationships across many variables simultaneously.

Correlation Test in R: Using cor.test()

cor() gives you the coefficient. cor.test() gives you the full statistical picture: the coefficient, a t-statistic, degrees of freedom, p-value, and a 95% confidence interval. Use it whenever you need to report whether a correlation is statistically significant.

cor.test() Syntax

cor.test(x, y, method = c("pearson", "spearman", "kendall"),
         alternative = c("two.sided", "less", "greater"),
         conf.level = 0.95)

Running a Correlation Test in R — Example

cor.test(mtcars$mpg, mtcars$hp)

cor.test() output in R showing t-statistic, df, p-value, and 95 percent confidence interval

How to Read the cor.test() Output — Line by Line

Output line	What it means
`t = -6.7424`	The test statistic. Larger absolute values mean stronger evidence against no correlation.
`df = 30`	Degrees of freedom = n − 2. Here n = 32 cars, so df = 30.
`p-value = 1.788e-07`	Probability of observing this result by chance if the true correlation is zero. Below 0.05 = significant.
`95 percent confidence interval: -0.8852 -0.5863`	The interval does not cross zero → the negative correlation is reliably non-zero.
`cor = -0.7761684`	The Pearson correlation coefficient. Same as `cor(mtcars$mpg, mtcars$hp)`.

The p-value of 1.79e−07 is far below 0.05, confirming the correlation is highly significant. The confidence interval (−0.885 to −0.586) does not cross zero, meaning the negative relationship is not a sampling artefact.

Interpreting p-values and Confidence Intervals

p-value	Interpretation
p < 0.001	Very strong evidence of a real correlation
p < 0.05	Statistically significant at the standard threshold
p < 0.10	Marginally significant — interpret cautiously
p ≥ 0.10	No statistically significant correlation detected

Choosing the Right Correlation Method in R

The cor function in R supports three methods. Picking the wrong one will give you a coefficient that misrepresents the actual relationship in your data.

Situation	Method	Code
Continuous data, normally distributed, no major outliers	Pearson (default)	`cor(x, y)`
Ranked / ordinal data, non-normal, or outliers present	Spearman	`cor(x, y, method = "spearman")`
Small sample size or many tied ranks	Kendall's tau	`cor(x, y, method = "kendall")`

Pearson, Spearman, and Kendall — Side-by-Side

# Pearson (default) — linear relationship
cor(mtcars$mpg, mtcars$hp)

# Spearman — rank-based, robust to non-normality
cor(mtcars$mpg, mtcars$hp, method = "spearman")

# Kendall — concordance of ranks, best for small n or ties
cor(mtcars$mpg, mtcars$hp, method = "kendall")

Pearson, Spearman and Kendall correlation methods compared using cor() in R

All three methods return different values for the same data. Pearson measures the linear relationship; Spearman measures the monotonic relationship on ranks; Kendall measures concordance between ranked pairs. The underlying relationship between mpg and hp is strong enough that all three point in the same direction here — but that will not always be the case in your data.

Running a Spearman or Kendall Correlation Test in R

# Spearman correlation test in R
cor.test(mtcars$mpg, mtcars$hp, method = "spearman")

# Kendall correlation test in R
cor.test(mtcars$mpg, mtcars$hp, method = "kendall")

Note: cor.test() with method = "spearman" or "kendall" does not produce a confidence interval in base R (the output will show NA for it). To get confidence intervals for Spearman, use the DescTools package with SpearmanRho(x, y, conf.level = 0.95).

Handling Missing Data in cor() and cor.test()

The default use = "everything" returns NA for any pair that includes a missing value. This silently breaks your correlation matrix. Always set the use parameter explicitly.

use parameter	Behaviour	When to use it
`"everything"`	Returns NA if any value is missing	Only when you are certain data is complete
`"complete.obs"`	Listwise deletion — drops entire row if any value is NA	When missing data is rare and random
`"pairwise.complete.obs"`	Uses all available pairs for each correlation separately	When missing data is common — preserves more observations

# Listwise deletion — only rows with complete data across ALL variables
cor(mtcars, use = "complete.obs")

# Pairwise deletion — maximum data used per pair
cor(mtcars, use = "pairwise.complete.obs")

Handling missing data in cor() using complete.obs in R

For most research datasets, "pairwise.complete.obs" is the safer default because it does not discard entire rows for unrelated missing values. Use "complete.obs" when you need every correlation in the matrix to be computed on the same set of observations — required for some downstream analyses like PCA.

Advanced: Correlation Matrix with p-values Using Hmisc

Base R's cor() does not attach significance markers to a matrix. The Hmisc package's rcorr() function solves this — it returns both the coefficient matrix and a matrix of p-values simultaneously.

if(!require(Hmisc)){
  install.packages("Hmisc")
  library(Hmisc)
}
corstudiodatalab <- function(x){
  require(Hmisc)
  x <- as.matrix(x)
  R <- rcorr(x)$r
  p <- rcorr(x)$P
  mystars <- ifelse(p < .01, "**|", ifelse(p < .05, "* |", "  |"))
  R <- format(round(cbind(rep(-1.111, ncol(x)), R), 3))[,-1]
  Rnew <- matrix(paste(R, mystars, sep=""), ncol=ncol(x))
  diag(Rnew) <- paste(diag(R), "  |", sep="")
  rownames(Rnew) <- colnames(x)
  colnames(Rnew) <- paste(colnames(x), "|", sep="")
  Rnew <- as.data.frame(Rnew)
  return(Rnew)
}
mtcars %>% select_if(is.numeric) %>%
  corstudiodatalab()

Correlation matrix with significance stars using rcorr() from Hmisc package in R

The ** marker means p < 0.01; * means p < 0.05. This output format is publication-ready and immediately tells you which correlations are worth interpreting versus which may be noise.

Visualization of Correlation Matrices in R

A raw correlation matrix with 10+ variables is difficult to scan. Visualizations let you identify clusters of related variables, spot sign changes, and communicate findings to non-statistical audiences.

Tool	Best for
corrplot	Quick, publication-quality correlation plots with minimal code
ggplot2 + reshape2	Fully customizable heatmaps integrated into a ggplot2 workflow

corrplot — Circle Method

library(corrplot)
corr_matrix <- cor(mtcars)
corrplot(corr_matrix, method = "circle")

ggplot2 Heatmap

library(ggplot2)
library(reshape2)

# Compute and reshape correlation matrix
corr_matrix <- cor(mtcars)
melted_corr <- melt(corr_matrix)

# Build heatmap
ggplot(data = melted_corr, aes(x = Var1, y = Var2, fill = value)) +
  geom_tile(color = "white") +
  scale_fill_gradient2(low = "blue", high = "red", mid = "white",
                       midpoint = 0, limit = c(-1, 1), space = "Lab",
                       name = "Correlation") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, vjust = 1,
                                   size = 12, hjust = 1)) +
  labs(title = "Correlation Matrix Heatmap", x = "", y = "")

ggplot2 correlation heatmap in R using geom_tile and scale_fill_gradient2

Red tiles show strong positive correlations; blue tiles show strong negative correlations; white tiles indicate near-zero relationships. This layout makes it immediately obvious which variable pairs deserve further analysis.

Integrating Correlation Analysis in a Reproducible Workflow

A correlation analysis that cannot be reproduced is not publishable. Use R Markdown to embed your cor() and cor.test() calls inside a document that renders code, output, and narrative together. Use Shiny when your audience needs to explore the matrix interactively — for example, filtering by variable group or switching between Pearson and Spearman dynamically.

Best Practices Checklist

Set a random seed (set.seed()) before any sampling or imputation steps that precede correlation analysis.
Check for outliers with a scatterplot matrix (pairs(mtcars)) before choosing Pearson vs Spearman.
Test normality with shapiro.test() on each variable — use Spearman if any variable fails.
Apply Bonferroni correction when testing many pairs simultaneously to control the false discovery rate.
Document your use parameter choice and justify it in your methods section.
Never report a correlation coefficient without its p-value or confidence interval.

Common Pitfalls and How to Avoid Them

Pitfall 1 — Correlation Does Not Mean Causation

A coefficient of −0.78 between mpg and hp tells you these variables move together consistently. It does not tell you that increasing horsepower causes lower fuel efficiency. There may be a confounding variable (e.g., vehicle weight) driving both. Always pair correlation analysis with domain knowledge and, when appropriate, regression modelling.

Pitfall 2 — Using Pearson on Non-Normal Data

Pearson assumes both variables are approximately normally distributed. Applying it to ordinal survey responses, count data, or heavily skewed distributions produces a coefficient that understates or overstates the true relationship. Check normality first; switch to Spearman when in doubt.

Pitfall 3 — Ignoring Outliers

A single extreme observation can move a Pearson coefficient by 0.2 or more in small samples. Plot your data before running cor(). If outliers are present and cannot be removed on scientific grounds, use Spearman — it is rank-based and therefore robust to extreme values.

Pitfall 4 — Leaving use = "everything" (the default)

If your data frame has any missing values, the default setting returns NA for every affected pair without warning. Always set use explicitly. If you receive a matrix full of NAs, this is almost certainly the cause.

Pitfall 5 — Multiple Testing Without Correction

A 10-variable correlation matrix produces 45 unique pairs. At a 0.05 threshold, you expect roughly 2–3 significant results purely by chance. Apply the Bonferroni correction (p.adjust(p_values, method = "bonferroni")) or the Benjamini-Hochberg FDR procedure when testing many pairs.

Conclusion

The cor function in R and its companion cor.test() together cover the full workflow of correlation analysis: computing the coefficient, testing its significance, and building publication-ready matrices. Use Pearson for linear continuous data, Spearman for ranked or non-normal data, and Kendall's tau for small samples with ties. Always handle missing values explicitly with the use parameter, verify significance with cor.test(), and visualize results with corrplot or ggplot2 to communicate findings clearly. Embedding this workflow in R Markdown ensures it stays reproducible and shareable.

Frequently Asked Questions

What is the difference between cor() and cor.test() in R?

cor() returns only the correlation coefficient. cor.test() returns the coefficient plus a t-statistic, degrees of freedom, p-value, and 95% confidence interval. Use cor() for quick exploration; use cor.test() whenever you need to report whether the result is statistically significant.

How do I run a correlation test in R?

Use cor.test(x, y). Example: cor.test(mtcars$mpg, mtcars$hp). Add method = "spearman" or method = "kendall" for non-parametric variants. The default tests Pearson correlation.

What does a p-value in cor.test() mean?

The p-value is the probability of observing a correlation this extreme (or more extreme) if the true correlation were zero. A p-value below 0.05 means the correlation is statistically significant at the 95% confidence level — it is unlikely to be a sampling artefact.

When should I use Spearman instead of Pearson correlation in R?

Use Spearman when your data is ordinal or ranked, clearly non-normal, or contains outliers you cannot remove. Spearman measures the monotonic relationship on ranks rather than the raw values, making it more robust. Code: cor(x, y, method = "spearman") or cor.test(x, y, method = "spearman").

How do I handle missing values in cor() in R?

Set the use parameter explicitly. use = "complete.obs" drops any row with a missing value across all variables. use = "pairwise.complete.obs" uses all available pairs for each correlation separately — preserving more data. The default "everything" returns NA for any affected pair.

What is the difference between COV and COR in R?

cov() computes covariance — an unstandardized measure of how two variables change together. Its magnitude depends on the scale of the variables. cor() standardizes covariance to produce a value always between −1 and 1, making it scale-independent and directly comparable across different variable pairs.

How do I get a correlation between two columns in R?

Use cor(df$column1, df$column2). For significance testing: cor.test(df$column1, df$column2).

What is the Hmisc rcorr() function and when should I use it?

rcorr() from the Hmisc package computes an entire correlation matrix and its corresponding p-value matrix simultaneously. Use it when you need significance levels for every pair in a matrix — base R's cor() does not provide p-values for matrices, only cor.test() does, and only for one pair at a time.

References:

Shantal, M., Othman, Z., & Bakar, A. (2023). A novel approach for data feature weighting using correlation coefficients and min–max normalization. Symmetry, 15(12), 2185. https://doi.org/10.3390/sym15122185
Wang, J. and Zheng, N. (2014). Measures of correlation for multiple variables. https://doi.org/10.48550/arxiv.1401.4827
Çayak, S. (2022). A study on teachers shows the mediating role of organizational happiness in the relationship between work engagement and life satisfaction. International Journal of Contemporary Educational Research, 8(4), 27–46. https://doi.org/10.33200/ijcer.852454

Need help applying these techniques to your own dataset? Our team at RStudioDatalab supports researchers, students, and businesses with one-on-one sessions via Zoom, Google Meet, or chat. Contact us at contact@rstudiodatalab.com or schedule a discovery call.

Join Our Community Book a free call Fiverr