Pearson Correlation Coefficient ~ Guide & Examples

Pearson-Correlation-Coefficient-Definition

The best and most common method for measuring a linear correlation is calculating the Pearson correlation coefficient. This approach in statistics provides a significant formula in the field of experimental research. This article will account for the various types, how to calculate them, and the significant test. Furthermore, this guide will provide an in-depth understanding of using this method and give respective examples for visualization and clarity.

Index

Inhaltsverzeichnis

1 Pearson Correlation Coefficient – In a Nutshell
2 Definition: Pearson correlation coefficient
3 Types of Pearson correlation coefficients
4 Visualizing the Pearson correlation coefficient
5 Calculating the Pearson correlation coefficient
6 Pearson correlation coefficient: Significance test
7 Pearson correlation coefficient in a thesis
8 FAQs

Pearson Correlation Coefficient – In a Nutshell

The Pearson correlation coefficient is an expressive statistic that measures the strength between diverse variables and how they relate.
In simpler terms, the Pearson correlation coefficient recaps the features of a dataset.
This article gives insight into the various types of Pearson correlation coefficients
It also outlines the steps of how to calculate the Pearson correlation coefficient.

Definition: Pearson correlation coefficient

The Pearson correlation coefficient is an expressive statistic that measures the strength between diverse variables and how they relate. In simpler terms, it recaps the features of a dataset. The Pearson correlation coefficient is also known as:

Bivariate correlation
The correlation coefficient
Pearson’s r
(PPMCC) Pearson product-moment correlation coefficient

Its formula is as follows:

$r= \frac{n \Sigma xy - \big( \Sigma x\big) \big( \Sigma y\big) }{ \sqrt{[n \Sigma x^{2}- \big( \Sigma x \big)^{2} ][n \Sigma y^{2}- \big( \Sigma y\big)^{2}]} }$

Give your thesis a final format revision prior to printing

Have a last check of your formatting with our 3D preview feature before sending your thesis to print. The accurate virtual representation of what the physical print will look like, affords you to ensure the printed version aligns with your expectations.

Types of Pearson correlation coefficients

The Pearson correlation coefficient is a digit between -1 and 1 that calculates the strength and course of the affiliation between two variables. The table below provides a vivid explanation.

Pearson correlation coefficient (r)	Correlation type	Interpretation	Example
Between 0 and 1	Positive correlation	A change in one variable triggers a change in the other in the same direction	Height and weight of a person: The taller a person gets, the heavier they weigh
0	No correlation	The variables are not affiliated	Cost of shoes and width of cars: The price of shoes will not influence the width of your cars and vice versa.
Between 0 and -1	Negative correlation	A change in one variable triggers a change in the other in the opposite direction	Elevation and temperature: The higher you go, the lower the temperature

Positive correlation

Pearson-correlation-coefficient-positive

Negative correlation

Pearson-correlation-coefficient-negative

No correlation

The effect size (relationship strength) interpretation may vary depending on the discipline. However, the following standard rules still apply.

Pearson correlation coefficient (r) value	Strength	Direction
Higher than .5	Strong	Positive
.3 to .5	Moderate	Positive
0 to .3	Weak	Positive
0	None	None
0 to -.3	Weak	Negative
-.3 to -.5	Moderate	Negative
Below -.5	Strong	Negative³

Besides descriptive statistics, the Pearson correlation coefficient can also be used for testing statistical hypotheses because it is an inferential statistic.

Visualizing the Pearson correlation coefficient

You can visualize Pearson’s r as a measure of how close the observations in experimental research are to a line of best fit. Also, it tells you whether the slope of the line of best fit is positive or negative.

Example

The line of best fit is when r is 1 or -1.

Pearson correlation coefficient vs. Spearman’s rank correlation coefficients

Besides the Pearson correlation coefficient, another popular correlation coefficient is Spearman’s rank correlation coefficient.

It is a go-to method when at least one of the following characteristics is true:

The variables are ordinal
The variables are not distributed normally
The data features outliers
The variables have a non-linear or monotone relationship

Calculating the Pearson correlation coefficient

While the formula is easy to use, you can apply software tools like R or Excel to help you calculate the Pearson correlation coefficient.

Example

You are researching the relationship between the weight and length of newborn babies and have data from 10 babies born within the last four weeks at a local clinic. After translating the imperial dimensions to metrics, you enter the data in this table:

Weight (kg)	Length (cm)
3.33	52.9
3.63	53.2
3.02	49.7
3.82	48.4
3.59	54.9
3.42	54.2
2.87	43.7
3.36	54.4
3.03	47.2
3.46	45.2

Step 1: Calculating the sums of x and y

The first step is renaming the variables from weight and length to $x$ and $y$ . Next, add up all the $x$ and $y$ values as indicated in the formula by the symbol $\Sigma$ (take the sum of).

Example:

Weight = $x$

Length = $y$

Therefore:

$\Sigma x=3.3+3.63+3.02+3.82+3.59+3.42+2.87+3.36+3.03+3.34$
$\Sigma x=33.53$
$\Sigma x=52.9+53.2+49.7+48.4+54.9+54.2+43.7+54.4+47.2+45.2$
$\Sigma x=503.2$

Step 2: Calculating x² and y² and the respective sums

Next, create two new columns containing the squares of the values in $x$ and $y$ . Then, calculate the sums of the new columns.

Example

$X$	$Y$	$X^{2}$	$Y^{2}$
3.33	52.9	11.09	2798.4
3.63	53.2	13.18	2819.6
3.02	49.7	9.12	2470.1
3.82	48.4	14.59	2342.6
3.59	54.9	12.89	3014
3.42	54.2	11.7	2937.6
2.87	43.7	8.24	1909.7
3.36	54.4	11.29	2959.4
3.03	47.2	9.18	2227.8
3.46	45.2	11.97	2043

Calculations:

$\Sigma X^{2}=11.09+13.18+9.12+14.49+12.89+11.7+8.24+9.18+11.97$
$\Sigma X^{2}=101.18$
$\Sigma X^{2}=2798.4+2819.6+2470.1+2342.6+3014+2937.6+1909.7+2959.4+2227.8+2043$
$\Sigma X^{2}=2349.2$
$XY(X \times Y)$

Step 3: Calculating the cross product and its sum

Finally, create a column with the products of x and y and name it the cross product. Then, calculate the sum of the new column.

Example

$X$	$Y$	$X^{2}$	$Y^{2}$	$XY(X \times Y)$
3.33	52.9	11.09	2798.4	176.16
3.63	53.2	13.18	2819.6	193.12
3.02	49.7	9.12	2470.1	150.1
3.82	48.4	14.59	2342.6	184.9
3.59	54.9	12.89	3014	197.1
3.42	54.2	11.7	2937.6	185.4
2.87	43.7	8.24	1909.7	125.4
3.36	54.4	11.29	2959.4	182.8
3.03	47.2	9.18	2227.8	143
3.46	45.2	11.97	2043	156.4

Calculations:

$\Sigma XY=176.16+193.12+150.1+184.9+197.1+185.4+125.4+182.8+143+156.4$
$\Sigma XY=1537.98$

Step 4: Calculating Pearson correlation coefficient r

Use the formula above and the figures for each section to calculate the Pearson correlation coefficient.

Example

$N=10$
$\Sigma x=33.53$
$\Sigma y=503.2$
$\Sigma X^{2}=101.18$
$\Sigma Y^{2}=2349.2$
$\Sigma XY=1537.98$

Insert the results into the formula of r:

$r=0.51$

Pearson correlation coefficient: Significance test

You can use the Pearson correlation coefficient to test if the relationship between two variables is significant.

For instance, if the Pearson correlation coefficient of the sample is r, then it is an estimate of rho, which is the correlation of the population. Therefore, determining the r and n (sample size) can help deduce if the rho is meaningfully different from 0.

Null hypothesis $(H_{0} ): p = 0$
Alternative hypothesis $(H_{1} ): p \neq 0$

You can use tools like the R or Strata software to test the hypothesis. Alternatively, you can follow these three steps:

Step 1: Calculating the t value

Calculating the t value is as easy as the following formula:

$t= \frac{r}{ \sqrt{ \frac{1-r^{2}}{n-2} } }$

Example

The weight and length of 10 babies have a Pearson correlation coefficient of 0.51. So, $n=10$ and
$r=0.51$

Therefore, using the formula above,

$t=1.514$

Step 2: Finding the critical value of t

You can find the t in a table that will need the following facts:

The degree of freedom (df) (calculated using the formula: ( $df=n-2$ )
Significance level α: Which is usually 0.05
One-tailed or two-tailed: Two-tailed is the right option for correlations

Example

For a two-tailed significance test at $a-0.05$ and $df=8$ , the critical value is 1.86.

Step 3: Comparing the t value to the critical value

Then, determine if the absolute t value is greater than the critical value. Note that “absolute” implies that you should disregard the minus sign if the t value is negative.

Example

$t=1.514$

And critical value $(t^{*})=1.86$

Therefore: the t value is less than the critical value of $t$

Step 4: Deciding whether to reject the null hypothesis

If the t value is larger than $t^{*}$ , the relationship is significant (p ˂ α). This information allows you to reject the null hypothesis and support the alternative hypothesis.
If the t value is less than $t^{*}$ , the relationship is insignificant (p ˃ α). This does not allow you to reject the null hypothesis or support the alternative hypothesis.

Example

In our previous example, the correlation between newborns’ height and weight, the t value, is less than the $t*$ . Therefore, we do not reject the null hypothesis that the coefficient of the population of p is 0.(6)

Pearson correlation coefficient in a thesis

The Pearson correlation coefficient usually comes up in the results section of an academic paper or thesis. Apply the rules below if you want to report in APA style:

No need for a reference
Italicize r
Include a leading zero before the decimal point
Provide two significant digits after the decimal point

Print Your Thesis Now

Printing your thesis with BachelorPrint guarantees every Australian student to benefit from numerous advantages:

✓ Free express delivery
✓ Individual embossing
✓ Selection of high-quality bindings

to printing services

FAQs

How is the Pearson correlation coefficient in R calculated?

It is calculated using the formula below:

$r= \frac{n \Sigma xy - ( \Sigma x)( \Sigma y)}{ \sqrt{[n \Sigma x^{2} - ( \Sigma x)^{2}][n \Sigma y^{2} - ( \Sigma y)^{2}]} }$

Category

Your Steps to Success

Pearson Correlation Coefficient – Guide & Examples

How do you like this article? Cancel reply

Pearson Correlation Coefficient – In a Nutshell

Definition: Pearson correlation coefficient