# How to calculate pearson’s correlation coefficient

Pearson’s correlation coefficient, often represented by the symbol “r”, is a statistical measure that determines the strength and direction of the relationship between two continuous variables. It’s widely used in various domains such as finance, medicine, and social sciences to establish correlations between variables. The coefficient ranges from -1 to 1; values close to -1 indicate a strong negative correlation, 0 indicates no correlation, and values close to 1 represent a strong positive correlation.

In this article, we will outline the step-by-step process of calculating Pearson’s correlation coefficient for two given datasets.

**Step 1: Prepare Your Data**

Before starting with the calculation, make sure you have two sets of continuous data of equal length. Continuous datasets consist of numeric values rather than categorical or ordinal data.

For example, let Data Set A = {x1, x2, x3,…, xn} and Data Set B = {y1, y2, y3,…, yn} be two sets with ‘n’ number of observations.

**Step 2: Calculate Mean Values **

Compute the mean (average) value of each dataset using the following formula:

Mean_X = (Σx) / n

Mean_Y = (Σy) / n

Where Σx represents the sum of all elements in Data Set A, Σy represents the sum of all elements in Data Set B, and ‘n’ denotes the number of elements in each dataset.

**Step 3: Compute Deviations from Mean **

Calculate each observation’s deviation from their respective mean values:

Deviation_Xi = xi – Mean_X

Deviation_Yi = yi – Mean_Y

**Step 4: Multiply Deviations**

Multiply the deviations derived in Step 3 for each pair of corresponding elements (xi and yi):

Product_Deviation_i = Deviation_Xi * Deviation_Yi

**Step 5: Calculate the Sum of Product Deviations**

Find the sum of all product deviations obtained in Step 4:

Σ(Product_Deviations) = Σ(Deviation_Xi * Deviation_Yi)

**Step 6: Square and Sum The Deviations **

To compute the sum of squared deviations for each dataset, follow these steps:

a) Square each deviation value for both sets:

Squared_Deviation_Xi = (Deviation_Xi)^2

Squared_Deviation_Yi = (Deviation_Yi)^2

b) Calculate the sum of squared deviations:

Σ(Squared_Deviation_X) = Σ(Deviation_Xi)^2

Σ(Squared_Deviation_Y) = Σ(Deviation_Yi)^2

**Step 7: Calculate Pearson’s Correlation Coefficient**

Finally, compute Pearson’s correlation coefficient using the formula:

r = Σ(Product_Deviations) / √[Σ(Squared_Deviation_X) * Σ(Squared_Deviation_Y)]

**Conclusion:**

Now that you’ve successfully calculated Pearson’s correlation coefficient, you can interpret the results to understand the strength and direction of the relationship between your two datasets. Keep in mind that correlation doesn’t always imply causation; further research may be required to determine causation in specific cases. However, knowing how to calculate and interpret this statistical tool can provide valuable insights in various fields of study.