Descriptive Statistics - Measures of Association
Updated: Jan 26
The price of light is less than the cost of darkness.
Measures of Association helps analyze the relationship between two variables. Below is a scatter diagram between a sample x and sample y(sample x and sample y in the below diagram are linearly related).
Two methods widely used as measures of association are Correlation and Covariance.
2. Correlation Coefficient
Covariance is a descriptive measure of the linear association between two variables. A Positive value for covariance indicates a positive linear association between two variables x and y. That is as value of x increases, y increases. If the value of covariance is negative, it indicates a negative relationship between x and y, that is as the value of x increases, y decreases.
Covariance is different for different units of measurement. For eg., suppose we are interested in the relationship between height x and weight y for individuals. Measuring the height in inches, gives much larger values for (x-xi) than measurement of height in feet. Measurement unit affects covariance.
A measure of the relationship between two variables that is not affected by the units of measurement for x and y is the correlation coefficient. This is a measure of how two random variables change together, or the strength of their correlation. A measure of linear association between two variables that takes on values between -1 and +1. Values near +1 indicates a strong positive linear relationship; Values near -1 indicates a strong negative linear relationship; and values near zero indicate the lack of linear relationship.
Pearson's Correlation Coefficient
Pearson's product moment correlation coefficient is a statistic that measures linear correlation between two variables X and Y. It has a value between +1 and −1. A value of +1 is total positive linear correlation, 0 is no linear correlation, and −1 is total negative linear correlation.
Spearman's Correlation Coefficient
Spearman's rank correlation coefficient is a nonparametric measure of rank correlation. It assesses how well the relationship between two variables can be described using a monotonic function. While Pearson's correlation assesses linear relationships, Spearman's correlation assesses monotonic relationships (whether linear or not). If there are no repeated data values, a perfect Spearman correlation of +1 or −1 occurs when each of the variables is a perfect monotone function of the other.
The Spearman correlation between two variables will be high when observations have a similar rank between the two variables, and low when observations have a dissimilar rank between the two variables. The below formula is applicable only if all ranks are distinct with no duplicates.
Please note that Correlation coefficient can be used to analyze the strength and the direction of the relationship between the variables but they do not imply causation.
Hope this post was helpful!!. If you’re interested to read more, you can subscribe and be notified when the next article is published.