ASSIGNMENT GURUJI: February 2017

1)What is The Spearman's Rank Correlation Coefficien ?
Ans:- The Spearman's Rank Correlation Coefficient is used to discover the strength of a link between two sets of data. This example looks at the strength of the link between the price of a convenience item (a 50cl bottle of water) and distance from the Contemporary Art Museum in El Raval, Barcelona.

coefficient

A correlation can easily be drawn as a scatter graph, but the most precise way to compare several pairs of data is to use a statistical test - this establishes whether the correlation is really significant or if it could have been the result of chance alone.

Spearman’s Rank correlation coefficient is a technique which can be used to summarise the strength and direction (negative or positive) of a relationship between two variables.

The result will always be between 1 and minus 1.

Method - calculating the coefficient

Create a table from your data.
Rank the two data sets. Ranking is achieved by giving the ranking '1' to the biggest number in a column, '2' to the second biggest value and so on. The smallest value in the column will get the lowest ranking. This should be done for both sets of measurements.
Tied scores are given the mean (average) rank. For example, the three tied scores of 1 euro in the example below are ranked fifth in order of price, but occupy three positions (fifth, sixth and seventh) in a ranking hierarchy of ten. The mean rank in this case is calculated as (5+6+7) ÷ 3 = 6.
Find the difference in the ranks (d): This is the difference between the ranks of the two values on each row of the table. The rank of the second value (price) is subtracted from the rank of the first (distance from the museum).
Square the differences (d²) To remove negative values and then sum them (d²).
Calculate the coefficient (R) using the formula below. The answer will always be between 1.0 (a perfect positive correlation) and -1.0 (a perfect negative correlation).

When written in mathematical notation the Spearman Rank formula looks like this :

Now to put all these values into the formula.

Find the value of all the d² values by adding up all the values in the Difference² column. In our example this is 285.5. Multiplying this by 6 gives 1713.
Now for the bottom line of the equation. The value n is the number of sites at which you took measurements. This, in our example is 10. Substituting these values into n³ - n we get 1000 - 10
We now have the formula: R = 1 - (1713/990) which gives a value for R:
1 - 1.73 = -0.73

What does this R value of -0.73 mean?

The closer R is to +1 or -1, the stronger the likely correlation. A perfect positive correlation is +1 and a perfect negative correlation is -1. The R value of -0.73 suggests a fairly strong negative relationship.

A further technique is now required to test the significance of the relationship.

The R value of -0.73 must be looked up on the Spearman Rank significance table below as follows:

Work out the 'degrees of freedom' you need to use. This is the number of pairs in your sample minus 2 (n-2). In the example it is 8 (10 - 2).
Now plot your result on the table.
If it is below the line marked 5%, then it is possible your result was the product of chance and you must reject the hypothesis.
If it is above the 0.1% significance level, then we can be 99.9% confident the correlation has not occurred by chance.
If it is above 1%, but below 0.1%, you can say you are 99% confident.
If it is above 5%, but below 1%, you can say you are 95% confident (i.e. statistically there is a 5% likelihood the result occurred by chance).

In the example, the value 0.73 gives a significance level of slightly less than 5%. That means that the probability of the relationship you have found being a chance event is about 5 in a 100. You are 95% certain that your hypothesis is correct. The reliability of your sample can be stated in terms of how many researchers completing the same study as yours would obtain the same results: 95 out of 100.

Friday, 17 February 2017