User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

1.9 - hypothesis test for the population correlation coefficient.

There is one more point we haven't stressed yet in our discussion about the correlation coefficient r and the coefficient of determination \(R^{2}\) — namely, the two measures summarize the strength of a linear relationship in samples only . If we obtained a different sample, we would obtain different correlations, different \(R^{2}\) values, and therefore potentially different conclusions. As always, we want to draw conclusions about populations , not just samples. To do so, we either have to conduct a hypothesis test or calculate a confidence interval. In this section, we learn how to conduct a hypothesis test for the population correlation coefficient \(\rho\) (the greek letter "rho").

In general, a researcher should use the hypothesis test for the population correlation \(\rho\) to learn of a linear association between two variables, when it isn't obvious which variable should be regarded as the response. Let's clarify this point with examples of two different research questions.

Consider evaluating whether or not a linear relationship exists between skin cancer mortality and latitude. We will see in Lesson 2 that we can perform either of the following tests:

For this example, it is fairly obvious that latitude should be treated as the predictor variable and skin cancer mortality as the response.

By contrast, suppose we want to evaluate whether or not a linear relationship exists between a husband's age and his wife's age ( Husband and Wife data ). In this case, one could treat the husband's age as the response:

husband's age vs wife's age plot

...or one could treat the wife's age as the response:

wife's age vs husband's age plot

In cases such as these, we answer our research question concerning the existence of a linear relationship by using the t -test for testing the population correlation coefficient \(H_{0}\colon \rho = 0\).

Let's jump right to it! We follow standard hypothesis test procedures in conducting a hypothesis test for the population correlation coefficient \(\rho\).

Steps for Hypothesis Testing for \(\boldsymbol{\rho}\) Section  

Step 1: hypotheses.

First, we specify the null and alternative hypotheses:

Step 2: Test Statistic

Second, we calculate the value of the test statistic using the following formula:

Test statistic:  \(t^*=\dfrac{r\sqrt{n-2}}{\sqrt{1-R^2}}\) 

Step 3: P-Value

Third, we use the resulting test statistic to calculate the P -value. As always, the P -value is the answer to the question "how likely is it that we’d get a test statistic t* as extreme as we did if the null hypothesis were true?" The P -value is determined by referring to a t- distribution with n -2 degrees of freedom.

Step 4: Decision

Finally, we make a decision:

Example 1-5: Husband and Wife Data Section  

Let's perform the hypothesis test on the husband's age and wife's age data in which the sample correlation based on n = 170 couples is r = 0.939. To test \(H_{0} \colon \rho = 0\) against the alternative \(H_{A} \colon \rho ≠ 0\), we obtain the following test statistic:

\begin{align} t^*&=\dfrac{r\sqrt{n-2}}{\sqrt{1-R^2}}\\ &=\dfrac{0.939\sqrt{170-2}}{\sqrt{1-0.939^2}}\\ &=35.39\end{align}

To obtain the P -value, we need to compare the test statistic to a t -distribution with 168 degrees of freedom (since 170 - 2 = 168). In particular, we need to find the probability that we'd observe a test statistic more extreme than 35.39, and then, since we're conducting a two-sided test, multiply the probability by 2. Minitab helps us out here:

Student's t distribution with 168 DF

The output tells us that the probability of getting a test-statistic smaller than 35.39 is greater than 0.999. Therefore, the probability of getting a test-statistic greater than 35.39 is less than 0.001. As illustrated in the following video, we multiply by 2 and determine that the P-value is less than 0.002.

Since the P -value is small — smaller than 0.05, say — we can reject the null hypothesis. There is sufficient statistical evidence at the \(\alpha = 0.05\) level to conclude that there is a significant linear relationship between a husband's age and his wife's age.

Incidentally, we can let statistical software like Minitab do all of the dirty work for us. In doing so, Minitab reports:

Correlation: WAge, HAge

Pearson correlation of WAge and HAge = 0.939

P-Value = 0.000

Final Note Section  

One final note ... as always, we should clarify when it is okay to use the t -test for testing \(H_{0} \colon \rho = 0\)? The guidelines are a straightforward extension of the "LINE" assumptions made for the simple linear regression model. It's okay:

Language Flag

Find Study Materials for

Create Study Materials

Select your language

hypothesis example correlation

Hypothesis Test for Correlation

Lerne mit deinen Freunden und bleibe auf dem richtigen Kurs mit deinen persönlichen Lernstatistiken

Nie wieder prokastinieren mit unseren Lernerinnerungen.

Let's look at the hypothesis test for correlation, including the hypothesis test for correlation coefficient, the hypothesis test for negative correlation and the null hypothesis for correlation test.

What is the hypothesis test for correlation coefficient?

When given a sample of bivariate data (data which include two variables), it is possible to calculate how linearly correlated the data are, using a correlation coefficient.

The product moment correlation coefficient (PMCC) describes the extent to which one variable correlates with another. In other words, the strength of the correlation between two variables. The PMCC for a sample of data is denoted by r , while the PMCC for a population is denoted by ρ.

The PMCC is limited to values between -1 and 1 (included).

If r = 1 , there is a perfect positive linear correlation. All points lie on a straight line with a positive gradient, and the higher one of the variables is, the higher the other.

If r = 0 , there is no linear correlation between the variables.

If r = - 1 , there is a perfect negative linear correlation. All points lie on a straight line with a negative gradient, and the higher one of the variables is, the lower the other.

Correlation is not equivalent to causation, but a PMCC close to 1 or -1 can indicate that there is a higher likelihood that two variables are related.

statistics bivariate data correlation null positive negative graphs StudySmarter

The PMCC should be able to be calculated using a graphics calculator by finding the regression line of y on x, and hence finding r (this value is automatically calculated by the calculator), or by using the formula r = S x y S x x S y y , which is in the formula booklet. The closer r is to 1 or -1, the stronger the correlation between the variables, and hence the more closely associated the variables are. You need to be able to carry out hypothesis tests on a sample of bivariate data to determine if we can establish a linear relationship for an entire population. By calculating the PMCC, and comparing it to a critical value, it is possible to determine the likelihood of a linear relationship existing.

What is the hypothesis test for negative correlation?

To conduct a hypothesis test, a number of keywords must be understood:

Null hypothesis ( H 0 ) : the hypothesis assumed to be correct until proven otherwise

Alternative hypothesis ( H 1 ) : the conclusion made if H 0 is rejected.

Hypothesis test: a mathematical procedure to examine a value of a population parameter proposed by the null hypothesis compared to the alternative hypothesis.

Test statistic: is calculated from the sample and tested in cumulative probability tables or with the normal distribution as the last part of the significance test.

Critical region: the range of values that lead to the rejection of the null hypothesis.

Significance level: the actual significance level is the probability of rejecting H 0 when it is in fact true.

The null hypothesis is also known as the 'working hypothesis'. It is what we assume to be true for the purpose of the test, or until proven otherwise.

The alternative hypothesis is what is concluded if the null hypothesis is rejected. It also determines whether the test is one-tailed or two-tailed.

A one-tailed test allows for the possibility of an effect in one direction, while two-tailed tests allow for the possibility of an effect in two directions, in other words, both in the positive and the negative directions. Method: A series of steps must be followed to determine the existence of a linear relationship between 2 variables. 1 . Write down the null and alternative hypotheses ( H 0 a n d H 1 ). The null hypothesis is always ρ = 0 , while the alternative hypothesis depends on what is asked in the question. Both hypotheses must be stated in symbols only (not in words).

2 . Using a calculator, work out the value of the PMCC of the sample data, r .

3 . Use the significance level and sample size to figure out the critical value. This can be found in the PMCC table in the formula booklet.

4 . Take the absolute value of the PMCC and r , and compare these to the critical value. If the absolute value is greater than the critical value, the null hypothesis should be rejected. Otherwise, the null hypothesis should be accepted.

5 . Write a full conclusion in the context of the question. The conclusion should be stated in full: both in statistical language and in words reflecting the context of the question. A negative correlation signifies that the alternative hypothesis is rejected: the lack of one variable correlates with a stronger presence of the other variable, whereas, when there is a positive correlation, the presence of one variable correlates with the presence of the other.

How to interpret results based on the null hypothesis

From the observed results (test statistic), a decision must be made, determining whether to reject the null hypothesis or not.

hypothesis test for correlation probability of observed result studysmarter

Both the one-tailed and two-tailed tests are shown at the 5% level of significance. However, the 5% is distributed in both the positive and negative side in the two-tailed test, and solely on the positive side in the one-tailed test.

From the null hypothesis, the result could lie anywhere on the graph. If the observed result lies in the shaded area, the test statistic is significant at 5%, in other words, we reject H 0 . Therefore, H 0 could actually be true but it is still rejected. Hence, the significance level, 5%, is the probability that H 0 is rejected even though it is true, in other words, the probability that H 0 is incorrectly rejected. When H 0 is rejected, H 1 (the alternative hypothesis) is used to write the conclusion.

We can define the null and alternative hypotheses for one-tailed and two-tailed tests:

For a one-tailed test:

For a two-tailed test:

Let us look at an example of testing for correlation.

12 students sat two biology tests: one was theoretical and the other was practical. The results are shown in the table.

a) Find the product moment correlation coefficient for this data, to 3 significant figures.

b) A teacher claims that students who do well in the theoretical test tend to do well in the practical test. Test this claim at the 0.05 level of significance, clearly stating your hypotheses.

a) Using a calculator, we find the PMCC (enter the data into two lists and calculate the regression line. the PMCC will appear). r = 0.935 to 3 sign. figures

b) We are testing for a positive correlation, since the claim is that a higher score in the theoretical test is associated with a higher score in the practical test. We will now use the five steps we previously looked at.

1. State the null and alternative hypotheses. H 0 : ρ = 0 and H 1 : ρ > 0

2. Calculate the PMCC. From part a), r = 0.935

3. Figure out the critical value from the sample size and significance level. The sample size, n , is 12. The significance level is 5%. The hypothesis is one-tailed since we are only testing for positive correlation. Using the table from the formula booklet, the critical value is shown to be cv = 0.4973

4. The absolute value of the PMCC is 0.935, which is larger than 0.4973. Since the PMCC is larger than the critical value at the 5% level of significance, we can reach a conclusion.

5. Since the PMCC is larger than the critical value, we choose to reject the null hypothesis. We can conclude that there is significant evidence to support the claim that students who do well in the theoretical biology test also tend to do well in the practical biology test.

Let us look at a second example.

A tetrahedral die (four faces) is rolled 40 times and 6 'ones' are observed. Is there any evidence at the 10% level that the probability of a score of 1 is less than a quarter?

The expected mean is 10 = 40 × 1 4 . The question asks whether the observed result (test statistic 6 is unusually low.

We now follow the same series of steps.

1. State the null and alternative hypotheses. H 0 : ρ = 0 and H 1 : ρ <0.25

2. We cannot calculate the PMCC since we are only given data for the frequency of 'ones'.

3. A one-tailed test is required ( ρ < 0.25) at the 10% significance level. We can convert this to a binomial distribution in which X is the number of 'ones' so X ~ B ( 40 , 0 . 25 ) , we then use the cumulative binomial tables. The observed value is X = 6. To P ( X ≤ 6 ' o n e s ' i n 40 r o l l s ) = 0 . 0962 .

4. Since 0.0962, or 9.62% <10%, the observed result lies in the critical region.

5. We reject and accept the alternative hypothesis. We conclude that there is evidence to show that the probability of rolling a 'one' is less than 1 4

Hypothesis Test for Correlation - Key takeaways

Images One-tailed test: https://en.wikipedia.org/w/index.php?curid=35569621

Frequently Asked Questions about Hypothesis Test for Correlation

--> is the pearson correlation a hypothesis test.

Yes. The Pearson correlation produces a PMCC value, or r   value, which indicates the strength of the relationship between two variables.

--> Can we test a hypothesis with correlation?

Yes. Correlation is not equivalent to causation, however we can test hypotheses to determine whether a correlation (or association) exists between two variables.

--> How do you set up the hypothesis test for correlation?

You need a null (p = 0) and alternative hypothesis. The PMCC, or r value must be calculated, based on the sample data. Based on the significance level and sample size, the critical value can be worked out from a table of values in the formula booklet. Finally the r value and critical value can be compared to determine which hypothesis is accepted.

Final Hypothesis Test for Correlation Quiz

What does a PMCC, or r coefficient of 1 signify?

Show answer

There is a perfect positive linear correlation between 2 variables

Show question

What does a PMCC, or r coefficient of 0 signify?

There is no correlation between 2 variables

What does a PMCC, or r coefficient of -0.986 signify? 

There is a strong negative linear correlation between the 2 variables

What does the null hypothesis state?

p = 0 (there is no correlation between the variables)

What is bivariate data?

Data which includes 2 variables

What is the critical region?

The range of values which lead to the rejection of the null hypothesis

What is the difference between a one-tailed and a two-tailed test?

A one-tailed test only examines an effect in one direction, whether a two-tailed test examines it in two (negative and positive)

Are hypotheses written in words or symbols?

What does a significance of 5% indicate?

The probability of incorrectly rejecting the null hypothesis is 5%

of the users don't pass the Hypothesis Test for Correlation quiz! Will you pass the quiz?

More explanations about Statistics

Discover the right content for your subjects, business studies, combined science, english literature, environmental science, human geography, macroeconomics, microeconomics, no need to cheat if you have everything you need to succeed packed into one app.

Be perfectly prepared on time with an individual plan.

Test your knowledge with gamified quizzes.

Create and find flashcards in record time.

Create beautiful notes faster than ever before.

Have all your study materials in one place.

Upload unlimited documents and save them online.

Study Analytics

Identify your study strength and weaknesses.

Weekly Goals

Set individual study goals and earn points reaching them.

Smart Reminders

Stop procrastinating with our study reminders.

Earn points, unlock badges and level up while studying.

Magic Marker

Create flashcards in notes completely automatically.

Smart Formatting

Create the most beautiful study materials using our templates.

Join millions of people in learning anywhere, anytime - every day

Sign up to highlight and take notes. It’s 100% free.

This is still free to read, it's not a paywall.

You need to register to keep reading, get free access to all of our study material, tailor-made.

Over 10 million students from across the world are already learning smarter.

Illustration

StudySmarter bietet alles, was du für deinen Lernerfolg brauchst - in einer App!

Population, sample and hypothesis testing

What is a hypothesis?

A hypothesis is an assumption that is neither proven nor disproven. In the research process, a hypothesis is made at the very beginning and the goal is to either reject or not reject the hypothesis. In order to reject or or not reject a hypothesis, data, e.g. from an experiment or a survey, are needed, which are then evaluated using a hypothesis test .

Usually, hypotheses are formulated starting from a literature review. Based on the literature review, you can then justify why you formulated the hypothesis in this way.

An example of a hypothesis could be: "Men earn more than women in the same job in Austira."

hypothesis

To test this hypothesis, you need data, e.g. from a survey, and a suitable hypothesis test such as the t-test or correlation analysis . Don't worry, DATAtab will help you choose the right hypothesis test.

How do I formulate a hypothesis?

In order to formulate a hypothesis, a research question must first be defined. A precisely formulated hypothesis about the population can then be derived from the research question, e.g. men earn more than women in the same job in Austria.

Formulate hypothesis

Hypotheses are not simple statements; they are formulated in such a way that they can be tested with collected data in the course of the research process.

To test a hypothesis, it is necessary to define exactly which variables are involved and how the variables are related. Hypotheses, then, are assumptions about the cause-and-effect relationships or the associations between variables.

What is a variable?

A variable is a property of an object or event that can take on different values. For example, the eye color is a variable, it is the property of the object eye and can take different values (blue, brown,...).

If you are researching in the social sciences, your variables may be:

If you are researching in the medical field, your variables may be:

What is the null and alternative hypothesis?

There are always two hypotheses that are exactly opposite to each other, or that claim the opposite. These opposite hypotheses are called null and alternative hypothesis and are abbreviated with H0 and H1.

Null hypothesis H0:

The null hypothesis assumes that there is no difference between two or more groups with respect to a characteristic.

The salary of men and women does not differ in Austria.

Alternative hypothesis H1:

Alternative hypotheses, on the other hand, assume that there is a difference between two or more groups.

The salary of men and women differs in Austria.

The hypothesis that you want to test or that you have derived from the theory usually states that there is an effect e.g. gender has an effect on salary . This hypothesis is called an alternative hypothesis.

The null hypothesis usually states that there is no effect e.g. gender has no effect on salary . In a hypothesis test, only the null hypothesis can be tested; the goal is to find out whether the null hypothesis is rejected or not.

Types of hypotheses

What types of hypotheses are available? The most common distinction is between difference and correlation hypotheses, as well as directed and undirected hypotheses.

Differential and correlation hypotheses

Difference hypotheses are used when different groups are to be distinguished, e.g., the group of men and the group of women. Correlation hypotheses are used when the relationship or correlation between variables is to be tested, e.g., the relationship between age and height.

Difference hypotheses

Difference hypotheses test whether there is a difference between two or more groups.

Difference hypotheses

Examples of difference hypotheses are:

Thus, one variable is always a categorical variable, e.g., gender (male, female), smoking status (smoker, nonsmoker), or country (Germany, Austria, and France); the other variable is at least ordinally scaled, e.g., salary, percent risk of heart attack, or hours worked per week.

Correlation hypotheses

Correlation hypotheses test correlations between at least two variables, for example:

Correlation hypotheses

Correlation hypotheses are, for example:

As can be seen from the examples, correlation hypotheses often take the form "The more..., the higher/lower...". Thus, at least two ordinally scaled variables are being examined.

Directed and undirected hypotheses

Hypotheses are divided into directed and undirected or one-sided and two-sided hypotheses. If the hypothesis contains words like "better than" or "worse than", the hypothesis is usually directed.

Directed hypotheses

In the case of an undirected hypothesis, one often finds building blocks such as "there is a difference between" in the formulation, but it is not stated in which direction the difference lies.

Directional and non-directional hypothesis test

Undirected hypotheses

Undirected hypotheses test whether there is a relationship or a difference, and it does not matter in which direction the relationship or difference goes. In the case of a difference hypothesis, this means there is a difference between two groups, but it does not say whether one of the groups has a higher value.

In regard to a correlation hypothesis, this means there is a relationship or correlation between two variables, but it is not said whether this relationship is positive or negative.

In both cases it is not said whether this correlation is positive or negative!

Directed hypotheses

Directed hypotheses additionally indicate the direction of the relationship or the difference. In the case of the difference hypothesis a statement is made which group has a higher or lower value.

In the case of a correlation hypothesis, a statement is made as to whether the correlation is positive or negative.

The p-value for directed hypotheses

Usually, statistical software always calculates the undirected test and then also outputs the p-value for this.

To obtain the p-value for the directed hypothesis, it must first be checked whether the effect is in the right direction. Then the p-value must be divided by two. This is because the significance level is not split on two sides, but only on one side. More about this in the tutorial about the p-value .

If you select "one-tailed" in DATAtab for the calculated hypothesis test, the conversion is done automatically and you only need to read the result.

Step-by-step instructions for testing hypotheses

Next tutorial about hypothesis testing

The next tutorial is about hypothesis testing. You will learn what hypothesis tests are, how to find the right one and how to interpret it.

Statistics made easy

Datatab

"Super simple written"

"It could not be simpler"

"So many helpful examples"

Statistics Calculator

Cite DATAtab: DATAtab Team (2023). DATAtab: Online Statistics Calculator. DATAtab e.U. Graz, Austria. URL https://datatab.net

This means that we can state a null and alternative hypothesis for the population correlation ρ based on our predictions for a correlation. Let's look at how this works in an example.

Now let's go through our hypothesis testing steps:

Step 1: State hypotheses and choose α level

Remember we're going to state hypotheses in terms of our population correlation ρ. In this example, we expect GPA to decrease as distance from campus increases. This means that we are making a directional hypothesis and using a 1-tailed test. It also means we expect to find a negative value of ρ, because that would indicate a negative relationship between GPA and distance from campus. So here are our hypotheses:

H 0 ρ > 0

H A : ρ < 0

We're making our predictions as a comparison with 0, because 0 would indicate no relationship. Note that if we were conducting a 2-tailed test, our hypotheses would be ρ = 0 for the null hypothesis and ρ not equal to 0 for the alternative hypothesis.

We'll use our conventional α = .05.

Step 2: Collect the sample

Here are our sample data:

Step 3: Calculate test statistic

For this example, we're going to calculate a Pearson r statistic. Recall the formula for Person r:   The bottom of the formula requires us to calculate the sum of squares (SS) for each measure individually and the top of the formula requires calculation of the sum of products of the two variables (SP). We'll start with the SS terms. Remember the formula for SS is: SS = Σ(X - ) 2 We'll calculate this for both GPA and Distance. If you need a review of how to calculate SS, review Lab 9 . For our example, we get: SS GPA = .58 and SS distance = 18.39 Now we need to calculate the SP term. Remember the formula for SP is SP = Σ(X - )(Y - ) If you need to review how to calculate the SP term, go to Lab 12 . For our example, we get SP = -.63 Plugging these SS and SP values into our r equation gives us r = -.19 Now we need to find our critical value of r using a table like we did for our z and t-tests. We'll need to know our degrees of freedom, because like t, the r distribution changes depending on the sample size. For r, df = n - 2 So for our example, we have df = 5 - 2 = 3. Now, with df = 3, α = .05, and a one-tailed test, we can find r critical in the Table of Pearson r values . This table is organized and used in the same way that the t-table is used.

Our r crit = .805. We write r crit (3) = -.805 (negative because we are doing a 1-tailed test looking for a negative relationship).

Step 4: Compare observed test statistic to critical test statistic and make a decision about H 0

Our r obs (3) = -.19 and r crit (3) = -.805

Since -.19 is not in the critical region that begins at -.805, we cannot reject the null. We must retain the null hypothesis and conclude that we have no evidence of a relationship between GPA and distance from campus.

Now try a few of these types of problems on your own. Show all four steps of hypothesis testing in your answer (some questions will require more for each step than others) and be sure to state hypotheses in terms of ρ.

(1) A high school counselor would like to know if there is a relationship between mathematical skill and verbal skill. A sample of n = 25 students is selected, and the counselor records achievement test scores in mathematics and English for each student. The Pearson correlation for this sample is r = +0.50. Do these data provide sufficient evidence for a real relationship in the population? Test at the .05 α level, two tails.

(2) It is well known that similarity in attitudes, beliefs, and interests plays an important role in interpersonal attraction. Thus, correlations for attitudes between married couples should be strong and positive. Suppose a researcher developed a questionnaire that measures how liberal or conservative one's attitudes are. Low scores indicate that the person has liberal attitudes, while high scores indicate conservatism. Here are the data from the study:

Couple A: Husband - 14, Wife - 11

Couple B: Husband - 7, Wife - 6

Couple C: Husband - 15, Wife - 18

Couple D: Husband - 7, Wife - 4

Couple E: Husband - 3, Wife - 1

Couple F: Husband - 9, Wife - 10

Couple G: Husband - 9, Wife - 5

Couple H: Husband - 3, Wife - 3

Test the researcher's hypothesis with α set at .05.

(3) A researcher believes that a person's belief in supernatural events (e.g., ghosts, ESP, etc) is related to their education level. For a sample of n = 30 people, he gives them a questionnaire that measures their belief in supernatural events (where a high score means they believe in more of these events) and asks them how many years of schooling they've had. He finds that SS beliefs = 10, SS schooling = 10, and SP = -8. With α = .01, test the researcher's hypothesis.

Using SPSS for Hypothesis Testing with Pearson r

We can also use SPSS to a hypothesis test with Pearson r. We could calculate the Pearson r with SPSS and then look at the output to make our decision about H 0 . The output will give us a p value for our Pearson r (listed under Sig in the Output). We can compare this p value with alpha to determine if the p value is in the critical region.

Remember from Lab 12 , to calculate a Pearson r using SPSS:

The output that you get is a correlation matrix. It correlates each variable against each variable (including itself). You should notice that the table has redundant information on it (e.g., you'll find an r for height correlated with weight, and and r for weight correlated with height. These two statements are identical.)

Real Statistics Using Excel

Two Sample Hypothesis Testing for Correlation

We now extend the approach for one-sample hypothesis testing of the correlation coefficient to two samples.

Theorem 1 : Suppose r 1 and r 2 are as in the Theorem 1 of  Correlation Testing via Fisher Transformation  where r 1 and r 2 are based on independent samples and further suppose that ρ 1  = ρ 2 . If z  is defined as follows, then z  ∼ N (0,1).

image083x

Proof : By Theorem 1 of  Correlation Testing via Fisher Transformation  for i  = 1, 2

image085x

By Properties 1 and 2 of Basic Characteristics of the Normal Distribution , it follows that

image086x

where s  is as defined above. Since ρ 1  = ρ 2  it follows that ρ´ 1  = ρ´ 2 , and so

image088x

from which the result follows.

We can use Theorem 1 to test whether the correlation coefficients of two populations are equal based on taking a sample from each population and comparing the correlation coefficients of the samples.

Example 1 : A sample of 40 couples from London is taken comparing the husband’s IQ with his wife’s. The correlation coefficient for the sample is .77. Is this significantly different from the correlation coefficient of .68 for a sample of 30 couples from Paris?

H 0 : ρ 1 = ρ 2

r'_1

s = SQRT(1/( n 1 – 3) + 1/( n 2 – 3)) = SQRT(1/37 + 1/27) = 0.253

p-value = 2(1 – NORM.S.DIST( z, TRUE) = 2(1 – NORM.S.DIST(.755, TRUE)) = 0.45

We next perform either one of the following tests:

p-value = .45 > .05 =  α

z crit   = NORM.S.INV(1 – α /2) = NORM.S.INV(.975) = 1.96 > .755 = z

In either case, the null hypothesis is not rejected.

Related Tests

Note that in Example 1 the couples from Paris are selected independently from the couples from London. A different test is required if the samples are dependent.

Click here for an example of how to perform  Two Sample Hypothesis Testing for Correlation with Overlapping Dependent Samples .

Click here for an example of how to perform  Two Sample Hypothesis Testing for Correlation with Non-overlapping Dependent Samples .

Worksheet Functions

Real Statistics Functions : The following function is provided in the Real Statistics Resource Pack.

Correl2Test ( r 1 , n 1 , r 2 , n 2 , alpha, lab ): array function which outputs z , p-value (two-tailed), lower and upper (i.e. lower and upper bound of the 1 – alpha confidence interval), where r 1 and n 1 are the correlation coefficient and sample size for the first sample and r 2 and n 2 are similar values for the second sample. If lab = TRUE then the output takes the form of a 4 × 2 range with the first column consisting of labels, while if lab = False (default) then the output takes the form of a 4 × 1 range without labels. If alpha is omitted it defaults to .05.

Correl2Test (R1, R2, R3, R4, alpha, lab ) = CorrelTest( r 1, n 1, r 2, n 2, alpha, lab ) where r 1 = CORREL(R1, R2), n 1 = the common sample size between R1 and R2 (i.e. the number of pairs from R1 and R2 which both contain numeric data), r 2 = CORREL(R3, R4) and  n 2 = the common sample size between R3 and R4.

Correl2Test(.77,40,.68,30,.05) generated the values z = .755, p-value = .45, consistent with what we observed above, plus lower = -.296 and upper = .596. Since 0 is in the confidence interval (-.296, .596) the test is not significant and we cannot reject the null hypothesis that the two correlation coefficients are equal.

24 thoughts on “Two Sample Hypothesis Testing for Correlation”

Dear Charles, Thank you very much for your clear statements and example codes. I really benefited a lot.

1-) if i am not wrong, this equation (p-value = 2(1 – NORM.S.DIST(z, TRUE) = 2(1 – NORM.S.DIST(.522, TRUE)) = 0.45) should be p-value = 2(1 – NORM.S.DIST(z, TRUE) = 2(1 – NORM.S.DIST(0.755, TRUE)) = 0.45. z value should be change to 0.755.

2-) I want to ask that i see an article that use z=(r1′-r2′)/standart deviation (r1′-r2′). In that article they have 11 controls and 11 patients, they measure 320 point time series from different brain region of each subject. And they calculate pearson corrleaiton(PC) betweeen these time series. As a result they have 11 PC for control and 11 patients. They compared this PC between control and patients. They transfom to all PC to Z. So they have 11 r1′ and 11 r2′. And they calculate p value from this translated PCs.

But calculation of s is very diffirent from yours. You use s=sqrt(1/(n1-3)+1/(n2-3)) but in that article they use s=std(r1′-r2′). What is the differences? Can use std(r1′-r2′) ?

Thank you very much. Best Regards.

Hello Sabri, 1. Thanks for finding this error. I just made the correction on the webpage. I appreciate your help in improving the reliability of the website. 2. I don’t know what the std(x,y) function is, so it is difficult for me to comment. But if std is sqrt, then I don’t know how they came up with this standard deviation. It is completely different from the estimate on the webpage. I would have to see the article. Charles

Hi: I am trying to test correlation between Fed Rate and Inflation Rate from 1981 to 2021. I copied the data from Fed Records and ran correlation analysis online free. It returned with NaN ? with All values for each calculation = 0.

Hello Ram, This should not happen. There is a mistake somewhere. Charles

Hi thank you

I wanted to check if this would be appropriate for my test. I am looking into well-being in my group of participants who use a new social media site. I have a control group where they use two other popular sites (the reason is because many people don’t only use one social media site- so i was going to control for this). I was going to do a correlation between the use of a new social media app and well-being and then do a correlation between the control group’s use of social media and well-being. Then do a T-test between the two groups to see if they significantly differ? Do you advise this is the correct way to conduct my stats?

Sam, I am not sure how you plan to combine the two correlations into a t-test. Essentially you have two groups: treatment group (uses new app or apps) vs control group and subjects in both groups are tested for well-being. You can do a two-sample t-test where the first sample consists of the well-being scores of the treatment group and the second sample consists of the well-being scores of the subjects in the control group. Charles

Hello Charles, my question is this: I am looking for the correlation between life satisfaction and sexual satisfaction in a sample of 47 people. The correlation is a weak positive one. (0,33) Now I want to examine if the correlations differ between males and females. (I suppose I should make an independent samples t-test) But I don’t know how to do that with correlations. I also don’t know how to make different correlation calculations based on gender. What should I do? (I do that on SPSS)

Hello Bahar, This is described on this webpage. In fact, Example 1 explains how to conduct such a test. The correlation between a husband and wife is replaced by the correlation between life and sexual satisfaction. And the relationship between London and Paris is replaced by the relationship between Male and Female. Charles

Hi Charles, you have made a very impressive and informative webpage. Based on the calculation above, I wonder if the method is suitable for my project. My case is to test if the intraclass correlation coefficient (ICC) between genders of a biomarker is significantly different or not. Since there is no publication in this topic and I have been looking for a suitable method for the hypothesis test and I am not sure if your method is the right way to do.

Hi Jessie, I can’t say whether or not ICC is the correct tool to use for your situation, but from your description it may indeed be useful. Charles

Hello Charles, I have a question about correlation. I have a number of correlation coefficients betwen two variables A and B. I calculate the correlation coefficient for each subject (say n=10), so 10 values of r. Most of them are negative. Now I would like to test whether the correlation is on average negative. Does it make sense to make a one-sample t-test on the 10 values of r ?

Hello Fabrizio, Ordinarily you would use a one-tail version of the t test described at https://real-statistics.com/correlation/one-sample-hypothesis-testing-correlation/ on your complete sample pair. From your description, though, perhaps the data in each sample is not independent, in which case this approach won’t work. Please clarify your sample approach. If you are taking multiple samples from multiple subjects, please explain why the pairs for each subject should be different at all (e.g. are you just studying measurement error)? Charles

Dear Charles, thank you for your prompt reply. I briefly explain what I am doing. I have a number of subjects, say 13. For each subject, I measured two variables, A and B, a variable number of times (not constant). For example, for subject 1, I measured A and B five times. Then for subject 2, I measured A and B seven times, etc. Within subject, I calculated the correlation between A and B. So I get 13 values of r. In most cases the values of r are negative (although not significant). Now I would like to know if this due to pure chance or not. So my idea was to make a one-sample t-test on the sample of 13 correlation coefficients. I hope it’s clearer now. Many thanks again for your feedback! Best regards, Fabrizio

I add that I expect there is a relationship between A and B, within a subject. The smaller A, the greater B. Best regards, Fabrizio

Hello Fabrizio, Provided the 13 subjects were chosen randomly and the 13 r values are normally distributed (or at least not too far off from normal), this approach seems to be valid. You need to use a one-tail version of the one-sample t test with a null hypothesis of mean r >= 0. Charles

Dear Charles, thanks again for your kind reply. And many compliments for your web site. I know no other site with so many resources and where stats are explained so well. Best regards, Fabrizio

Dear Charles,

Will the df always be n-3 in this test?

Anna, Since the normal distribution is used there is no df. Charles

Hi Charles,

I am seeking to compare either the Kendall’s Tau value of two independent samples or the Spearman’s Rho of two independent samples. That is I have an estimate of the correlation between x and y for sample 1 and that of sample 2. The samples are independent. However, the sample size in each group is small (n1=15 n2=35) and the data for x and y is not normal in either sample (this is the reason I would use either Kendall’s Tau or Spearmans’s Rho instead of Pearson’s in each of the samples).

Is there a test to compare the Kendall’s tau or the Spearman’s Rho of 2 independent samples?

Any guidance would be greatly appreciated.

-Maggie Biostat II

Maggie, If I remember correctly, with Spearman’s rho you are just calculating Pearson’s correlation on the ranks of the two pairs in the samples. If you are comparing two independent sample pairs, you should be able to use the test of two independent sample pairs described on the referenced webpage, but on the ranks not on the original data. I don’t how to do this for Kendall’s tau. Charles

Hi and thank you for the nice informative pages. Since I am not a very experienced user I must ask. I use your correl2test(r1, n1, r2, n2, alpha, lab) as follows =correl2test( 0,569;10190;0,641; 2039;0,05) but not get only one number instead of 4 I get -4,652529256

By the way I have excel2010

Thank you in advance

Regards Gustaf

Gustaf, Correl2Test is an array function and so you can’t simply highlight one cell and press the Enter key. You need to highlight a column range with at least 4 cells and press Ctrl-Shift-Enter. See Array Formulas and Functions for more details. Charles

Thank you Charles, I will check it out.

I had some trouble with your p-value also so I solved it like this: The cell for the p-value: =IF(AW4>=0; 2*(1-NORM.DIST(AW4;0;1;TRUE));2*NORM.DIST(AW4;0;1;TRUE)) where AW4 is the z. The NORM.DIST is a new function from Excel. The other one NORMSDIST does not work anymore apparently.

I have a question though. I now tested the hypothesis that of equality. What if I tested Ho: rho1>rho2. Any tips for me there?

I am very thankful for your commitment to these pages you offer by the way.

Gustaf, You could also use the formula =2*(1-NORM.DIST(ABS(AW4);0;1;TRUE)) or =2*(1-NORM.S.DIST(ABS(AW4);TRUE)). The formula NORMSDIST still works on my computer. I understood that Excel still supports this function, but wants people to migrate to NORM.DIST. I beleive that if you are testing Ho: rho1>rho2, then you should use a one-tail test, i.e. =1-NORM.S.DIST(ABS(AW4);TRUE). Charles

Leave a Comment Cancel reply

Sciencing_Icons_Science SCIENCE

Sciencing_icons_biology biology, sciencing_icons_cells cells, sciencing_icons_molecular molecular, sciencing_icons_microorganisms microorganisms, sciencing_icons_genetics genetics, sciencing_icons_human body human body, sciencing_icons_ecology ecology, sciencing_icons_chemistry chemistry, sciencing_icons_atomic &amp; molecular structure atomic & molecular structure, sciencing_icons_bonds bonds, sciencing_icons_reactions reactions, sciencing_icons_stoichiometry stoichiometry, sciencing_icons_solutions solutions, sciencing_icons_acids &amp; bases acids & bases, sciencing_icons_thermodynamics thermodynamics, sciencing_icons_organic chemistry organic chemistry, sciencing_icons_physics physics, sciencing_icons_fundamentals-physics fundamentals, sciencing_icons_electronics electronics, sciencing_icons_waves waves, sciencing_icons_energy energy, sciencing_icons_fluid fluid, sciencing_icons_astronomy astronomy, sciencing_icons_geology geology, sciencing_icons_fundamentals-geology fundamentals, sciencing_icons_minerals &amp; rocks minerals & rocks, sciencing_icons_earth scructure earth structure, sciencing_icons_fossils fossils, sciencing_icons_natural disasters natural disasters, sciencing_icons_nature nature, sciencing_icons_ecosystems ecosystems, sciencing_icons_environment environment, sciencing_icons_insects insects, sciencing_icons_plants &amp; mushrooms plants & mushrooms, sciencing_icons_animals animals, sciencing_icons_math math, sciencing_icons_arithmetic arithmetic, sciencing_icons_addition &amp; subtraction addition & subtraction, sciencing_icons_multiplication &amp; division multiplication & division, sciencing_icons_decimals decimals, sciencing_icons_fractions fractions, sciencing_icons_conversions conversions, sciencing_icons_algebra algebra, sciencing_icons_working with units working with units, sciencing_icons_equations &amp; expressions equations & expressions, sciencing_icons_ratios &amp; proportions ratios & proportions, sciencing_icons_inequalities inequalities, sciencing_icons_exponents &amp; logarithms exponents & logarithms, sciencing_icons_factorization factorization, sciencing_icons_functions functions, sciencing_icons_linear equations linear equations, sciencing_icons_graphs graphs, sciencing_icons_quadratics quadratics, sciencing_icons_polynomials polynomials, sciencing_icons_geometry geometry, sciencing_icons_fundamentals-geometry fundamentals, sciencing_icons_cartesian cartesian, sciencing_icons_circles circles, sciencing_icons_solids solids, sciencing_icons_trigonometry trigonometry, sciencing_icons_probability-statistics probability & statistics, sciencing_icons_mean-median-mode mean/median/mode, sciencing_icons_independent-dependent variables independent/dependent variables, sciencing_icons_deviation deviation, sciencing_icons_correlation correlation, sciencing_icons_sampling sampling, sciencing_icons_distributions distributions, sciencing_icons_probability probability, sciencing_icons_calculus calculus, sciencing_icons_differentiation-integration differentiation/integration, sciencing_icons_application application, sciencing_icons_projects projects, sciencing_icons_news news.

How to Write a Hypothesis for Correlation

A hypothesis for correlation predicts a statistically significant relationship.

How to Calculate a P-Value

A hypothesis is a testable statement about how something works in the natural world. While some hypotheses predict a causal relationship between two variables, other hypotheses predict a correlation between them. According to the Research Methods Knowledge Base, a correlation is a single number that describes the relationship between two variables. If you do not predict a causal relationship or cannot measure one objectively, state clearly in your hypothesis that you are merely predicting a correlation.

Research the topic in depth before forming a hypothesis. Without adequate knowledge about the subject matter, you will not be able to decide whether to write a hypothesis for correlation or causation. Read the findings of similar experiments before writing your own hypothesis.

Identify the independent variable and dependent variable. Your hypothesis will be concerned with what happens to the dependent variable when a change is made in the independent variable. In a correlation, the two variables undergo changes at the same time in a significant number of cases. However, this does not mean that the change in the independent variable causes the change in the dependent variable.

Construct an experiment to test your hypothesis. In a correlative experiment, you must be able to measure the exact relationship between two variables. This means you will need to find out how often a change occurs in both variables in terms of a specific percentage.

Establish the requirements of the experiment with regard to statistical significance. Instruct readers exactly how often the variables must correlate to reach a high enough level of statistical significance. This number will vary considerably depending on the field. In a highly technical scientific study, for instance, the variables may need to correlate 98 percent of the time; but in a sociological study, 90 percent correlation may suffice. Look at other studies in your particular field to determine the requirements for statistical significance.

State the null hypothesis. The null hypothesis gives an exact value that implies there is no correlation between the two variables. If the results show a percentage equal to or lower than the value of the null hypothesis, then the variables are not proven to correlate.

Record and summarize the results of your experiment. State whether or not the experiment met the minimum requirements of your hypothesis in terms of both percentage and significance.

Related Articles

How to calculate percent relative range, difference between proposition & hypothesis, how to know if something is significant using spss, how to calculate a two-tailed test, how to interpret a student's t-test results, how to find y value for the slope of a line, similarities of univariate & multivariate statistical..., what is the meaning of sample size, here's the secret to *really* understanding your science..., how to determine the sample size in a quantitative..., how to use the pearson correlation coefficient, how to calculate the percentage of another number, the difference between a t-test & a chi square, how to determine your practice clep score, advantages & disadvantages of finding variance, how to calculate p-hat, the difference between research questions & hypothesis, how to calculate an adjusted odds ratio.

About the Author

Brian Gabriel has been a writer and blogger since 2009, contributing to various online publications. He earned his Bachelor of Arts in history from Whitworth University.

Photo Credits

Thinkstock/Comstock/Getty Images

Find Your Next Great Science Fair Project! GO

We Have More Great Sciencing Articles!

Statistics add-in software for statistical analysis in Excel

Correlation/association hypothesis test

A hypothesis test formally tests if there is correlation/association between two variables in a population.

The hypotheses to test depends on the type of association:

When the test p-value is small, you can reject the null hypothesis and conclude that the population correlation coefficient is not equal to the hypothesized value, or for rank correlation that the variables are not independent. It is important to remember that a statistically significant test may not have any practical importance if the correlation coefficient is very small.

Pearson's and Kendall's tests are preferred as both have associated estimators of the population correlation coefficient (rho and tau respectively). Although the Spearman test is popular due to the ease of computation, the Spearman correlation coefficient is a measure of the linear association between the ranks of the variables and not the measure of association linked with the Spearman test.

hypothesis example correlation

IMAGES

  1. HYPOTHESIS TESTING

    hypothesis example correlation

  2. Hypothesis testing for the (Pearson) correlation coefficient

    hypothesis example correlation

  3. PPT

    hypothesis example correlation

  4. PPT

    hypothesis example correlation

  5. Independent study supplementary lecture

    hypothesis example correlation

  6. Day 9 hypothesis and correlation for students

    hypothesis example correlation

VIDEO

  1. Correlation coefficient hypothesis testing

  2. 3.3b Correlation and Hypothesis Testing

  3. 33. Introduction to Regression

  4. Testing of Hypothesis About Regression Coefficient

  5. Difference between Null and Alternative Hypothesis (Continued)

  6. Correlation and Regression Analysis

COMMENTS

  1. 1.9

    In general, a researcher should use the hypothesis test for the population correlation ρ to learn of a linear association between two variables, when it isn't

  2. Hypothesis Test for Correlation: Explanation & Example

    What is the hypothesis test for correlation coefficient? · If r = 1 , there is a perfect positive linear correlation. · If r = 0 , there is no linear correlation

  3. Hypothesis

    In regard to a correlation hypothesis, this means there is a relationship or correlation between two variables, but it is not said whether this relationship is

  4. A CORRELATION LESSON

    For example: “The hypothesis is that higher happiness scores are associated with higher income scores.” This is the kind we usually state, because we usually

  5. Lab 20: Hypothesis testing with correlation

    Hypothesis Testing with Pearson r · Step 1: State hypotheses and choose α level · Step 2: Collect the sample · Step 3: Calculate test statistic · Step 4: Compare

  6. Two Sample Hypothesis Testing for Correlation

    Example 1: A sample of 40 couples from London is taken comparing the husband's IQ with his wife's. The correlation coefficient for the sample is .77. Is this

  7. How to Write a Hypothesis for Correlation

    A hypothesis is a testable statement about how something works in the natural world. While some hypotheses predict a causal relationship

  8. Correlation/association hypothesis test

    For a product-moment correlation, the null hypothesis states that the population correlation coefficient is equal to a hypothesized value (

  9. How To Conduct Hypothesis Testing For A Population Correlation

    The r value, or sample correlation coefficient equals 0.9067. ... Timestamps 0:00 Overview Of Hypothesis Testing For Correlation Between 2

  10. Correlation Hypothesis Testing

    A Level Maths revision tutorial video.For the full list of videos and more revision resources visit www.mathsgenie.co.uk.