Chapter 10 – Correlation and Regression Project

Refer to the Section 10.1 and 10.2 course notes posted in the discussion forum and these sections

in the textbook that comprises of examples on the topics covered in these sessions.

Refer to the data in below;

The table below lists are systolic blood pressure measurements (in mm Hg) obtained from the

same woman (based on the data from “Consistency of Blood Pressure Difference Between the

Left and Right Arms,” by Eguchi, et al., Archives of Internal Medicine, Vol. 167).

Right arm

Left arm

102

175

101

169

94

182

79

146

79

144

1. Construct a scatterplot for the variables. See the “StatCrunch Video Tutorials” below and

under Tools for Success in MyStatLab on how to graph a scatterplot using StatCrunch software.

https://mediaplayer.pearsoncmg.com/assets/statcrunch_01

https://mediaplayer.pearsoncmg.com/assets/statcrunch_02

https://mediaplayer.pearsoncmg.com/assets/statcrunch_18

2. Use the scatterplot to determine whether there is correlation between the two variables. State

the type of correlation.

3. Make a table for the data and calculate ∑ x , ∑ y , ∑ xy , ∑ x 2 , ∑ y 2 .

4. Calculate the correlation coefficient, r using the appropriate formula below.

r=

n xy − ( x )( y )

n( x 2 ) − ( x )

2

(

)

n y 2 − ( y )

2

5. Check your answer for the correlation coefficient, r using the statistical software like

“StatCrunch software” or “TI 83/84 Graphing Calculator” and record the results from your

software.

6. What does the correlation coefficient, r, tell us about the strength of the correlation.

7. Compute the square of the correlation coefficient, r2. What does r2 tell us about the best-fit

line.

8. Define the best-fit line (or regression line).

9. Find the slope, y-intercept and equation for the best-fit line of your data using any of two

appropriate formulas below. Show your work.

m= 𝑟×

𝑠𝑦

𝑠𝑥

b = 𝑦̅ − (𝑚 × 𝑥̅ ) ,

,

y = mx + b

or

b1 =

n( ∑ xy)−( ∑ x)(∑ 𝑦)

n( ∑ x2 ) − ( ∑ 𝑥)2

,

b0 =

(∑ y)(∑ x2 )−( ∑ x )( ∑ xy)

n( ∑ x2 ) − ( ∑ x)2

,

𝑦̂ = 𝑏0 + 𝑏1 𝑥

10. Add a graph of the best-fit line to your scatterplot using the StatCrunch software. Include a

screen shot of your data, best-fit line and equation of the best-fit line from StatCrunch.

11. Summarize your findings from this project.

Chapter 10- Correlation and Regression

Section 10.1 – Correlation

Key Concept

In Part 1 of this section, we introduces the linear correlation coefficient r, which is a numerical

measure of the strength of the relationship between two variables representing quantitative data.

Using paired sample data (sometimes called bivariate data), we find the value of r (usually using

technology), then we use that value to conclude that there is (or is not) a linear correlation

between the two variables.

In this section we consider only linear relationships, which means that when graphed, the points

approximate a straight-line pattern.

In Part 2, we discuss methods of hypothesis testing for correlation.

Part 1: Basic Concepts of Correlation

Definition

A correlation exists between two variables when the values of one are somehow associated with

the values of the other in some way.

A linear correlation exists between two variables when there is a correlation and the plotted

points of paired data result in a pattern that can be approximated by a straight line.

Exploring the Data

We can often see a relationship between two variables by constructing a scatterplot.

The figure below shows scatterplots with different characteristics.

Scatterplots of Paired Data

Measure the Strength of the Linear Correlation

The linear correlation coefficient r measures the strength of the linear relationship between the

paired quantitative x- and y-values in a sample.

Requirements for Linear Correlation

1. The sample of paired (x, y) data is a simple random sample of quantitative data.

2. Visual examination of the scatterplot must confirm that the points approximate a straight-line

pattern.

3. The outliers must be removed if they are known to be errors. The effects of any other outliers

should be considered by calculating r with and without the outliers included.

Notation for the Linear Correlation Coefficient

n=

number of pairs of sample data

= denotes the addition of the items indicated.

x = denotes the sum of all x-values.

x 2 = indicates that each x-value should be squared and then those squares added.

(x) 2 = indicates that the x-values should be added and then the total squared.

xy = indicates that each x-value should be first multiplied by its corresponding y-value.

After obtaining all such products, find their sum.

r= linear correlation coefficient for sample data.

= linear correlation coefficient for population data.

Formula

The linear correlation coefficient r measures the strength of a linear relationship between the

paired values in a sample.

r

n xy x y

n( x 2 ) x

2

n y 2 y

2

Note

Computer software or calculators can compute r

Interpreting the Linear Correlation Coefficient r

We can base our interpretation and conclusion about correlation on a P-value obtained from

computer software or a critical value from Table A-6.

Using Table A-6 to Interpret r:

If the absolute value of the computed value of r , denoted r , exceeds the value in Table A-6,

conclude that there is a linear correlation.

Otherwise, there is not sufficient evidence to support the conclusion of a linear correlation.

Using Software to interpret r:

If the computed P-value is less than or equal to the significance level, conclude that there is

a linear correlation.

Otherwise, there is not sufficient evidence to support the conclusion of a linear correlation.

Caution

Know that the methods of this section apply to a linear correlation. If you conclude that there

does not appear to be linear correlation, know that it is possible that there might be some other

association that is not linear.

Rounding the Linear Correlation Coefficient r

Round to three decimal places so that it can be compared to critical values in Table A-6.

Use calculator or computer if possible.

Properties of the Linear Correlation Coefficient r

1. The value of r is always between -1 and 1 inclusive. That is

1 r 1

2. If all values of either variable are converted to a different scale, the value of r does not change.

3. The value of r is not affected by the choice of x and y. Interchange all x- and y-values and the

value of r will not change.

4. r measures strength of a linear relationship.

5. r is very sensitive to outliers, they can dramatically affect its value.

Example 1: Given n = 6 and significant level 0.05

Solution

Critical Values from Table A-6 and the Computed Value of r

Conclusion:

Using Table A-6 to Interpret r:

Because

0.988 0.988 exceeds the critical value of 0.811 from Table A-6, {n = 6, 0.05 },

we conclude that there is sufficient evidence to support a claim of a linear correlation between

variables.

Example 2: Interpret r using a significance level 0.05

The heights (in inches) of a sample of eight mother/ daughter pairs of subjects were measured.

Using the TI 83/ 84 Plus calculator with the paired mother/daughter heights, the linear

correlation coefficient r is found to be 0.693 (based on data from the National Health

Examination Survey).

Is there sufficient evidence to support the claim that there is a linear correlation between the

heights of mothers and heights of daughters? Explain.

Solution

Requirements are satisfied: simple random sample of quantitative data; scatterplot approximates

a straight line; no outliers.

Using Table A-6 to Interpret r:

Because r 0.693 0.693 does not exceed the critical value of 0.707 from Table A-6,

{n = 8, 0.05 }, we conclude that there is not sufficient evidence to support a claim of a linear

correlation between the heights of mothers and heights of daughters?

Using Software to Interpret r:

Assume P- value = 0.325,

Solution

Since P-value of 0.325 is not less than significant level 0.05 , we conclude that there is not

sufficient evidence to support a claim of a linear correlation between the heights of mothers and

heights of daughters?

Example 3:

The table below consists of a data set from Graphs in Statistical Analysis by F. J. Anscombe, The

American Statistician, Vol. 27.

x

y

10

7.46

8

6.77

13

12.74

9

7.11

11

7.81

14

8.84

6

6.08

4

5.39

12

8.15

7

6.42

5

5.73

(a) Find the value of the linear correlation coefficient r, than determine whether there is

sufficient evidence to support the claim of linear correlation between the two variables.

(b) Identify the feature of the data that would be missed if part (b) was completed without

constructing the scatterplot.

Solution

(a) Calculating the Linear Correlation Coefficient r using the formula;

r

n xy x y

n( x 2 ) x

2

n y 2 y

2

n= 11

x

10

8

13

9

11

14

6

4

12

7

5

x 99

y

7.46

6.77

12.74

7.11

7.81

8.84

6.08

5.39

8.15

6.42

5.73

y 82.5

x2

100

64

169

81

121

196

36

16

144

49

25

x 2 1001

y2

55.6516

45.8329

162.3076

50.5521

60.9961

78.1456

36.9664

29.0521

66.4225

41.2164

32.8329

2

y 659.9762

xy

74.6

54.16

165.62

63.99

85.91

123.76

36.48

21.56

97.8

44.94

28.65

xy 797.47

n xy x y

r

r

r

n( x 2 ) x

2

n y 2 y

2

11797.47 (99)(82.5)

11(1001) (99) 2

11659.9762 (82.5) 2

8772.17 8167.5

(34.78505426)(21.29526238)

r 0.81628627395

r 0.816

The Linear Correlation Coefficient r = 0.816

Using the T1 83/84 Calculator to calculate the Linear Correlation Coefficient r

Enter the x-values in list L1

Enter the y-values in list L2

STAT

TEST

LinRegTTest

Xlist: L1

Ylis: L2

Freq: 1

& : # 0, 0

Reg EQ:

Calculate

LinRegTTest

Using Table A-6 to Interpret r:

Because

0.816 0.816 exceeds the critical value of 0.602 from Table A-6 {n=11, 0.05 },

we conclude that there is sufficient evidence to support a claim of a linear correlation between

the two variables.

Using Software to Interpret r:

Assume P- value = 0.002,

Solution

Since P-value of 0.002 is less than significant level 0.05 , we conclude that there is

sufficient evidence to support a claim of a linear correlation between the two variables.

(b) Identify the feature of the data that would be missed if part (b) was completed without

constructing the scatterplot.

Solution

The scatter plot indicates that the relationship between the variables is essentially a perfect

straight pattern except for the presence of one outlier.

Common Errors Involving Correlation

1. Causation: It is wrong to conclude that correlation implies causality.

2. Averages: Averages suppress individual variation and may inflate the correlation coefficient.

3. Linearity: There may be some relationship between x and y even when there is no linear

correlation.

Caution

Know that correlation does not imply causality.

Part 2: Formal Hypothesis Test

Formal Hypothesis Test

We wish to determine whether there is a significant linear correlation between two variables.

Hypothesis Test for Correlation Notation

n = number of pairs of sample data

r = linear correlation coefficient for a sample of paired data

= linear correlation coefficient for a population of paired data

Hypothesis Test for Correlation Requirements

1. The sample of paired (x, y) data is a simple random sample of quantitative data.

2. Visual examination of the scatterplot must confirm that the points approximate a straight-line

pattern.

3. The outliers must be removed if they are known to be errors. The effects of any other outliers

should be considered by calculating r with and without the outliers included.

Hypothesis Test for Correlation Hypotheses

H0 : 0

H1 : 0

(There is no linear correlation.)

(There is a linear correlation.)

Test Statistic: r

Critical Values: Refer to Table A-6

Hypothesis Test for Correlation Conclusion

If r critical value from Table A-6, reject H 0 and conclude that there is sufficient evidence

to support the claim of a linear correlation

If r critical value from Table A-6, fail to reject H 0 and conclude that there is not sufficient

evidence to support the claim of a linear correlation.

Hypothesis Test for Correlation P-Value from a t Test

H0 : 0

H1 : 0

(There is no linear correlation.)

(There is a linear correlation.)

Test Statistic: t

t

r

1 rr

n2

Hypothesis Test for Correlation Conclusion

P-value: Use computer software to find the P-value corresponding to the test statistic t.

(Remove)use Table A-6 with n – 2 degrees of freedom to find the P-value corresponding to the

test statistic t.

If the P-value is less than or equal to the significance level, reject H 0 and conclude that there

is sufficient evidence to support the claim of a linear correlation.

If the P-value is greater than the significance level, fail to reject H 0 and conclude that there

is not sufficient evidence to support the claim of a linear correlation.

Note:

The exercises in this section will involve only two-tailed test.

One-Tailed Tests

One-tailed tests can occur with a claim of a positive linear correlation or a claim of a

negative linear correlation. In such cases, the hypotheses will be as shown here.

Note: For these one-tailed tests, the P-value method can be used as in earlier chapters.

Example:

The table below consists of a data set from Graphs in Statistical Analysis by F. J. Anscombe, The

American Statistician, Vol. 27.

x

y

10

7.46

8

6.77

13

12.74

9

7.11

11

7.81

14

8.84

6

6.08

4

5.39

12

8.15

7

6.42

(a) Construct a scatter plot.

(b) Find the value of the linear correlation coefficient r.

(c) Compute the test statistic.

(d) Find the critical values of r from Table A-6 using 0.05 .

(e) Find the P-value

(f) Determine whether there is sufficient evidence to support the claim of linear correlation

between the two variables.

Solution

(a) Scatterplot: See notes on constructing a scatterplot in section 2.4 in Textbook

5

5.73

(b)

TI 83/84 Calculator

OUTPUT

(b) linear correlation coefficient r.

r = 0.816

(c) Test Statistic

T = 4.239372102 = 4.23

(d) critical value, r

From Table A-6, for n = 11 and 0.05 , the critical value r is 0.602

Conclusion

Using Table A-6 to Interpret r:

Because

0.816 0.816 exceeds the critical value of 0.602 from Table A-6 {n=11, 0.05 },

we conclude that there is sufficient evidence to support a claim of a linear correlation between

the two variables.

(e) P-value = 0.0021763053 = 0.002

The P-value of 0.002 is less that 0.05

(f)

Using StatCrunch

To access StatCrunch, log into MyStatLab.

Watch the video (Getting Started) below to learn how to use StatCrunch.

Remember that you can access more videos like those listed below and other resources on using

the software by clicking on the “Help” tab in StatCrunch.

Creating a Scatterplot

x=19

Section 10.2 – Regression

Homework (Question # 3)

Using From Table A-6 to find Critical value

N = 70, alpha = 0.05, so critical value r = 0.305

Based on the information in the chat above we see that the regression equation is not a good

model because the linear correlation coefficient (r = 0.283) was not bigger than the critical value

(r = 0.305) and so there is not sufficient evidence to a support linear correlation.

Therefore we can only us eth sample mean of the y values ( y 75.1 ) to enable us predict the

best pulse rate and not the predicted equation.

Purchase answer to see full

attachment

Refer to the Section 10.1 and 10.2 course notes posted in the discussion forum and these sections

in the textbook that comprises of examples on the topics covered in these sessions.

Refer to the data in below;

The table below lists are systolic blood pressure measurements (in mm Hg) obtained from the

same woman (based on the data from “Consistency of Blood Pressure Difference Between the

Left and Right Arms,” by Eguchi, et al., Archives of Internal Medicine, Vol. 167).

Right arm

Left arm

102

175

101

169

94

182

79

146

79

144

1. Construct a scatterplot for the variables. See the “StatCrunch Video Tutorials” below and

under Tools for Success in MyStatLab on how to graph a scatterplot using StatCrunch software.

https://mediaplayer.pearsoncmg.com/assets/statcrunch_01

https://mediaplayer.pearsoncmg.com/assets/statcrunch_02

https://mediaplayer.pearsoncmg.com/assets/statcrunch_18

2. Use the scatterplot to determine whether there is correlation between the two variables. State

the type of correlation.

3. Make a table for the data and calculate ∑ x , ∑ y , ∑ xy , ∑ x 2 , ∑ y 2 .

4. Calculate the correlation coefficient, r using the appropriate formula below.

r=

n xy − ( x )( y )

n( x 2 ) − ( x )

2

(

)

n y 2 − ( y )

2

5. Check your answer for the correlation coefficient, r using the statistical software like

“StatCrunch software” or “TI 83/84 Graphing Calculator” and record the results from your

software.

6. What does the correlation coefficient, r, tell us about the strength of the correlation.

7. Compute the square of the correlation coefficient, r2. What does r2 tell us about the best-fit

line.

8. Define the best-fit line (or regression line).

9. Find the slope, y-intercept and equation for the best-fit line of your data using any of two

appropriate formulas below. Show your work.

m= 𝑟×

𝑠𝑦

𝑠𝑥

b = 𝑦̅ − (𝑚 × 𝑥̅ ) ,

,

y = mx + b

or

b1 =

n( ∑ xy)−( ∑ x)(∑ 𝑦)

n( ∑ x2 ) − ( ∑ 𝑥)2

,

b0 =

(∑ y)(∑ x2 )−( ∑ x )( ∑ xy)

n( ∑ x2 ) − ( ∑ x)2

,

𝑦̂ = 𝑏0 + 𝑏1 𝑥

10. Add a graph of the best-fit line to your scatterplot using the StatCrunch software. Include a

screen shot of your data, best-fit line and equation of the best-fit line from StatCrunch.

11. Summarize your findings from this project.

Chapter 10- Correlation and Regression

Section 10.1 – Correlation

Key Concept

In Part 1 of this section, we introduces the linear correlation coefficient r, which is a numerical

measure of the strength of the relationship between two variables representing quantitative data.

Using paired sample data (sometimes called bivariate data), we find the value of r (usually using

technology), then we use that value to conclude that there is (or is not) a linear correlation

between the two variables.

In this section we consider only linear relationships, which means that when graphed, the points

approximate a straight-line pattern.

In Part 2, we discuss methods of hypothesis testing for correlation.

Part 1: Basic Concepts of Correlation

Definition

A correlation exists between two variables when the values of one are somehow associated with

the values of the other in some way.

A linear correlation exists between two variables when there is a correlation and the plotted

points of paired data result in a pattern that can be approximated by a straight line.

Exploring the Data

We can often see a relationship between two variables by constructing a scatterplot.

The figure below shows scatterplots with different characteristics.

Scatterplots of Paired Data

Measure the Strength of the Linear Correlation

The linear correlation coefficient r measures the strength of the linear relationship between the

paired quantitative x- and y-values in a sample.

Requirements for Linear Correlation

1. The sample of paired (x, y) data is a simple random sample of quantitative data.

2. Visual examination of the scatterplot must confirm that the points approximate a straight-line

pattern.

3. The outliers must be removed if they are known to be errors. The effects of any other outliers

should be considered by calculating r with and without the outliers included.

Notation for the Linear Correlation Coefficient

n=

number of pairs of sample data

= denotes the addition of the items indicated.

x = denotes the sum of all x-values.

x 2 = indicates that each x-value should be squared and then those squares added.

(x) 2 = indicates that the x-values should be added and then the total squared.

xy = indicates that each x-value should be first multiplied by its corresponding y-value.

After obtaining all such products, find their sum.

r= linear correlation coefficient for sample data.

= linear correlation coefficient for population data.

Formula

The linear correlation coefficient r measures the strength of a linear relationship between the

paired values in a sample.

r

n xy x y

n( x 2 ) x

2

n y 2 y

2

Note

Computer software or calculators can compute r

Interpreting the Linear Correlation Coefficient r

We can base our interpretation and conclusion about correlation on a P-value obtained from

computer software or a critical value from Table A-6.

Using Table A-6 to Interpret r:

If the absolute value of the computed value of r , denoted r , exceeds the value in Table A-6,

conclude that there is a linear correlation.

Otherwise, there is not sufficient evidence to support the conclusion of a linear correlation.

Using Software to interpret r:

If the computed P-value is less than or equal to the significance level, conclude that there is

a linear correlation.

Otherwise, there is not sufficient evidence to support the conclusion of a linear correlation.

Caution

Know that the methods of this section apply to a linear correlation. If you conclude that there

does not appear to be linear correlation, know that it is possible that there might be some other

association that is not linear.

Rounding the Linear Correlation Coefficient r

Round to three decimal places so that it can be compared to critical values in Table A-6.

Use calculator or computer if possible.

Properties of the Linear Correlation Coefficient r

1. The value of r is always between -1 and 1 inclusive. That is

1 r 1

2. If all values of either variable are converted to a different scale, the value of r does not change.

3. The value of r is not affected by the choice of x and y. Interchange all x- and y-values and the

value of r will not change.

4. r measures strength of a linear relationship.

5. r is very sensitive to outliers, they can dramatically affect its value.

Example 1: Given n = 6 and significant level 0.05

Solution

Critical Values from Table A-6 and the Computed Value of r

Conclusion:

Using Table A-6 to Interpret r:

Because

0.988 0.988 exceeds the critical value of 0.811 from Table A-6, {n = 6, 0.05 },

we conclude that there is sufficient evidence to support a claim of a linear correlation between

variables.

Example 2: Interpret r using a significance level 0.05

The heights (in inches) of a sample of eight mother/ daughter pairs of subjects were measured.

Using the TI 83/ 84 Plus calculator with the paired mother/daughter heights, the linear

correlation coefficient r is found to be 0.693 (based on data from the National Health

Examination Survey).

Is there sufficient evidence to support the claim that there is a linear correlation between the

heights of mothers and heights of daughters? Explain.

Solution

Requirements are satisfied: simple random sample of quantitative data; scatterplot approximates

a straight line; no outliers.

Using Table A-6 to Interpret r:

Because r 0.693 0.693 does not exceed the critical value of 0.707 from Table A-6,

{n = 8, 0.05 }, we conclude that there is not sufficient evidence to support a claim of a linear

correlation between the heights of mothers and heights of daughters?

Using Software to Interpret r:

Assume P- value = 0.325,

Solution

Since P-value of 0.325 is not less than significant level 0.05 , we conclude that there is not

sufficient evidence to support a claim of a linear correlation between the heights of mothers and

heights of daughters?

Example 3:

The table below consists of a data set from Graphs in Statistical Analysis by F. J. Anscombe, The

American Statistician, Vol. 27.

x

y

10

7.46

8

6.77

13

12.74

9

7.11

11

7.81

14

8.84

6

6.08

4

5.39

12

8.15

7

6.42

5

5.73

(a) Find the value of the linear correlation coefficient r, than determine whether there is

sufficient evidence to support the claim of linear correlation between the two variables.

(b) Identify the feature of the data that would be missed if part (b) was completed without

constructing the scatterplot.

Solution

(a) Calculating the Linear Correlation Coefficient r using the formula;

r

n xy x y

n( x 2 ) x

2

n y 2 y

2

n= 11

x

10

8

13

9

11

14

6

4

12

7

5

x 99

y

7.46

6.77

12.74

7.11

7.81

8.84

6.08

5.39

8.15

6.42

5.73

y 82.5

x2

100

64

169

81

121

196

36

16

144

49

25

x 2 1001

y2

55.6516

45.8329

162.3076

50.5521

60.9961

78.1456

36.9664

29.0521

66.4225

41.2164

32.8329

2

y 659.9762

xy

74.6

54.16

165.62

63.99

85.91

123.76

36.48

21.56

97.8

44.94

28.65

xy 797.47

n xy x y

r

r

r

n( x 2 ) x

2

n y 2 y

2

11797.47 (99)(82.5)

11(1001) (99) 2

11659.9762 (82.5) 2

8772.17 8167.5

(34.78505426)(21.29526238)

r 0.81628627395

r 0.816

The Linear Correlation Coefficient r = 0.816

Using the T1 83/84 Calculator to calculate the Linear Correlation Coefficient r

Enter the x-values in list L1

Enter the y-values in list L2

STAT

TEST

LinRegTTest

Xlist: L1

Ylis: L2

Freq: 1

& : # 0, 0

Reg EQ:

Calculate

LinRegTTest

Using Table A-6 to Interpret r:

Because

0.816 0.816 exceeds the critical value of 0.602 from Table A-6 {n=11, 0.05 },

we conclude that there is sufficient evidence to support a claim of a linear correlation between

the two variables.

Using Software to Interpret r:

Assume P- value = 0.002,

Solution

Since P-value of 0.002 is less than significant level 0.05 , we conclude that there is

sufficient evidence to support a claim of a linear correlation between the two variables.

(b) Identify the feature of the data that would be missed if part (b) was completed without

constructing the scatterplot.

Solution

The scatter plot indicates that the relationship between the variables is essentially a perfect

straight pattern except for the presence of one outlier.

Common Errors Involving Correlation

1. Causation: It is wrong to conclude that correlation implies causality.

2. Averages: Averages suppress individual variation and may inflate the correlation coefficient.

3. Linearity: There may be some relationship between x and y even when there is no linear

correlation.

Caution

Know that correlation does not imply causality.

Part 2: Formal Hypothesis Test

Formal Hypothesis Test

We wish to determine whether there is a significant linear correlation between two variables.

Hypothesis Test for Correlation Notation

n = number of pairs of sample data

r = linear correlation coefficient for a sample of paired data

= linear correlation coefficient for a population of paired data

Hypothesis Test for Correlation Requirements

1. The sample of paired (x, y) data is a simple random sample of quantitative data.

2. Visual examination of the scatterplot must confirm that the points approximate a straight-line

pattern.

3. The outliers must be removed if they are known to be errors. The effects of any other outliers

should be considered by calculating r with and without the outliers included.

Hypothesis Test for Correlation Hypotheses

H0 : 0

H1 : 0

(There is no linear correlation.)

(There is a linear correlation.)

Test Statistic: r

Critical Values: Refer to Table A-6

Hypothesis Test for Correlation Conclusion

If r critical value from Table A-6, reject H 0 and conclude that there is sufficient evidence

to support the claim of a linear correlation

If r critical value from Table A-6, fail to reject H 0 and conclude that there is not sufficient

evidence to support the claim of a linear correlation.

Hypothesis Test for Correlation P-Value from a t Test

H0 : 0

H1 : 0

(There is no linear correlation.)

(There is a linear correlation.)

Test Statistic: t

t

r

1 rr

n2

Hypothesis Test for Correlation Conclusion

P-value: Use computer software to find the P-value corresponding to the test statistic t.

(Remove)use Table A-6 with n – 2 degrees of freedom to find the P-value corresponding to the

test statistic t.

If the P-value is less than or equal to the significance level, reject H 0 and conclude that there

is sufficient evidence to support the claim of a linear correlation.

If the P-value is greater than the significance level, fail to reject H 0 and conclude that there

is not sufficient evidence to support the claim of a linear correlation.

Note:

The exercises in this section will involve only two-tailed test.

One-Tailed Tests

One-tailed tests can occur with a claim of a positive linear correlation or a claim of a

negative linear correlation. In such cases, the hypotheses will be as shown here.

Note: For these one-tailed tests, the P-value method can be used as in earlier chapters.

Example:

The table below consists of a data set from Graphs in Statistical Analysis by F. J. Anscombe, The

American Statistician, Vol. 27.

x

y

10

7.46

8

6.77

13

12.74

9

7.11

11

7.81

14

8.84

6

6.08

4

5.39

12

8.15

7

6.42

(a) Construct a scatter plot.

(b) Find the value of the linear correlation coefficient r.

(c) Compute the test statistic.

(d) Find the critical values of r from Table A-6 using 0.05 .

(e) Find the P-value

(f) Determine whether there is sufficient evidence to support the claim of linear correlation

between the two variables.

Solution

(a) Scatterplot: See notes on constructing a scatterplot in section 2.4 in Textbook

5

5.73

(b)

TI 83/84 Calculator

OUTPUT

(b) linear correlation coefficient r.

r = 0.816

(c) Test Statistic

T = 4.239372102 = 4.23

(d) critical value, r

From Table A-6, for n = 11 and 0.05 , the critical value r is 0.602

Conclusion

Using Table A-6 to Interpret r:

Because

0.816 0.816 exceeds the critical value of 0.602 from Table A-6 {n=11, 0.05 },

we conclude that there is sufficient evidence to support a claim of a linear correlation between

the two variables.

(e) P-value = 0.0021763053 = 0.002

The P-value of 0.002 is less that 0.05

(f)

Using StatCrunch

To access StatCrunch, log into MyStatLab.

Watch the video (Getting Started) below to learn how to use StatCrunch.

Remember that you can access more videos like those listed below and other resources on using

the software by clicking on the “Help” tab in StatCrunch.

Creating a Scatterplot

x=19

Section 10.2 – Regression

Homework (Question # 3)

Using From Table A-6 to find Critical value

N = 70, alpha = 0.05, so critical value r = 0.305

Based on the information in the chat above we see that the regression equation is not a good

model because the linear correlation coefficient (r = 0.283) was not bigger than the critical value

(r = 0.305) and so there is not sufficient evidence to a support linear correlation.

Therefore we can only us eth sample mean of the y values ( y 75.1 ) to enable us predict the

best pulse rate and not the predicted equation.

Purchase answer to see full

attachment