Chapter 10 – Correlation and Regression Project
Refer to the Section 10.1 and 10.2 course notes posted in the discussion forum and these sections
in the textbook that comprises of examples on the topics covered in these sessions.
Refer to the data in below;
The table below lists are systolic blood pressure measurements (in mm Hg) obtained from the
same woman (based on the data from “Consistency of Blood Pressure Difference Between the
Left and Right Arms,” by Eguchi, et al., Archives of Internal Medicine, Vol. 167).
Right arm
Left arm
102
175
101
169
94
182
79
146
79
144
1. Construct a scatterplot for the variables. See the “StatCrunch Video Tutorials” below and
under Tools for Success in MyStatLab on how to graph a scatterplot using StatCrunch software.
https://mediaplayer.pearsoncmg.com/assets/statcrunch_01
https://mediaplayer.pearsoncmg.com/assets/statcrunch_02
https://mediaplayer.pearsoncmg.com/assets/statcrunch_18
2. Use the scatterplot to determine whether there is correlation between the two variables. State
the type of correlation.
3. Make a table for the data and calculate ∑ x , ∑ y , ∑ xy , ∑ x 2 , ∑ y 2 .
4. Calculate the correlation coefficient, r using the appropriate formula below.
r=
n xy − ( x )( y )
n( x 2 ) − ( x )
2
(
)
n y 2 − ( y )
2
5. Check your answer for the correlation coefficient, r using the statistical software like
“StatCrunch software” or “TI 83/84 Graphing Calculator” and record the results from your
software.
6. What does the correlation coefficient, r, tell us about the strength of the correlation.
7. Compute the square of the correlation coefficient, r2. What does r2 tell us about the best-fit
line.
8. Define the best-fit line (or regression line).
9. Find the slope, y-intercept and equation for the best-fit line of your data using any of two
appropriate formulas below. Show your work.
m= 𝑟×
𝑠𝑦
𝑠𝑥
b = 𝑦̅ − (𝑚 × 𝑥̅ ) ,
,
y = mx + b
or
b1 =
n( ∑ xy)−( ∑ x)(∑ 𝑦)
n( ∑ x2 ) − ( ∑ 𝑥)2
,
b0 =
(∑ y)(∑ x2 )−( ∑ x )( ∑ xy)
n( ∑ x2 ) − ( ∑ x)2
,
𝑦̂ = 𝑏0 + 𝑏1 𝑥
10. Add a graph of the best-fit line to your scatterplot using the StatCrunch software. Include a
screen shot of your data, best-fit line and equation of the best-fit line from StatCrunch.
11. Summarize your findings from this project.
Chapter 10- Correlation and Regression
Section 10.1 – Correlation
Key Concept
In Part 1 of this section, we introduces the linear correlation coefficient r, which is a numerical
measure of the strength of the relationship between two variables representing quantitative data.
Using paired sample data (sometimes called bivariate data), we find the value of r (usually using
technology), then we use that value to conclude that there is (or is not) a linear correlation
between the two variables.
In this section we consider only linear relationships, which means that when graphed, the points
approximate a straight-line pattern.
In Part 2, we discuss methods of hypothesis testing for correlation.
Part 1: Basic Concepts of Correlation
Definition
A correlation exists between two variables when the values of one are somehow associated with
the values of the other in some way.
A linear correlation exists between two variables when there is a correlation and the plotted
points of paired data result in a pattern that can be approximated by a straight line.
Exploring the Data
We can often see a relationship between two variables by constructing a scatterplot.
The figure below shows scatterplots with different characteristics.
Scatterplots of Paired Data
Measure the Strength of the Linear Correlation
The linear correlation coefficient r measures the strength of the linear relationship between the
paired quantitative x- and y-values in a sample.
Requirements for Linear Correlation
1. The sample of paired (x, y) data is a simple random sample of quantitative data.
2. Visual examination of the scatterplot must confirm that the points approximate a straight-line
pattern.
3. The outliers must be removed if they are known to be errors. The effects of any other outliers
should be considered by calculating r with and without the outliers included.
Notation for the Linear Correlation Coefficient
n=
number of pairs of sample data
= denotes the addition of the items indicated.
x = denotes the sum of all x-values.
x 2 = indicates that each x-value should be squared and then those squares added.
(x) 2 = indicates that the x-values should be added and then the total squared.
xy = indicates that each x-value should be first multiplied by its corresponding y-value.
After obtaining all such products, find their sum.
r= linear correlation coefficient for sample data.
= linear correlation coefficient for population data.
Formula
The linear correlation coefficient r measures the strength of a linear relationship between the
paired values in a sample.
r
n xy x y
n( x 2 ) x
2
n y 2 y
2
Note
Computer software or calculators can compute r
Interpreting the Linear Correlation Coefficient r
We can base our interpretation and conclusion about correlation on a P-value obtained from
computer software or a critical value from Table A-6.
Using Table A-6 to Interpret r:
If the absolute value of the computed value of r , denoted r , exceeds the value in Table A-6,
conclude that there is a linear correlation.
Otherwise, there is not sufficient evidence to support the conclusion of a linear correlation.
Using Software to interpret r:
If the computed P-value is less than or equal to the significance level, conclude that there is
a linear correlation.
Otherwise, there is not sufficient evidence to support the conclusion of a linear correlation.
Caution
Know that the methods of this section apply to a linear correlation. If you conclude that there
does not appear to be linear correlation, know that it is possible that there might be some other
association that is not linear.
Rounding the Linear Correlation Coefficient r
Round to three decimal places so that it can be compared to critical values in Table A-6.
Use calculator or computer if possible.
Properties of the Linear Correlation Coefficient r
1. The value of r is always between -1 and 1 inclusive. That is
1 r 1
2. If all values of either variable are converted to a different scale, the value of r does not change.
3. The value of r is not affected by the choice of x and y. Interchange all x- and y-values and the
value of r will not change.
4. r measures strength of a linear relationship.
5. r is very sensitive to outliers, they can dramatically affect its value.
Example 1: Given n = 6 and significant level 0.05
Solution
Critical Values from Table A-6 and the Computed Value of r
Conclusion:
Using Table A-6 to Interpret r:
Because
0.988 0.988 exceeds the critical value of 0.811 from Table A-6, {n = 6, 0.05 },
we conclude that there is sufficient evidence to support a claim of a linear correlation between
variables.
Example 2: Interpret r using a significance level 0.05
The heights (in inches) of a sample of eight mother/ daughter pairs of subjects were measured.
Using the TI 83/ 84 Plus calculator with the paired mother/daughter heights, the linear
correlation coefficient r is found to be 0.693 (based on data from the National Health
Examination Survey).
Is there sufficient evidence to support the claim that there is a linear correlation between the
heights of mothers and heights of daughters? Explain.
Solution
Requirements are satisfied: simple random sample of quantitative data; scatterplot approximates
a straight line; no outliers.
Using Table A-6 to Interpret r:
Because r 0.693 0.693 does not exceed the critical value of 0.707 from Table A-6,
{n = 8, 0.05 }, we conclude that there is not sufficient evidence to support a claim of a linear
correlation between the heights of mothers and heights of daughters?
Using Software to Interpret r:
Assume P- value = 0.325,
Solution
Since P-value of 0.325 is not less than significant level 0.05 , we conclude that there is not
sufficient evidence to support a claim of a linear correlation between the heights of mothers and
heights of daughters?
Example 3:
The table below consists of a data set from Graphs in Statistical Analysis by F. J. Anscombe, The
American Statistician, Vol. 27.
x
y
10
7.46
8
6.77
13
12.74
9
7.11
11
7.81
14
8.84
6
6.08
4
5.39
12
8.15
7
6.42
5
5.73
(a) Find the value of the linear correlation coefficient r, than determine whether there is
sufficient evidence to support the claim of linear correlation between the two variables.
(b) Identify the feature of the data that would be missed if part (b) was completed without
constructing the scatterplot.
Solution
(a) Calculating the Linear Correlation Coefficient r using the formula;
r
n xy x y
n( x 2 ) x
2
n y 2 y
2
n= 11
x
10
8
13
9
11
14
6
4
12
7
5
x 99
y
7.46
6.77
12.74
7.11
7.81
8.84
6.08
5.39
8.15
6.42
5.73
y 82.5
x2
100
64
169
81
121
196
36
16
144
49
25
x 2 1001
y2
55.6516
45.8329
162.3076
50.5521
60.9961
78.1456
36.9664
29.0521
66.4225
41.2164
32.8329
2
y 659.9762
xy
74.6
54.16
165.62
63.99
85.91
123.76
36.48
21.56
97.8
44.94
28.65
xy 797.47
n xy x y
r
r
r
n( x 2 ) x
2
n y 2 y
2
11797.47 (99)(82.5)
11(1001) (99) 2
11659.9762 (82.5) 2
8772.17 8167.5
(34.78505426)(21.29526238)
r 0.81628627395
r 0.816
The Linear Correlation Coefficient r = 0.816
Using the T1 83/84 Calculator to calculate the Linear Correlation Coefficient r
Enter the x-values in list L1
Enter the y-values in list L2
STAT
TEST
LinRegTTest
Xlist: L1
Ylis: L2
Freq: 1
& : # 0, 0
Reg EQ:
Calculate
LinRegTTest
Using Table A-6 to Interpret r:
Because
0.816 0.816 exceeds the critical value of 0.602 from Table A-6 {n=11, 0.05 },
we conclude that there is sufficient evidence to support a claim of a linear correlation between
the two variables.
Using Software to Interpret r:
Assume P- value = 0.002,
Solution
Since P-value of 0.002 is less than significant level 0.05 , we conclude that there is
sufficient evidence to support a claim of a linear correlation between the two variables.
(b) Identify the feature of the data that would be missed if part (b) was completed without
constructing the scatterplot.
Solution
The scatter plot indicates that the relationship between the variables is essentially a perfect
straight pattern except for the presence of one outlier.
Common Errors Involving Correlation
1. Causation: It is wrong to conclude that correlation implies causality.
2. Averages: Averages suppress individual variation and may inflate the correlation coefficient.
3. Linearity: There may be some relationship between x and y even when there is no linear
correlation.
Caution
Know that correlation does not imply causality.
Part 2: Formal Hypothesis Test
Formal Hypothesis Test
We wish to determine whether there is a significant linear correlation between two variables.
Hypothesis Test for Correlation Notation
n = number of pairs of sample data
r = linear correlation coefficient for a sample of paired data
= linear correlation coefficient for a population of paired data
Hypothesis Test for Correlation Requirements
1. The sample of paired (x, y) data is a simple random sample of quantitative data.
2. Visual examination of the scatterplot must confirm that the points approximate a straight-line
pattern.
3. The outliers must be removed if they are known to be errors. The effects of any other outliers
should be considered by calculating r with and without the outliers included.
Hypothesis Test for Correlation Hypotheses
H0 : 0
H1 : 0
(There is no linear correlation.)
(There is a linear correlation.)
Test Statistic: r
Critical Values: Refer to Table A-6
Hypothesis Test for Correlation Conclusion
If r critical value from Table A-6, reject H 0 and conclude that there is sufficient evidence
to support the claim of a linear correlation
If r critical value from Table A-6, fail to reject H 0 and conclude that there is not sufficient
evidence to support the claim of a linear correlation.
Hypothesis Test for Correlation P-Value from a t Test
H0 : 0
H1 : 0
(There is no linear correlation.)
(There is a linear correlation.)
Test Statistic: t
t
r
1 rr
n2
Hypothesis Test for Correlation Conclusion
P-value: Use computer software to find the P-value corresponding to the test statistic t.
(Remove)use Table A-6 with n – 2 degrees of freedom to find the P-value corresponding to the
test statistic t.
If the P-value is less than or equal to the significance level, reject H 0 and conclude that there
is sufficient evidence to support the claim of a linear correlation.
If the P-value is greater than the significance level, fail to reject H 0 and conclude that there
is not sufficient evidence to support the claim of a linear correlation.
Note:
The exercises in this section will involve only two-tailed test.
One-Tailed Tests
One-tailed tests can occur with a claim of a positive linear correlation or a claim of a
negative linear correlation. In such cases, the hypotheses will be as shown here.
Note: For these one-tailed tests, the P-value method can be used as in earlier chapters.
Example:
The table below consists of a data set from Graphs in Statistical Analysis by F. J. Anscombe, The
American Statistician, Vol. 27.
x
y
10
7.46
8
6.77
13
12.74
9
7.11
11
7.81
14
8.84
6
6.08
4
5.39
12
8.15
7
6.42
(a) Construct a scatter plot.
(b) Find the value of the linear correlation coefficient r.
(c) Compute the test statistic.
(d) Find the critical values of r from Table A-6 using 0.05 .
(e) Find the P-value
(f) Determine whether there is sufficient evidence to support the claim of linear correlation
between the two variables.
Solution
(a) Scatterplot: See notes on constructing a scatterplot in section 2.4 in Textbook
5
5.73
(b)
TI 83/84 Calculator
OUTPUT
(b) linear correlation coefficient r.
r = 0.816
(c) Test Statistic
T = 4.239372102 = 4.23
(d) critical value, r
From Table A-6, for n = 11 and 0.05 , the critical value r is 0.602
Conclusion
Using Table A-6 to Interpret r:
Because
0.816 0.816 exceeds the critical value of 0.602 from Table A-6 {n=11, 0.05 },
we conclude that there is sufficient evidence to support a claim of a linear correlation between
the two variables.
(e) P-value = 0.0021763053 = 0.002
The P-value of 0.002 is less that 0.05
(f)
Using StatCrunch
To access StatCrunch, log into MyStatLab.
Watch the video (Getting Started) below to learn how to use StatCrunch.
Remember that you can access more videos like those listed below and other resources on using
the software by clicking on the “Help” tab in StatCrunch.
Creating a Scatterplot
x=19
Section 10.2 – Regression
Homework (Question # 3)
Using From Table A-6 to find Critical value
N = 70, alpha = 0.05, so critical value r = 0.305
Based on the information in the chat above we see that the regression equation is not a good
model because the linear correlation coefficient (r = 0.283) was not bigger than the critical value
(r = 0.305) and so there is not sufficient evidence to a support linear correlation.
Therefore we can only us eth sample mean of the y values ( y 75.1 ) to enable us predict the
best pulse rate and not the predicted equation.
Purchase answer to see full
attachment
Refer to the Section 10.1 and 10.2 course notes posted in the discussion forum and these sections
in the textbook that comprises of examples on the topics covered in these sessions.
Refer to the data in below;
The table below lists are systolic blood pressure measurements (in mm Hg) obtained from the
same woman (based on the data from “Consistency of Blood Pressure Difference Between the
Left and Right Arms,” by Eguchi, et al., Archives of Internal Medicine, Vol. 167).
Right arm
Left arm
102
175
101
169
94
182
79
146
79
144
1. Construct a scatterplot for the variables. See the “StatCrunch Video Tutorials” below and
under Tools for Success in MyStatLab on how to graph a scatterplot using StatCrunch software.
https://mediaplayer.pearsoncmg.com/assets/statcrunch_01
https://mediaplayer.pearsoncmg.com/assets/statcrunch_02
https://mediaplayer.pearsoncmg.com/assets/statcrunch_18
2. Use the scatterplot to determine whether there is correlation between the two variables. State
the type of correlation.
3. Make a table for the data and calculate ∑ x , ∑ y , ∑ xy , ∑ x 2 , ∑ y 2 .
4. Calculate the correlation coefficient, r using the appropriate formula below.
r=
n xy − ( x )( y )
n( x 2 ) − ( x )
2
(
)
n y 2 − ( y )
2
5. Check your answer for the correlation coefficient, r using the statistical software like
“StatCrunch software” or “TI 83/84 Graphing Calculator” and record the results from your
software.
6. What does the correlation coefficient, r, tell us about the strength of the correlation.
7. Compute the square of the correlation coefficient, r2. What does r2 tell us about the best-fit
line.
8. Define the best-fit line (or regression line).
9. Find the slope, y-intercept and equation for the best-fit line of your data using any of two
appropriate formulas below. Show your work.
m= 𝑟×
𝑠𝑦
𝑠𝑥
b = 𝑦̅ − (𝑚 × 𝑥̅ ) ,
,
y = mx + b
or
b1 =
n( ∑ xy)−( ∑ x)(∑ 𝑦)
n( ∑ x2 ) − ( ∑ 𝑥)2
,
b0 =
(∑ y)(∑ x2 )−( ∑ x )( ∑ xy)
n( ∑ x2 ) − ( ∑ x)2
,
𝑦̂ = 𝑏0 + 𝑏1 𝑥
10. Add a graph of the best-fit line to your scatterplot using the StatCrunch software. Include a
screen shot of your data, best-fit line and equation of the best-fit line from StatCrunch.
11. Summarize your findings from this project.
Chapter 10- Correlation and Regression
Section 10.1 – Correlation
Key Concept
In Part 1 of this section, we introduces the linear correlation coefficient r, which is a numerical
measure of the strength of the relationship between two variables representing quantitative data.
Using paired sample data (sometimes called bivariate data), we find the value of r (usually using
technology), then we use that value to conclude that there is (or is not) a linear correlation
between the two variables.
In this section we consider only linear relationships, which means that when graphed, the points
approximate a straight-line pattern.
In Part 2, we discuss methods of hypothesis testing for correlation.
Part 1: Basic Concepts of Correlation
Definition
A correlation exists between two variables when the values of one are somehow associated with
the values of the other in some way.
A linear correlation exists between two variables when there is a correlation and the plotted
points of paired data result in a pattern that can be approximated by a straight line.
Exploring the Data
We can often see a relationship between two variables by constructing a scatterplot.
The figure below shows scatterplots with different characteristics.
Scatterplots of Paired Data
Measure the Strength of the Linear Correlation
The linear correlation coefficient r measures the strength of the linear relationship between the
paired quantitative x- and y-values in a sample.
Requirements for Linear Correlation
1. The sample of paired (x, y) data is a simple random sample of quantitative data.
2. Visual examination of the scatterplot must confirm that the points approximate a straight-line
pattern.
3. The outliers must be removed if they are known to be errors. The effects of any other outliers
should be considered by calculating r with and without the outliers included.
Notation for the Linear Correlation Coefficient
n=
number of pairs of sample data
= denotes the addition of the items indicated.
x = denotes the sum of all x-values.
x 2 = indicates that each x-value should be squared and then those squares added.
(x) 2 = indicates that the x-values should be added and then the total squared.
xy = indicates that each x-value should be first multiplied by its corresponding y-value.
After obtaining all such products, find their sum.
r= linear correlation coefficient for sample data.
= linear correlation coefficient for population data.
Formula
The linear correlation coefficient r measures the strength of a linear relationship between the
paired values in a sample.
r
n xy x y
n( x 2 ) x
2
n y 2 y
2
Note
Computer software or calculators can compute r
Interpreting the Linear Correlation Coefficient r
We can base our interpretation and conclusion about correlation on a P-value obtained from
computer software or a critical value from Table A-6.
Using Table A-6 to Interpret r:
If the absolute value of the computed value of r , denoted r , exceeds the value in Table A-6,
conclude that there is a linear correlation.
Otherwise, there is not sufficient evidence to support the conclusion of a linear correlation.
Using Software to interpret r:
If the computed P-value is less than or equal to the significance level, conclude that there is
a linear correlation.
Otherwise, there is not sufficient evidence to support the conclusion of a linear correlation.
Caution
Know that the methods of this section apply to a linear correlation. If you conclude that there
does not appear to be linear correlation, know that it is possible that there might be some other
association that is not linear.
Rounding the Linear Correlation Coefficient r
Round to three decimal places so that it can be compared to critical values in Table A-6.
Use calculator or computer if possible.
Properties of the Linear Correlation Coefficient r
1. The value of r is always between -1 and 1 inclusive. That is
1 r 1
2. If all values of either variable are converted to a different scale, the value of r does not change.
3. The value of r is not affected by the choice of x and y. Interchange all x- and y-values and the
value of r will not change.
4. r measures strength of a linear relationship.
5. r is very sensitive to outliers, they can dramatically affect its value.
Example 1: Given n = 6 and significant level 0.05
Solution
Critical Values from Table A-6 and the Computed Value of r
Conclusion:
Using Table A-6 to Interpret r:
Because
0.988 0.988 exceeds the critical value of 0.811 from Table A-6, {n = 6, 0.05 },
we conclude that there is sufficient evidence to support a claim of a linear correlation between
variables.
Example 2: Interpret r using a significance level 0.05
The heights (in inches) of a sample of eight mother/ daughter pairs of subjects were measured.
Using the TI 83/ 84 Plus calculator with the paired mother/daughter heights, the linear
correlation coefficient r is found to be 0.693 (based on data from the National Health
Examination Survey).
Is there sufficient evidence to support the claim that there is a linear correlation between the
heights of mothers and heights of daughters? Explain.
Solution
Requirements are satisfied: simple random sample of quantitative data; scatterplot approximates
a straight line; no outliers.
Using Table A-6 to Interpret r:
Because r 0.693 0.693 does not exceed the critical value of 0.707 from Table A-6,
{n = 8, 0.05 }, we conclude that there is not sufficient evidence to support a claim of a linear
correlation between the heights of mothers and heights of daughters?
Using Software to Interpret r:
Assume P- value = 0.325,
Solution
Since P-value of 0.325 is not less than significant level 0.05 , we conclude that there is not
sufficient evidence to support a claim of a linear correlation between the heights of mothers and
heights of daughters?
Example 3:
The table below consists of a data set from Graphs in Statistical Analysis by F. J. Anscombe, The
American Statistician, Vol. 27.
x
y
10
7.46
8
6.77
13
12.74
9
7.11
11
7.81
14
8.84
6
6.08
4
5.39
12
8.15
7
6.42
5
5.73
(a) Find the value of the linear correlation coefficient r, than determine whether there is
sufficient evidence to support the claim of linear correlation between the two variables.
(b) Identify the feature of the data that would be missed if part (b) was completed without
constructing the scatterplot.
Solution
(a) Calculating the Linear Correlation Coefficient r using the formula;
r
n xy x y
n( x 2 ) x
2
n y 2 y
2
n= 11
x
10
8
13
9
11
14
6
4
12
7
5
x 99
y
7.46
6.77
12.74
7.11
7.81
8.84
6.08
5.39
8.15
6.42
5.73
y 82.5
x2
100
64
169
81
121
196
36
16
144
49
25
x 2 1001
y2
55.6516
45.8329
162.3076
50.5521
60.9961
78.1456
36.9664
29.0521
66.4225
41.2164
32.8329
2
y 659.9762
xy
74.6
54.16
165.62
63.99
85.91
123.76
36.48
21.56
97.8
44.94
28.65
xy 797.47
n xy x y
r
r
r
n( x 2 ) x
2
n y 2 y
2
11797.47 (99)(82.5)
11(1001) (99) 2
11659.9762 (82.5) 2
8772.17 8167.5
(34.78505426)(21.29526238)
r 0.81628627395
r 0.816
The Linear Correlation Coefficient r = 0.816
Using the T1 83/84 Calculator to calculate the Linear Correlation Coefficient r
Enter the x-values in list L1
Enter the y-values in list L2
STAT
TEST
LinRegTTest
Xlist: L1
Ylis: L2
Freq: 1
& : # 0, 0
Reg EQ:
Calculate
LinRegTTest
Using Table A-6 to Interpret r:
Because
0.816 0.816 exceeds the critical value of 0.602 from Table A-6 {n=11, 0.05 },
we conclude that there is sufficient evidence to support a claim of a linear correlation between
the two variables.
Using Software to Interpret r:
Assume P- value = 0.002,
Solution
Since P-value of 0.002 is less than significant level 0.05 , we conclude that there is
sufficient evidence to support a claim of a linear correlation between the two variables.
(b) Identify the feature of the data that would be missed if part (b) was completed without
constructing the scatterplot.
Solution
The scatter plot indicates that the relationship between the variables is essentially a perfect
straight pattern except for the presence of one outlier.
Common Errors Involving Correlation
1. Causation: It is wrong to conclude that correlation implies causality.
2. Averages: Averages suppress individual variation and may inflate the correlation coefficient.
3. Linearity: There may be some relationship between x and y even when there is no linear
correlation.
Caution
Know that correlation does not imply causality.
Part 2: Formal Hypothesis Test
Formal Hypothesis Test
We wish to determine whether there is a significant linear correlation between two variables.
Hypothesis Test for Correlation Notation
n = number of pairs of sample data
r = linear correlation coefficient for a sample of paired data
= linear correlation coefficient for a population of paired data
Hypothesis Test for Correlation Requirements
1. The sample of paired (x, y) data is a simple random sample of quantitative data.
2. Visual examination of the scatterplot must confirm that the points approximate a straight-line
pattern.
3. The outliers must be removed if they are known to be errors. The effects of any other outliers
should be considered by calculating r with and without the outliers included.
Hypothesis Test for Correlation Hypotheses
H0 : 0
H1 : 0
(There is no linear correlation.)
(There is a linear correlation.)
Test Statistic: r
Critical Values: Refer to Table A-6
Hypothesis Test for Correlation Conclusion
If r critical value from Table A-6, reject H 0 and conclude that there is sufficient evidence
to support the claim of a linear correlation
If r critical value from Table A-6, fail to reject H 0 and conclude that there is not sufficient
evidence to support the claim of a linear correlation.
Hypothesis Test for Correlation P-Value from a t Test
H0 : 0
H1 : 0
(There is no linear correlation.)
(There is a linear correlation.)
Test Statistic: t
t
r
1 rr
n2
Hypothesis Test for Correlation Conclusion
P-value: Use computer software to find the P-value corresponding to the test statistic t.
(Remove)use Table A-6 with n – 2 degrees of freedom to find the P-value corresponding to the
test statistic t.
If the P-value is less than or equal to the significance level, reject H 0 and conclude that there
is sufficient evidence to support the claim of a linear correlation.
If the P-value is greater than the significance level, fail to reject H 0 and conclude that there
is not sufficient evidence to support the claim of a linear correlation.
Note:
The exercises in this section will involve only two-tailed test.
One-Tailed Tests
One-tailed tests can occur with a claim of a positive linear correlation or a claim of a
negative linear correlation. In such cases, the hypotheses will be as shown here.
Note: For these one-tailed tests, the P-value method can be used as in earlier chapters.
Example:
The table below consists of a data set from Graphs in Statistical Analysis by F. J. Anscombe, The
American Statistician, Vol. 27.
x
y
10
7.46
8
6.77
13
12.74
9
7.11
11
7.81
14
8.84
6
6.08
4
5.39
12
8.15
7
6.42
(a) Construct a scatter plot.
(b) Find the value of the linear correlation coefficient r.
(c) Compute the test statistic.
(d) Find the critical values of r from Table A-6 using 0.05 .
(e) Find the P-value
(f) Determine whether there is sufficient evidence to support the claim of linear correlation
between the two variables.
Solution
(a) Scatterplot: See notes on constructing a scatterplot in section 2.4 in Textbook
5
5.73
(b)
TI 83/84 Calculator
OUTPUT
(b) linear correlation coefficient r.
r = 0.816
(c) Test Statistic
T = 4.239372102 = 4.23
(d) critical value, r
From Table A-6, for n = 11 and 0.05 , the critical value r is 0.602
Conclusion
Using Table A-6 to Interpret r:
Because
0.816 0.816 exceeds the critical value of 0.602 from Table A-6 {n=11, 0.05 },
we conclude that there is sufficient evidence to support a claim of a linear correlation between
the two variables.
(e) P-value = 0.0021763053 = 0.002
The P-value of 0.002 is less that 0.05
(f)
Using StatCrunch
To access StatCrunch, log into MyStatLab.
Watch the video (Getting Started) below to learn how to use StatCrunch.
Remember that you can access more videos like those listed below and other resources on using
the software by clicking on the “Help” tab in StatCrunch.
Creating a Scatterplot
x=19
Section 10.2 – Regression
Homework (Question # 3)
Using From Table A-6 to find Critical value
N = 70, alpha = 0.05, so critical value r = 0.305
Based on the information in the chat above we see that the regression equation is not a good
model because the linear correlation coefficient (r = 0.283) was not bigger than the critical value
(r = 0.305) and so there is not sufficient evidence to a support linear correlation.
Therefore we can only us eth sample mean of the y values ( y 75.1 ) to enable us predict the
best pulse rate and not the predicted equation.
Purchase answer to see full
attachment