0% found this document useful (0 votes)
27 views9 pages

Ch. 3 Review Packet

ap stats

Uploaded by

sshreev2703
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
27 views9 pages

Ch. 3 Review Packet

ap stats

Uploaded by

sshreev2703
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 9
AP Statis: s Chapter 3 — Examining Relationships 3.1: Scatterplots and Correlation Explanatory and Response Variables A response variable measures an outcome of a study. An explanatory variable attempts to explain the observed outcomes. The explanatory variable is sometimes referred to as the independent vatiable and is typically symbolized by the variable x. The response variable is sometimes referred to as the dependent variable and is typically symbolized by the variable y. Scatterplot A seatterplot shows the relationship between two quantitative variables measured on the same individuals. The values of the explanatory variable appear on the horizontal axis, and the values of the response variable appear on the vertical axis, If there is no clear explanatory/response relationship between the two variables, then either variable can be placed on either axis. Each individual in the data set appears as a single point in the plot fixed by the values of both variables for that individual. Examining a Scatterplot In any graph of data, look for patterns and deviations from the pattern, Describe the overall pattern of a scatterplot by the form, direction and strength of the relationship. © Form can be described as linear or curved. ‘* Direction can be described as positive or negative or neither. ‘© Strength can be described as weak, moderate or strong. A deviation from the overall pattern of a seatterplot is called an outlier. Association ‘© Two variables are positively associated if as one increases the other increases. Two variables are negatively associated if as one increases the other decreases. Correlation Correlation measures the strength and direction of the relationship between two quantitative variables. Correlation is usually represented by the letter r. Facts about Correlation 1. When calculating correlation, it makes no difference which variable is x and which is y. 2. Correlation is only calculated for quantitative variables, not categorical. 3. The value of r does not change if the units of x and/or y are changed. 4. Positive r indicates a positive association between x and y. Negative r indicates a negative association, 5. Correlation is always a number between -1 and +1. Values close to +1 or -I indicate that the points lie close to a line, The extreme values of +1 and -1 are only achieved when the points are perfectly linear. 6, Correlation measures the strength of a linear relationship between two variables, not curved relationships 7. Correlation, like the mean and standard deviation, is nonresistant, Recall that this means that itis greatly affected by outliers. AP Statistics ~ Summary of Chapter 3 Page | of 2 3.2: Least-Squares Regression Regression Line A regression line is a straight line that describes how a response variable y changes as an explanatory variable x changes. The line is often to predict values of y for given values of x. Regression, unlike correlation, requires an explanatory/response relationship. In other words, when x and y are reversed, the regression line changes, Recall that correlation is the same no ‘matter which variable is x and which is y Least-Squares Regression Line The least-squares regression line is the line that makes the sum of the squares of the vertical distances from the data points to the line as small as possible. Equation of the Least-Squares Regression Line To find the equation of the regression line in the form y=a +bx, where a is the y-intercept and b is the slope, use the following equations: The Role of r-squared (Coefficient of Determination) The square of the correlation coefficient, or r-squared, represents the percentage of the change in the y-variable that can be attributed to its relationship with the x-variable. So if r-squared for the regression between x and y is .73, we can say that x accounts for 73% of the variation in y. Residuals A residual is the difference between an observed value of y and the value predicted by the regression line. That is, residual = actual y - predicted y. Residual Plot A residual plot is a scatterplot of each x-value and its residual value, The residual plot is used to determine whether a linear equation is a good model for a set of data, as follows: + Ifthe residual plot exhibits randomness, then a line is a good model for the data (see left) + Ifthe residual plot exhibits a pattern, then a line is NOT a good model for the data (right) Outliers and Influential Points ‘A point that lies outside the overall pattern of the other observations is considered an outlier. If the removal of such a point has a large effect on the correlation and/or regression, that point is considered an influential point. AP Statistics — Summary of Chapter 3 Page 2 of 2 - Chapter 3 MC Study Guide Multiple Choice Identify the choice that best completes the statement or answers the question [J] 1. ima statistics course, a linear regression equation was computed to predict the final exam score from the score on the first test. The equation was y= 10+ 9x where y is the final exam score and x is the score on the first test. Carla scored 95 on the first test. On the final exam, Carla scored 98, What is the value of her residual? a. 8 b. 25 e. -2.5 ct e. 0 None of the above [=] 2. In the scatterplot below, if each x-value were decreased by one unit and the y- values remained the same, then the correlation r would 8 oe, a. Decrease by | unit b. Decease slightly . Increase slightly d. Stay the same €, Can't tell without knowing the data values [x] 3: In regression, the residuals are which of the following? a, Those factors unexplained by the data ». The difference between the observed responses and the values predicted by the regression line ¢. Those data points which were recorded after the formal investigation was completed 4. Possible models unexplored by the investigator €. None of the above [=] 4. Which of the following statements are true? 1. Correlation and regression require explanatory and response variables. IL. Scatterplots require that both variables be quantitat II. Every least-square regression line passes through +) , Land Il only Tand III only Mand Ill only I, I, and IIL None of the above gaeoe Suppose the following information was collected, where X = diameter of tree trunk in inches, and Y = tree height in feet. X[_4] 2/8 ys] 4[18[2 lo] 6 30] 8 If the LSRL equation is y=~3.6 + 3.1x, what is your estimate of the average height of all trees having a trunk diameter of 7 inches? a. 18.1 b. 19.1 ce. 20.1 d. 211 @ 221 Suppose we fit the least squares regression line to a set of data, What is true if a plot of the residuals shows a curved pattern? A straight line is not a good model for the data . The correlation must be 0. ‘The correlation must be positive. Outliers must be present. The LSRL might or might not be a good model for the data, depending on the extent of the curve, eapge Which of the following are resistant? . Least squares regression line . Correlation coefficient Both the least squares line and the correlation coefficient Neither the least squares line nor the correlation coefficient It depends [3] 8. A copy machine dealer has data on the number x of copy machines at each of 89 customer locations and the number y of service calls in a month at each = 2 location. Summary calculations give ¥ = 8.4, $.= 2.1, = 14.2, S)=3.8, and r= 0.86. What is the slope of the least squares regression line of number of service calls on number of copiers? a. 0.86 . 1.56 c. 0.48 d. None of these . Can't tell from the information given - [=] 9. There is a linear relationship between the number of chirps made by the striped ground cricket and the air temperature. A least squares fit of some data collected bya biologist gives the model ¥ =25.2+3.3x, 9 = 1400 + 2000x where y is the raise amount and x is the performance rating. Which of statements (a) to (4) is not correct? (@) For each increase of one point in performance rating, the raise will increase on average by $2000. (b) This equation produces predicted raises with an average error of 0. (c) A rating of 0 will yield a predicted raise of $1400. (d) The correlation between salary raise and performance rating is positive. (e) All of the above are true. 8 Leonardo da Vinci, the renowned painter, speculated that an ideal human would have an armspan (distance from outstretched fingertip of left hand to outstretched fingertip of right hand) that was equal to his height. The following computer regression printout shows the results of a least-squares regression on height and armspan, in inches, for a sample of 18 high school students. Depancent variable ist Height No Selector Reaqured = 67.18 R squoved (adjusted) = 88.9% = 161 with 18-2 = 16 degrees of freecom Sum of Squares af Mean Square F-rati: 200.69 1 pees! 188 S161e3 8 260118 prob «2.0001 Which of the following statements is false? (a) This least-squares regression model would make a prediction that is 1.63 inches higher than da Vinci projected for a 62-inch tall student. (b) One of the students in the sample had a height of 70.5 inches and an armspan of 68 inches. The residual for this student is 1.83 inches (©) Da Vinei’s projection is lower than the prediction that this least-squares line will make for any height. (d) For every one-inch increase in armspan, the regression model predicts about a 0.84-inch increase in height. (e) Fora student 66 inches tall, our model would predict an armspan of about 67 inches. Chapter 3 Part 2, Free Response Answer completely, but be concise. Show your work. Joey appears to be growing slowly as a toddler. His height between 18 and 30 months of age increases as, follows: Observed Predicted Age_(months) height (cm height Residual Ta 76.5 70.08 21 78.7 73.09 28 82.0 81.6 0.4 27 Bal8 Balla 30 96.0 62 The least-squares regression line fitted to this data has equation HEIGHT = 61.5 + 0.837 AGE, 9. Finish filling in the table above. 10. Sketch a residual plot on the axes provided. 11, Based on your residual plot, would you describe Joey's growth pattem from 18 to 30 monthis as Linear? Explain, 12, According to the least-squares principle, which of the lines below provides the best fit for the data shown in the scatterplot? Justify your answer. @) OO) TEE x) (c) ) (d) = (e) y=3-L5x ee Chapter 3 3 13, Anthropologists must often estimate from human remains how tall the person was when alive. Carla is studying how overall height can be predicted from the length of a leg bone in a group of 36 living males. The data show that the bone lengths have mean 45.9 cm and standard deviation 4.2 ‘om, the overall heights have mean 172.7 cm and standard deviation 8.14 cm, and the correlation between bone length and height is 0.914, (a) Determine the equation of the least-squares regression line of height on bone length. Show your work (b) Interpret the correlation in the context of this problem. 14, In general, is correlation a resistant measure of association? Explain briefly or give a simple example to illustrate, Chapter 3 4

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy