0% found this document useful (0 votes)
12K views784 pages

Introduction To Probability and Statistics 15th Edition

Uploaded by

IvanBoyd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12K views784 pages

Introduction To Probability and Statistics 15th Edition

Uploaded by

IvanBoyd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 784

EDITION

Introduction
15 to Probability
and Statistics
Metric Version
William Mendenhall, Ill
1925- 2009

Robert J. Beaver
Unive rsity of Californ ia, Riverside, Emeritus

Barbara M. Beaver
Unive rsity of California, Riverside, Emerita

~: - CENGAGE
Australia • Brazil • Mexico • Singapore• United Kingdom • Un ited States

Copyright 2020 Cengage Learning. AU Rights Reserved ~fay not be copied. scanned. or duplicated. in wllolc or in part. Due to electronic rights. somc t.hird party content may be s uppressed from the cBook amVor tChaptcr{s).
Editorial review has deemed that any suppressed content docs not tllllterially affect the ovcrnll lcaming cxpcrk'ncc. Cengagc Lctirning reserves the right to remove additional content at any 1.imc if s ubsequent rights restrict ion.~ require it.
This is an electronic version of the print textbook. Due to electronic rights restrictions,
some third party content may be suppressed. Editorial review has deemed that any suppressed
content does not materially affect the overall learning experience. The publisher reserves the right
to remove content from this title at any time if subsequent rights restrictions require it. For
valuable information on pricing, previous editions, changes to current editions, and alternate
formats, please visit www.cengage.com/highered to search by ISBN#, author, title, or keyword for
materials in your areas of interest.

Important Notice: Media content referenced within the product description or the product
text may not be available in the eBook version.

Copyright 2020 Cengage Learning. AU Rights Reserved. ~fay not be copied. scanned. a- dupliratcd. in wllolc or in part. Due to electronic rig hts. some t.hird party content may be suppressed from the cBook amVor tChaptcr(s).
Editorial review has deemed that any suppressed content docs not materially affect the ovcrnll lcaming experience. Cengagc Lctirning reserve-. tile right to remove additional content at any time ifsubscqucnt rig hts restriction.~ require it.
Introduction to Probability and Statistics, © 2020, 2013, 2009 Cengage Learning, Inc.
Fift eenth Edition, Metric Version WCN : 02-300

William Mendenhall, Ill, RobertJ. ALL RIGHTS RESERVED. No part of this work covered by the copyright
Beaver. Barbara M . Beaver herein may be reproduced or d istributed in any form or by any means,
except as permitted by U.S. copyright law, without the prior written
Metric Version prepared by Qaboos lmran
permission of the copyright owner.
International Product Director: Timothy L.
Anderson

Senior Product Assistant: Alexander Sham For product information and technology assistance, contact us at
Cengage Customer & Sales Support, 1-800-354-9706
Content Manager: Marianne Groth
or support.cengage.com.
Associate Marketing Manager: Tori Sitcawich
For permission to use material from this text or product, submit all
Associate Content Managers: Abby DeVeuve, requests online at www.cengage.com/permissions.
Amanda Rose

Manufacturing Planner: Doug Bertke


ISBN: 978-0-357-11446-9
IP Analyst: Reba Frederics
Cengage International Offices
IP Proj ect Manager: Carly Belcher
Asia Australia/New Zealand
Production Service/Compositor: SPi Global www.cengageasia.com www.cengage.com.au
Inventory Analyst: Sarah Ginsberg tel: (65) 64101200 tel : (61) 3 9685 4111

Art Director: Vernon T. Boes Brazil India


www.cengage.com.br www.cengage.co.i n
Senior Designer: Diana Graham
tel: (55) 11 3665 9900 tel: (91) 1143641111
Cover Image: mikroman6/Getty Images
Latin America UK/Europe/Middle East/Africa
www.cengage.com.mx www.cengage.co.uk
tel; (52) 55 1500 6000 tel: (44) 0 1264 332 424

Represented in Canada by
Nelson Education. Ltd.
tel: (416) 752 9100 I (800) 668 0671
www.nelson.co m

Cengage is a leading provider of customized learning solutions with


office locations around the globe, including Singapore, the United
Kingdom, Australia, Mexico, Brazil, and Japan. Locate your local office
at: www.cengage.com/global.

For product information: www.cengage.com/international


Visit your local office: www.cengage.com/global
Visit our corporate website: www.cengage.com

Printed in the U nited States of America


Print N umber: 0 I Print Yea r: 20 I 8

Copyright 2020 Ccngage Learning. AU Rights Re.served May not be copied. .sc:mncd, a- d uplic:ited. in wllolc or in part. Due to electronic rig hts . some third par ty conte nt may be suppressed from the cBook amVo r tChaptcr(s) .
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubseq uent rights rest rict ion.~ require it.
Brief Contents

INTRODUCTION: WHAT IS STATISTICS? 1

1 DESCRIBING DATA WITH GRAPHS 7

2 DESCRIBING DATA WITH NUMERICAL MEASURES 54

3 DESCRIBING BIVARIATE DATA 96

4 PROBABILITY 126

5 DISCRETE PROBABILITY DISTRIBUTIONS 167

6 THE NORMAL PROBABILITY DISTRIBUTION 212

7 SAMPLING DISTRIBUTIONS 245

8 LARGE-SAMPLE ESTIMATION 288

9 LARGE-SAMPLE TESTS OF HYPOTHESES 335

10 INFERENCE FROM SMALL SAMPLES 380

11 THE ANALYSIS OF VARIANCE 445

12 SIMPLE LINEAR REGRESSION AND CORRELATION 503

13 MULTIPLE LINEAR REGRESSION ANALYSIS 555

14 ANALYSIS OF CATEGORICAL DATA 599

15 NONPARAMETRIC STATISTICS 633

APPENDIX I 681

DATA SOURCES 714

ANSWERS TO SELECTED EXERCISES 727

INDEX 745

iii

Copyright 2020 Cengage Learning. AU Rights Res erved. ~fay not be copied. scanned. a duplicated. in wllolc or in part. Due to electronic rights. somc t.hird party content may be s uppressed from the cBook amVo r tChaptcr(s).
Editorial review has deemed that any suppressed content docs not materially affect the ovcrnll lcaming experience. Cengagc Learning reserve-. the right to remove additional content at any 1.imc ifsubscqucnt rig hts restrict ion.~ require it.
Contents

Introduction: What Is Statistics? 1


The Population and the Sample 3
Descriptive and Inferenti al Statistics 3
Achieving the Objective of Inferential Statistics: The Necessary Steps 4
Keys for Successful Learning 5

Describing Data with Graphs 7


1.1 Variables and Data 8
Types of Variab les 9
Exercises 11
1 .2 Graphs for Categorical Data 12
Exercises 15
1.3 Graphs for Quantitative Data 17
Pie Charts and Bar Charts 17
Line Charts 19
Dotplots 20
Stem and Leaf Plots 20
lnte1preting Graphs with a C1itical Eye 22
Exercises 24
1.4 Relative Frequency Histograms 27
Exercises 31

Chapter Review 35
Technology Today 35
Reviewing What You've Learned 4 7
CASE STUDY: How Is Your Blood Pressure? 53

Describing Data with Numerical Measures 54


Introduction 55
2.1 Measures of Center 55
Exercises 59
2.2 Measures of Variability 61
Exercises 66
iv

Copyrig ht 20 20 Ccngage Lea rning. AU Rights Re.served May not be copied. .sc:mncd, a- duplic:ited . in wllolc or in part. Due to electronic rig hts . some third party conte nt may be suppressed from the c Book amVor tChaptcr(s).
Edito rial review has dccml'd that any s uppressed co ntent docs 001 materially affec t the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubsequent rig hts rest rict ion.~ require it.
Contents V

2.3 Understanding and Interpreting the Standard Deviation 67


Tchebysheff's Theorem 67
The Empirical R ule 69
Approximating s Using the Range 7 1
Exercises 73
2.4 Measures of Relative Standing 76
z-Scores 76
Percentiles and Quartiles 77
The Five-Number Summary and the Box Plot 80
Exercises 83

Chapter Review 86
Technology Today 87
Reviewing W hat You've Learned 9 1
CASE STUDY: The Boys of Summer 95

Describing Bivariate Data 96


Introduction 97
3.1 Describing Bivariate Categorical Data 97
Exercises 99
3.2 Describing Bivariate Quantitative Data 101
Scatterplots I 01
T he Correlation Coefficient 104
The Least-Squares Line 106
Exercises 109

Chapter Review 112


Technology Today 11 2
Reviewing What You've Learned 118
CASE STUDY: Are Your Clothes Really Clean? 124

Probability 126
Introduction 127
4.1 Events and the Sample Space 127
Exercises 130
4.2 Calculating Probabilities Using Simple Events 131
Exercises 134
4.3 Useful Counting Rules 137
Using the Tl-83/84 Plus Calculator 142
Exercises 142

Copyright 20 20 Ccngage Lea rning. AU Rights Re.served May not be copied. .sc:mncd, a- d uplic:ited . in wllolc or in part. Due to electronic rig hts . some th ird par ty conte nt may be suppressed from the c Book amVor tChaptcr(s).
Edito rial review has dccml'd that any s uppressed co ntent docs 001 materially affec t the ovcrnll lcaming c xpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubsequent rights rest rict ion.~ require it.
vi Contents

4.4 Rules for Calculating Probabilities 144


Calculating Probabilities for Unions and Complements 146
Calculating Probabilities for Intersections 148
Exercises 155
4.5 Bayes' Rule 158
Exercises 161
Chapter Review 163
Reviewing What You've Learned 163
CASE STUDY: Probability and Decision Making in the Congo 166

Discrete Probability Distributions 167


5.1 Discrete Random Variables and Their Probability Distributions 168
Random Va1iables 168
Probability Disttibutions 168
The Mean and Standard Dev iation for a Discrete Random Vari able 170
Exercises 174
5.2 The Binomial Probability Distribution 176
Exercises 185
5.3 The Poisson Probability Distribution 189
Exercises 194
5.4 The Hypergeometric Probability Distribution 196
Exercises 198

Chapter Review 200


Technology Today 201
Reviewing What You've Learned 206
CASE STUDY: A Mystery: Cancers Near a Reactor 211

The Normal Probability Distribution 212


6.1 Probability Distributions for Continuous Random Variables 213
The Continuous Uniform Probability Distribution 215
The Exponential Probability Distribution 216
Exercises 217
6.2 The Normal Probability Distribution 218
The Standard Normal Random Variable 219
Calculating Probabilities for a General Normal Random Variable 222
Exercises 225
6.3 The Normal Approximation to the Binomial Probability Distribution 228
Exercises 232

Copyright 2:020 Ccngage Learning. AU Rights Re.served May not be copied .sc:mncd, a- d uplic:ited. in wllolc or in part. Due to electronic rights . some third par ty content may be suppressed from the cBook and'o r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubsequent rights rest rict ion.~ require it.
Contents vii

Chapter Review 235


Technology Today 235
Reviewing What You've Learned 241
CASE STUDY: "Are You Going to Curve the Grades?" 244

Sampling Distributions 245


Introduction 246
7.1 Sampling Plans and Experimental Designs 246
Exercises 249
7.2 Statistics and Sampling Distributions 252
Exercises 254
7.3 The Central Limit Theorem and the Sample Mean 255
The Central Limit T heorem 255
The Sampling Distribution of the Sample Mean 258
S tandard Error of the Sample Mean 259
Exercises 262
7.4 Assessing Normality 264
7.5 The Sampling Distribution of the Sample Proportion 268
Exercises 271
7.6 A Sampling Application: Statistical Process Control (Optional) 273
A Control Chart for the Process Mean: T he x Chart 274
A Control Chart for the Proportion Defective: T he p Chart 276
Exercises 278

Chapter Review 280


Technology Today 28 1
Reviewing What You've Learned 284
CASE STUDY: Sampling the Roulette at Monte Carlo 287

Large-Sample Estimation 288


8.1 Where We've Been and Where We're Going 289
Statistical Inference 289
Types of E stimators 290
8.2 Point Estimation 291
Exercises 296
8.3 Interval Estimation 298
Constructing a Confidence Interval 298
Large-Sample Confidence Interval for a Populatio n Mean J.-t 300

Copyright 2020 Ccngage Learning. AU Rights Re.served May not be copied. .sc:mncd, a- duplic:ited. in wllolc or in part. Due to electronic rights . some third party conte nt may be suppressed from the cBook amVor tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affec t the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubsequent rights rest rict ion.~ require it.
vi ii Contents

Interpreting the Confidence Interval 30 I


Large-Sample Confidence Interval for a Population Proportion p 303
Using Technology 304
Exercises 304
8.4 Estimating the Difference Between Two Population Means 307
Exercises 311
8.5 Estimating the Difference Between Two Binomial Proportions 313
Using Technology 3 16
Exercises 316
8.6 One-Sided Confidence Bounds 319
Exercises 320
8.7 Choosing the Sample Size 322
Exercises 325

Chapter Review 326


Technology Today 327
Reviewing What You've Learned 330
CASE STUDY: How Reliable Is That Poll? CBS News: How and Where Ametica Eats 333

Large-Sample Tests of Hypotheses 335


Introduction 336

9.1 A Statistical Test of Hypothesis 336


Exercises 339
9.2 A Large-Sample Test About a Population Mean 340
The Essentials of the Test 340
Calculating the p-Value 344
Two Types of Etrnrs 348
The Power of a Statistical Test 349
Exercises 352
9.3 A Large-Sample Test of Hypothesis for the Difference Between
Two Population Means 354
Hypothesis Testing and Confidence Intervals 356
Exercises 357
9.4 A Large-Sample Test of Hypothesis
for a Binomial Proportion 360
Statistical Significance and Practical Importance 362
Exercises 363
9.5 A Large-Sample Test of Hypothesis for the Difference
Between Two Binomial Proportions 365
Exercises 367

Copyrig ht 20 20 Ccngage Learning. AU Rights Re.served May not be copied. .sc:mncd, a- duplic:ited . in wllolc or in part. Due to electronic rig hts . some third party conte nt may be suppressed from the c Book amVor tChaptcr(s).
Editorial review has dccml'd that any s uppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubsequent rig hts restrict ion.~ require it.
Contents ix

9.6 Concluding Comments on Testing Hypotheses 369

Chapter Review 370


Technology Today 37 1
Reviewing What You've Learned 375
CASE STUDY: An Aspi1in a Day . .. ? 378

Inference from Small Samples 380


Introduction 381
10.1 Student's t Distribution 381
Assumptions behind Student's t Distribution 384
Exercises 385
10.2 Small-Sample Inferences Concerning a Population Mean 386
Exercises 390
10.3 Small-Sample Inferences for the Difference Between Two Population Means:
Independent Random Samples 394
Exercises 400
10.4 Small-Sample Inferences for the Difference Between Two Means: A Paired-
Difference Test 404
Exercises 409
10.5 Inferences Concerning a Population Variance 413
Exercises 419
10.6 Comparing Two Population Variances 421
Exercises 427
10.7 Revisiting the Small-Sample Assumptions 429

Chapter Review 430


Technology Today 43 1
Reviewing What You've Learned 439
CASE STUDY: Schoo] Accountability- Are We Doing Better? 443

The Analysis of Variance 445


11.1 The Design of an Experiment 446
Basic Definitio ns 446
What Is an Analys is of Variance? 447
The Assumptio ns for an Analysis of Variance 448
Exercises 448
11.2 The Completely Randomized Design: A One-Way Classification 449
Partitioning the To tal Variation in the Experiment 450

Copyright 2020 Ccngage Learning. AU Rights Re.served May not be copied. .sc:mncd, a- d uplic:ited. in wllolc or in part. Due to electronic rights . some th ird par ty content may be suppressed from the cBook amVor tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubsequent rig hts rest rict ion.~ require it.
X Contents

Testing the Equality of the Treatment Means 453


Estimating Differences in the Treatment Means 455
Exercises 458
11.3 Ranking Population Means 461
Exercises 464
11.4 The Randomized Block Design: A Two-Way Classification 465
Partitioning the Total Variation in the Experiment 466
Testing the Equality of the Treatment and Block Means 469
Identifying Differences in the Treatment and Block Means 47 1
Some Cautionary Comments on Blocking 472
Exercises 473
11.S The a x b Factorial Experiment: A Two-Way Classification 477
The Analysis of Variance for an a X b Factorial Experiment 479
Exercises 483
11.6 Revisiting the Analysis of Variance Assumptions 486
Residual Plots 487
11.7 A Brief Summary 489

Chapter Review 490


Technology Today 490
Reviewing What You've Learned 497
CASE STUDY: How to Save Money on Groceries! 502

Simple Linear Regression and Correlation 503


Introduction 504
12.1 Simple Linear Regression 504
A Simple Li.near M odel 505
The Method of Least Squares 507
Exercises 509
12.2 An Analysis of Variance for Linear Regression 511
Exercises 514
12.3 Testing the Usefulness of the Linear Regression Model 516
Inferences About {3, the Slope of the Line of Means 516
The Analys is of Va.iiance F-Test 519
Measuring the Strength of the Relationship:
The Coefficient of Detennination 520
Interpreting the Results of a Significant Regression 521
Exercises 522

Copyright 2020 Ccngage Learning. AU Rights Re.served May not be copied. .sc:mncd, a- d uplic:ited. in wllolc or in part. Due to electronic rights . some third par ty content may be suppressed from the cBook amVo r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if subsequent rights rest rict ion.~ require it.
Contents xi

12.4 Diagnostic Tools for Checking the Regression Assumptions 525


Dependent Error Terms 525
Residual Plots 525
Exercises 526
12.5 Estimation and Prediction Using the Fitted Line 530
Exercises 534
12.6 Correlation Analysis 537
Exercises 540

Chapter Review 543


Technology Today 544
Reviewing What You've Learned 549
CASE STUDY: Is Your Car "Made in the U.S.A."? 553

Multiple Linear Regression Analysis 555


Introduction 556
13.1 The Multiple Regression Model 556
13.2 Multiple Regression Analysis 558
The Method of Least Squares 558
The Analys is of Variance 559
Testing the Usefulness of the Regression Model 561
Interpreting the Results of a Significant Regression 562
Best Subsets Regression 563
Checking the Regression Assumptions 564
Using the Regression Model for Estinrntion and Prediction 564
Exercises 565
13.3 A Polynomial Regression Model 567
Exercises 570
13.4 Using Quantitative and Qualitative Predictor Variables in a Regression
Model 573
Exercises 578
13.5 Testing Sets of Regression Coefficients 582
13.6 Other Topics in Multiple Linear Regression 584
Inte1preting Residual Plots 584
Stepwise Regress ion Analysis 586
Bina1y Logistic Regression 587
Misinterpreting a Regression Analysis 587
13.7 Steps to Follow When Building a Multiple Regression Model 589

Copyright 2020 Ccngage Learning. AU Rights Re.served May not be copied. .sc:mncd, a- d uplic:ited . in wllolc o r in part. Due to e lec tronic rig hts . some th ird par ty conte nt may be suppressed from the cBook amVo r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubseq uent rig hts rest rict ion.~ require it.
xii Contents

Chapter Review 589


Technology Today 590
Reviewing What You've Learned 592
CASE STUDY: "Made in the U.S.A."- Another Look 598

Analysis of Categorical Data 599


14.1 The Multinomial Experiment and the Chi-Square Statistic 600
14.2 Testing Specified Cell Probabilities: The Goodness-of-FitTest 602
Exercises 604
14.3 Contingency Tables: A. Two-Way Classification 606
The Chi-Square Test of Independence 607
Exercises 611
14.4 Comparing Several Multinomial Populations: A Two-Way Classification with
Fixed Row or Column Totals 614
Exercises 616
14.S Other Topics in Categorical Data Analysis 619
The Equivalence of Statistical Tests 619
Other Applications of the Chi-Square Test 620

Chapter Review 621


Technology Today 622
Reviewing What You've Learned 627
CASE STUDY: Who Is the P1imary Breadwinner in Your Family? 63 1

Nonparametric Statistics 633


Introduction 634
1S.1 The Wilcoxon Rank Sum Test: Independent Random Samples 634
Normal Approximation for the Wilcoxon Rank Sum Test 638
Exercises 641
1S.2 The Sign Test for a Paired Experiment 643
Normal Approximation for the Sign Test 644
Exercises 646
1S.3 A Comparison of Statistical Tests 648
1S.4 The Wilcoxon Signed-Rank Test for a Paired Experiment 648
Normal Approximation for the Wilcoxon Signed-Rank Test 652
Exercises 653
1S.S The Kruskal- Wallis H-Testfor Completely Randomized Designs 655
Exercises 658

Copyright 2020 Ccngage Learning. AU Rights Re.served May not be copied. .sc:mncd, a- d uplic:ited. in wllolc or in part. Due to electronic rights . some third par ty content may be suppressed from the cBook amVo r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcriC'ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubsequent rights rest rict ion.~ require it.
Contents xi ii

15.6 The Friedman F,-Test for Randomized Block Designs 660


Exercises 663
15.7 Rank Correlation Coefficient 666
Exercises 670
15.8 Summary 672

Chapter Review 672


Technology Today 673
Reviewing What You've Learned 676
CASE STUDY: Amazon HQ2 680

Appendix I 681
Table 1 Cumulative Binomial Probabilities 682
Table 2 Cumulative Poisson Probabilities 688
Table 3 Areas under the Normal Curve 690
Table 4 Critical Values oft 692
Table 5 Critical Values of Chi-Square 694
Table 6 Percentage Points of the F Distribution 696
Table 7 Critical Values of Tfor the Wilcoxon Rank
Sum Test, n1 ::; n 2 704
Table 8 Critical Values of Tfor the Wilcoxon Signed-Rank
Test, n = 5(1 )50 706
Table 9 Critical Values of Spearman's Rank Correlation Coefficient
for a One-Tailed Test 707
Table 10 Random Numbers 708
Table 11 Percentage Points of the Studentized Range, q_05 (k, df) 710

Data Sources 714

Answers to Selected Exercises 727

Index 745

Copyright 2020 Cengage Learning. AU Rights Reserved ~fay not be copied. scanned. a- d uplicated. in wllolc or in part. Due to electronic rights. some th ird par ty conte nt may be suppressed from the cBook and'o r tChaptcr(s).
Edito rial review has deemed that any suppressed content docs not mlltcrially affect the ovcrnll lcaming experience. Cengagc Lctirning reserve-. the right to remove additional content at :my time ifsubscqucnt rig hts rest rict ion.~ require it.
Copyright 2020 Cengage Learning. AU Rights Re.served. May not be copied. scanned. a- duplicitcd. in whole orin part. Due to electronic rigtu. .some third party content may be suppressed rmm thc cBook amVor cChaplcr(s) .
Editorial review has deemed that any supprcs.<,ed coment docs not materially affect the ovcrnll lcaming experience. Cengage Learning rc..<.ervcs the right to remove additional contem at any time ihubscqucnt rights restrict ion.~ require it.
Preface

Every time you pick up a newspaper or a magazine, watch TV, or scroll through Facebook,
you encounter stati sties. Every time you fill out a questionnaire, register at an online
website, or pass your grocery rewards card throug h an electronic scanner, your perso nal
information becomes part of a database containing your personal statistical info1mati on.
You can't avoid it! In this digital age, data collection and analysis are part of our day-to-day
activities. If you want to be an educated consumer and citizen, you need to understand how
statistics are used and misused in our daily lives.
This international metric version is desig ned for classrooms and students outside of
the United States. The units of meas urement used in selected examples and exercises have
been changed from U.S. Customary units to metric units. We did not update problems that
are specific to U.S. Customary units, such as passing yards in football or data related to
specific publications.

The Secret to Our Success


The first college course in introductory statistics that we ever took used Introduction to
Probability and Statistics by William Mendenhall. Since that time, this text-currently in
the fifteenth edition-has helped generations of students understand what statistics is all
about and how it can be used as a tool in their particular area of application. The secret to
the success of Introduction to Probability and Statistics is its ability to blend the old with
the new. With each revision we try to build on the strong points of previous editions, and
to look for new ways to motivate, encourage, and interest students using new technologies.

Hallmark Features of the Fifteenth Edition


The fifteenth edition keeps the traditional outline for the coverage of descriptive and
inferential statistics used in previous editions. This revision maintains the straightforward
presentation of the fourteenth edition. We have continued to simplify the language in order to
make the text more readable-without sacrificing the statistical integrity of the presentation.
We want students to understand how to apply statistical procedures, and also to understand
• how to meaningfully describe real sets of data
• how to explain the results of statistical tests in a practical way
• how to tell whether the assumptions behind statisti cal tests are valid
• what to do whe n these assumptions have been violated

Exercises
As with all prev ious editions, the variety and number of real applications in the exercise
sets is a major strength of thi s edition. We have revised the exercise sets to provide new and
xv

Copyright 2020 Ccngage Learning. AU Rights Re.served May not be copied. .sc,:mncd, a- d uplic:ited. in wllolc or in part. Due to electronic rights . some th ird par ty conte nt may be suppressed from the cBook amVo r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubsequent rights rest rict ion.~ require it.
xvi Preface

interesting real-world situations and real data sets, many of which are drawn from current
periodicals and journals. The fifteenth edition contains over 1900 exercises, many of which
are new to this edition. Exercises are graduated in level of difficulty; some, involving only
basic techniques, can be so lved by almost aU students, while others, involving practical
applications and interpretatio n of results, will chaUenge students to use more sophisticated
statistical reasoning and understanding. Exercises have been rearranged to provide a more
even distribution of exercises within each chapter and a new numbering system has been
introduced, so that numbering begins again with each new section.

Organization and Coverage


We believe that Chapters l through 10-with the possible exception of Chapter 3-should
be covered in the order presented. The remaining chapters can be covered in any order.
The analysis of variance chapter precedes the regression chapter, so that the instructor can
present the analysis of variance as part of a regression analysis. Thus, the most effective
presentation would order these three chapters as weU.
Chapters 1-3 present descriptive data analysis for both one and two variables, using
MIN/TAB 18, Microsoft Excel 201f?, and T/-83/84 Plus graphics. Chapter 4 includes a fuU
presentation of probability. The last section of Chapter 4 in the fourteenth edition of the
text, ''Discrete Random Variables and Their Probability Distributions" has been moved to
become the first section in Chapter 5. As in the fourteenth edition, the chapters on analysis
of variance and linear regression include both calculational formulas and computer print-
outs in the basic text presentation. These chapters can be used with equal ease by instmctors
who wish to use the "hands-on" computational approach to linear regression and ANOVA
and by those who choose to focus on the interpretation of computer-generated statistical
printouts. This edition includes expanded coverage of the uniform and exponential di stri-
butions in Chapter 5 and nonnal probability plots for assessing normality in C hapter 7,
i.n addition to an expanded t-table (Table 4 in Appendix I). New topics in Chapter 13 include
best subsets regression procedures and binary logistic regression.
One important feature in the hypothes is testing chapters involves the emphasis on
p-values and their use in judging statistical significance. With the advent of computer-
generated p-values, these probabilities have become essential in reporting the results of a
statistical analysis. As such, the observed value of the test statistic and its p-value are pre-
sented together at the outset of our discussion of statistical hypothesis testi ng as equivalent
tools for decision-making. Statistical significance is defi ned in terms of preassigned values
of a, and the p-value approach is presented as an alternative to the critical value approach
for testing a statistical hypothesis. Examples are presented using both the p-value and
critical value approaches to hypothesis testing. Discuss ion of the practical interpretation
of statistical results, along with the difference between statistical significance and practical
signi ficance, is emphasized in the practical examples in the text.

Special Features of the Fifteenth Edition


• NEED TO KNOW... : This edition again includes highlighted sections called "NEED
TO KNOW. .." and identified by this icon. [, •Utt!ctiJ,i.fai [hese sections provide in-
formation consisting of definitions, procedures, or step-by-step hints on problem solv-
ing for specific questions such as ''NEED TO KNOW ... How to Constmct a Relative
Frequency Histogram?" or ''NEED TO KNOW ... How to Decide Which Test to Use?"
• Graphical and numerical data description includes both traditional and EDA methods,
using computer graphics generated by MIN/TAB 18for Windows and MS Excel 2016.
• Calculator screen captures from the T/-84 Plus calculator have been used for several
examples, allowing students to access this option for data analys is.

Copyright 2020 Cengage Learning. AU Rights Reserved ti.fay not be copied .sc,:mncd, a- d uplic:itcd. in wllolc or in part. Due to electronic rights. some th ird par ty content may be s uppressed from the cBook amVor tChaptcr(s).
Editorial review has deemed that any suppressed content docs 001 llllltcrially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserve. tile right to remove additional content at any time if s ubsequent rights rest rict ion.~ require it.
Preface xvii

Figure2.11
Figure 9.9 Rel::i.tive frequency
Tl-84 Pl,1s oulpul for
histogram for Example 2.8
Example 9.7
1,1)3300
z=0.9090909091
p=0.1816510401
x=3400
n=100

8.5 14.5 20.5 26.5 32.5


Scores

G H
FtOllt L!Jl Room R~ot L!:Q: Room

Mean 41,9 Mean 21.350


St•ndard (rro, 0.221 St•nct.rd EtrOf 0.409
MH°IAn 41..750 Medi.In 28
Mode 41..500 Mode 28
Sr.nda,rd Deviation 0.699 Stand.rd Oewition J.292
s,~v.mnce 0.489 Simple V•nance 1.669
Kun0ii1 2...456 Kunosit, -0.163
S.\ewnns L)Sl Skewnes1 -0.021
Ranae
Min"1"1um
2.5 Aonp
41 M inimum

26
MntffiUm 43.S MlxN'nUffl 30
5"m 41.9 Sum 283.5
Count 10 Count 10

• All examples a nd exercises in the text that contain printouts or calcul ator screen
captures are based on MIN/TAB 18, MS Excel 2016, or the T/-84 Plus calculator.
These outputs are provided for some exercises, while other exercises require the
student to obtain solutions without using a computer.

Name length (km) Name Lengt h (km) d. Use a bar graph to show the percentage of federal
Gulf fishing areas closed.
Superior 560 T~icaca 19S
Victoria 334 Nicaragua 163 e. Use a line c hart to show the amounts of dispersants
Huron 330 Athabasca 333 used. Is there any underlying straight line relatio n-
Michigan 491 Reindeer 229 ship over time?
Aral Sea 416 Tonie Sap 112
Tanganyika
Baykal
672
632
Turkana
lssyk Kul
246
184 llJ 7. Election Results The 20 16 election was a race
Great Bear 307 Torrens 208 in which Donald Trump defeated Hillary Clinton
Nyasa 576 vanern 146 DS0129
and other candidates, winning 304 electoral votes,
Great Slave 477 Nettilling 107 or 57% of the 538 available. However. Trump only won
Erie 386 Winnipegosis 226
Winnipeg 426 Albert 160 46. I% of the popular vote, while Clinton won 48.2%.
Ontario 309 Nipigon 115 The popular vote (in thousands) for Donald Trump in
Balkhash 602 Gairdner 144 each of the 50 states is listed as follows 18:
Ladoga 198 Urmia 144
Maracaibo 213 Manitoba 224 AL 1319 HI 129 MA 1091 NM 320 SD 228
Onega 232 Chad 280 AK 163 ID 409 Ml 2280 NY 2820 TN 1523
Eyre 144 AZ 1252 IL 2146 MN 1323 NC 2363 TX 4685
AR 68S IN 1S57 MS 701 ND 217 UT S1S
Sou,ce: The \.'\tlrld Afmooac and Book ofFacts 2017 CA 4484 IA 801 MO 1595 OH 2841 VT 95
co 1202 KS 671 MT 279 OK 949 VA 1769
a. Use a stem and leaf plot to describe the lengths of CT 673 KY 1203 NE 4% OR 782 WA 1222
the world's major lakes. DE 185 LA 1179 NV 512 PA 2971 WV 489
b. Use a histogram to disp lay these same data. How FL 4618 ME 336 NH 346 RI 181 W1 1405
GA 2089 MD 943 NJ 1602 SC 11 SS WY 174
does this compare to the stem and leaf plot in part a?
c. Are these data symmetric or skewed? If skewed. a. By just looking at the table, what shape do you think
what is the direction of the skewing? the distribution for the popular vote by state will
have?
ll!!!3 6. Gulf Oil Spill Cleanup On April 20, 2010, the
b. Draw a relative frequency histogram to describe the
llil United Stales experienced a major environmental
distribution of the popular vote for President Trump
128
'"' disaster when a Deepwater Horizon drilling rig
in the 50 states.
exploded in the Gulf of Mexico. The number of person-
nel and equipment used in the Gulf oil spill cleanup, c. Did the histogram in part b confirm your guess in
beginning May 2, 2010 (Day 13) throu gh June 9, 2010 part a? Are there any outliers? How can you explain
(Day 5 1) is given in the following table." them?

Copyright 2020 Ccngage Learning. AU Rights Reserved. May not be copied, .scanned, a- duplicated. in wllolc or in part. Due to electronic rights . some third par ty conte nt may be s uppressed from the cBook amVo r tChaptcr(s).
Editorial review has deemed that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubsequent rights rest rict ion.~ require it.
xviii Preface

TECHNOLOGY TODAY
The Role of Computers and Calculators in the
Fifteenth Edition-Technology Today
Computers and scientific or graphing calculators are now common tools for college students
in all disciplines. Most students are accomplished users of word processors, spreadsheets, and
databases, and they have no trouble navigating through software packages in the Windows en-
vironment. Many own either a scientific or a graphing calculator, very often one of the many
calculators made by Texas Instruments.™ We believe, however, that advances in computer
technology should not tum statistical analyses into a "black box." Rather, we choose to use
the computational shortcuts that modem technology provides to give us more time to empha-
size statistical reasoning as well as the understanding and interpretation of statistical results.
In this edition, students will be able to use computers both for standard statistical analy-
ses and as a tool for reinforcing and visualizing statistical concepts. Both MS Excel 2016 and
MIN/TAB 18 are used exclusively as the computer packages for statistical analysis along with
procedures available using the Tl-83 or Tl-84 Plus calculators. However, we have chosen to
isolate the instructions for generating computer and calculator output into individual sections
called Technology Today at the end of each chapter. Each discussion uses numerical examples
to guide the student through the MS Excel commands and option necessary for the procedures
presented in that chapter, and then present the equivalent steps and commands needed to pro-
duce the same or similar results using MIN/TAB and the TI-83/84 Plus. We have included screen
captures from MS Excel, MIN/TAB 18, and the T/-84 Plus, so that the student can actually work
through these sections as "mini-labs."
If you do not need "hands-on" knowledge of MIN/TAB, MS Excel, or the T/-83/84 Plus, or if
you are using another calculator or software package, you may choose to skip these sections and
simply use the printouts as guides for the basic understanding of computer or calculator outputs.

TECHNOLOGY TODAY
Numerical Descriptive Measures in Excel
MS Excel provides most of lhe basic descripti\•e statistics presented in Chapter 2 using a
single command on the Data tab. Other descriptive statistics can be calculated using the
Fuoction Library group on the Formu las tab.

lif-MIA!Dfi The followingdataarethe fronl and rear leg rooms (in ioches) for 10 different compact sports
utility \·chicles 0:

Make&Model Front Leg Room Rear Leg Room


Chevrolet Equinox 425 30.0
Ford Escape 41.5 28.0
Hyundai Tucson 41.5 28.0
Jeep Cherokee 43.5 30.0
Jee Com ss 41.5 28.0

Numerical Descriptive Measures in MIN/TAB


MIN/TAB provides most of the basic descriptive statistics presented in Chapter 2 using a
single command in the drop-down menus.

•J:fii'i!Uijf:■ 1l1e follow ing data are tl1e front and rear leg rooms (in inches) for 10 different compact sports
utility vehicles":

Make& Model Front Leg Room Rear Leg Room


Chevrolet Equinox 42.5 30.0
Ford Escape 4 1.5 28.0
Hyundai Tucson 41.5 28.0
Jeep Cherokee 43.5 30.0

Numerical Descriptive Measures on the Tl-83/ 84 Plus Calculators


The TJ-83/84 Pl11s calculators can be used to calculate descriptive statistics and create box
plots using the stat ► CALC and 2nd ► stat plot commands.

iitiMPIIIEI The following data are the front and rear leg rooms (in inches) for IO different compact sports
utility vehicles 13 :

Make&Model Front Leg Room Rear leg Room


Chevrolet Equinox 42.5 30.0
Ford Escape 41 .5 28.0

Copyright 2020 Cengage Learning. AU Rights Res erved. May not be copied, .sc.:mncd. a- duplicated. in wllole or in part. Due to electronic rights. some third par ty content may be suppressed from the eBook amVo r tChaptcr(s).
Editorial review has deemed that any suppressed content docs oot materially affect the overnll lcaming expcric-nce. Cengagc Lctirning reserves the right to remove additional content at any time ir subsequent rights rest rict ion.~ require it.
Preface xix

Study Aids
The many and vaiied exercises in the text provide the best learning tool for students em-
barking on a first course in statistics. The answers to aU odd-numbered exercises are given
in the back of the text. Each appli cation exercise has a title, making it easier for students
and instructors to immediately identify both the context of the problem and the area of
application. All of the basic exercises have been rewritten and all of the applied exercises
restructured according to increasing difficulty. New exercises have been introduced, dated
exercises have been deleted, and a new numbering system has been introduced within each
section.

The Basics 12. P(x 2: 6) and P(x > 6) when 11 = 15 and p = .5


Normal Approximation? Can t/1e normal approxima- 13. P (4:sx:s6)whenn = 25andp=.2
tion be used to approximate probabilities for the bino-
14. P(x 2: 7) and P(x = 5) when 11 = 20 and p = .3
mial random variable x, with values for II and p given
i11 Exercises 1-4? lf11ot, is therea11otherapproximatio11 15. P(x2: 10) when 11 = 20 and p = .4
that you could use?
1. 11 = 25andp = .6 2. 11 =45andp=.05 Applying the Basics
3. 11 = 25 and p = .3 4. 11 = I5 and p = .5 16. A USA Today snapsho t found that 47% of Ameri-
cans associate "recycling" with Earth Day.9 Suppose a
Using the Normal Approximation Find the mean and
random sample of 11 = 50 adu lts are po lled and that the

Students should be enco uraged to use the "NEED TO KNOW. .." sections as they
occur in the text. The placement of these sections is intended to answer questions as they
would normally arise in discussions. In addition, there are numerous hints called "NEED
A TIP?" that appear in the mai·gins of the text. The tips are short and concise.

In the previous three chapters, you have learned a lot about probability distributions. such
as tl1e binomial and nonnal distributions. 1l1e shape of the nonnal distribution is determined
• Need a Tip? by its mean µ and its standard deviation rr, while the shape of the binomial distribution is
Parame(er o Population determined by p. These numerical descriptive measures-called parameters---are needed
Statistic o Sample
to calculate die probability of observing sample results.
In practical situations, you may be able to decide which t~pe of probability distribution
to use as a model, but die values of the parameters diat specify its exact form are unknown.
Here are two examples:

Finally, sections called Key Concepts and Fm·mulas appear in each chapter as a review
in outline fo 1m of the material covered in that chapter.

CHAPTER REVIEW
Key Concepts and Formulas 2. The Empirical Rule can be used only for rela-
tively mound-shaped data sets. Approximately
I. Measures of the Center of a Data Distribution 68%. 95%. and 99. 7% of the measure1nents are
within one. two. and three standard deviations of
I . Arithmetic mean (mean) or average
the mean. respectively.
a. Population: µ,
b. Sample of 11 measurements: X = - '
LX IV. Measures of Relative Standing
II x- X
I. Sample z-score: ~ = ~~
2. Median: position of the median = .5(11 + I) s
3. Mode 2. pth percentile; p% of the measure ments are
4. l11e medjan may be preferred to the mean if the smaller. and (100 - p)% are larger.
data are highly skewed. 3. Lower quartile. Q,: position of Q, = .25 (11 + I)

Copyright 2020 Ccngage Learning. AU Rights Res erved. May not be copied, .sc,anncd, a- duplicated. in wllole or in part. Due to electronic rights. some th ird par ty content may be s uppressed from the eBook amVo r tChaptcr(s).
Editorial review has deemed that any suppressed content docs oot materially affect the overnll lcaming expcric-nce. Cengagc Learning reserves the right to remove additional content at any time if s ub.sequent rights rest rict ion.~ require it.
XX Preface

Instructor Resources

WebAssign
WebAssign for Mendenhall/Beaver/Beaver's Introduction to Probability and Statistics, 15th
#• •~ WEBASSIGN
I - From Cengage Edition, Metric Version is a flexible and fully customizable online instructional solution
that puts powerlul tools in the hands of instructors, empowering you to deploy assignments,
instantly assess individual student and class performance, and help your students master the
course concepts. With WebAssign's powerful digital platform and Introduction to Probability
and Statistics's specific content, you can tailor your course with a wide range of assignment
settings, add your own questions and content, and access student and course analytics and
communication tools.

MindTap Reader
Available via WebAssign, MindTap Reader is Cengage's next-generation eBook. MindTap
Reader provides robust oppottunities for students to annotate, take notes, navigate, and
interact with the text. Instructors can edit the text and assets in the Reader, as well as add
videos or URLs.

Cognero
Cengage Learning Testing, powered by Cognero, is a flexible, online system that allows
you to import, edit, and ma nipulate content from the text's Test Bank or elsewhere-
including your own favorite test questions; create multiple test versions in an instant; and
deliver tests from your LMS, your classroom, or wherever you want.

Instructor Solutions Manual


This time-saving online manual provides complete solutions to all the problems in the text.
You can download the solutions manual from the Instructor Companion Website.

Instructor Companion Website


Everything yo u need for your cow-se in one place! This collection of book-specific class
tools is available online via www.cengage.com/login. Access and download PowerPoint
presentations, images, Instructor Solutions M anual, data sets, and more.

SnapStat
Tell the story behind the numbers with SnapStat in WebAssig n. Designed with students
to bring stats to life, SnapStat uses interactive visuals to perform complex analysis online.
Labs and Projects in WebAssig n allow students to crunch their ow n data or choose from
pre-existing data sets to get hands-on with technology and see for themselves that Statistics
is much more than just numbers.

Student Resources

WebAssign
WebAssig n for M endenhall/Beaver/Beaver's Introduction to Probability and Statistics,
#- •~ WEBASSIGN
I - From Cengage 15th Edition, Metric Version lets you prepare for class with confidence. Its online learning

Copyright 2020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some third par ty conte nt may be s uppressed from the cBook amVo r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if subsequent rights rest rict ion.~ require it.
Preface xxi

platform fo r your math , statistics, and science courses helps yo u practice and abso rb what
you learn. Videos and tutorials walk you th rough concepts when you're stuck, and instant
feedback and grading let you know where you stand-so you can focus yo ur study time
and perform better o n in-class assignments. Study smarter with WebAssign !

MindTap Reader
Avai lable via WebAssign, MindTap Reader is Cengage's next-generation eBook. MindTap
Reader provides robust o pportunities for students to annotate, take notes, navigate, and
interact with the text. Annotations captu red in M indTap are automatically tied to the
Notepad app, where they can be viewed chronologically and in a cogent, linear fas hion.

Online Technology Guides


Online Technology Guides, accessed via www.cengage.com, provide step-by-step instructions
for completing problems using common statistical software.

SnapStat
Learn the story behind the numbers with SnapStat in WebAssign. Designed with students
to b1ing stats to life, SnapStat uses interactive visuals to perfo rm complex analysis o nline.
Labs and Projects in WebAssign allow you to crunch your own data or choose from pre-
existing data sets to get hands-on with technology and see for yourself that Statistics is
much more than just numbers.

Acknowledgments
The authors are gratefu l to Catherine Van Der Laan and the editorial staff of Cengage
Learning fo r their patience, assistance, and cooperatio n in the preparatio n of thi s edition.
T hanks are also due to fiftee nth edition rev iewers Olcay Akrnan, Matt Han-i s, Z hongming
Huang, Bo Kai, Sarah Miller, and Katie Wheato n. We wish to thank authors and organ iza-
tions fo r allowing us to reprint selected material; acknowledgments are made wherever
such material appears in the text.
Robert J. Beaver
Barbara M. Beaver

Copyright 2:020 Ccngage Learning. AU Rights Re.served May not be copied .sc,:mncd, a- duplic:ited. in wllolc or in part. Due to electronic rig hts . some third party content may be suppressed from the cBook and'or tChaptcr(s).
Editorial review has dccml'd that any s uppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubsequent rights restriction.~ require it.
Copyright 2020 Cengage Learning. AU Rights Re.served. May not be copied. scanned. a- duplicitcd. in whole orin part. Due to electronic rigtu. .some third party content may be suppressed rmm thc cBook amVor cChaplcr(s) .
Editorial review has deemed that any supprcs.<,ed coment docs not materially affect the ovcrnll lcaming experience. Cengage Learning rc..<.ervcs the right to remove additional contem at any time ihubscqucnt rights restrict ion.~ require it.
Introduction
What Is Statistics?

What is statistics? Have you ever met a statistician? Do


you know what a statistician does? Maybe you are think-
ing of the person who sits in the broadcast booth at the
Rose Bowl, recording the number of pass completions,
yards rushing, or interceptions thrown on New Year's Day.
Or maybe just hearing the word statistics sends a shiver of
fear through you. You might think you know nothing about
statistics, but almost every time you turn on the news or
scroll through your favorite news app, you will find statis-
tics in one form or other! Here are some examples that we
Andrea Ricardi, Italy/Moment/Getty Images
found just before the 20 17 November elections:

• Northam Heads Into Virginia Governor's Race With A Small Lead. The first
major statewide elections since President Trump was inaugurated take place on
Tuesday .. . And while the race's final result by itself isn' t likely to tell us much
abo ut the national political e nvironment, it is likely to have a big effect on the 2018
midterms. Polls show a fairly close race, with Northam slightly favored to win [over
Ed Gillespie]. An average of the last 10 surveys give Northam a 46 percent-to-43
percent advantage. Over the past month, there has been a tighteni ng of the race, with
Gillespie closing what had been a 6-point lead . In the individual polls, though, there
is a fairly wide spread. Northam has led by as much as 17 percentage points
(a Quinnipiac University survey) a nd has trailed by as much as 8 points (a Hampton
University poll). 1
-www.fivethirtyeight.com
• Why Trump Has a Lock on the 2020 GOP Nomination. In interviews with nearly
three-dozen GOP strategists and fundraisers over the past several tumultuous weeks,
virtually everyone told me that. .. they expect Tmmp to coast to the GOP nomina-
tion in 2020 ... the hurdles to a 2020 primary challenge are vivid when considering
a recent Washington Post/ABC News poll that fo und 9 1% of Trump voters said
they'd vote fo r him agai n ... This ABC News/Washington Post poll was conducted by
landline and cellular tele phone Oct. 29-Nov. 1, 20 17, in English and Spanis h, among
a random national sample of 1005 adults. Results have a margin of sampling error of
3.5 points, including the design effect. 2
- www.cnn.com

Copyright 2020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc,:mncd, a- duplicated . in wllolc or in part. Due to electronic rig hts . some third party conte nt may be s uppressed from the cBook and'o r tChaptcr(s).
Editorial review has dccml'd that any s uppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves Lhc right to remove additional content at any time if s ubseq uent rights rest rict ion.~ require it.
2 INTRODUCTION What Is Statistics?

Articles similar to these can be found in alJ forms of news media, and, just before a presi-
dential or congressional election, a new poll is reported almost every day. These articles are
very familiar to us; however, they might leave you with some unanswered questions. How
were the peo ple in the poll selected? WilJ these people give the same response tomorrow?
Will they give the same response on election day? WilJ they even vote? A.re these people
representative of aJI those who will vote on election day? It is the job of a statistician to ask
these questions and to find answers for them in the language of the poll.
Most Believe "Cover-Up" of JFK Assassination Facts
A majority of the public believes the assass ination of President John F. Kennedy was part of a
larger conspiracy, not the act of one individual. In addition, most Americans think there was a
cover-up of facts about the 1963 shooting. Almost 50 years after JFK's assassination, a FOX
news poll shows many Americans disagree with the government's conclus ions about the killing.
The Warren Commission found that Lee Harvey Oswald acted alone when he shot Ke1medy,
but 66 percent of the public today think the assassination was " part of a larger conspiracy" while
only 25 percent think it was the "act of one individual."
"For older Americans, the Kennedy assassination was a traumatic experience that began
a loss of confidence in government," commented Opinion D ynamics President John Gorman.
"Younger people have grown up with movies and documentaries that have pretty much pushed
the 'conspiracy' line. Therefore, it isn't surprising there is a fairly solid national consensus that
we still don't know the truth."
(The poll asked): "Do you think that we know all the facts about the assassination of
President John F. Kennedy or do you tl1ink there was a cover-up?"

We Know All the Facts (%) There Was a Cover-Up (Not Sure)
All 14 74 12
Democrats 11 81 8
Republicans 18 69 13
Independents 12 71 17

- www.foxnews .com3

When you see an article like this one, do you simply read the title and the first paragraph,
or do you read further and try to understand the meaning of the numbers? How did the
authors get these numbers? Did they realJy interview every American with each political
affiliation ? It is the job of the statis tician to answer some of these questions.
Hot News: 98.6°F Not Normal
After believing for more than a century that 98.6°F was the normal body temperature fo r
humans, researchers now say normal is not normal anymore.
For some people at some hours of the day, 99.9°F could be fine. And readings as low as
96°F turn out to be highly human.
The 98.6°F standard was derived by a German doctor in 1868. Some physicians have always
been suspicious of tl1e good doctor's research. His claim: I million readi ngs- in an epoch
witl1out computers.
So Mackowiak & Co. took temperature readings from 148 healthy people over a three-day
period and found that the mean temperature was 98.2°F. Only 8 percent of the readings were
98.6°F.
- The Press-Enterprise4

What questions do you have when you read this article? How did the researcher select the
148 people, and how can we be sure that the results based on these 148 people are accurate
when applied to the general population? How did the researcher arrive at the normal "high"
and "low" temperatures given in the article? How did the German doctor record l million
temperatures in 1868? This is another statistical problem with an application to everyday life.
Statistics is a branch of mathematics that has applications in almost every part of our
daily life. It is a new and unfamili ar language for most people, however, and, like any

Copyright 2:020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc,:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some third par ty content may be s uppressed from lhc cBook and'o r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the overnll lcaming expcric-ncc. Cengagc Lctirning reserves the right to remove additional content at any time ir s ubsequent rights rest rict ion.~ require it.
Descriptive and Inferential Statistics 3

new language, statis tics can seem overwhelming at first glance. But once the language of
statistics is learned and understood, it provides a powerful tool for data analysis in many
different fields of application.

The Population and the Sample


In the language of statistics, one of the most basic concepts is sampling. In most statistical
problems, a specified number of meas urements or data-a sample-is drawn from a much
larger body of measurements, called the population.

Sample

Population

For the body-temperature experiment, the sample is the set of body-temperature mea-
surements for the 148 healthy people chosen by the experimenter. We hope that the sample
is representative of a much larger body of measurements-the population- the body tem-
peratures of all healthy people in the world!
Which is more important to us, the sampl e or the population? In most cases, we are
interested primarily in the population, but identifying each member of the population may
be difficult or impossible. Imagine trying to record the body temperature of every heal thy
person on earth or the presidential preference of every registered voter in the United States!
Instead, we try to describe or predict the behavior of the population on the basis of
information obtained from a representative sample from that population.
The words sample and population have two meanings for most people. For example,
you read that a Gallup poll conducted in the United States was based on a sample of
1823 people. Presumably, each person interviewed is asked a particular question, and that
person's response represents a single measurement in the sample. Is the sample the set of
1823 people, or is it the 1823 responses that they give?
In statistics, we distinguish between the set of objects on which the measurements are
taken and the measurements themselves. To experimenters, the objects on which measure-
ments are taken are called experimental units. The sample survey statistici an calls them
elements of the sample.

Descriptive and Inferential Statistics


When first presented with a set of measurements- whether a sample or a population- you
need to find a way to organize and summari ze it. The branch of stati stics that gives us tools
for describing sets of measurements is called descriptive statistics. You have seen descrip-
tive statistics in many fotms : bar charts, pie charts, and line charts presented by a political
candidate; numerical tables in the media; or the average rainfall amounts on your favorite
weather app. Computer-generated graphics and numerical summaries are commonplace in
our everyday communicatio n.

Copyright 2020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc.:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some th ird par ty conte nt may be s uppressed from the cBook amVo r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubsequent rights rest rict ion.~ require it.
4 INTRODUCTION What Is Statistics?

DEFINITION

Descriptive statistics are procedures used to summarize and describe the important
characteristics of a set of measurements.

If the set of measurements is the entire population, you need only to draw conclusions
based on the descriptive statistics. However, it might be too expensive or too time consum-
ing to identify each member of the population. Maybe listing the entire population would
destroy it-for example, measuring the amount of force required to cause a football helmet
crack. For these or other reasons, you may have only a sample fro m the population. By
looking at the sample, you want to answer questions about the population as a whole. The
branch of statistics that deals with this problem is called inferential statistics.

DEFINITION

Inferential statistics are procedures used to make inferences (that is, draw conclusions,
make predictions, make decisions) about a population from information contained in a
sample drawn from this population.

The objective of inferential statistics is to make inferences about a population from


information contained in a sample.

Achieving the Objective of Inferential


Statistics: The Necessary Steps
How can you make inferences about a population using information contained in a sample?
The task becomes simpler if you organize the problem into a series of logical steps.
1. Specify the questions to be answered and identify the population of interest. In
the Virginia election poll, the objective is to determine who will get the most votes
on election day. So, the population of interest is the set of all votes in the Virgi nia
election. When you select a sample, it is imp01tant that the sample be representative
of this population, not the population of voter preferences on some day prior to
the election.
2. Decide how to select the sample. This is called the design of the experiment or the
sampling procedure. Is the sample representative of the population of interest? For
example, if a sample of registered voters is selected from the city of San Francisco,
will this sample be representative of all voters in California? Will it be the same as
a sample of " likely voters"- those who are likely to actually vote in the election ?
Is the sample large enough to answer the questions posed in step l without wasting
time and money on additional information? A good sampling design will answer
the questions posed with minimal cost to the experimenter.
3. Select the sample and a nalyze the sample information. No matter how much
information the sample contains, you must use an appropriate method of analysis to
obtain it. Many of these methods, which depend on the sampling procedure in
step 2, are explained in the text.

Copyright 2:020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc,:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some third par ty content may be s uppressed from the cBook and'o r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if subsequent rights rest rict ion.~ require it.
Keys for Successful Learning 5

4. Use the information from step 3 to make an inference about the population.
Many different procedures can be used to make this inference, and some are bet-
ter than others. For example, 10 different methods might be available to estimate
human response to an experimental drug, but one procedure mi ght be more accurate
than others. You should use the best inference-making proced ure avail able (many of
these are expla ined in the text).
5. Determine the reliability of the inference. Since you are using only a fraction of
the population in drawing the co ncl usions described in step 4 , you might be wrong!
If an agency conducts a statistical survey for you and estimates that your company's
product will gai n 34% of the market this year, how much confidence can you place
in this estimate? Is this estimate accurate to within 1, 5, or 20 percentage points? Is
it reliable enough to be used in setting productio n goals? Every statistical inference
sho uld include a measure of reliability that tells you how much confidence you
have in the inference.

Now that you have learned a few basic terms and concepts, we again pose the ques-
tion asked at the begin ning of this discussion: Do you know what a statisti cian does? The
statistician's j ob is to can-y o ut all of the preceding steps.

Keys for Successful Learning


As you begin to study statistics, you will find that there are many new terms and concepts
to be mastered. Since statistics is an applied branch of mathematics, many of these basic
concepts are mathematical---developed and based on results from calculus or higher math-
ematics. However, you do not have to be able to prove the results in order to apply them
in a logical way. In this tex t, we use numerical examples and commonsense arguments to
explain statistical concepts, rather than more co mplicated mathematical arguments.
Com puters and calculators are now readily available to many students and provide
them with an invaluable tool. In the study of statistics, even the beginning student can
use packaged programs to pe1fo1m statistical analyses with a high degree of speed and
accuracy. Some of the more common statistical packages available at computer facilities
are MIN/TAB™, SAS , and SPSS. Personal computers and laptops w ill s upport MIN/TAB,
MS EXCEL, JMP, and others. Many students are familiar with the Tl -83 o r Tl-84 Plus cal-
culators, that have many built-in stati stics functions. There are even online statistical pro-
grams and interactive "applets" that students can use.
These programs, cal led statistical software, differ in the types of analyses availab le,
the options within the programs, and the forms of printed results (called output). However,
they are all similar. In this book, we use both MIN/TAB and Microsoft Excel as statistical tools.
Unders tanding the bas ic output of these packages will help you interpret the output from
other software syste ms. Similarly, understanding the results shown on yo ur Tl-83 or Tl-84
Plus calculator will make understanding a different calculator much eas ier.
At the end of most chapters, you will find a sectio n called " Technology Today." T hese
sectio ns present nume1ical examples to g uide you through the MIN/TAB, MS Excel, and
Tl-83/84 Plus co mmands and optio ns that are used fo r the procedures in that chapter. If
you are using MIN/TAB, MS Excel, or your Tl-83/84 Plus calcul ator in a lab or home setting,
you may want to work through this section using your own com puter or calculator so that
you become fa mili ar with the hands-on methods. If you do not need hands-o n knowl-
edge of MIN/TAB, MS Excel, or the Tl-83/84 Plus, you may choose to skip this section

Copyright 2020 Cengagc Learning. AU Rights Reserved. ti.fay not be copied. .sc,:mncd, a- duplicated. in wllolc or in part. Due to electronic rights. some th ird par ty content may be s uppressed from the cBook amVor cChaptcr(s).
Editorial review has deemed that any suppressed content docs not materially affect the ovcrnll lcaming experience. Ccngagc Lctirning reserve. Lile right to remove additional content at any time if s ubsequent rights rcsLrict ion.~ require it.
6 INTRODUCTION What Is Statistics?

and simply use the computer printouts or calculator screen captures for analysis as they
appear in the text.
Most important, using statistics successfully requires common sense and logical think-
ing. For example, if we want to find the average height of all students at a parti cular uni-
versity, would we select our entire sample from the members of the basketball team? In the
body-temperature example, the logical thinker would question an 1868 average based on
l million measurements-when computers had not yet been invented.
As you learn new statistical terms, concepts, and techniques, remember to view every
problem with a critical eye and be sure that the rnle of common sense app)jes. Throughout
the text, we wiJI remind you of the pitfalls and dangers in the use or mi suse of statisti cs.
Benjamin Disrael.i once said that there are three k.inds of lies : lies, damn lies, and statistics!
Our purpose is to prove this claim to be wrong-to show you how to make statistics work
for you and not lie for you!
As you continue through the book, refer back to this introduction every once in a while.
Each chapter will increase your knowledge of statistics and should, in some way, help you
achieve one of the steps described here. Each of these steps is important in achieving the
overall objective of inferentia l statistics: to make inferences about a population using infor-
mation contained in a sample drawn from that population.

Copyright 2020 Cengage Learning. AU Rights Reserved. ~fay not be copied. scanned. a- duplicated. in wllolc or in part. Due to electronic rights. some t.hird par ty conte nt may be s uppressed from the cBook amVor tChaptcr(s).
Editorial review has deemed that any suppressed content docs not mlltcrially affect the ovcrnll lcaming experience. Cengagc Lctirning reserve-. tile right to remove additional content at any time ifsubscqucnt rights rest rict ion.~ require it.
Describing Data
with Graphs

How Is Your Blood Pressure?


Is your blood pressure normal, or is it too high or too low?
The case study at the end of this chapter examines a large
set of blood pressure data. You will use graphs to describe
these data and compare your blood pressure with that of
others of your same age and gender.

© Photographee.eu/Shutterstock.com

LEARNING OBJECTIVES
Many sets of measurements are samples selected from larger populations. Other sets constitute
the entire population, as in a national census. In this chapter, you will learn what a variable is,
how to classify variables into several types, and how measurements or data are generated. You
will then learn how to use graphs to describe data sets.

CHAPTER INDEX
• Data distributions and their shapes (1.1, 1.3)
• Dotplots (1.3)
• Pie charts, bar charts, line charts (1.2, 1.3)
• Qualitative and quantitative variables-discrete and continuous (1.1)
• Relative frequency histograms (1.4)
• Stem and leaf plots (1.3)
• Univariate and bivariate data (1.1)
• Variables, experimenta l units, sa mples and populations, data (1.1)

• Need to Know..•
How to Construct a Stem and Leaf Plot
How to Construct a Relative Frequency Histog ram

Copyright 2020 Cengagc Learning. All Rights Reserved. ti.fay not be copied. .scanned, a- duplicated. in wllolc or in part. Due to electronic rights. some third party conte nt may be s uppressed from the cBook amVor cChaptcr(s).
Edllorial review has deemed that any suppressed content docs 001 llllltcrially affect the ovcrnll lcaming c xpcrk'ncc. Ccngagc Lctirning reserve. tile right to remove additional content at any time ifsubscqucnt rights restrict ion.~ require it.
8 CHAPTER 1 Describing Data with Graphs

- Variables and Data


In Chapters I and 2, we will present some basic techniques in descriptive statistics-the
branch of statistics concerned with describing sets of measurements, both samples and
populations. Once you have collected a set of measurements, how can you display this set
in a clear, understandable, and readable form? First, you must be able to define what is
meant by measurements or "data" and to categorize the types of data that you are likely to
encounter in real life. We begin by introducing some definitions.

DEFINITION

A vadable is a characteristic that changes or varies over time and/or for different
individuals or objects under consideration.

For example, body temperature is a variable that changes over time within a single indi-
vidual; it also varies from person to person. Religious affiliation, ethnic origin, income,
height, age, and number of offspring are all variables-characte1istics that vary depending
on the individual chosen.
In the Introduction, we defined an experimental unit or an element of the sample as the
object on which a measurement is taken. This is the same as saying that an experimental
unit is the object on which a variable is measured. When a va1iable is actually meas ured on
a set of experimental units, a set of meas urements or data result.

DEFINITION

An experimental unit is the individual or object on which a variable is measured.


A single measurement o r data value results when a variable is actually measured on
an experimental unit.

If a measurement is obtained for eve1y experimental unit in the entire collection, the resulting
data set constitutes the population of interest. Any smaller subset of measurements is a sample.

A population is the set of all measurements of interest to the investigator.

A sample is a subset of measurements selected from the population of interest.

litH•HIIM A set of five students is selected from all undergraduates at a large university, and measurements
are entered into a spreadsheet as shown in Figure 1.1. Identify the various elements involved
in obtaining this set of measurements.

Solution The experimental unit on which the variables are measured is a paiticular under-
graduate student on the campus, fo und in column A. Five variables are measured for each
student: grade point average (GPA), gender, year in college, major, and cu1Tent number of units

Copyright 2020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc.:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some third party conte nt may be s uppressed from the cBook amVo r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcriC'ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubsequent rights restrict ion.~ require it.
1.1 Variables and Data 9

Figure 1.1 A 8 C D E F
Measurements on five 1 Student GPA Gender Year Major Number o f Units
undergraduate students 2 1 2 F Fr Psychology 16
3 2 2.3 F So Mathematics 15
4 3 2.9 M So English 17
s 4 2.7 M Fr English 15
6 5 2.6 F Jr Business 14 .
Shttt 1 '+

enrolled. Each of these characteristics varies from student to student. If we consider the GPAs
of all students at this university to be the population of interest, the fi ve GPAs in column B rep-
resent a sample from this population. If the GPA of each undergraduate student at the university
had been measured, we would have the entire population of measurements for this variable.
The second varia ble measured on the students is gender, in co lumn C. This variable is
somew hat different from GPA, because it typicall y takes one of two values- male (M) or
female (F). If we could identify each member of the population, it would consist of a set
of Ms and Fs, o ne for each student at the university. The third and fourth variables, year
and major, also involve nonnumerical data- year has four categories (Fr, So, Jr, Sr), and
major has one category for each undergraduate major on campus. The last variable, current
number of units enrolled, is numerically valued, consisting of a set of numbers rather than
a set of qualities or characteristics.
Although we have discussed each variable individually, remember that we have measured
each of these five vari ables on a single experimental unit: the student. Therefore, in this
example, a "measure ment" really consists of five observations, one for each of the fi ve mea-
sured variables. For example, the measurement taken on student 2 produces this observation:
(2.3, F, So, Mathematics, 15)

There is a differe nce between a single variable measured on a single experimental unit
and multiple variables measured on a single experimental unit as in Example l. l.

Univariate data results when a single variable is measured on a single expe,imental unit.

DEFINITION

Bivariate data results when two variables are measured on a single experimental unit.
Multivariate data results when more than two variables are measured.

If you measure the body temperatures of 148 people, the resulting data are univariate. In
Example l. I, fi ve variables were meas w·ed on each student, resulting in multivariate data.

■ Types of Variables
Variables can be classified into one of two types: qualitative or quantitative.

DEFINITION

Qualitative variables measure a quality or characteristic on each experimental unit.


Quantitative variables measure a numerical quantity or amount on each experim ental unit.

Copyright 20 20 Ccngage Lea rning. AU Rights Re.served. May not be copied. .sc.:mncd, a- duplicated . in wllolc or in part. Due to electronic rig hts . some th ird par ty conte nt may be s uppressed from the c Book amVor tChaptcr(s).
Edito rial review has dcc ml'd that any s uppressed co ntent docs 001 materially affec t the overnll lcaming e xpcric-nce . Cc ngagc Lctirning reserves Lhc right to remove additional content at any time if s ubsequent rights rest rict ion.~ require it.
10 CHAPTER 1 Describing Data with Graphs

• NeedaTip? Qualitative variables produce data that can be separated into categories. Hence, they are
Qualitative~ "quality" or called categorical variables, and produce categorical data. The variables gender, year,
characteristic
Quantitative~ "quantity" or and major in Example 1. l are qualitative variables that produce categoti caJ data. Here are
number some other examples:
• Political affiliation: Republican, Democrat, Independent
• Taste ranking: excellent, good, fair, poor
• Color of an M&M'S® candy: brown, yellow, red, orange, green, blue
Quantitative variables, often represented by the letter x, produce nu merical data, such
as those listed here:
• x = P1ime interest rate
• x = Number of passengers on a flight from Los Angeles to New York City
• x = Weight of a package ready to be shipped
• x = Volume of orange juice in a glass
Notice the difference in the types of numeri cal values that these quantitative variables
assume. The number of passengers, for example, can only be x = 0, l , 2, ... , whereas the
weight of a package can be any value greater than zero, or x > 0. To describe this difference,
we define two types of quantitative variables: disnete and continuous.

DEFINITION

A discrete variable can assume o nly a finite or co untable number of values.


A continuous variable can assume the infinitely many values co tTesponding to the
points on a line interval.

• NeedaTip? The name discrete refers to the discrete gaps between the possible values that the variable
Discrete~ "Ii stable" can assume. Variables such as number of family members, number of new car sales, and
Continuous ~ "unlistable"
number of defective tires returned for replacement are all examples of discrete variables. On
the other hand, variables such as height, weight, time, distance, and volume are continuous
because they can assume values at any point along a line interval. For any two values you
pick, a third value can always be found between them!

11:ffiMHIIW Identify each of the following variables as qualitative or quantitative:


l. The most frequent use of your mi crowave oven (reheating, defrosting, warming, other)
2. The number of cons umers who refuse to answer a telepho ne survey
3. The door chosen by a mouse in a maze experiment (A, B, or C)
4. The winning time for a horse tunning in the Kentucky Derby
5. The number of children in a fifth-grade class who are reading at or above grade level

• NeedaTip? Solution Variables 1 and 3 are both qualitative because only a qua lity or character-
Discrete variables often involve istic is measured for each individual. T he ca tegories for these two variables are s hown
the "number of" items in a set.
in parentheses. The other three variables are quantitative. Variables 2 and 5 are discrete
variables that can be any of the values x = 0, I, 2, ... , w ith a max imum value depending on
the number of consumers called or the number of children in the class, respectively. Variable
4, the winning time for a Kentucky Derby horse, is the only continuous variab le in the list.
The winning time, if it could be measured w ith sufficient accuracy, could be 12 1 seconds,
12 1.5 seconds, 12 1.25 seconds, or any values between any two times we have listed.

Copyright 2:020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc,:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some third par ty conte nt may be s uppressed from the cBook and'or tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming expcric-nce. Ccngagc Lctirning reserves Lhc right to rcmO\lc additional content at any time if subsequent rights rcsLrict ion.~ require it.
1.1 Variables and Data 11

Why worry about different kinds of vaiiables (shown in Figure 1.2) and the data that they
generate? Because different types of data require different methods for description, so that
the data can be presented clearly and understandably to your audience!

Figure 1.2
Data
Types of data

Qualitative Qua ntitative

Discrete Continuous

1.1 EXERCISES
The Basics 13. Number of consumers in a poll of 1,000 who
Experimental Units Define the experimental units for consider nutritional labeling on food products to be
the variables described in Exercises 1-5. important

1. Gender of a s tudent 14. Number of boating accidents along an 80-kiJometer


stretch of the Colorado River
2. Number of errors on a midterm exam
15. Time required to complete a questionnaire
3. Age of a cancer patient
16. Cost of a head of lettuce
4 . Number of flowers on an azalea plant
5. Color of a cai· entering a parking lot 17. Number of brothers and sisters you have

Qualitative or Quantitative? Are the variables in 18. Yi eld of wheat (in tonnes) from a one-hectare plot
Exercises 6- 9 qualitative or quantitative? Populations or Samples? In Exercises 19-22,
6. Amount of time it takes to assemble a simple determine whether the data collected represents a
puzzle population or a sample.
7. Number of students in a first-grade classroo m 19. A researcher uses a statewide database to determine
the percentage of Michigan drivers who have had an
8. Rating of a newly elected politician (excellent, good,
accident in the last 5 years.
fair, poor)
20. One thousand citizens were interviewed
9. State in which a person lives
and their opinions regarding g un co ntrol were
Discrete or Continuous? Are the variables in Exercises recorded.
10- 18 discrete or continuous ?
21 . Twenty animals are put on a new diet and their
10. Population in a ce11ain area of the United States weight gain over 3 months is recorded.
11. Weight of newspapers recycled on a single day
22. T he income distribution of the top 10% of wage
12. Number of claims filed with an insurance company earners in the United States is determined using data
during a single day from the Internal Revenue Service.

Copyright 2020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc,:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some th ird par ty conte nt may be s uppressed from the cBook and'o r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content tit any time if s ubsequent rights rest rict ion.~ require it.
12 CHAPTER 1 Describing Data with Graphs

Applying the Basics 25. Voter Attitudes You are a candidate for your state
legislature, and you want to survey voter attitudes about
23. Parking on Campus Six vehicles selected from a
your chances of winning.
campus vehicle database are shown in the table.
a. What is the population that is of interest to you and
One-way from which you want to choose your sample?
Commute Age of b. How is the population in patt a dependent on time?
Distance Vehicle
Vehicle Type Make Carpool? (kilometers) (years) 26. Cancer Survival Times A researcher wants to esti-
mate the survival time of a cancer patient after a course
1 Car Honda No 37.8 6
2 Car Toyota No 27.5 3 of radiation therapy.
3 Truck Toyota No 16.2 4 a. What is the variable of interest to the researcher?
4 Van Dodge Yes 50.7 2
5 Motor- Harley- No 40.8 b. Is the variable in part a qualitative, quantitative dis-
cycle Davidson crete, or quantitative continuous?
6 Car Chevrolet No 8.6 9 c. What is the population of interest?
d. How could the researcher select a sample from the
a. What are the experimental units? population?
b. List the variables that are being measured. What e. What problems might occur in sampling from this
types are they? population?
c. Is this univariate, bivariate, or multivariate data? 27. New Teaching Methods A researcher wants to know
24. Past U.S. Presidents A data set gives the ages at whether a new way of teaching reading to deaf students is
death for each of the 38 past presidents of the United working. She measures a student's score on a reading test
States now deceased. before and after being taught using the new method.
a. Is this data set a population or a sample? a. What is the variable being measured? What type of
b. What is the variable being measured? variable is it?
c. Is the variable in part b quantitative or b. What is the experimental unit?
qualitative? c. What is the population of interest?

. , . Graphs for Categorical Data


After the data have been collected, they can be consolidated and summarized to show the
following information:
• What values of the variable have been measured
• How often each val ue has occurred
First, construct a statistical table and then use it to create a graph called a data
distribution. The type of graph you choose depends on the type of variable you have
measured.
When the variable of interest is qualitative or categorical, the statisti cal table is a list
of the categories along with a meas ure of how often each value occun-ed. You can measure
" how often" in three different ways:
• The frequency, or number of measurements in each category
• The relative frequency, or proportion of measurements in each category
• The percentage of measurements in each category

Copyright 2020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc,:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some third party conte nt may be s uppressed from the cBook amVor tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves Lhc right to remove additional content at any time if s ubsequent rights restrict ion.~ require it.
1.2 Graphs for Categorical Data 13

If you let n be the total number of meas urements in the set, you can find the relative
frequency and percentage us ing these relations hips:

. Frequency
Relative frequency=-~-~
n
Percent= 100 X Relative frequency

The sum of the frequencies is always n, the sum of the relative frequencies is 1, and the sum
of the percentages is 100%.
When the variable is qualitative, the categories should be chosen so that
• a measurement will fall into one and only one category and
• each measurement has a category to fall into.

• NeedaTip? For example,


Three steps to a data distribution:
(1) Raw data ⇒ • To categorize meat products according to the type of meat used, you mjg ht use beef,
(2) Statistical table ⇒ chicken, seafood, pork, tw-key, other.
(3) Graph
• To categorize ranks of college faculty, you mjght use professor, associate professor,
assistant professor, instructor, lecturer, other.
The "other" category is included in both cases to allow for the possibility that a measure-
ment cannot be assigned to one of the earlier categori es.
Once the measurements have been summarized in a statistical table, you can use either
a pie chart or a bar chart to display the distribution of the data. A pie chart is the familiar
circular graph that shows how the measurements are distributed among the categories. A bar
chart shows the same distribution of meas urements among the categories, with the height
of the bar measuring how often a pru1icular category was observed.

■i:ii&iHII■
___..,______ I n a pu bli c e d ucatton
. survey, 400 schooI a d nun1strators
· . were ask e d to rate the qua 1·tty of
education in the United States. Their responses are summarized in Table 1.1. Construct a pie
chart and a bru· cha11 for trus set of data.

Solution To construct a pie chart, ass ign one sector of a circle to each category. The
angle of each sector is determmed by the proportion of measurements (or relative frequency)
in that category. Since a circle contains 360°, you can use this equation to find the angle:

Angle = Relative frequency X 360°

■ Table 1.1 U.S. Education Rating by 400 Educators


Rating Frequency
A 35
B 260
C 93
D 12
Total 400

• Need a Tip? Table 1.2 s hows the ratings along with the frequencies, relative frequencies, percentages,
Proportions add to 1. and sector angles necessary to construct the pie chart shown in Figure 1.3. W hile pie charts
Percents add to 100.
Sector angles add to 360°.
use percentages to determine the relative sizes of the "pie s lices," bar charts usually plot
frequency against the categories. A bar chart for these data is shown in Figure 1.4.

Copyright 2:020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc,:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some third party content may be suppressed from the cBook and'o r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves Lhc right to remove additional content at any time if s ubsequent rights restriction.~ require it.
14 CHAPTER 1 Describing Data with Graphs

■ Table 1.2 Calculations for the Pie Chart in Example 1.3


Frequency Relative Frequency Percent Angle
35 35/400 = .09 90/4 .09 X 360 = 32.4°
260 260/400 = .65 65% 234.0°
93 93/400 = .23 23% ll~
12 12/400 = .03 3% lQr
400 1.00 100% 360°

These two graphs look quite different. The pie chart shows the relationship of the parts to
the whole; the bar chart shows the actual quantity or frequency for each category. Since the
catego1ies in this example are ordered "grades" (A, B, C, D), we would not want to rearrange
the bars in the chart to change its shape. In a pie chart, the order of presentation is iJTelevant.

Figure 1.3 D
Pie cha rt for Example 1.3 3.0%

B
65.0%

Figure 1.4
250
Bar chart for Example 1.3
200
;,.,
g
OJ
::,
150
C"
OJ
ct 100

50

0
A B C D
Rating

•i:£U•Hill A snack size bag of peanut M&M'S candi es contai ns 2 1 candies with the colors listed in
Table l.3. The variable "color" is qualitative, so Table I .4 lists the six categories along with
a tally of the number of candies of each color. The last three columns of Table 1.4 show how
often each category occurred. Since the categories are colors and have no particular order,
you could construct bar charts with many different shapes just by reordering the bars. To
emphasize that brown is the most frequent color, followed by blue, green, and orange, we order
the bars from largest to smallest and create the bar chart in Figure l.5. A bar chart in which
the bars are ordered from largest to smallest is called a Pareto chart.

Copyright 2020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc,:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some third party content may be s uppressed from the cBook amVor tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if subsequent rights rest rict ion.~ require it.
1.2 Graphs for Categorical Data 15

■ Table 1.3 Raw Data: Colors of 21 Candies


Brown Green Brown Blue
Red Red Green Brown
Yellow Orange Green Blue
Brown Blue Blue Brown
Orange Blue Brown Orange
Yellow

■ Table 1.4 Statistical Table: M&M's Data for Example 1.4


Category Tally Frequency Relative Frequency Percent
Brown 11111 6 6/21 28%
Green Ill 3 3/ 21 14
Orange Ill 3 3/21 14
Yellow II 2 2/21 10
Red II 2 2/21 10
Blue 1111 5 5/ 21 24
Total 21 100%

Figure 1.5 6
Pareto chart for
Example 1.4 5

G' 4
5
::, 3
! 2

0
B rown Blue Green Orange YeUow Red
Col or

1.2 EXERCISES
The Basics 4. Groups of People Fifty people are grouped into four
Pie and Bar Charts The data in Exercises 1- 3 represent categories-A, B, C, and D-and the number of people
different ways to classify a group of 100 students in a who fall into each category is shown in the table:
statistics class. Construct a bar chart and pie chart to
describe each set of data. Category Frequency
1. 2. A 11
Final Grade Frequency St atus Frequency B 14
C 20
A 31 Freshman 32
D 5
B 36 Sophomore 34
C 21 Junior 17 a. Construct a pie chart to describe the data.
D 9 Senior 9
Grad Student 8 b. Construct a bar chart to describe the data.
F 3
c. Does the shape of the bar chart in patt b change
3. depending on the order of presentation of the four
College Frequency
categories? Is the order of presentation important?
Humanities, Arts, & Sciences 43
d. What proportion of the people are in category B, C,
Natural/Agricultural Sc iences 32
Business 17 or D?
Other 8 e. What percentage of the people are not in category B?

Copyright 2020 Ccngage Learning. All Rights Re.served. ti.fay not be copied. .scanned, a duplicated. in wllolc or in part. Due to electronic rights . some t.hird party content may be suppressed [mm the cBook and'o r tChaptcr(s).
Edllorial review has dccmOO that any suppressed content docs not tlllltcrially affect the ovcrnll lcaming cxpcri'ncc. Ccngagc Learning reserves the right to remove additional content at any time ifsubscqucnt rights rest rict ion.~ require it.
16 CHAPTER 1 Describing Data with Graphs

5. Jeans A manufacturer of jeans has plants in 9. Draw a bar chart to describe the approval rating of
California, Arizona, and Texas. Twenty-five pairs of jeans Barack Obama based on age.
are randomly selected from the computerized database,
10. What affect, if any, do the variables of gender, race,
and the state in which each is produced is recorded:
age, and party affiliation have on the approval ratings?
CA AZ AZ TX CA
11. Want to Be President? In an opinion poll con-
CA CA TX TX TX
AZ AZ CA AZ TX ducted by ABC News, nearly 80% of the teens said
CA AZ TX TX TX they were not interested in being the president of the
CA AZ AZ CA CA United States.2 When asked "What's the main reason
a. Use a pie chart to describe the data. you would not want to be president?" they gave the
responses as follows:
b. Use a bar chart to describe the data.
c. What proportion of the jeans are made in Texas? Other career plans/ no interest 4Do/o
Too much pressure 2 0%
d. What state produced the most jeans in the group?
Too much work 15%
e. If you want to find out w hether the three plants pro- Wouldn't be good at it 14%
duced equal numbers of jeans, how can you use the Too much arguing 5%
charts from parts a and b to help you? What conclu-
sions can you draw from these data? a. Are all of the reasons accounted for in this table?
Add another category if necessary.
b. Would you use a pie chart or a bar chart to graphi-
Applying the Basics cally describe the data? Why?
Presidential Popularity After the elections of 2016, a c. Draw the chart you chose in part b.
poll was taken to study the approval ratings for past
d. If you were the person conducting the opinion poll,
presidents George W Bush and Barack Obama. The
what other types of questions might you want to
poll, involving 1,009 U.S. adults 18 years or older
investigate?
living in the United States and the District of Columbia,
gives approval ratings by gender, race, age, and
party JD. 1 Use this data for Exercises 6-10. g Facebook Fanatics The social networking site
g Facebook has grown rapidly in the last JO
Category George W. Bush Barack Obama DSOlOl years. The following table shows the average

U.S. Adults 59 63 number of daily users (in millions) as it has grown


Gender from 2010 to 2017 in different regions in the world. 3
Men 56 60 Use this data for Exercises 12-15.
Women 60 66
Race Region 2010 2017
White 64 55 United States/Canada 99 183
Nonwhite 47 82 Europe 107 271
Age Asia 64 453
18 to 34 42 75 Rest of the world 58 419
35 to 54 64 62 Total 328 1,326
55+ 65 55
Source: Company repartsl2017 as of 02
Party ID
Republicans 82 22 12. Use a pie chart to describe the distribution of
Independents 56 65
Democrats 41 95 average daily users for the four regions in 2017.
13. Use a bar chart to describe the distribution of
6. Draw a bar chart to describe the approval rating of average daily users for the four regions in 2010.
George W. Bush based on party ID.
14. Use a bar chart to describe the distribution of
7. Draw a bar chart to describe the approval rating of average daily users for the four regions in 2017.
George W. Bush based on age. 15. How wou ld you describe the changes in the
8. Draw a bar chart to describe the approval rating of distribution of average daily users during this 7-year
Barack Obama based on party ID. period?

Copyright 2:020 Ccngage Learning. AU Rights Res erved. May not be copied, .sc,:mncd, a- duplicated. in wllolc or in part. Due to electronic rig hts . some third party content may be suppressed from the eBook and'o r tChaptcr(s) .
Editorial review has deemed that any suppressed content docs 001 materially affect the overnll lcaming cxpcric-nce. Cengagc Lctirning reserves the right to remove additional content at any time if s ubsequent rights restrict ion.~ require it.
1.3 Graphs for Quantitative Data 17

16. Back to Work How long does it take you to adjust Share of World Diamond
to your normal work routine after corning back from Revenues ~ ~ " - 20% _

:f-- <- _"" V r:::,,_1


, """'"
!\@r--;:;%
vacation? A bar graph with data from a USA Today
~
snapshot is shown here:
a. Are all of the opinions accounted for in the table?
Add another category if necessary.
~~!w~~,~~
~
\\
\ ,.,I
b. Is the bar chatt drawn accurately? That is, are the
three bars in the correct proportion to each other? ~:~t:b 10%
~~
V- 10%
.
South Africa
c. Use a pie cha1t to describe the opinions. Which Ango Ia
graph is more interesting to look at? Source: Kimberley Process

17. Draw a pie chart to describe the various shares of


Adjustment from Vacation the world's diamond revenues.
40
40 18. Draw a bar chart to describe the various shares of
the world's diamond revenues.
30
"
00 19. Draw a Pareto chart to describe the various shares
'"
i: 19 of the world's diamond revenues.
"~ 20
Q.. 20. Which of the charts is the most effective in describ-
10 ing the data?
r!!ffl 21. Car Colors The most popular colors for compact
0 W and sports cars in a recent year are given in the table.5
D50102

Color Percentage Color Percentage


Silver 14 White/white 21
pearl
Diamonds Are Forever! Much of the world's diamond
Black/black
mining industry is located in Africa, Russia, and effect 21 Beige/brown 4
Canada. A visual representation of the various shares Gray 17 Yellow/gold 2
of the world's diamond revenues, adapted from Time Blue 9 Green
Magazine,4 is shown as follows. Use this information to Red 11 Other <1
answer the questions in Exercises 17- 20. Source: The World Almanac and Book of Facts 2017

Use an appropriate graph to describe these data.

- Graphs for Quantitative Data


Quantitative variables measure an amount or quantity on each experimental unit. If the
variable can take only a finite or countable number of values, it is a discrete variable. A
variable that can take on the infinite number of values corresponding to points on a line
interval is called continuous.

■ Pie Charts and Bar Charts


Sometimes a quantitative vari able mig ht be measured on different segments of the popu-
latio n, or for different categori es of classification. For example, you mi ght measure the
average inco mes for people of different age groups, different genders, or living in differ-
ent geographic areas of the country. In such cases, you can use pie charts or bar charts to
describe the data, us ing the amount measured in each category. The pie chart displays how
the total quantity is distributed among the categories, and the bar chart uses the height of
the bar to display the amount in a particular category.

Copyrig ht 2:0 20 Ccngage Learning. AU Rights Re.served. May not be copied, .sc,:mncd, a- duplicated . in wllolc or in part. Due to clcclmnic rig hts . some th ird par ty content may be s uppressed from the e Book amVo r tChaptcr(s).
Editorial review has deemed that any s uppressed content docs 001 materially affect the overnll lcaming expcric-nce. Cengagc Lctirning reserves Lhc right to remove additional content at any time ir s ubsequent rig hts rest rict ion.~ require it.
18 CHAPTER 1 Describing Data with Graphs

■J:QMtHIIW The amount of money expended in fiscal year 2016 by the U.S. Department of Defense in
various categories is shown in Table 1.5.6 Use both a pie chart and a bar chart to describe the
data. Compare the two fo1ms of presentation.

■ Table 1.5 Expenses by Category


Category Amount ($ billions)
Military personne l 138.6
Operation and maintenance 244.4
Procurement 11 8.9
Research and development 69.0
Military construction 6.9
Other 2.5

Total 580.3

Solution Two variables are being measured: the category of expenditure (qualitative)
and the amount of the expenditure (quantitative). The bar chart in Figure 1.6 displays the
categories on the horizontal axis and the amounts on the vertical axis.

Figure 1.6
250
Bar chart for Example 1.5 "'
-~ 200
ai
~ 150

§ 100
0
E
<i: 50

For the pie chart in Figw-e 1. 7, each "pie slice" represents the proportion of the total
expenditures ($580.3 billion) corresponding to its particular category. For example, for the
research and development category, the angle of the sector is
69
.O X 360° = 42.8°
580.3

Figure 1.7 Military


Pie chart for Exa mple 1.5 construction
6.9

Research and
development
69.0

Operation and
maintenance
244.4

Copyright 2020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc.:mncd, a- duplicated. in wllolc or in part. Due to electronic rig hts . some third party conte nt may be s uppressed from the c Book amVor tChaptcr(s).
Editorial review has dccml'd that any s uppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubsequent rights restrict ion.~ require it.
1.3 Graphs for Quantitative Data 19

Both graphs show that the largest amounts of money were spent on personnel and opera-
tions. Since there is no particular order to the categories, you are free to rearrange the bars
or sectors of the graphs in any way you like. The shape of the bar chart has no bearing on
its interpretation.

■ Line Charts
When a quantitative variable is recorded over time at equally spaced intervals (such as daily,
weekly, monthly, quarterly, or yearly), the data set forms a time series. Time series data are
most effectively presented on a line chart with time as the horizontal axis. The idea is to
try to find a pattern or trend that will likely continue into the future, and then to use that
pattern to make accurate predictions for the immediate future.

l!:tMUIIIM In the year 2025, the oldest ''baby boomers" (born in 1946) will be 79 years old, and the oldest
"Gen Xers" (born in 1965) wilJ be 2 years from Social Security eligibility. How wi ll this affect
the consumer trends in the next 40 years? WiU there be sufficie nt funds for "baby boomers" to
collect Social Security benefits? The United States Bureau of the Census gives projections for
the portion of the U.S. population that will be 85 and over in the coming years, as shown in
Table 1.6. 5 Use a line chart to illustrate the data. What is the effect of stretching and shrinking
the vertical axis on the line chart?

■ Table 1.6 Population Growth Projections


Year 2020 2030 2040 2050 2060
85 and over (millions) 6.7 9.1 14.6 19.0 19.7

Source: The World Almanac and Book of Facts 2017, p. 618

• NeedaTip? Solution The quantitative variable "85 and over" is measured over four time intervals,
Beware of stretching or shrinking creating a time series that you can graph with a line chart. The time intervals are marked
axes when you look at a graph! on the ho rizontal axis and the projections o n the vertical axis. The data points are then con-
nected to form the line charts in Figure 1.8. Notice the difference in the vertical scales of
the two graphs. Shrinking the scale on the vertical axis causes large changes to appear small,
and vice versa. To avoid misleading conclusions, look carefulJy at the scales of the verti -
cal and horizontal axes. However, from both graphs you get a clear picture of the steadily
increasing number of those 85 and older over the next 40 years.

Figure 1.8
Line charts for Example 1.6

20.0 ~
100
"'
C:
"'
C: 17.5 .!2
.!2 80
15.0 -~
i~
12.5
~
"O
60

"
"O
0 40
0 10.0 "O
"O
a a 20
"'
00
7.5 "'
00

5.0
2020 2030 2040 2050 2060
2020 2030 2040 2050 2060
Year
Year

Copyright 2020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc.:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some th ird par ty conte nt may be s uppressed from the cBook amVo r tChaptcr(s).
Editorial review has dceml'd that any suppressed content docs 001 materially affect the overnll lcaming expcric-nce. Cengagc Lctirning reserves the right to remove additional content at any time if s ubsequent rights rest rict ion.~ require it.
20 CHAPTER 1 Describing Data with Graphs

■ Dotplots
Many sets of quantitative data consist of numbers that cannot easily be separated into cat-
egories or intervals of time. You need a different way to graph this type of data!
The simplest graph for quantitative data is the dotplot. For a small set of meas ure-
ments-for example, the set 2, 6 , 9, 3, 7, 6--you can simply plot the measurements as points
on a horizontal axis, as shown in Figure I .9(a). For a large data set, however, s uch as the
one in Figure 1.9(6), the dotplot can be hard to interpret.

Figure 1 .9 (a)
Dotplots for small and large • • •• • •
data sets 2 3 4 s 6 7 8 9
Small Set
(b) .•
• • •• •• •• • • ••
• • • •••• •• • •• • • • •• ••• •• • • ••
0.98 I.OS 1.12 1.19 1.26 1.33 1.40 1.47
Large Set

■ Stem and Leaf Plots


Another simple way to display the distribution of a quantitative data set is the stem and leaf
plot. This plot uses the actual numerical values of each data point.

@ Need to Know...
How to Construct a Stem and Leaf Plot
1. Divide each measurement into two parts: the stem and the leaf.
2. List the stems in a column, with a vertical line to their right.
3. For each m easw·ement, record the leaf pmtion in the sam e row as its
corresponding stem.
4. Order the leaves from lowest to highest in each stem.
5. Provide a key to your stem and leaf coding so that the reader can re-create the
actual meas urements if necessary.

•J:OMiHIM Table 1.7 1.ists the prices (in dollars) of 19 different brands of walk.ing shoes. Use a stem and
leaf plot to display the data.

■ Table 1.7 Prices of Walking Shoes


90 70 70 70 75 70
65 68 60 74 70 95
75 70 68 65 40 65
70

Solution To create the stem and leaf, divide each observati on between the ones and the
tens place. The number to the left is the stem; the number to the right is the leaf. T hus, for
the shoes that cos t $65, the stem is 6 and the leaf is 5. The stems, rang ing from 4 to 9, are
Iis ted in Figure 1.10, along with the leaves for each of the 19 measurements. If you indicate
that the leaf un.it is 1, the reader will reali ze that the stem 6 and the leaf 8, for example,
represent the number 68, recorded to the nearest dollar.

Copyright 2020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some third par ty conte nt may be s uppressed from the cBook amVo r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming expcric-nce. Ccngagc Lctirning reserves Lhc right to remove additional content at any time if subsequent rights rcsLrict ion.~ require it.
1.3 Graphs for Quantitative Data 21

Figure 1.10 4 0 Leafunit = 1 4 0


Stem and leaf plot for the 5 5
data in Table 1.7 6 580855 IReordering l~ 6 055588
7 0005040500 7 0000000455
8 8
9 05 9 05

• NeedaTip? Sometimes the available stem choices result in a plot that contains too few stems and
Stem I Leaf a large number of leaves within each stem. In this situation, you can stretch the stems by
dividing each one into several lines, dependi ng on the leaf values assigned to them. Stems
are usually divided in one of two ways:
• Into two lines, w ith leaves 0-4 in the first line and leaves 5- 9 in the second line
• Into five lines, with leaves 0-1, 2-3, 4-5, 6-7, and 8- 9 in the five lines, respectively

The data in Table 1.8 are the weights at bi1th of 30 full-term babies, born at a metropolitan
hospital and recorded to the nearest tenth of a pound. 7 Construct a stem and leaf plot to display
the disl:!ibution of the data.

■ Table 1.8 Birth Weights of 30 Full-Term Newborn Babies


7.2 7.8 6.8 6.2 8.2
8.0 8.2 5.6 8.6 7.1
8.2 7.7 7.5 7.2 7.7
5.8 6.8 6.8 8.5 7.5
6.1 7.9 9.4 9.0 7.8
8.5 9.0 7.7 6.7 7.7

Solution T he data, though recorded to an accuracy of o nly one decima l place, are
measurements of the co ntinuous variable x = weight , which can take on any positive value.
If you scan the data in Table 1.8, yo u will f ind that the highest and lowest weights are
9.4 and 5.6, respectively. But how are the remaining weights distributed? If yo u use the
decimal point as the dividing line between the stem and the leaf, you have only five stems,
which does not give you a very good picture. W hen you divide each stem into two lines,
there are eight stems, because the first l.ine of stem 5 and the second lin e of stem 9 are
empty! T his is a more descriptive plot, as shown in Figure 1. 11 . For these data, the leaf
unit is . l , and the reader can tell that the stem 8 and the leaf 2, for exampl e, represent the
measurement x = 8.2.

Figure 1.11 5 86 5 68
Stem and leaf plot for the 6 12 6 12
data in Table 1.8 6 8887 !ReorderingI ~ 6 7888
7 221 7 122
7 8 795775 8 7 7 557777 889
8 0222 8 022 2
8 5 65 8 556
9 040 Leaf unit = .1 9 0 04

If you tu m the stem and leaf plot sideways, so that the vertical Line is now a horizo ntal
ax.is, you can see that the data have "piled up" or been "disu·ibuted" alo ng the ax.is in a pat-
tern that can be described as " mo und-shaped"- like a pile of sand on the beach. Thi s plot
again shows that the weights of these 30 newborns range between 5.6 and 9.4; many weights
are between 7.5 and 8.0 pounds.

Copyright 2:020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc,:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some third party content may be suppressed from the cBook and'or tChaptcr(s).
Edito rial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming c xpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubsequent rights restriction.~ require it.
22 CHAPTER 1 Describing Data with Graphs

■ Interpreting Graphs with a Critical Eye


Once you have created a graph or graphs for a set of data, what should you look for as you
attempt to describe the data?
• First, check the horizontal and vertical scales, so that you are clear about what is
being measured.
• Look at the location of the data distribution. Where on the horizontal axis is the
center of the distribution? If you are comparing two distributions, are they both
centered in the same place?
• Look at the shape of the distribution. Does the distribution have one "peak," a
point that is higher than any other? If so, this is the most frequently occurring
measurement or category. Is there more than one peak? Are there an approximately
equal number of measurements to the left and right of the peak?
• Look for any unusual measurements or outliers. That is, are any measurements
much bigger or s maller than all of the others? These outliers may not be
representative of the other values in the set.

Distributions are often described according to their shapes.

DEFINITION

A distribution is symmetric if the left and right sides of the distribution, when divided
at the middle value, form mirror images.
A distiibution is skewed to the right if a greater proportion of the measurements lie to
the right of the peak value. Disttibuti ons that are skewed right contain a few unusually
large measurements.
A distribution is skewed to the left if a greater proportion of the measw·ements lie to the left of
the peak value. Dist:ti butions that are skewed left contain a few unusually small measurements.
A distribution is unimodal if it has one peak; a bimodal distribution has two peaks.
Bimodal distributions often represent a mixture of two different populations in the data set.

•tlM•HID Look at the four dotplots shown in Figure 1.12. Describe the locations and shapes of these
distributions.
Figure 1.12 •
• ••
Shapes of data distributions
• •
for Example 1.9 •• •• •• •• ••
•• •• •• •• •• •• ••
2 3 4 5 6 7

• • • •

• ••• •
• •• ••• •
• •••
•• •• •• •• •• •• ••
2 3 4 5 6 7

••
•• •• •
•• •• ••• •• ••
•• •• •• •• • •• • •
2 3 4 5 6 7 8 9

Copyright 2020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc,:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some third par ty conte nt may be s uppressed from the cBook amVo r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if subsequent rights rest rict ion.~ require it.
1.3 Graphs for Quantitative Data 23

•• ••• ••

• •• •• •• ••
• • • • • •
• • • • • • • • •
2 3 4 5 6 7 8 9

• NeedaTip? Solution The first dotplot shows a relatively symmetric distribution with a single peak
Symmetric~ mirror images located at x = 4. If you were to fold the page at this peak, the left and right halves would
Skewed right~ long right tail
Skewed left~ long left tail almost be mirror images. Sometimes this shape is called "mound-shaped," because the data
points seem to pile up like a mound of sand. The second dotplot is also symmetric, but it is
flat or " uniform" rather than mound-shaped. The third dotplot, however, is far from sym-
metric. It has a long "right tail ," meaning that there are a few unusuall y large observati ons.
If you were to fold the page at the peak, a larger proportion of measurements would be on
the right s ide than on the left. This distribution is skewed to the right. Similarl y, the fourth
dotplot with the long " left tail" is skewed to the left.

EXAMPLE 1.10 An administrative assistant for the athletics department at a local university is monit01ing the
GPAs for eight members of the women's volleyball team. He enters the GPAs into the database
but accidentally misplaces the dec imal point in the last entry.

2.8 3.0 3.0 3.3 2.4 3.4 3.0 .21

Use a dotplot to describe the data and uncover the assistant's mistake.

Solution The dotplot of this small data set is shown in Figure l . 13(a). You can clearly see
the outlier or unusua l observation caused by the assistant's data entry erro r. Once the error
has been corrected, as in Figure l . 13(b), you can see the correct distribution of the data set.
Since this is a very s mall set, it is hard to describe the shape of the distribution, although it
seems to have a peak value around 3.0 and it appears to be relatively symmetric.

Figure 1.13 (a)


Distributions of G PAs for
••
Example J .10
• • • • • •
0.5 1.0 1.5 2.0 2.5 3.0 3.5
GPAs

(b) .•
• • • • • •
2.2 2.4 2.6 2.8 3.0 3.2 3.4
GPAs

• NeedaTip? When comparing graphs created fo r two data sets, you should compare their scales
Outliers lie out, away from the of measurement, locations, and shapes, and look for unusual measurements or outliers.
main body of data.
Remem ber that outl iers are not always caused by errors or incorrect data entry. Some-
times they provide very valuable information that should not be ignored. You may need
to inves tigate whether an outli er is a valid measurement that is simply unusually large or
small, or whether there has been some sort of mistake in the data collection. If the scales
differ widely, be careful about making comparisons or drawing conclusions that mi ght be
inaccurate !

Copyright 2:020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc,:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some th ird par ty conte nt may be s uppressed from the cBook and'or tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves Lhc right to rcmO\lc additional content at any time if s ubsequent rights rest rict ion.~ require it.
24 CHAPTER 1 Describing Data with Graphs

1.3 EXERCISES
The Basics 11. Describe the shape of the distribution. Do you see
Dotplots Construct a dotplot for the data given in any outliers?
Exercises 1-2. Describe the shape of the distribution 12. Compare the dotplot and the stem and leaf plot. Do
and look for any outliers. they convey roughly the same information?
1. 2.0, 1.0, 1.1, 0.9, 1.0, 1.2, 1.3, 1.1, 0.9, 1.0, 0.9, Line Charts Construct a line chart to describe the data
1.4, 0.9, 1.0, 1.0 and answer the questions in Exercises 13-14.
2. 53,61,58,56,58,60,54,54,62,58,60,58,56, 13. Navigating a Maze A psychologist measured the
56, 58 length of time it took for a rat to get through a maze on
Stem and Leaf I Construct a stem and leaf each of 5 days. Do you think that any learning is taking
plot for these 50 measurements and answer the place?
0S0103
questions in Exercises 3- 5.
Day 11 2 3 4 5
3.1 4.9 2.8 3.6 2.5 4.5 3.5 3.7 4.1 4.9 Time (seconds) 43 46 32 25
2.9 2.1 3.5 4.D 3.7 2.7 4.0 4.4 3.7 4.2
3.8 6.2 2.5 2.9 2.8 5.1 1.8 5.6 2.2 3.4 ml'l'.ffl 14. Measuring over Time A quantitative vari-
2.5
4.3
3.6
5.7
5.1
3.7
4.8
4.6
1.6
4.0
3.6
5.6
6 .1
4.9
4.7
4.2
3.9
3.1
3.9
3.9
m able is measured once a year for a 10-year
oso,os period. What does the line chart tell you about
3. Describe the shape of the distribution. Do you see the data?
any outliers?
Year Measurement Year Measurement
4. Use the stem and leaf plot to find the smalJest
61.5 6 58.2
observation. 2 62.3 7 57.5
S. Find the eighth and ninth largest observations. 3 60.7 8 57.5
4 59.8 9 56.1
ml'l'.ffl Stem and Leaf II Use the following set of data to 5 58.0 10 56.0
m
0S0104
answer the questions in Exercises 6-8.

4.5 3.2 3.5 3.9 3.5 3.9


Applying the Basics
4.3 4 .8 3.6 3.3 4.3 4.2 15. Cheeseburgers Create a dotplot for the number
3.9 3.7 4.3 4.4 3.4 4.2 of cheeseburgers eaten in a given week by 10 college
4.4 4.0 3.6 3.5 3.9 4.0
students.
6. Draw a stem and leaf plot, using the number in the
4 5 4 2 1
ones place as the stem. 3 3 4 2 7
7. Draw a stem and leaf plot, using each number in the a. How would you desc1ibe the shape of the
ones place twice to form the stems. distributio n?
8. Does the stem and leaf plot in Exercise 7 improve b. What proportion of the students ate more than 4
the presentation of the data? cheeseburgers that week?
Comparing Graphs A discrete variable can take on only 16. Test Scores The test scores on a l 00-point
the values 0, 1, or 2. Use the set of 20 measurements on test were recorded for 20 students.
050106
this variable to answer the questions in Exercises 9- 12.
2 0 2 61 93 91 86 55 63 86 82 76 57
2 1 0 0 94 89 67 62 72 87 68 65 75 84
2 2 1 0
a. Use a stem and leaf plot to describe the data.
0 2
b. Describe the shape and location of the scores.
9. Draw a dotplot to describe the data.
c. Is the shape of the distribution unusual? Can you
10. How could you define the stem and leaf for this think of any reason that the scores would have such a
data set? Draw the stem and leaf plot. shape?

Copyright 2:020 Ccngage Learning. AU Rights Res erved. May not be copied, .sc,:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some third par ty conte nt may be s uppressed from the cBook and'o r tChaptcr(s).
Editorial review has deemed that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves Lhc right to rcmO\lc additional content at any time if s ubsequent rights rest rict ion.~ require it.
1.3 Graphs for Quantitative Data 25

mlil 17. RBC Counts The re d blood cell count of a Number of Calories
W healthy person was m easured on each o f
DSOTO? 15 days. The number recorde d is measured in
mi llio ns of cells per microliter (µL).

5.4
5.3
5.2
5.4
5 .0
5.2
5.2
5. 1
5.5
5.3
•26
Hershey 's
■ 0
53
Oreo
140
350mL
8
145
350ml
~
330
Slice of a large
e 800
Burger King
Kiss cookie can of bottle of Papa John's Whopper
5.3 4.9 5.4 5.2 5.2 Coke Budweiser beer pepperoni pizza wilh cheese

a. U se a ste m a nd leaf plot to describe the da ta. a. Do the sizes, heights, and volumes of the six items
b. D escribe the shape and location of the red blood cell accurately represent the number of calo1ies in the item?
counts. b. Draw an actua l bar chart to describe the number of
c. If the person' s red blood cell count is measured calori es in these six food favorites .
today as 5.7 milli on cell s pe r mi croliter, would this ml7l 21. Education Pays Off! Educatio n pays off,
be unusual? What conc lusio ns might you draw? llil according to some data fro m the Bureau of Labor
ml7l 18. Calcium Contents The calcium content (Ca) DSOllO Statistics. The media n weekly earnings and the

W of a pow dered mineral substa nce was a na lyzed unemployment rates fo r e ight differe nt levels of educa-
DSOlOS 10 times w ith the following percent comp osition tion are shown in the tabl e. 10
recorded. Median
Educational Unemployment Usual Weekly
2.71 2.82 2.79 2.8 1 2.68 Attainment Rate(% ) Earnings($)
2.71 2.81 2.69 2.75 2.76
Do cto ral d egree 1.6 1,664
Professio nal deg ree 1.6 1,745
a. D raw a dotplot to describe the data. (HINT: The
Mast er's d egree 2.4 1,380
scale of the hori zonta l ax is should range from 2.60 Bachelor's d eg ree 2.7 1,156
to 2.90.) Associate degree 3.6 8 19
Som e college, no d eg ree 4.4 756
b. Draw a ste m a nd leaf plot for the da ta. Use the num-
Hig h school diploma 5.2 692
be rs in the hundredths and thousandths pl ace as the Less than a high school 7.4 504
stem. d iploma
c. Are any of the measureme nts inconsistent with the
Note: Dat a are for persons age 25 and over. Earnings are for full-time
other measure ments, indicating that the technic ia n wage and salary workers
may have made a n error in the ana lysis ? Source: Current Pop ulat ion Survey, U.S. Department of Labor, U.S.
Bureau of Labor St atist ics, April 20, 2017
ml7l 19. Aqua Running A qua running has been sug-
a. Draw a bar chart to desc1i be the unemployment rates
W gested as a method of exercise for injured ath-
as they vary by educati on level.
Dsoio9 letes and others who want a low-impact aerobics
program . A study reported in the Journal of Sports b. Draw a bar chart to descti be the media n weekly
Medicine reported the heart rates of 20 healthy volun- earnings as they vary by education leve l.
teers at a cade nce of 96 steps per minute.8 The data are c. Summaii ze the informati on us ing the graphs in parts
listed here: a and b.

87 109 79 80 96 95 90 92 96 98
ml7l 22. Organized Religion Statistics of the world's
101 91 78 1 12 94 98 94 107 81 96 llil religions are only approximate, because many
DS011l religions do not keep track of their membership
a. Construct a stem and leaf plot to describe th.e data. numbers. An estimate of these nu m bers (in millions) is
b. Discuss the ch aracteristics of the da ta distribution. shown in the table. 11
20. The Great Calorie Debate Wa nt to lose weight? You Members Members
can do it by cutting calories, as long as you get enough Religion (millions) Religion (millions)
nutritional value from the foods that you do eat! Here Bud dhism 376 Judaism 14
is a pic ture s howing the numbe r of ca lories in so me of Ch ri stianit y 2,100 Sikhism 23
Hinduism 900 Chinese t rad it ional 394
Ame1ica's favorite foods adapted fro m an article in The
Isl am 1,500 Other 61
Press-Enterprise.9 Primal indigeno us 4 00
and African
t rad it ional

Copy right 2020 Cengage Lea rning. AU Rights Reserved. ti.fay not be copied, .sc,:mncd, a- duplicated. in wllok or in part. Due to electronic rig hts. some third party conte nt may be suppressed from the e Book amVor cChaptcr(s).
Edito rial review has deemed that any s uppressed content docs not llllltcrially affec t the overnll le aming e xperience. Ccngagc Learning reserve-. the right to remove additional conte nt at any time i r s ub.sequent rights rest rict ion.~ require it.
26 CHAPTER 1 Describing Data with Graphs

a. Use a pie chait to describe the total membership in 24. Top 20 Movies The table that follows shows
the world's organized religions. the weekend gross ticket sales for the top 20
0S0113 movies for the weekend of August 25-28, 2017 12 :
b. Use a bar chart to describe the total membership in
the world's organized religions.
c. Order the religious groups from the smallest to the Weekend Weekend
Gross Gross
largest number of members. Use a Pareto chart to
Movie ($ millions) Movie ($ millions)
describe the data. Which of the three displays is the
most effective? 1. The Hitman's $10.3 11. Girl'sTrip $2.4
Bodyguard 12. The Nut Job 2:
2. Annabelle 7.7 Nutty by Nature 2.3
l!!'m'I23. Hazardous Waste How safe is your Creation 13. Despicable Me 3 1.8
W neighborhood? Are th ere any hazardous waste 3. Leap! 4.7 14. The Dark Tower 1.7
DS0112 sites nearby? The table and the stem and leaf 4. Wind River 4.6 15. WonderWoman 1.7
plot show the number of hazardous waste s ites in 5. Logan Lucky 4.2 16. All Saints 1.5
each of the 50 s tates and the District of Columbia in 6. Dunkirk 3.9 17. Kidnap (2017) 1.5
7. Spiderman 2.8 18. The Glass Castle 1.4
2016. 5
Homecoming 19. Baby Driver 1.2
8. Birth of the 2.7 20. War for the Planet 0.9
AL 15 HI 3 MA 33 NM 16 SD 2 Dragon of the Apes
AK 6 ID 9 Ml 67 NY 87 TN 17 9. Mayweather 2.6
AZ 9 IL 49 MN 25 NC 39 TX 53 vs. McGregor
AR 9 IN 40 MS 9 ND 0 UT 18 10. The Emoji 2.5
CA 99 IA 13 MO 33 OH 43 VT 12 Movie
co 21 KS 13 MT 19 OK 8 VA 31
CT 15 KY 13 NE 16 OR 14 WA 51 a. Draw a stem and leaf plot for the data. Describe the
DE 14 LA 15 NV 1 PA 97 WV 10
DC 1 ME 13 NH 21 RI 12 WI 38 shape of the distribution and look for outliers.
FL 54 MD 21 NJ 115 SC 25 WY 2 b. Draw a dotplot for the data. Which of the two graphs
GA 17 is more informative? Explain.
Source: The World Almanac and Book of Facts 2017, p. 335
l!!'m'I
25. American Presidents The folJowing table
a. Describe the shape of the distribution. Identify the W lists the ages at the time of death for the 38
050114
unus ually la rge measurements marked " HI" by deceased American presidents from George
state. Washington to Ronald Reagan 5:
b. Can you think of a reason why these states would Washington 67 Arthur 57
have a large number of hazardous waste sites? What J. Adams 90 Cleveland 71
other variable might you measure to help explain Jefferson 83 B. Harrison 67
why the data behave as they do? Madison 85 McKinley 58
Monroe 73 T. Roosevelt 60
J.Q. Adams 80 Taft 72
Jackson 78 Wilson 67
Van Buren 79 Harding 57
Stem-and -leaf of Hazardous Waste N = 51 W. H. Harrison 68 Coolidge 60
6 0 011223 Tyler 71 Hoover 90
12 0 689999 Polk 53 F. D. Roosevelt 63
21 022333344 Taylor 65 Truman 88
(9) 555667789 Fillmore 74 Eisenhower 78
21 2 111
18 2 55 Pierce 64 Kennedy 46
16 3 133 Buchanan 77 L. Johnson 64
13 3 89 Lincoln 56 Nixon 81
11 4 03 A.Johnson 66 Ford 93
9 4 9 Grant 63 Reagan 93
8 5 134 Hayes 70
5 5
5 6 Garfield 49
5 6 7 a. Before you graph the data, think about the
Leaf Unit - 1 distribution of the ages at death for the presidents.
HI 87, 97, 99, 115 What s hape do you think it will have?

Copyrig ht 2:0 20 Ccngage Learning. AU Rights Re.served. May not be copied, .sc,:mncd, a- duplicated . in wllolc or in part. Due to electronic rig hts . some third party conte nt may be s uppressed from the c Book and'o r tChaptcr(s).
Editorial review has deemed that any s uppressed content docs 001 materially affec t the ovcrnll lcaming cxpcric-ncc . Ccngagc Lctirning reserves Lhc right to rcmO\lc additional content at any time if s ubsequent rig hts restrict ion.~ require it.
1.4 Relative Frequency Histograms 27

b. Draw a stem and leaf plot for the data. Describe the d. Three of the fi ve youngest have o ne thing in com-
shape. Does it sw-prise you? mo n. What is it?
c. The five youngest presidents at the time of d eath
appear in the lower "tail" of the distribution. Identify
these five presidents.

. . Relative Frequency Histograms


A relative frequency his togram resembles a bar chart, but it is used to graph quantitative
rather than qualitati ve data. The data in Table 1.9 are the birth weights of 30 ful I-term new-
born babies, reproduced fro m Example 1.8 and show n as a dotplot in Figure l .14(a). First,
divide the interval from the smal lest to the largest measurements into subinter vals or classes
of equal length. If you stack up the clots in each subinterval (Figure l. l 4(b)), and draw a
bar over each stack, you will have created a frequency histogram or a relative frequency
histogram, depending o n the scale of the vertical axis.

■ Table 1.9 Birth Weights of 30 Full-Term Newborn Babies


7.2 7.8 6.8 6.2 8.2
8.0 8.2 5.6 8.6 7.1
8.2 7.7 7.5 7.2 7.7
5.8 6.8 6.8 8.5 7.5
6.1 7.9 9.4 9.0 7.8
8.5 9.0 7.7 6.7 7.7

I.
Figure 1.14 (a)
How to construct a ,. I 1. . = 1· :I ·I=- I
• • T• • I I
histogram 6.0 6.6 7.2 7.8 8.4 9.0
Birth Weights

(b)

I I I I I I I I
6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5
Birth We ights

Here are some definitions and terms that are commonly used when constructing relative
frequency histograms.

DEFINITIONS

• A class is a subinterval created when you divide up the interval from the smallest to
the largest measurement.
• The class boundaries are the numbers that create the upper and lower li mits of the
class.
• The class width is the difference between the upper and lower class boundaries.
• The class frequency is the number of measurements falli ng into that particular class.
• A relative frequency histogram for a quantitative data set is a bar graph in whi ch the
height of the bar shows "how often" (measured as a proportion or relative frequency)
meas urements fall into each su binterval or class. The classes or subintervals are plot-
ted along the horizo ntal axis.

Copyright 2:020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc,:mncd, a- duplicated. in wllolc or in part. Due to electronic rig hts . some third par ty content may be s uppressed from the cBook and'o r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves Lhc right to rcmO\lc additional content at any time if s ubsequent rights rest rict ion.~ require it.
28 CHAPTER 1 Describing Data with Graphs

The way in which you create the classes or subintervals is a matter of personal choice.
However, as a rule of thumb, the number of classes s hould range from 5 to 12; the more
data available, the more classes you need. Choose the classes so that each measurementfalls
into one and only one class. You can use Table I. 10 as a guide for selecting the approximate
number of classes for a particular data set. Remember though, that this is only a guide; you
may use more or fewer classes if it makes the graph more descriptive.

■ Table 1.10 Choosing the Number of Classes


Sample Size 25 50 100 200 500
Number of Classes 6 7 8 9 10

For the bi 1th weights in Table 1.9, we decided to use eight intervals of equal length. Since
the total span of the birth weights is

9.4-5.6=3.8

the minimum class width necessary to cover the range of the data is (3.8 + 8)=.475.
For convenience, we round this approximate width up to .5. Beginning the first inter-
val at the lowes t value, 5.6, we form subintervals from 5.6 up to but not including 6.1,
6.1 up to but not including 6.6, and so on. By using the method of left inclusion , and
including the left class boundary point but not the right boundary point in the class, we
eliminate any confusion about where to place a meas urement that happens to fall on a
class boundary point.
Table 1.11 shows the eight classes, labeled from I to 8 for identification. The boundar-
ies for the eight classes, along with a tally of the number of measurements that fall in each
class, are also listed in the table. As with the charts in Section 1.3, you can now measure
how often each class occurs usingfrequency or relative frequency.
To construct the relative frequency histogram, plot the class boundaries along the ho1i-
zontal axis. Draw a bar over each class interval, with height equal to the relative frequency
for that class. The relative frequency histogram for the birth weight data, Figure I. 15, shows
at a glance how birth weights are distributed over the interval 5.6 to 9.4.

• NeedaTip? ■ Table 1.11 Relative Frequencies for the Data ofTable 1.9
Relative frequencies add to 1;
Class Class Relative
frequencies add ton.
Class Class Boundaries Tally Frequency Frequency

5.6to < 6.1 II 2 2/30


2 6.1 to < 6.6 II 2 2/30
3 6.6 to < 7.1 1111 4 4/30
4 7.1 to < 7.6 rm 5 5/30
s 7.6to < 8.1 rm Ill 8 8/30
6 8.1 to < 8.6 rm 5 5/30
7 8.6to < 9.1 Ill 3 3/30
8 9.1 to < 9.6 I 1/30

Copyright 2:020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc,:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some third party conte nt may be s uppressed from the cBook and'o r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if subsequent rights restrict ion.~ require it.
1.4 Relative Frequency Histograms 29

Figure 1.15
Relative frequency 8/30
histogram
>, 7/30
u
c::
6/30
"::,
0-
5/30
u::"
4/30
.::"
E 3/30
Q)
~ 2/30
l/30
0
5.6 6.1 6.6 7.1 7.6 8.1 8.6 9.1 9.6
Birth We ights

•J:IJMQIIII■ Twenty-five Starbucks® customers are polled in a marketing survey and asked, "How often do
you visit Starbucks in a typical week?" Table 1. 12 !ists the responses for these 25 customers.
Use a relative frequency histogram to describe the data.

■ Table 1.12 Number of Visits in a Typical Week for 25 Customers


6 7 5 6
4 6 4 6 8
6 5 6 3 4
5 5 5 7 6
3 5 7 5 5

Solution The variable being measured is "number of visits to Starbucks," which is a


discrete variable that takes on only integer values. In th is case, it is simplest to choose the
classes or subintervals as the integer values over the range of observed val ues: I, 2, 3, 4,
5, 6, and 7. You could w1ite the intervals as .5 to < 1.5, 1.5 to < 2.5, and so o n, but notice
that the only val ues that can actually occur are the integer values, 1, 2, ... , 8. Table 1.13
shows the classes and their corresponding frequencies and relative frequencies. T he relative
frequency histogram is s hown in Figure 1.16.

■ Table 1.13 Frequency Table for Example 1.11


Number of Visits Class Relative
to Starbucks Boundaries Frequency Frequency
1 .5 to < 1.5 .04
2 1.5 to < 2.5
3 2.5 to < 3.5 2 .08
4 3.5 to < 4.5 3 .12
5 4.5 to < 5.5 8 .32
6 5.5 to < 6.5 7 .28
7 6.5 to <7.5 3 .12
8 7.5 to < 8.5 .04

Copyright 2020 Cengage Learning. AU Rights Reserved. ~fay not be copied. scanned. a- dupliratcd. in wllolc or in part. Due to electronic rights. some third par ty conte nt may be suppressed from the cBook and'o r tChaptcr{s).
Editorial review has deemed that any suppressed content docs not llllltcrially affect the ovcrnll lcaming experience. Cengagc Lctirning reserve-. the right to remove additional content at :my time ifsubscqucnt rights rest rict ion.~ require it.
30 CHAPTER 1 Describing Data with Graphs

Figure 1.16 0.35


Relative frequency histo-
0.30
gram for Example I. I I >,
u
5 0.25
:,
0-
e
ti.
0.20
o.>
.f: 0.15
"'
-.;
0:: 0.10

0.05

0
2 3 4 5 6 7 8
Visits

The distribution is skewed to the left and there is a gap between l and 3.

0 Need to Know...
How to Construct a Relative Frequency Histogram
I. Choose the number of classes, usually between 5 and 12. The more data you
have, the more classes you should use.
2. Find the approximate class width by dividing the difference between the largest
and smallest values by the number of classes.
3. Round the approximate class width up to a convenient number.
4. If the data are discrete, you might assign one class for each integer value. For a
large number of integer values, you may need to group them into classes.
5. List the class boundaries. The lowest class must include the smallest measure-
ment. Then add the remaining classes, including the left boundary point but not
the right.
6. Build a s tatistical table containing the classes, their frequencies, and their rela-
tive frequencies.
7. Draw the histogram like a bar graph, with the class intervals on the h01izontal
axis and relative frequencies as the heights of the bars.

A relative frequency histogram can be used to describe the location and shape of a data
set, and to check for outliers. For example, the birth weight data were relatively symmetric,
with no unu sual measurements, whiJe the Starbucks data were skewed left. Since the bar
drawn above each class represents the relative frequency or propo rtion of the measurements
in that class, these heights ca n be used to calculate the following:
• The proportion of the measurements that fall in a particular class or gro up of classes
• The probability that a measurement drawn at random from the set will faJI in a par-
ticular class or group of classes
Look at the relative frequency histogram for the birth weight data in Figure 1.15. What
proportion of the newborns have birth weights of 7.6 or higher? This involves all classes
beyond 7 .6 in Table 1. 1 l. Because there are 17 newborns in those classes, the proportion
who have birth weights of 7.6 or higher is 17/30 , or approximately 57%. This is also the
percentage of the total area under the hj stogram in Figure 1. 15 that Jj es to the right of 7.6.

Copyright 2020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc.:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some th ird par ty conte nt may be s uppressed from the cBook amVo r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content tit any time if s ubsequent rights rest rict.Kin.~ require it.
1.4 Relative Frequency Histograms 31

Suppose you wrote each of the 30 birth weights on a piece of paper, put them in a hat, and
drew o ne at random. What is the chance that this piece of paper contains a birth weight of
7.6 or higher? Since 17 of the 30 pieces of paper fall in this category, you have 17 chances
out of 30; that is, the probability is 17/30. The word probability is not new to you; we will
discuss it in more de tail in Chapter 4 .
Although we are only describing the set of n = 30 birth weig hts, we mig ht a lso be
interested in the population from which the sample was draw n, which is the set of birth
weights of all babies born at this hospital. Or, if we are interested in the we ights of
newborns in general, we might consider our sample as representative of the population
of birth weights for newborns at s imilar hospitals. A sample histogram provides valu-
able inform ation about the population histogram- the graph that describes the distribu-
ti o n of the e ntire population. Rem ember, tho ugh, that differe nt samples fro m the same
populati o n will produce different histograms, eve n if you use the same class boundaries.
However, you can expect that the sa mple and population histograms will be similar. As
you add more and m ore data to the sample , the two histograms beco me more and more
alike. If you enlarge the sample to include the entire population, the two hi stograms w ill
be identical !

1.4 EXERCISES
The Basics 0.4
Graphing Relative Frequency Histograms Construct a G'
C
relative frequency histogram using the statistical tables ~ 0.3
in Exercises 1-2. How would you describe the shape of [
u..
the distribution ? ~ 0.2

1. "'
ol
CG 0.1
Class Boundaries Relative Frequency
l 00 t o < 120 .08 0 '-+----+------+---+---+---+f----l---+-
120 t o < 140 .22 30.5 3 1.0 31.5 32.0 32.5 33 .0 33.5 34.0 34.5
140 t o < 160 .49 X
160to < 180 .17
180 t o < 200 .04 3. 33 or more 4. 32 to < 33.5
2. 5. less than 3 1 6. G reater than or equal to 33.5

Number of Household Pets Frequency


7. at least 34 8. At least 31.5 but less than 33.5
0 13
1 19
Class Boundaries In Exercises 9-1 2, use the informa-
2 12 tion given to find a convenient class width. Then list the
3 4 class boundaries that can be used to create a relative
4 f requency histogram.
5 0
6 1 9. 7 classes for n = 50 measurements; minim um
value = 10; maximum value = 110
Interpreting Relative Frequency Histograms Use the
relative frequency histogram that follows to calculate 10. 6 classes for n = 20 measurements; minimum
the proportion of measurements falling into the inter- value= 25.5; maximu m value = 76.8
vals given in Exercises 3-8. Remember that the classes 11. IO classes for n = 120 measurements; minimum
include the left boundary point, but not the right. value = 0.3 1; maximum value = 1.73

Copyright 2020 Ccngage Lea rning. AU Rights Re.served. May not be copied. .sc,:mncd, a- duplicated. in wllolc or in part. Due to electronic rig hts . some third par ty conte nt may be s uppressed from the c Book amVor tChaptcr(s).
Edito rial review has dccml'd that any s uppressed content docs 001 materially affec t the overnll lcaming e xpcric-nce. Ccngagc Lctirning reserves Lhc right to remove additional conte nt at any time i f s ubsequent rights rcsLrict ion.~ require it.
32 CHAPTER 1 Describing Data with Graphs

12. 8 classes for n = 75 measurements; minimum 6 classes of width 8, and starting at 52. Then answer
value= O; maximum value = 192 the questions in Exercises 21- 23.
g Relative Frequency Histogram I Construct a 61 93 91 86 55 63 86 82 76 57
m relative frequency histogram for these 94 89 67 62 72 87 68 65 75 84
osom 50 measurements using classes starting at 1.6
with a class width of .5. Then answer the questions in 21 . Describe the sha pe and location of the scores.
Exercises 13-16. 22. Is the shape of the distribution unusual? Can you think
of any reason that the scores would have such a shape?
3.1 4.9 2.8 3.6 2.5 4.5 3.5 3.7 4.1 4.9
2.9 2.1 3.5 4.0 3.7 2.7 4.0 4.4 3.7 4.2 23. Compare the shape of the histogram to the stem
3.8 6.2 2.5 2.9 2.8 5.1 1.8 5.6 2.2 3.4 a nd leaf plot from Exercise 16, Section 1.3 . A re the
2.5 3.6 5.1 4.8 1.6 3.6 6.1 4.7 3.9 3.9
shapes roughly the same?
4.3 5.7 3.7 4.6 4.0 5.6 4.9 4.2 3.1 3.9

13. How would you describe the shape of the


Applying the Basics
distribution?
mJS!'I 24. Survival Times Altman and Bland report
14. What fraction of the measurements are less
than 5.1?
m the Sll!Vival times for patients with active hepati-
DS01l7 tis, half treated with prednisone and half receiv-

15. What is the probability that a measure ment drawn ing no treatment. 13 The data that follow are adapted
at random from this set will be greater than or equal from their data for those treated with prednisone. The
to 3.6? survival times are recorded to the nearest month:

16. What fraction of the measurements are from 2.6 up 8 87 127 147
to but not including 4.6? 11 93 133 148
52 97 139 157
Relative Frequency Histogram II Construct a relative 57 109 142 162
frequency histogram for these 20 measurements on a 65 120 144 165
discrete variable that can take only the values 0, 1, and
2. Then answer the questions in Exercises 17-20. a. Look at the da ta. Can you guess the approximate
shape of the data distributio n?
2 0 2 b. Construct a relative frequency histogram for the data.
2 1 0 0
2 2 1 0
What is the shape of the distribution?
0 2 c. Are there any outliers in the set? If so, which sur-
vival times are unusually short?
17. What proportion of the measurements are greater
25. A Recurring Illness The length of time (in
than 1?
months) between the onset of a pa1ticular illness
18. What proportion of the measurements are less D50118
and its recun-ence was recorded for n = 50 patients:
than 2?
2.1 4.4 2.7 32.3 9.9 9.0 2.0 6.6 3.9 1.6
19. If a measurement is selected a t random from the 14.7 9.6 16.7 7.4 8.2 19.2 6.9 4.3 3.3 1.2
20 measure ments s hown, what is the probability that it 4.1 18.4 .2 6.1 13.5 7.4 .2 8.3 .3 1.3
is a 2? 14.1 1.0 2.4 2.4 18.0 8.7 24.0 1.4 8.2 5.8
1.6 3.5 11.4 18.0 26.7 3.7 12.6 23.1 5.6 .4
20. D esc1ibe the shape of the d is tribution. Do you see
a ny o utlie rs? a. Construct a relative frequency histogram for the data.

Test Scores 17ie test scores on a JOO-point test b. Would you describe the sha pe as roughly symmetric,
skewed right, or skewed left?
were recorded for 20 students. Construct a rela-
D50116 tive frequency distribution for the data, using c. Find the fraction of rec urrence times less than or
equal to 10 months .

Copyrig ht 2:0 20 Ccngage Learning. AU Rights Re.served. May not be copied, .sc,:mncd, a- duplicated . in wllolc or in part. Due to electronic rig hts . some third par ty conte nt may be s uppressed from the c Book and'o r tChaptcr(s).
Editorial review has deemed that any s uppressed content docs 001 materially affec t the overnll lcaming expcric-nce . Ccngagc Lctirning reserves Lhc right to remove additional content at any time if s ubsequent rig hts rest rict ion.~ require it.
1.4 Relative Frequency Histograms 33

26. Preschool The ages (in months) at which Year Name Average
50 children were first enrolled in a preschool are 2000 Todd Helton .372
0S0119 listed as follows. 2017 Charlie Blackmon .331
1917 Edd Roush .341
38 40 30 35 39 40 48 36 31 36 1934 Paul Waner .362
47 35 34 43 41 36 41 43 48 40 1911 Honus Wagner .334
32 34 41 30 46 35 40 30 46 37 1898 Willie Keeler .379
55 39 33 32 32 45 42 41 36 so 1924 Roger Hornsby .424
42 so 37 39 33 45 38 46 36 31 1963 Tommy Davis .326
1992 Gary Sheffield .330
1954 Willie Mays .345
a. Construct a relative frequency histogram for these 1975 Bill Madlock .354
data. Strut the lower boundai·y of the first class at 1958 Richie Ashburn .350
30 and use a class width of 5 months. 1942 Ernie Lombardi .330
1948 Stan Musial .376
b. What proportion of the children were 35 months 1971 Joe Torre .363
or older, but less than 45 months of age whe n first 1996 Tony Gwynn .353
enrolled in preschool? 1961 Roberto Clemente .351
1968 Pete Rose .335
c. If one child were selected at random from this group 1885 Roger Connor .371
of children, what is the probability that the child 2009 Hanley Ramirez .342
was less than 50 months old when first enrolled in
preschool? a. Use a relative frequency histogram to describe the
batting averages for these 20 champions.
ml7l 27. How Long ls the Line? To decide on the
b. If you were to random Iy choose o ne of the 20 names,
U number of serv ice counters needed for stores to
what is the chance that you would choose a player
osono be built in the future, a s upermarket chain gath-
whose average was above .400 for his championship
ered information on the length of time (in minutes)
year?
required to service customers, using a sample of
60 customers' service times, shown here: m!7l 29. Ages of Pennies We collected 50 pennies
U and recorded their ages, by calculating AGE =
3.6 1.9 2.1 .3 .8 .2 1.0 1.4 1.8 1.6 osom CURRENT YEAR - YEAR ON PENNY·
1.1 1.8 .3 1.1 .5 1.2 .6 1.1 .8 1.7
1.4 .2 1.3 3.1 .4 2.3 1.8 4.5 .9 .7 5 9 2 20 0 25 0 17
.6 2.8 2.5 1.1 .4 1.2 .4 1.3 .8 1.3
1 4 4 3 0 25 3 3 8 28
.8 .9 3.1 2.2 5 21 19 9 0 5 0 2 1 0
1.1 1.2 1.0 .7 1.7 1.1
1.6 1.9 5.2 .5 1.8 .3 1.1 .6 .7 .6 0 1 19 0 2 0 20 16 22 10
19 36 23 0 17 6 0 5 0

a. Construct a relative frequency histogram for the a. Before drawing any graphs, what do you think the dis-
supermarket service times. tribution of penny ages will look like? WtJI it be mound-
b. Describe the shape of the distribution. Do you see shaped, symmetric, skewed right, or skewed left?
any outliers? b. Draw a relative frequency histogram to display the
penny ages. How would you describe the shape of
c. A ss uming that the outlie r s in this data se t are
the distribution?
va lid observations, how would you explain
them to the management of the s uperma rket ml7l 30. Ages of Pennies, continued The data
ch ain ? U here represent the ages of a different set of
osom 50 pennies, again calculated using AGE =
ml7l 28. Batting Champions The officials of major CURRENT YEAR - YEAR ON PENNY.
U league baseball have crowned a batting cham-
41 9 0 4 3 0 3 8 21 3
osom pion in the National League each year since
2 10 4 0 14 0 25 12 24 19
1876. A sample of winning batting averages is listed in 3 1 14 7 2 4 4 5 1 20
the table 14: 14 9 3 5 3 0 8 17 16 0
0 7 3 5 23 7 28 17 9 2

Copyright 2:020 Ccngage Learning. AU Rights Reserved. May not be copied, .sc,:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some third party conte nt may be suppressed from the cBook and'o r tChaptcr(s).
Editorial review has deemed that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubsequent rights restrict ion.~ require it.
34 CHAPTER 1 Describing Data with Graphs

a. Draw a relative frequency hi stogram to display these 10/105


penny ages. Is the shape similar to the shape of the
relative frequency histogram in Exercise 29? i,'
C:
OJ
::,
b. Are there any unusually large or small measurements [
in the set? u.. 5/105

ffl'l'l7ll 31. Windy Cities Are some cities more windy


m than others? Does Chicago deserve to be nick-
osoii4 named "The Windy City"? These data are the
average wind speeds (in kilometers per hour) for 152.5 160.0 167.5 175.0 182.5 190.0
54 selected cities in the United States5 : Heights

13.1 12.2 15.4 11.0 11.2 12.0 18.1 12.0 12.5 a. Describe the shape of the distribution.
11.2 18.4 16.8 16.5 11.8 56.2 16.0 14.9 12.6
13.3 16.5 15.8 11.8 12.5 11.4 14.9 12.3 16.3
b. Do you see any unusual feature in this hi stogram?
11.7 13.3 15.7 15.2 13.4 12.8 9.8 14.6 14.4 c. Can you think of an explanation for the two peaks in
9.9 12.6 15.2 9.8 16.3 10.6 12.6 13.4 18.4 the histogram? Is there something that is causing the
15.0 15.8 7.0 10.6 15.5 15.7 12.8 17.0 13.6 heights to mound up in two separate peaks? What is it?
Source: World Almanac and Book of Facts 2017, p. 343 ffl'l'l7ll 33. Starbucks Students at the University of
a. Construct a relative frequency histogram for the
m California, Riverside (UCR), along w ith many
DSOl2 6
other Californians love their Starbucks! The dis-
data. (HINT: Choose the class boundaries with- tances in kilometers from campus for the 39 Starbucks
out including the value x = 56.2 in the range of stores within 16 kilometers of UCR are shown here 15:
values.)
b. The value x = 56.2 was recorded at Mt. Washington, 0.6 1.0 1.6 1.8 4.5 5.8 5.9 6.1 6.4 6.4
7.0 7.2 8.5 8.5 8.8 9.3 9.4 9.8 10.2 10.6
New Hamps hire. Does the geography of that city 11.2 12.0 12.2 12.2 12.5 13.0 13.3 13.8 13.9 14.1
explain the observation? 14.1 14.2 14.2 14.6 14.7 15.0 15.4 15.5 15.7
c. The average wind speed in Chicago is recorded as
a. Construct a relative frequency histogram to describe
15.8 kilometers per hour. Do you think this is unusu-
the distances from the UCR campus, usi ng 8 classes
ally windy?
of width 2, starting at 0.0.
ffl'l'l7ll 32. Student Heights The self-reported heights b. What is the shape of the histogram? Do you see any
m of 105 students in a biostatistics class are unusual features?
osom described in the relative frequency histogram c. Can you explain why the histogram looks the way it
shown here. does?

As you continue to work through the exercises in this chapter, you will get better at rec-
ognizing different types of data and at choosing the best graph to use. Remember that the
type of graph you use is not as important as the interpretation that accompani es the picture.
L ook for these important features:
• Location of the center of the data
• Shape of the distribution of data
• Unus ual observations in the data set
Using these features to guide you, you can interpret and compare sets of data using
graphs , which are o nly the first of many statistical tools that you will soon have to
work with.

Copyrig ht 2:0 20 Ccngage Learning. AU Rights Re.served. May not be copied, .sc,:mncd, a- duplicated . in wllolc or in part. Due to electronic rig hts . some third party conte nt may be s uppressed from the c Book and'o r tChaptcr(s).
Editorial review has deemed that any s uppressed content docs 001 materially affec t the ovcrnll lcaming cxpcric-ncc . Ccngagc Lctirning reserves the right to remove additional content at any time if s ubsequent rig hts rest rict ion.~ require it.
Technology Today 35

CHAPTER REVIEW
Key Concepts 2. Quantitative data
a. Pie and bar charts
I. How Data Are Generated
b. Line charts
1. Experimental units, variables, and measurements
c. Dotplots
2. Samples and populations
d. Stem and leaf plots
3. Univariate, bivariate, and multivariate data
e. Relative frequency histograms
II. Types of Variables
3. Describing data distributions
1. Qualitative or categorical
a. Shapes-symmetric, skewed left, skewed
2. Quantitative
right, unin10dal, and bimodal
a. Discrete
b. Proportion of measurements in certain
b. Continuous intervals
Ill. Graphs for Univariate Data Distributions c. Outliers

I. Qualitative or categorical data


a. Pie charts
b. Bar charts

TECHNOLOGY TODAY
Introduction to Microsoh Excel
MS Excel is desig ned for a variety of applications, including statistical applications. We
will assume that you are fami liar with Windows, and that you know the basic techniques
necessary for executing commands from the tabs, groups, and drop-down menus at the
top of the screen. If not, perhaps a lab or teaching assistant can help you to master the
basics. The current version of MS Excel at the time of this printing is Excel 2016, used
in the Windows IO environment. When the program opens, a sp1·eadsheet appears (see
Figure 1. 17), containing rows and columns into which you can enter data. Tabs at the bot-
tom of the screen identify the spreads heets availab le for use; when multiple spreadsheets
are saved as a co llectio n, these s preadsheets are called a workbook.

Figure 1.17 m::11 .... .


_.,
b,I ¢. . ,..,, . ,.....

1,Lr ln-.t"'f1 P.M)l" l,l'f".->ul l1>rm1IJ\ (l,11,1 J(.-.._ V- l~<>P("I fh-·lp Q ,• ' ~ ':-,h,1""

• ~ C.,1 bt • 11 ~ IP Cond1bonll fOf!Nl"OQ • T lnsen• l: . !• .

--
- -
·••E::· s ."' '
u• i' Ool<I< • 'i' • P ·
11· ,.·
., ..
-
B I ,/fOf!Nt .. - •
•.,J Forrnlt.
,...
~
"' · A ·

'7· ♦ ~ ..JC<llS,ytos•
sc,,,, , •dibno
Al J.

A 8 C D G H K
•c=J
2
3
~

-
6
Sheet I (±)
I!!! Ill I!!: -
+ ·-

Copyright 2020 Cengage Learning. AU Rights Res erved. ti.fay not be copied. .sc.:mncd, a- duplicated. in wllolc or in part. Due to e lectronic rights. some third par ty co ntent may be s uppressed from the cBook amVo r cChaptcr(s).
Edllorial review has dcemOO that any suppressed content docs not tnlltcrially affect the ovcrnll leaming cxpcriC'ncc. Ccngagc Lctirning reserve. tlr right to remove additional content at any time if .subsequent rights rest rict ion.~ require it.
36 CHAPTER 1 Describing Data with Graphs

Graphing with Excel


Pie charts, bar charts, and line charts can all be created in MS Excel. Data is entered into
the spreadsheet, including labels if needed. Highlight the data to be graphed, and then click
the chart type that you want o n the Insert tab in the Charts group. Once the chart has been
created, it can be edited in a variety of ways to change its appearance.

ii:iN&iHINFI
•••-"••••--•- <Pie and Bar Charts) The qualitative vaiiable "class status" has been recorded for each of
IOS students in an introductory statistics class, and the frequencies are recorded in Table 1.14.
■ Table 1.14 Status of Students in Statistics Class
Status Freshman Sophomore Junior Senior Grad Student
Frequency 5 23 32 35 10

l. Enter the categories into column A of the first s preadsheet and the frequencies into
column B. You sho uld have two columns of data, including the labels.
2. Highlight the data, using your left mouse to click-and-drag from ce ll A I to cell B6
(sometimes written as Al:B6). Click the Insert tab and click on the Pie icon in the
Charts group. In the drop-down list, you will see a variety of styles to choose fro m.
Select the first option in the 2D Pie section to produce the pie chart. Double-click on the
title "Frequency" and change the title to "Student Status."
3. Editing the pie chart: Once the chart has been created, use your mouse to make sure that
the chart is selected- a box with round handles will appear around the cha.it. You should
see a green area above the tabs marked "Chart Tools." Click the Design tab, and look at
the drop-down lists in the Chart Layouts and Chart Styles groups. These lists allow you
to alter the appearance of your cha.it. In Figure l.l 8(a), the pie chart has been changed
using Layout 6 in Quick Layout (in the Charts Layo ut group) so that the percentages
are shown in the appropriate sectors and the legend is on the right. By clicking on the
legend, we have dragged it so that it is closer to the pie chart.

Figure 1.18
(a)
(b)
Student Status
Student Status

preshman
1• Sophomore
Junior
l !
~ 25
C
~ 20
40
35

30

15

-
Senior 10

• Grad Student 5

0
Fr.shman Sophomore JunlOr Senior Grad Sludent
Sta1us

4. Click on vario us parts of the pie chart (legend, chart area, sector) and a box with round
handles will appeai·. Double-click, and a format menu w ill appear on the right side of
the screen. You can adjust the appeai·ance of the selected object or regio n in this m enu.
When you are done, click the "X" in the upper right corner to close the format menu.

Copyright 2020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc:mncd, a- duplicated. in wllolc or in part. Due to e lectronic rights . some third par ty content may be s uppressed from the cBook and'o r tChaptcr(s).
Editorial review has dceml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcriC'ncc. Cengagc Lctirning reserves Lhc right to remove additional content at any time ir s ubsequent rights rest rict ion.~ require it.
Technology Today 37

5. Still in the Design section, but in the Type group, click on Change Chart Type and
choose the simplest Column type. Click OK to create a bar cha.rt for the same data set.
6. Editing the bar chart: Again, you can experiment with the various options in the Chart
Layouts and Chart Styles groups to change the look of the cha1t. You can click the entire
bar chart ("chart area") or the interior ("Plot area") to stretch the cha.rt. You can change
colors by double-clicking on the appropriate region. We have chosen a des ig n using
Layout 9 in Quick Layout (in the Charts Layout group) that allows axis titles (we have
edited them) and have deleted the "frequency legend entry." We have decreased the gaps
between the bars by right-clicking on one of the bars, selecting Format Data Series,
and changing the Gap Width to 50%. The edited bar chart is shown in Figure I. 18(b).

ii:INMHIIJI
__,_,,_______ (Line Charts) The Dow Jones Industrial Average was monitored at the close of trading for
10 days in a recent year, with the results shown in Table 1.15.

■ Table 1.15 Dow Jones Industrial Average


Day 2 3 4 5 6 7 8 9 10
DJIA 21,479 21,478 21,320 21,414 21,409 21 ,532 21,553 21,638 21,630 21,575

1. Click the tab at the bottom of the screen marked"+ " to open a new spreadsheet. Enter
the Day into column A and the DJ/A into column B. You sho uld have two columns of
data, including the labels.
2. Highlight the DJIA data in column B, using your left mouse to click-and-drag from cell
Bl to cell B 11 (sometimes written as Bl:Bll). Click the Insert tab and click on the
Line icon in the Charts group. Select the first option in the 2D Line list to produce the
line chart.
3. Editing the line chart: Again, you can experiment with the various options in the Charts
Layout and Chart Styles groups to change the look of the chart. We have chosen a
desig n (Layo ut 10 from Quick Layout) that allows titles o n both axes, which we have
changed to ''Day" and "DTIA," and we have deleted the title and the legend on the right
side. The line chart is shown in Figure 1.19.

Figure 1.19
21700

21600

21500

! 21400

21300

21200

21100
3 • s 6 8 , 10
Day

4. Note: If your time series involves time periods that are not equally spaced, it is better to
use a scatteq>lot with points connected to form a li ne chart. This procedure is described
in the Technology Today sectio n in Chapter 3 of the text.

Copyright 2020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc.:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some third party content may be s uppressed from the cBook amVo r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming expcric-nce. Ccngagc Lctirning reserves Lhc right to rcmO\lc additional content at any time if s ubsequent rights rcsLrict ion.~ require it.
38 CHAPTER 1 Describing Data with Graphs

EXAMPLE 1.14 (Frequency Histograms) The top 50 over-the-counter (OTC) stocks in a recent year were
found using an equal weighting of I-year total return and average daily doll ar volume growth.
These 50 weightings, or averages, are shown here 16:

■ Table 1.16 Rankings of top 50 OTC Stocks


1395.27 807.73 515.74 305.59 245.39 176.81 143.70 113.52 95.75 83.44
1196.82 780.16 392.52 297.60 231.13 166.09 142.82 112.60 88.73 80.85
11 47.05 729.44 374.27 268.97 195.94 165.73 135.82 105.74 88.38 78.67
1138.47 642.91 350.20 258.30 194.91 152.97 135.47 105.10 85.02 78.20
832.23 598.51 350.13 246.19 189.79 14 7.95 124.06 103.15 83.48 76.48

l. Many of the statistical procedures that we will use in this textbook require the instal-
lation of the Analysis ToolPak add-in. To load this add-in, click File ► Options
► Add-ins.At the bottom of the dialog box, click on Go,just to the right of the Manage
Excel Add-ins drop-down list. Select Analysis ToolPak, Analysis ToolPak VBA and
cl ick OK.
2. Click the tab at the bottom of the screen marked "+", to add a new spreadsheet. Enter
the data into the first column and include the label "Average" in the fi rst cell.
3. Excel refers to the maxim um value fo r each class interval as a bin. This means that
Excel is using a method of right inclusion, which is sli ghtly different fro m the method
presented in Section 1.4. For this exam ple, we choose to use the class intervals 0- 150,
> 150--300, > 300-450, etc. Enter the bin values (150, 300, 450, ... , 1500) into the
second column of the spreadsheet, labeling them as "Bins" in cell Bl.
4. Select Data ► Data Analysis ► Histogram and click OK. The Histogram dialog box
will appear, as shown in Figure 1.20.

---
Figure 1.20

... _
-·-
Cit
S&Sl1AUI t
IHl:11111 t

@--
0-

"""""- IHI t
o--..""
o -wo,-
□, ..... _,. ........
o~,.,.(tf'Qft
g""""""""
5. Hig hlight o r type in the appropria te Input Range and Bin Range for the data . Notice
tha t you can click the minimize button t on the rig ht of the box befo re you click- I I
and-drag to highlight. Click the minirni ze butto n again to see the entire dialog box. The
Inpu t Range will appear as $A$! :$A$5 l , with the do llar sign indicating an absolute
cell range. Make sure to cli ck the "Labels" and "Chart Output" check boxes. Pick a
co nvenient cell locatio n for the output (we picked E 1) and click OK. T he frequency
table and histogra m will appear in the spreadsheet. T he hi stogram (Figure 1.2 1(a))
doesn' t appear quite Like we wanted.
6. Editing the histogram: Click on the frequency legend entry and the histogram title and
press the Delete key. Then select the Data Series by double-clicking on a bar. In the fo r-
mat menu that appears, change the Gap Width to 0% (no gap) and click the "X" in the
top right corner to close the menu. Stretch the graph by dragging the lower right corner,
and edit the colors, ti tle, a nd labels if necessary to finis h your histogram, as shown in
Figure l .2l (b). Remember that the numbers shown along the horiz ontal axis are the bins,
the upper Limit of the class interval, not the midpoint of the interval.

Copyright 2:020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc:mncd, a- duplicated. in wllolc or in part. Due to electronic rig hts . some third par ty conte nt may be s uppressed from the cBook and'or tChaptcr(s).
Edito rial review has dceml'd that any s uppressed conte nt docs 001 materially affec t the overnll lcaming e xpcric-nce. Cengagc Lctirning reserves the right to remove additional conte nt at any time if s ubsequent rights rest rict ion.~ require it.
Technology Today 39

Figure 1.21 (a) (b)

25

Histogram
30
>
u
~ 20
6-
..
QI
u..
10
0 lh. -. ~ ■ Frequency

•-
I I I

0
1/'1
0
0
0
1/'1
0
0
\0
0
1/'1
..... m
0
0
0 0
1/'1 0
0
0
1/'1
('()
0
0 ...0
QI
rl ('() s:i- N 1/'1
rl rl rl rl ~
150 300 450 600 750 900 1050 UDO 1350 1500 Mo<e
Bins Averap

7. You can save your Excel workbook for use at a later time using File ► Save or File
► Save As and nami ng it "Chapter l."

TECHNOLOGY TODAY
Introduction to MIN/TAB™
MIN/TAB computer software is a Windows-based program designed specifically for statistical
applications. We will assume that you are familiar with Windows, and that you know the
basic techniques necessary for executing commands from the tabs and drop-down menus at
the top of the screen . If not, perhaps a lab or teaching assistant can help you to master the
basics. The current version of MIN/TAB at the time of this printing is MIN/TAB 18, used in the
Windows 10 environment. When the program opens, the main screen (see Figure 1.22) is
displayed, co ntaining two windows: the Data window, similar to an Excel spreadsheet, and
the Session window, in which your results will appear. Just as with MS Excel, MIN/TAB allows
you to save works heets (similar to Excel spreadsheets), projects (collections of worksheets),
or graphs.

Figure 1.22 D M,1111ab • Unto~ed - □ X


file Ed11 Data Cale S1a1 Graph Edolot Tools Window Help Assistant
~

8 ~ 0 0 0 O a f.r
I .:J • I .. + ~ : I ..:J X '-.. ,c I

0 Session

(ii]
• c, CZ C3 C4 cs C6 C7 ca

1
2
3
4
Current Y/o(l(Sheet Volof1<shttt 1

Copyrig ht 20 20 Cengage Lea rning. AU Rights Reserved. ti.fay not be copied. .scanned, a- duplicated . in wllolc or in part. Due to electronic rig hts. some third party content may be s uppressed from the c Book amVor tChaptcr{s) .
Edllo rial review has deemed that any s uppressed co ntent docs 001 llllltcrially affec t the ovcrnll lcaming c xpcriC'ncc . Ccngagc Lctirning reserve. tOC right to remove additional content at any time if s ubsequent rights restrict ion.~ require it.
40 CHAPTER 1 Describing Data with Graphs

Graphing with MIN/TAB


All of the graphical methods that we have discussed in this chapter can be created in
MIN/TAB. Data is entered into a MIN/TAB worksheet, with labels entered in the gray cells just
below the column name (Cl, C2, etc.) in the Data window.

EXAMPLE 1.15 (Pie and Bar Charts) The qualitative variable "class status" has been recorded for each of
105 students in an introductory statistics class, and the frequencies are shown in Table 1.17.
■ Table 1.17 Status of Students in Statistics Class
Status Freshman Sophomore Junior Senior Grad Student
Frequency s 23 32 35 10

1. Enter the categories into column CI, with your own descriptive name, perhaps "Status"
in the gray cell. Notice that the name Cl has changed to Cl-T because you are enter-
ing text rather than numbers. Enter the five numerical frequencies into C2, naming it
"Frequency."
2. To construct a pie chart for these data , click on Graph ► Pie Chart, and a Dialog
box will appear (see Figure 1.23). Click the radio button marked Chart values from a
table. Then place your cursor in the box marked "Categorical variable." (1) Highli ght
Cl in the list at the left and choose Select, (2) double-click on Cl in the list at the left,
or (3) type Cl in the "Categorical variable" box. Similarly, place the cursor in the box
marked "Summary variables" and select C2. Click Labels and select the tab marked
Slice Labels. Check the boxes marked " Category names" and "Percent." When you
click OK twice, MIN/TAB will create the pie chart in Figure l.24(a). We have removed
the legend by selecting and deleting it.

Figure 1.23 Pie Chart X

a Slalus (' Chart a,untS oJ unique Vllluos


0 Frequency r- Chart ltllues r.om o toble

l!)e Options... l,obols. ••

~ l e Grophs... I ot<• OptiaftS..•

Concel

3. As you become better at using the pie cha11 command, you can take advantage of some
of the options available. Once the chart is created, right-click on the pie chart and select
Edit Pie. You can change the colors and format of the chart, "explode" important sectors
of the pie, and change the order of the categories. If you right-click on the pie chart and
select Update Graph Automatically, the pie chart will automatically update when you
change the data in columns C l and C2 of the MIN/TAB worksheet.
4. To construct a bar chart, use the co mmand Graph ► Bar Chart. In the Dialog box
that appears, choose Simple. Choose an option in the "Bars represent" drop-down list,
depending on the way that the data has been entered into the worksheet. For the data
in Table 1.1 7, we choose "Values from a table" and click OK. When the Dialog box

Copyright 2020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some third par ty conte nt may be s uppressed from the cBook amVo r tChaptcr(s) .
Editorial review has dccml'd that any suppressed content docs 001 materially affect the overnll lcaming expcric-nce. Ccngagc Lctirning reserves Lhc right to remove additional content at any time if subsequent rights rest rict ion.~ require it.
Technology Today 41

Figure 1.24 (a)


t, P,e Chort of SUlln

Pie Chart of Status

(b)
t, CNnotF,_

40

30

~
i
t 20
...
10


F,........ s..,,,-.
- - I
Status
Gr-'Studlnt

appears, place your cursor in the "Graph variables" box and select C2 and select Cl in
the "Categorical variable" box. CLick OK to finish the bar chart, shown in Figure 1.24(6).
Once the cha1t is created, right-click on various parts of the bar chart and choose Edit
to change the look of the cha1t.

■J:IW1UIII@ (Line Charts) The Dow Jones lndust1ial Average was monitored at the close of tradin g for
10 days in a recent year with the results shown in Table 1.1 8.
■ Table 1.18 Dow Jones Industrial Average
Day 2 3 4 5 6 7 8 9 10
DJIA 21,479 21,478 21,320 21,414 21,409 21,532 21,553 21,638 21,630 21,575

1. Although we could simply enter this data into third and fourth co lumns of the current
worksheet, let's create a new worksh eet using File ► New ► Wo1·ksheet. E nter the
Days into column C l and the DJ/A into column C2.
2. To create the line chart, use Graph ► Time Series Plot ► Simple. In the Dialog box
that appears, place your cursor in the "Series" box and select "DJIA" from the list to
the left. Under Time/Scale, choose "Stamp" and select column Cl (''Day") in the box
labeled "Stamp Columns." C lick OK twice to obtain the line chart shown in Figure 1.25.
Copyright 2:020 Ccngagc Learning. All Rights Re.served. ti.fay not be copied. .scanned, a duplicated. in wllolc or in part. Due to electronic rights . some t.hird party content may be s uppressed [mm the cBook and'o r tChaptcr(s).
Edllorial review has dccmOO that any suppressed content docs not tnlltcrially affect the ovcrnll lcaming cxpcri'ncc. Ccngagc Learning reserves the right to remove additional content at any time ifsubscqucnt rights rcstrict Kln.~ require it.
42 CHAPTER 1 Describing Data with Graphs

Figure 1.25 't, y,_ s.n., Plol of OJIA

Time Series Plot of DJIA


216SO

21600

215SO I
21500
$
a 214150
I

21400
I
213SO

21300
10
Day

li:IM•Hllfl {Dotplots, Stem and Leaf Plots, Histograms) The top 50 over-the-counter (OTC) stocks in a
recent year were found using an equal weighting of I-year total return and average daily dol-
lar volume growth. T hese 50 weightings, or averages, are listed in Table 1. 19. 16 Create a new
worksheet (File ► New ► Worksheet). Enter the data into column C I and name it "Average"
in the gray cell just below the Cl.
■ Table 1.19 Rankings of Top SO OTC Stocks
1395.27 8D7.73 515.74 3D5.59 245.39 176.81 143.70 113.52 95.75 83.44
1196.82 780.16 392.52 297.60 231.13 166.09 142.82 112.60 88.73 80.85
1147.05 729.44 374.27 268.97 195.94 165.73 135.82 105.74 88.38 78.67
1138.47 642.91 350.20 258.30 194.91 152.97 135.47 105.10 85.02 78.20
832.23 598.51 350.13 246.19 189.79 147.95 124.06 103.15 83.48 76.48

l . To create a dotplot, use Graph ► Dotplot. In the Dialog box that appears, choose One Y
► Simple and click OK. To create a stem and leaf plot, use Graph ► Stem-and-Leaf.
For either graph, place yow- cursor in the "Graph variables" box, and select "Average"
from the list to the left (see Figure 1.26).
Figure 1.26 Do1plot One Y, S,mpte X

1-
Cl Avtrogt liBPl>vonablts:
I

I
~ ... I Lal,tls...
I
M<,lllplt Crlpl,s... I otia Opuons. ..
I
·.:!:ti
I
Http
I QI(
I canott
I
2. You can choose from a variety of formatting options before clicking OK. The dotplot
appears as a graph, while the stem and leaf plot appears in the Session window. To pri nt
either a Graph window or the Session window, click on the window to make it active
and use File ► Print Graph (or Print Session Window).

Copyright 2020 Ccngage Learning. AU Rights Reserved. ti.fay not be copied. scanned, a- duplicated. in wllolc or in part. Due to electronic rights. somc third party content may be s uppressed [mm the cBook amVor tChaptcr(s).
Edllorial review has dccmOO that any suppressed content docs not tnlltcrially affect the ovcrnll lcaming experience. Ccngagc Learning reserve. the right to remove additional content tit any time if s ubsequent rights rest rict ion.~ require it.
Technology Today 43

3. To create a his togram, use Graph ► Histogram. In the Dialog box that appears,
choose Simple and click OK, selecting "Average" for the "Graph variables" box. Select
Scale ► Y-Scale Type and click the radio button marked "Frequency." (You can edit
the histogram later to show relati ve frequencies.) Click OK twice. Once the histogram
has been created, right-click on the y-axis and choose Edit Y-Scale. Under the tab
marked "Scale," you can click the radio button marked "Pos ition of ticks" and type in
0 5 10 15 20. Then click the tab mar ked "Labels," the radio button marked "Specified"
and type 0 5/50 10/50 15/50 20/50. Click OK. This will reduce the nu mber of ticks
o n the y-axis and c hange them to re lative frequenc ies. Finally, double-click on the
word "Frequency" alo ng the y-axis. C hange the box mar ked "Text" to read "Relative
Freque ncy" and click OK.
4. To adjust the type of boundaries for the histogram, right-click o n the bars of the his-
togram and choose Edit Ba1·s. Use the tab marked "Binning" to choose e ither "Cut-
points" or "Midpoints" fo r the hi stogram; you ca n specify the cutpoint o r midpoint
pos itions if you want. In this same Edit box, yo u can cha nge the colors, fill type, and
fo nt style of the histogram. If you right-click on the bars and select Update Graph
Automatically, the histogra m will auto matically update when you c hange the data in
the "Average" column.
As you become more fa miliar with MIN/TAB, you can explore the vari ous optio ns avail-
able for each type of graph. It is possible to plot more than one vari able at a time, to change
the axes, to choose the colors, and to modify graphs in many ways. However, even with
the basic default commands, it is clear that the distribution of OTC stocks in Figure 1.27 is
hig hly skewed to the right.

Figure 1.27 O Hmogram of A -

Histogram of Average

f 15/50

t.. 10/50
-~
!
5/50

Average

TECHNOLOGY TODAY
Introduction to the T/-83/84 Plus Calculators
Many of you are familiar with the T/-83 or T/-84 Plus calculators. T hese two calculators
operate in almost the same way, and can be used for many applications, including a large
number of statistical applicatio ns. When the calculator is turned o n, you will see a screen
with a blinking cursor, where you can do many of your numerical calculations. To use many
of the statistical functions however, data must first be entered into lists in the stat list editor.
Once the data has been entered, there are many different graphs that are available to you.

Copyright 2020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc.:mncd, a- duplicated. in wllolc or in part. Due to electronic rig hts . some third par ty content may be s uppressed from the cBook amVo r tChaptcr(s).
Editorial review has dccml'd that any s uppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves Lhc right to remove additional content at any time if s ubseq uent rights rest rict ion.~ require it.
44 CHAPTER 1 Describing Data with Graphs

Graphing with the T/-83/84 Plus Calculator


First, clear the calculator of any unwanted plots, functions, and drawings. Press 2nd ► stat
plot. U se the directional arrows (or press 4) to select 4:PlotsOff and press enter twice. The
calculator will respond with Done. You can turn a plot on in the stat plot menu when you
need it. Clear unwanted functions by pressing y = and using the clear button. Finally, press
2nd ► draw ► 1:Ch-Draw to clear unwanted drawings.

•i:£JMiQijlt:j (Bar Charts) The qualitative vaiiable "class status" has been recorded for each of 105 students
in an introductory statistics class, and the frequencies are shown in Table 1.20.
■ Table 1.20 Status of Students in Statistics Class
Status Freshman Sophomore Junior Senior Grad Student
Frequency 5 23 32 35 10

l. Enter the data into the stat list editor by press ing stat. The cursor (the black highlight)
should be on the EDIT command and l:Edit. .. and, when you press enter, the stat List
editor (Figure 1.28) will appear.
Figure 1.28

L1<1>=

The qualitative variable "class status" is coded num erica lly as Freshman = l ,
Sophomore= 2, ... , Grad Student= 5, and then entered into the first five rows of list
L 1 using the directio nal arrows to navigate through the table. The five frequencies are
entered into list L2.
2. To create the bar chart, press 2nd ► stat plot. The cursor will be on 1:Plot 1, press
enter. There are four choices to be made on the next screen (Figure 1.29), pressing enter
after each choice, and using the directio nal arrows to navigate the screen.

Figure 1.29 HORHAL rLDAT AUTO REAL RADIAN HP □

11m] P1ot2 P1ot3


[ml Off
TYPe: Le:. ~ ~ !II!:::~ lL
Xlist:L1
Freq :L2
Color: BLUE

Copyright 2020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some third par ty content may be s uppressed from the cBook amVo r tChaptcr(s).
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcriC'ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if subsequent rights rest rict ion.~ require it.
Technology Today 45

Choose to turn the plot On and choose the histogram Type (the third option). The Xlist
choice specifies the data for the horizontal (x) axis, in this case LI . The Ylist choice
specifies the data for the vertical (y) axis, in this case L2. Since the default value for
Ylist is I , you will need to change it using 2nd ► L2. Now press zoom and 9:ZoomStat
to see the bar chart, created automatically by the calculator, shown in Figure l.30(a).

Figure 1.30 (b)

WINDOW
Xmin=l
Xmax=5.5
Xscl=0.5
Ymin=-10.52415
Ymax=40.95
Yscl=59
Xres=l
oX=0.01704545454545
Trac eSteP=0. 034090909090 ...

3. Notice that there are no labels on either axis. You can understand the chart by pressing
tt·ace and moving the blinking cursor from left to ri ght with the directional arrows. The
"Freshman" bar begins at I , ends at <1.5 and has a height of 5. The gap betwee n the first
and second bars begins at 1.5 and ends at 2, where the "Sophomore" bar begi ns. That is,
the calculator is creating " left-in clusive" classes.
4. Editing the bar chart: Press window to see the exact settings for the chart (Figure l.30(b)).
Any of these setti ngs can be changed. The minimum and maximum values for the x- and
y-axes are Xmin, Xmax, Ymin, and Ymax. The width of each interval along the x- and
y-axes are Xscl and Yscl, respectively. For Figure l .30(a), the lower boundaiy of the first
class is 1, and the upper boundaiy of the last class is 5.5; each bar is 0.5 units wide, and
the y-axis has tick marks at every 59 units. If the window settings ai·e changed as shown
in Figure l.31 (a) and you press graph , the bai· chart will appear as in Figure l.3 l (b).
The bai·s are now centered over the class status values, 1, 2, 3, 4, and 5. [NOTE: If you
press zoom ► 9:ZoomStat instead of graph at this point, the settings will revert to the
calculator's default settings, and the bar chart will appear as in Figure I .30(a).]

Figure 1.31 (a) (b)


HORHAL FLOAT AUTO REAL RADIAN HP D
WINDOW
Xmin=0.25
Xmax=6.25
Xscl=0.5
Ymin=0
Ymax=40
Yscl=5
Xres =l
oX=0.02272727272727
TraceSteP=0. 10 45454545454...

Copyright 2020 Ccngagc Learning. All Rights Re.served. ~fay not be copied. .scanned, a duplicated. in wllolc or in part. Due to electronic rights . somc t.hird party content may be suppressed from the cBook and'o r cChaptcr(s).
Edllorial review has deemed that any suppressed content docs not llllltcrially affect the ovcrnll lcaming cxpcri'ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubsequent rights rest rict Kln.~ require it.
46 CHAPTER 1 Describing Data with Graphs

ii:MMHIIQ (Line Charts) The Dow Jones Industrial Average was monitored at the close of trading for
10 days in a recent year, with the results shown in Table l.21.
■ Table 1.21 Dow Jones Industrial Average
Day 2 3 4 5 6 7 8 9 10
DJIA 21,479 21 ,478 21,320 21,414 21,409 21 ,532 21,553 21,638 21 ,630 21,575

I. In the stat list editor, clear lists LI and L2 by placing the cursor on the list name and
pressing clear and enter. Then enter the data in Table 1.2 1, with Day in list LI and the
DTIA in L2.
2. Fo!Jow a procedure similar to that used for the bar chart. Press 2nd ► stat plot ► 1:Plotl.
On the screen that appears, make sure that the plot is On and choose the Line Chart
Type (the second option). T hen Xlist = LI (days) and Ylist = L2 (DJIA). Press zoom
► 9 :ZoomStat (or just zoom ► 9) to display the line chart, show n in Figure 1.32.

Figure 1.32 HORHAL FLOAT AUTO REAL RADIAN HP n

3. As with the bar chart, you can use trace to navigate through the screen and see the DnA
values for the 10 days. You can also use the window screen to modify the settings for
the x- and y-axes. Again, there are no labels on either axis, but yo u can still see the trend
in the DJIA over the 10 days.

•£§MiQijfJ,j (Frequency Histograms) The top 50 over-the-counter (OTC) stocks in a recent year were
found using an equal weighting of l -year total return and average daily dollar volume growth.
These 50 weightings, or averages, are shown here 16 :
■ Table 1.22 Rankings of Top SO OTC Stocks
1395.27 807.73 5 15.74 305.59 245.39 176.81 143.70 1 13.52 95.75 83.44
1196.82 780.16 392.52 297.60 231.13 166.09 142.82 1 12.60 88.73 80.85
1147.05 729.44 374.27 268.97 195.94 165.73 135.82 105.74 88.38 78.67
1138.47 642.91 350.20 258.30 194.91 152.97 135.47 105.10 85.02 78.20
832.23 598.51 350.13 246.19 189.79 147.95 124.06 103.15 83.48 76.48

l. We wi ll use the Histogram type in the stat plot menu aga in, but in th is case, each
observatio n occurs only o nce, so that the frequencies (Freq) w ill be l . Enter the data
into list Ll in the stat list editor.
2. To create the histogram, p ress 2nd ► stat plot ► 1:Plotl. On the screen that appears,
make sure that the plot is On, choose the Histogram Type (the third op tio n) and Xlist =
L l (averages). Ty pe the number " I " into Ylist and press zoom ► 9:ZoomStat (or j ust
zoom ► 9) to display the histogram.

Copyright 2020 Ccngage Learning. AU Rights Re.served. May not be copied. .sc.:mncd, a- duplicated. in wllolc or in part. Due to electronic rig hts . some third par ty conte nt may be s uppressed from the cBook amVo r tChaptcr(s) .
Editorial review has dccml'd that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if s ubsequent rights rest rict ion.~ require it.
Reviewing What You've Learned 47

3. Editing the histogram: Once the histogram has been created, use window to adjust the class
boundaries and the width of the classes. The window screen is shown in Figure l.33(a).
We have chosen to use classes of width 150, beginning at O and ending just above
the largest value at 1500. Remember that the TI calculators use the method of "left-
inclusion" as presented in Section 1.4. Once you have changed the window settings,
press graph to display the histogram in Figure l.33(b ). The distribution of OTC stocks
is highly skewed to the right, with a few unusually large measurements.

Figure 1.33 (a) (b)

WINDOW
Xmin=0
Xmax=1500
Xscl=150
Ymin=0
Ymax=25
Yscl=S
Xres=1
oX=S.6818181818182
TraceSteP=ll.363636363636

REVIEWING WHAT YOU'VE LEARNED


1. Quantitative or Qualitative? Identify each variable c. Weig ht of two dozen shrimp
as quantitative or qualitative: d. A person's body temperature
a. Ethnic origin of a candidate for public office e. Number of people waiting for treatment at a hospital
b. Score (0-100) on a placement examination emergency room
c. Fast-food restaurant preferred by a student
(McDo nald's, Burger King, or Carl's Jr.) 4 . Continuous or Discrete, again Identify each variable
d. Mercury concentration in a sample of tuna as continuous or discrete:
a. Number of properti es for sale by a real estate
2. Symmetric or Skewed? Do you expect the distribu- agency
tions of the following va1iables to be symmetric or
b. Depth of a s nowfall
skewed? Explain.
c. Length of time for a driver to respond when about to
a. Price of a 250-gram can of peas
have a collision
b. Height in centimeters of freshman women at your
d. Number of aircraft arriving at the Atlanta airport in a
university
given hour
c. Number of broken taco shells in a package of 100 shells
d. Number of ticks found on each of 50 trapped
rmm, 5. Major World Lakes A lake is a body of
cottontail rabbits
l1il water surrounded by land. Hence, so me bodies
3. Continuous or Discrete? Identify each variable as
osom of water named "seas ," like the Caspian Sea,
continuous or discrete: are actually salt lakes. In the table that follows, the
length in kilometers is listed for the major na tural
a. Length of time between atTivals at a medical clinic lakes of the world , excluding the Caspian Sea, which
b. Time required to finish an examination has a le ngth of 1,2 16 kilometers. 5

Copyright 2020 Ccngage Learning. AU Rights Rcs crvOO. May not be copied, .scanned, a- duplicated. in wllolc or in part. Due to electronic rights . some third par ty conte nt may be s uppressed from the cBook amVo r tChaptcr(s).
Editorial review has dccmOO that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves the right to remove additional content at any time if subsequent rights rest rict ion.~ require it.
48 CHAPTER 1 Describing Data with Graphs

Name Length (km) Name Length (km) d. Use a bar graph to show the percentage of federal
Gulf fishing areas closed.
Superior 560 Titicaca 195
Victoria 334 Nicaragua 163 e. Use a line chart to s how the amounts of dispersants
Huron 33D Athabasca 333 used. Is there any underlying straight line relation-
Michigan 491 Reindeer 229
ship over time?
Aral Sea 416 Tonie Sap 112
Tanganyika 672 Turkana 246
Baykal
Great Bear
632
307
lssyk Kul
Torrens
184
208 ii 7. Election Results
D50129
The 2016 election was a race
in which Donald Trump defeated Hillary Clinton
Nyasa 576 Vanern 146 and other candidates, winning 304 electoral votes,
Great Slave 477 Nettilling 107
or 57% of the 538 available. However, Trump only won
Erie 386 Winnipegosis 226
Winnipeg 426 Albert 160
46. l % of the popular vote, while Clinton won 48.2%.
Ontario 309 Nipigon 115 The popular vote (in thousands) for Donald Trnmp in
Balkhash 602 Gairdner 144 each of the 50 states is listed as follows 18 :
Ladoga 198 Urmia 144
Maracaibo 213 Manitoba 224 AL 1319 HI 129 MA 1091 NM 320 SD 228
Onega 232 Chad 280 AK 163 ID 409 Ml 2280 NY 2820 TN 1523
Eyre 144 AZ 1252 IL 2146 MN 1323 NC 2363 TX 4685
AR 685 IN 1557 MS 701 ND 217 UT 515
Source: The World Almanac and Book of Facts 2017 CA 4484 IA 801 MO 1595 OH 2841 VT 95
co 1202 KS 671 MT 279 OK 949 VA 1769
a. Use a stem and leaf plot to describe the lengths of CT 673 KY 1203 NE 496 OR 782 WA 1222
the world's major lakes. DE 185 LA 1179 NV 512 PA 2971 WV 489
FL 4618 ME 336 NH 346 RI 181 WI 1405
b. Use a histogram to display these same data. How
GA 2089 MD 943 NJ 1602 SC 1 155 WY 174
does this compare to the stem and leaf plot in part a?
c. Are these data symmetric or skewed? If skewed, a. By just looking at the table, what shape do you think
what is the direction of the skewing? the distribution for the popular vote by state will
have?
lm!ffl 6. Gulf Oil Spill Cleanup On April 20, 20 IO, the
b. Draw a relative frequency hi stogram to describe the
l'lil United States experienced a maj or environmental
distribution of the popular vote for President Trump
050128
disaster when a Deepwater Horizon drilling rig
in the 50 states.
exploded in the Gulf of Mexico. The number of person-
nel and equipment used in the Gulf oil spill cleanup, c. Did the histogram i.n pait b confirm your guess in
beginning May 2, 2010 (Day 13) through June 9, 2010 part a? Are there any outliers? How can you explai n
(Day 51) is given in the following table. 17 them?

Day 13 Day 26 Day 39 Day 51 II 8. Election Results, continued Refer to Exercise 7.


Listed here is the percentage of the popular vote
Number of personnel (1000s) 3.0 17.5 20.0 24.0
Federal Gulf fishing areas closed 3% 8% 25% 32% osono received by President Trump in each of the
Booms laid (kilometers) 74 504 1030 1454 50 states 18 :
Dispersants used (1000 liters) 590 1893 3293 4326
Vessels deployed (1 00s) 1.0 6.0 14.0 35.0 AL 62 HI 30 MA 33 NM 40 SD 62
AK 51 ID 59 Ml 47 NY 37 TN 61
a. What types of graphs could you use to display these AZ 49 IL 39 MN 45 NC so TX 52
AR 61 IN 57 MS 58 ND 63 UT 46
data?
CA 32 IA 51 MO 57 OH 52 VT 30
b. Before you draw your graphs, what trends do you co 43 KS 57 MT 56 OK 65 VA 44
see in each of the variables? CT 44 KY 63 NE 59 OR 39 WA 37
DE 42 LA 58 NV 46 PA 48 WV 68
c. Use a line chart to show the number of perso nnel FL 49 ME 45 NH 46 RI 39 WI 47
deployed over this 5 1-day period. GA 51 MD 34 NJ 41 SC 55 WY 68

Copyright 2:020 Ccngage Learning. AU Rights Re.served. May not be copied, .sc,:mncd, a- duplicated. in wllolc or in part. Due to electronic rights . some third party content may be s uppressed from the cBook and'or tChaptcr(s).
Editorial review has deemed that any suppressed content docs 001 materially affect the ovcrnll lcaming cxpcric-ncc. Ccngagc Lctirning reserves Lhc right to rcmO\lc additional content at any time if subsequent rights restrict ion.~ require it.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy