100% found this document useful (1 vote)
36 views145 pages

M2208 - 9 Practice Lab Manual JULY15

Lab Manual for STATS

Uploaded by

Kirk O'Connell
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
36 views145 pages

M2208 - 9 Practice Lab Manual JULY15

Lab Manual for STATS

Uploaded by

Kirk O'Connell
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 145

Math 2208/2209

Introduction to Statistics
Lab Manual

Department of Mathematics
Mount Saint Vincent University
Copyright © September, 2015
Inside cover ~ intentionally blank
Table of Contents

MATH2208 Practice Exercises (from textbook) ........................................................................................ 3


MATH2209 Practice Exercises (from textbook) ........................................................................................ 4
Structure of the Statistics Labs ................................................................................................................... 7
Attendance .............................................................................................................................................. 7
Class Cancellation ................................................................................................................................... 7
Preparation .............................................................................................................................................. 7
Academic Integrity.................................................................................................................................. 8
Grading ................................................................................................................................................... 8
Be Prepared ............................................................................................................................................. 8
Cell Phones ............................................................................................................................................. 8
Scent Free Policy .................................................................................................................................... 8
Dos and Don’ts of Statistics Labs ............................................................................................................... 9
How to Learn Mathematics ....................................................................................................................... 11
Frequently Asked Questions ..................................................................................................................... 15
MINITAB Survival Guide ........................................................................................................................ 17
Logging In & Out of the Campus Network .......................................................................................... 17
Minitab 17 Basics ................................................................................................................................. 19
Using Statistics Commands in MINITAB17 ........................................................................................ 22
General Commands in MINITAB 17 ................................................................................................... 29
Minitab Express Basics ......................................................................................................................... 31
Using Statistics Commands in MINITAB Express .............................................................................. 33
General Commands in MINITAB Express ........................................................................................... 40
Using Your Statistical Calculator ............................................................................................................. 41
Basic Functions ..................................................................................................................................... 41
Statistical Functions .............................................................................................................................. 46
Practice Exercises for MATH2208 ........................................................................................................... 51
Basic Skills for Math 2208 ................................................................................................................... 51
Chapter 2: Displaying and Describing Categorical Data ...................................................................... 53
Chapter 3: Displaying and Summarizing Quantitative Data................................................................. 55
Chapter 4: Understanding and Comparing Distributions...................................................................... 59
Chapter 5: The Standard Deviation as a ruler and the Normal Model .................................................. 61
Chapters 6 - 8: Linear Regression ......................................................................................................... 63
Chapter 10: Sample Surveys ................................................................................................................. 73
Chapter 11: Experiments & Observational Studies .............................................................................. 75
Chapter 12: From Randomness to Probability ...................................................................................... 77
Chapter 14: Random Variables ............................................................................................................. 79
Chapter 15: Sampling Distribution Models .......................................................................................... 81
Chapter 16: Confidence Intervals for Proportions ................................................................................ 83
Chapter 17: Testing Hypothesis about Proportions .............................................................................. 87
Chapter 18: More About Tests .............................................................................................................. 89
Chapter 19: Comparing Two Proportions ............................................................................................. 97
Practice Exercises for MATH2209 ......................................................................................................... 101
Chapter 15: Central Limit Theorem .................................................................................................... 101
Chapter 20: Inferences About Means ................................................................................................. 103
Chapter 21: Comparing Means ........................................................................................................... 105
Chapter 22: Paired Samples and Blocks ............................................................................................. 109
Chapter 23: Inference for Two-way Tables ........................................................................................ 113
Chapter 24: Inference for Regression ................................................................................................. 117
Chapter 25: Analysis of Variance ....................................................................................................... 125
Chapter 26: Multifactor Analysis of Variance .................................................................................... 131
Chapter 27: Multiple Regression ........................................................................................................ 137
3

MATH2208 Practice Exercises (from textbook)

De Veaux, Velleman, Bock, Vukov and Wong, Second Canadian Edition

Please note: Brief answers to the odd-numbered exercises may be found in the back of the
text. There are some errors in these solution; we will post a list of these errors as they are
discovered.

Chapter & Topic Exercises


1: Stats Starts Here 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23
2: Displaying and Describing Categorical 1, 3, 5, 7, 9, 11, 13, 17, 21, 25, 27, 29, 31, 33, 35,
Variables 37
Note: for Q21, 25, 27, 29, 31, 33, 35, 37 when graphing
to compare conditional distributions, recommend side-by-
side plots of the sort done in lab (The student’s solution
manual shows only segmented bar charts.)
3: Displaying and Summarizing Qualitative 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 29, 31,
Data 37, 39, 41
Note: Q17 ask you to calculate the standard deviation; for
the purpose of this course, the value will be given.
4: Understanding and Comparing 1, 3, 7, 9, 11, 13, 17 (challenging), 19 (note f is
Distributions challenging), 21, 23
5: The Standard Deviation as a Ruler and the 5, 7, 9, 17(a,b,c,e), 19, 21, 23, 25(a,b,d), 29(a,b),
Normal Model 33(a,b), 35(a), 37
6: Scatterplots, Association, and Correlation 5, 7, 11, 27
7: Linear Regression 9, 11(a), 13, 19, 47
8: Regression Wisdom 11, 15, 19
10: Sample Surveys Any odd number
(Note: see posted Errata file for information about
solutions)
11: Experiments and Observational Studies 15, 17, 23, 31, 51, 53, 61
12: From Randomness to Probability 11, 13, 15
13: Probability Rules! 9, 13, 15, 19
14: Random Variables 27 (c, d), 75 (a, b, c, d, e), 77
15: Sampling Distribution Models 17, 21, 25, 27
16: Confidence Intervals for Proportions 7, 11, 15, 17, 21, 27, 33
17: Testing Hypothesis About Proportions 7, 9, 11, 15, 17, 25, 27
18: More About Tests 5,7,11,13(a,b),17,19
19: Comparing Two Proportions* 9, 11, 17, 19, 21, 23, 25, 27, 33

*note for chapter 19, there are a lot of calculations. There is Minitab output posted for these
questions on your lab Moodle site.
4 Practice Exercises from the Textbook

MATH2209 Practice Exercises (from textbook)

De Veaux, Velleman, Bock, Vukov and Wong, Second Canadian Edition

Please note: Brief answers to the odd-numbered exercises may be found in the back of the
text. There are some errors in these solution; we will post a list of these errors as they are
discovered. You may choose to use Minitab to do the calculations needed for some of these
questions. Brief Minitab instructions for each topic are at the end of this question list.

Chapter 15: Sampling Distribution Models (for means)

 Q43
 Q45
 Q55
 Q57 part a)
 Q59 parts a and c, parts b and d are harder but make a good challenge

Chapter 20: Inferences about Means


 Q1  Q23
 Q5  Q25
 Q7  Q29
 Q9  Q31
 Q11  Q33
 Q13  Q35
 Q15 a, b & c  Q37 recommend MINITAB: parts
 Q17 a,b,c, & d, omit part e
 Q19  Q39 parts a & b only
 Q21

Chapter 21: Comparing Means


 Q7
 Q9
 Q11: use df = 31
 Q13: df = 42, 15: df = 33
 Q21(do not perform a test here, just explain why you can’t use the methods of Ch 24)
 Q31: df = 37
 Q33: df = 10 with outlier, df = 8 without
 Q35: df = 339 use df = 200 from the table, or use MINITAB
 Q39: df = 20
Practice Exercises from the Textbook 5

Chapter 22: Paired Samples and Blocks

Warning! Not all the questions have paired data; some use independent samples.

 Q 1
 Q 3 - good to think about this.
 Q 15
 Q 17
 Q 19 parts a and b only.
 Q 23
 Q 25

Chapter 23

 Q 23 (for part f, check the assumptions)


 Q 25
 Q 33

Other suitable problems: Q35, 39, 43 or 45, recommending that you use MINITAB to find the
test statistic.

Chapter 24

Note for MSVU style hypotheses & conclusions: we will include the word “mean’" in front of
the response ( Y) , and also include “ for the population” describing the population in context
when appropriate.

 Q1
 Q17
 Q19
 Q23: note part b) refers to “practical” significance
 Q25
 Q29
 Q31: part a) only
 Q33
 Q35
 Q43 omit part f).
6 Practice Exercises from the Textbook

Chapter 25

Note for MSVU style hypotheses & conclusions you need the “population mean” rather than just “
mean , and describe the relevant populations. In this section some of the data are NOT appropriate
for 1–way ANOVA.

 Q1
 Q7
 Q9: think carefully!
 Q11
 Q15
 Q19

Chapter 26

Note for MSVU style hypotheses & conclusions you need to add “population mean” to the response
variable. You are not required to use the "greek" parameters, you can just use words if you prefer.
For example in qu 11, the MSVU “Ho” hypothesis would be
Ho: time of day has no effect on the population mean number of shots for this pop model.
Solution manual: Ho: time of day has no effect on the number of shots
Note you are not required to use the "greek" parameters, you can just use words if you prefer.

 Q5
 Q13
 Q15
 Q19: good for discussion: part a) give hypotheses for the effects of brand & environment, in
part b) you can use the interaction plot to answer the question.
 Q7 & 9: think carefully for qu 7 something is wrong here, note that you are NOT asked to do
the analysis until qu 9, you could use MINTAB for this, but not ANOVA

Minitab Instructions

Chapter 21:

Stat> basic statistics > 2-sample t


choose samples in different columns if you have the data in C1 and C2,
choose summarized data if you want to enter means and standard deviations
the options button allows you to specify the alternative hypothesis and/or confidence level.

Chapter 22:
You can use MINITAB two ways.
1) Set the data into two columns and use
stat > basic statistics> paired t
2) Set the data into two columns, use the calculate menu to set the difference in a new
column and use one sample t method on the column of differences
7

Structure of the Statistics Labs

The lab component of Math 2208/2209 has been designed to enhance your learning
experience by providing a supportive environment where you can apply the concepts
presented in the textbook and lectures to real world statistics problems. We have carefully
constructed problems that are representative of many disciplines: medicine, psychology,
sociology, education, business, and marketing, to name a few. In fact, some of the questions
are based on research conducted by Mount faculty. You will be able to identify these questions
easily as they are the only ones that have citations included at the end of the question. All
other questions, though they may seem to be based on real research, are not. “Any similarities
to actual research are strictly coincidental”.

We encourage you to work in groups, as approaching a problem from a variety of perspectives


can facilitate better learning. However, if you are not comfortable working in groups, feel free
to complete the labs on your own. It is important to attempt each question on your own (or as
a group); and if you get stuck, please ask one of the lab staff for guidance.

During each lab class, you will be assigned a number of problems, which are to be completed
during class time and handed in at the end of class. Labs may not be completed outside of a
scheduled lab time.

Attendance

Please be aware that you are expected to attend the lab session for which you are registered.
If you are unable to attend your scheduled lab session, you must make alternative
arrangements with the lab instructor as soon as possible. As an alternative to missing a lab,
you can arrange with the lab instructor to complete your lab in any of the other lab classes
held during the same week. If your absence is due to illness or an equally serious matter,
appropriate documentation is required. If the reason for your absence is more unique, contact
the lab instructor prior to your scheduled lab to explain your circumstances. Please, do not
assume that any excuse will be accepted. Acceptance of unique excuses is at the
discretion of the lab instructor, and the lab instructor’s decision is final.

Class Cancellation

If your lab session is cancelled due to inclement weather or other circumstances, information
will be posted on the lab Moodle site as to how the cancelled lab will be handled.

Preparation

Ordinarily, no homework is given for the labs. You are expected to attend class to be prepared
for labs. Your lecture notes, and any handouts given by your professor, are the primary
resource of information when completing labs. If you miss a class, you should do your best to
obtain the notes for that lecture prior to attending your lab that week. Lab staff will be
available to help you, and your textbook will serve as a reference as needed.
8 Structure of the Statistics Labs

Academic Integrity

The lab experience affords you an opportunity to work collaboratively with your classmates.
Unfortunately, it also creates an environment in which students may be tempted to
misrepresent themselves. Please be aware that university regulations on plagiarism and
cheating will be strictly enforced. Labs submitted for credit must be your own work,
completed during the assigned lab time.

Grading

Although the lab is less formal than a test or exam, the grading of labs is quite rigorous. You
are able to refer to your class notes, textbook, and lab staff for help when answering the
assigned questions. As such, it is expected that answers will be complete. All written
responses are to be expressed in the context of the problem and in complete
sentences, unless otherwise stated. The correct use of language is strictly enforced in the
labs. If you are unsure if a given response is suitable, please feel free to ask one of the lab
staff.

If you have any questions about a grade you receive on a lab, please direct your questions to
the lab instructor – not the TA’s or your professor.

Be Prepared

For each lab class, please ensure that you bring the following:
 a sharpened pencil & eraser (working in pen can get very messy)
 a calculator & its manual (the lab staff can help you with some models, but they are not
familiar with all brands of calculators)
 your lecture notes and any handouts from your Instructor
 your textbook

Cell Phones

As with any regular class, cell phones are not permitted to be on during lab time. Phones
ringing during class are distracting to other students. If you have special circumstances which
require you to have a phone on, please speak to the lab instructor.

Scent Free Policy

Finally, please observe the Mount’s policy on “Scents”. Many people are adversely affected by
scented body products, especially strong perfumes and colognes. As a courtesy to your
classmates and instructors, please refrain from wearing such products to class.
9

Dos and Don’ts of Statistics Labs

To make your lab experience successful, a few guidelines need to be adhered do. The
following points are a summary and clarification of the information provided in the previous
section of your lab manual, “Structure of the Statistics Labs”.

Do:
 Come to lab prepared
 Ask questions
 Stay on task during the lab period
 Be considerate of others in the room
 Bring hardcopies of notes/handouts given to you by your professor
 Bring your textbook
 Bring your calculator, pencils and eraser
 Use permitted resources to help you

Do Not:
 Take unmarked labs out of the lab room.
 Bring any materials from another semester of statistics to the lab. This includes past lab
or practice exercise solutions.
 Use your phone or laptop or other device during lab without checking with the lab
instructor first.
10

This page left blank intentionally.


11

How to Learn Mathematics

A Sports Analogy

Recently, I have been taking skating lessons. The skating instructor, Debra, first demonstrates
what movement she wants the class to do, explaining all the steps involved. When Debra
demonstrates a move, it always looks easy. Once she is finished, I usually feel that I
understand how the particular movement is executed; however, when I try it for the first time,
I cannot seem to do it. What looked so easy for Debra turns out to be much harder for me. In
order to master the move, I have to practice over and over again.

Mathematics is a skill just like skating, and learning mathematics is much like learning to
skate: it takes practice, practice, practice. Some of you may have the idea that the world is
divided into two groups of people: those who can do math and those who cannot (and you
may classify yourself as one of the “cannot” people). Fortunately, this idea is not true. Just as
everyone can learn how to skate (though few will become Olympic class skaters), everyone
can learn the mathematics necessary for other disciplines (though not everyone will become a
mathematician). All you need to do is work at it.

1. Schedule Time to Practice

Practice is the most important step in learning mathematics. For university courses, it is
generally expected that you spend at least two hours studying outside class time for every
hour you spend in class, which may be a major change from high school. For math courses,
most of this time should be spent working on problems. It is better to practice in frequent,
shorter sessions than in a few long ones. For example, spending one hour each night
practicing for six nights probably will be more efficient than practicing six hours straight on the
night before a quiz.

2. Go to Class!!!

In order to practice math, you need to have some understanding of it. For most people,
attending class is the best way to obtain this understanding. Try to follow the examples the
instructor gives, and if you do not understand something, ask a question (most likely several
other people will have the same question and will be grateful you asked). If you do not feel
comfortable asking the question in class, write it down in your notes and ask the professor (or
a friend) after class. If, on the other hand, the topic covered in class is one you already have
mastered, use the class examples for review: copy them down and try working them out for
yourself rather than watching the professor, and then check your solution against the
instructor’s work.
12 How to Learn Mathematics

3. Read the Textbook

A textbook is not like a novel; it should be read slowly, with a pencil and paper handy. Read
over the explanations, and then try the examples for yourself, filling in any steps that may be
omitted. If possible, try reading the relevant section of the textbook before it is covered in
class. By doing so, it will be easier to follow the lecture and ask questions. However, if you are
unable to read ahead, use the textbook explanations to fill in any gaps in your understanding
from class.

4. Attend the Lab

The lab is an excellent forum in which to practice statistics problems. Here, you can work on
problems in a group environment, benefiting from the knowledge of your peers who may have
a different instructor than you. Additionally, you can request the help of the lab staff if you
really get stuck on a problem.

5. Do the Practice Exercises (in the textbook & the lab manual)

Again, practice is the most important step in learning mathematics. How many exercises you
do depends on how comfortable you are with a given topic. If you feel comfortable with the
material, you may need to try only a few exercises to be sure you have mastered the topic. If
you’re not sure that you understand the material, it may be best to start with the class
examples. Copy out the question, then cover up the solution and try to work it out for
yourself. By so doing, you will be able to see if you really understand the problem. If you
cannot do the question, read it over and try again or try the textbook examples. Next, try the
exercises. Check your answer after each question to ensure that you are on the right track. At
first, you may need to have your notes open in front of you to follow the model of the class
examples. Eventually, you should practice some exercises with your notes closed, as you will
not be able to refer to your notes during quizzes or exams. You even may want to try timing
yourself on later exercises to get practice at working under pressure.

6. If You Get Stuck, Get Help - ASAP

There are a number of people from whom you can get help: your instructor, the lab instructor,
a lab assistant, a classmate, a friend, or a relative. As a last resort, you can hire a tutor (be
aware that it is easy to get dependent on a tutor to do the exercises with you, which is not the
same as doing them by yourself). The critical point is getting help right away, before you fall
too far behind. The best thing to do is to take the exercises that you have tried
(unsuccessfully) to the helper and try to find out where you are going wrong. It is also useful
for you to work through several more exercises with the helper watching. Just watching
someone else work out more examples is not good practice. Remember, you probably did not
learn to ride a bike by watching someone else do it. If there seems to be some topic that you
just cannot get; try to remain calm. Sometimes just leaving a problem alone for a while and
coming back to it later makes a big difference.
How to Learn Mathematics 13

7. Consider Forming a Study Group

Many students find study groups particularly useful for mathematics. If you can, meet
regularly with several classmates either before you do the practice exercises to clarify the
concepts or after you have done the exercises to compare your answers. Study groups benefit
both weaker and stronger students. Weaker students benefit from having someone explain
how to do problems that they could not do on their own, while stronger students solidify their
understanding of a topic by explaining it to someone else.

8. Go Over Your Quizzes, Labs, & Homework Assignments

When you get a quiz, lab, or assignment back, go over it to determine where you went wrong.
Read any comments that may be given. If solutions have been provided, look them over to
see how they differ from the answers you provided. If necessary, try more practice exercises
or get help. If you ignore the feedback provided, you will most likely make the same mistakes
again on the exam. In fact, instructors often make some questions for exams similar to those
that were not done well on quizzes or assignments.

9. Study for Tests & Exams with More Practicing

Just studying your notes before an exam may be appropriate for some courses, but not for
math. Again, you need to practice. The first step in preparing for exams is regular practice
across the entire term. This way, by the time the exam gets close, you will have mastered the
individual topics and will be ready to practice material in combination (i.e., review exercises).
Another way to study for exams is through “Random Distributed Practice”. This method
involves writing questions down on the front of index cards - one question per card. Choose
the questions from class examples or practice exercises; write one or two from the topics you
are most comfortable with and more from the topics that give you some trouble. On the back
of the card, write down how to start solving the question. The critical part of a question for
exams is recognizing the type of question and getting started. Shuffle the cards, and go
through them: read the question, mentally consider how to start, and then check the back.
Keep the cards handy and go through them whenever you have a few spare minutes.
14

This page left blank intentionally.


15

Frequently Asked Questions

1. Can I use a calculator?

Yes. However, remember to round off your answer appropriately only at the end. You should
keep more decimal places than you need during the intermediate steps (easiest to do if you
use the memory function on your calculator), or your final answer may be inaccurate due to
round-off error.

2. Why do I lose marks for “bad form” if I have the right answer?

Remember that the course outline for this and every other MSVU course indicates that “the
correct use of language is one of the criteria included in the evaluation of all written work”.
University graduates are expected to have good written communication skills. Finding the right
answer to a question is part of mathematics and so is communicating the method used, and
the answer effectively.

3. I understand what we are doing in class, so why am I failing the quizzes?

It is not enough to understand a topic; you must master it. In order to master a topic, you
must practice. Understanding the method is a good start, but you also have to be able to
apply it without referring to examples in your notes (see pp. 11-12 How to Learn Math).

4. Will “this” be on the exam?

Any material covered or referred to in class, or in lab, or in practice exercises from the
textbook could be on the exam (unless you are told otherwise). “This” includes material from
the beginning of the course, and even prerequisite material. You could also be expected to
transfer your statistics knowledge to a related application. Moreover, if you take (or are
taking) Math2209, you will be expected to know the material from 2208. Math is cumulative:
later topics usually depend on earlier ones, and thus, you cannot just learn the material for a
quiz or exam and then just forget it.
16
This page left blank intentionally.
17
MINITAB Survival Guide

This guide will show you how to use some basic components of MINITAB, the statistical
software used for this course. It is assumed that you have basic computer skills such as being
able to use a Windows desktop, pull-down menus, and file managers. To obtain maximum
benefit, you should read the explanations, and then type the appropriate commands to follow
the examples given in the first five sections. All of the example commands for the user are
indicated by This Typeface. For example, the following command directs you to pull down the
Stat menu and then select the Basic Statistics function: Stat > Basic Statistics

You may use Minitab one of two ways.


1. Using the campus network.
2. Installing the Minitab program on your home PC or laptop by downloading it from the
MSVU IT&S website (under Software for Home). There is a link directly to the download
on your lab Moodle site. You will need to put in your name and student ID number to
download the software.

Please note: if you are using your own computer, you are responsible for the
installation and set-up. Follow the steps below to access Minitab on campus.

Logging In & Out of the Campus Network

Follow the steps below to access Minitab17 on campus.

1. Get a Student Username

In order to use the computers on campus, you need to get a student username, password,
and printer account from the computer help desk located on the main floor of the EMF library.
You will use the same student user-name for all of your computing needs at MSVU: course
work, email, Internet.

2. Find a Computer

The S316 computer lab has restricted hours (check the door to see when it are available), but
there may be an assistant present who can help you if you have any problems logging in,
using the computers, or getting printouts. Please note that the assistants CANNOT help you
with the specific details of your assignment – contact your professor if you have questions in
that regard. Once you are comfortable using the MSVU computers, you may prefer to use the
S315, EV136 or library computers.
18 Minitab Survival Guide: Minitab17

3. Logging In & Starting MINITAB17

Again, if you are inexperienced, try logging in with an assistant present. Read the usage
agreement on the screen of the computer and use the mouse to click that you agree. Now,
enter your student username and password. You probably will be asked to change your
password. Enter a new password (something that you can remember which is at least six
characters long and contains at least two digits), then enter it again to confirm it was typed
correctly. After a short (or sometimes, long) wait you will see several icons (small pictures)
come up on the computer screen; one of them should be a green icon with white bars labelled
MINITAB17. Double click on this icon to open the MINITAB17 software. If there is no
MINITAB17 icon on the screen, you may find it by searching through the programs menu
(START > PROGRAMS > MINITAB).

If you have tried all of these options


and still cannot access MINITAB,
ask the assistant in the computer lab
for help.

4. Logging Out

When you are ready to leave the computer lab (not now!), you also should log off of the
network (START > SHUT DOWN > RESTART). If you do not logoff from the MSVU network,
the next person to use the computer you were at will be able to access all your files, including
your email.
Minitab Survival Guide: Minitab 17 19

Minitab 17 Basics

If you have successfully opened MINITAB as described previously, you will see a standard
looking Window, with the name MINITAB - Untitled across the top bar. Below the title is a
row of Menus (File, Edit, Data, etc.), followed by a row of button icons (file folder, save, print,
etc.), and finally by two windows — the Session window and the Worksheet window.

1. Entering Data

The worksheet grid is divided into columns (C1, C2, C3, ...) and rows (1, 2, 3, ....). A row
contains data on all variables for a given individual while a column contains data on all
individuals for a given variable. To enter data into column C1, for instance, click on row 1 of
C1 to highlight that cell, and then type in the first number. Now, go to row 2 by using the
down arrow (or by clicking on the cell) and type in the next number. Continue in this way
down the column until all of the data for the variable are entered. You can enter data for other
variables into columns C2, C3, etc. If you make a data entry error, simply click on the cell in
question to highlight it and re-enter or delete the number.

Try entering the following data, which represents a sample of students and their response to
the question “What is your favourite season?”

C1: Spring, Summer, Fall, Winter C2: 12, 14, 16, 8


20 Minitab Survival Guide: Minitab17

There are different types of data that we will be working with, so to see how some other
Minitab Commands work, enter the following values for a sample of five individuals into
columns 3 and 4:
C3: 45, 53, 28, 47, 39 C4: 54, 27, 59, 32, 48

2. Naming Data Columns

Giving a name to a column serves two purposes:


i. The column may be referred to by its name as it often is easier to remember the name
of a variable than the number of a column.
ii. All output will be labelled with the name making the output easier to read.

To name a column, highlight the grey worksheet cell which is immediately below the column
label (above row one of the data), then type in the desired column name. Select names that
are descriptive of the data.

Recall that for the data entered in Step 1, the values in C1 represent the season and the
values in C2 represent how many students chose that as their favourite. So give the name
‘Season’ to C1 and ‘Number of Students to C2.

Note that for the second set of data entered in Step 1, the values in C3 represent
temperatures and the values in C4 represent reaction times. Give the name ‘temp’ to C3 and
‘react’ to C4.
Minitab Survival Guide: Minitab 17 21

3. Data Display

You can get the contents of the worksheet to display in the Session window with the
command Data > Display Data. In the dialogue box that opens you can select or double click
the columns you wish to display. Then, click OK and the columns in the worksheet will be
displayed in the Session window:

Note: You can get a hardcopy printout of the session window, as explained later in this
section.
22 Minitab Survival Guide: Minitab17

Using Statistics Commands in MINITAB17

The following MINITAB commands allow you to perform a number of useful statistical
computations. The results for most statistics commands are displayed in the Session window.
One exception is some graphics commands (e.g., histogram, plot) which open in a new
window.

1. PIE CHART – used to obtain pie charts

Click on the command, Graph > Pie Chart… and a dialogue box will open. Choose the
option “chart values from a table”, and put the label from C1 in the “Categorical Variable”
box. C2 will go in the “Summary Variables” box. Then click on Labels, and Slice Labels.
Click in the boxes to label the pie slices with the variable name and percent. Click OK to
create the pie chart.

Remove the legend (since your slices are labelled) by clicking on it and hitting the “Del” button
on your keyboard. You can also remove the colour by double-clicking on the pie and under the
“Attributes” tab, chose Custom under “Fill Pattern” and choosing white for the background
colour.

2. BAR CHART – used to obtain bar charts

Create a bar graph (Graph > Bar Chart) of the data. Select “Values from Table” from the
drop-down menu, then under “Chart options”, choose to show Y as a percent. Click in the
“graph variables” box then click C2 on the left. Then click in the “categorical variable” box
and C1.
Minitab Survival Guide: Minitab 17 23

3. DESCRIBE - used to obtain descriptive statistics

The following command may be used to generate a number of descriptive statistics including
mean, standard deviation, median, quartiles, etc. for specified variables:

Stat > Basic Statistics > Display Descriptive Statistics...

As with the data display command described earlier, a dialogue box will open; double click on
the columns for which you wish to get statistics, and then click OK.

Use the sample data to get descriptive statistics for C3 (temperature). The results should
appear in the Session window. Note that there are many other options and accessories
available in the dialogue box — if these are needed in future assignments, they will be
explained then.
24 Minitab Survival Guide: Minitab17

4. HISTOGRAM - used to obtain histograms

After using the command, Graph > Histogram..., a dialogue box will open. Highlight the
Simple button and then click OK. Another dialogue box will open in which you can double
click or select the columns for which you wish to get histograms. Then, click OK.

Use the sample data to get a histogram for C3 (temp). Please note that a new window will
open for each histogram you make during your Minitab session; these windows can be
minimized or closed (make sure you print the graph before you close the window). If you
want to see the temp histogram window again after it has been minimized, use the
following WINDOW command: Window > Histogram of temp
Minitab Survival Guide: Minitab 17 25

5. STEM-AND-LEAF PLOT - used to create a stem plot

Use the command, Graph > Stem-and-Leaf..., and a dialogue box will open. Double click
on the columns for which you wish to get stem plots, and then click OK. Try this command
with column C3 data; the results should appear in the Session window.
26 Minitab Survival Guide: Minitab17

6. REGRESSION - used to perform regression analysis

The following command fits a regression equation to the data: Stat > Regression >
Regression > Fit Regression Model. A dialogue box will open; click in the Responses box
then chose the response (or dependent) variable Y. Then click in the Continuous predictors
box, and chose the column to be used as the Predictor (or independent or explanatory)
variable X. Finally, click OK.

Try this command with column C4 as Y and C3 as X. The results should appear in the Session
window.
Minitab Survival Guide: Minitab 17 27

7. CORRELATE - used to find a correlation coefficient

Use the command, Stat > Basic Statistics > Correlation…, and a dialogue box will open.
Double click on the two columns of the variables for which you wish to find the correlation and
then click OK. Try this command with columns C3 and C4; the results should appear in the
Session window.
28 Minitab Survival Guide: Minitab17

8. SCATTERPLOT - used to generate a scatterplot

This command produces a scatterplot with the data for the response variable on the vertical
axis (Y) and the data for the predictor variable on the horizontal axis (X). After you type the
following command a dialogue box will open: Graph > Scatterplot.... Highlight the Simple
button and then click OK. Another dialogue box will open. As with regression, double click on
the column to be used as the response variable Y and on the column to be used as the
predictor variable X, and finally click OK.

Try this command with column C4 as Y and C3 as X; the results will appear in a new window.
As with the HISTOGRAM command, this window can be closed or minimized, and if you want
to see this window again after it has been minimized, it can be found using the WINDOW
command.
Minitab Survival Guide: Minitab 17 29

General Commands in MINITAB 17

The following are some other commands that are useful in MINITAB17.

1. SAVE Commands - used to save files

There are various commands you can use to save your work. For instance, File > Save Project
(or pressing the diskette button, or using Ctrl+S) will save the entire project (i.e., all windows
including the worksheet, session, graphics, info, etc.). If you only want to save the data in the
worksheet, you can use the command, File > Save Current Worksheet. These commands
initiate standard Windows file management dialogue boxes; it is assumed that you are
competent in the use of these standard dialogue boxes.

2. OPEN Commands - used to open existing files

The commands to retrieve previously saved projects and worksheets are analogous to the
SAVE commands: File > Open Project... and File > Open Worksheet...

To open Minitab files from the ActiveStats CD that came with your text, put the CD into your
CD drive. The Minitab files are in a folder called
DEVEAU_VELLEMAN_BOCK DATASETS\STATS AND DATA MODELS\MINITAB
30 Minitab Survival Guide: Minitab17

3. HELP Command - used to get help with MINITAB

This command is similar to a Standard Windows-style help interface. You may wish to
experiment with HELP > TUTORIALS.

4. PRINT Command - used to print files

You can get a hardcopy printout of any window in MINITAB, including the Session and
Graphics windows. It is particularly important to print these windows in order to turn in your
MINITAB assignments. For instance, to print the Session window you simply make sure that it
is active (i.e., click on the bar at the top of the Session window if it is not lit up as blue) and
then click the printer icon. The standard Windows Printer dialogue box will open, and you will
need to select the printer to which you wish to send your printout (Seton 316, Help Desk,
etc.). Also, make sure you print any graphs you need. Either do so when you first create the
graph by clicking the printer icon or at the end of your session by maximizing each graph
window and clicking on the printer icon.

You can also export the Session Window or graphs to Microsoft Word or PowerPoint. Simply
click in the Session Window, or on the graph you wish to copy, and right-click. If you already
have a Word or PowerPoint file open, it will send the information to that file. If not, a new file
will be created. If you are adding to an existing document, be sure to click in the document
where you want the new information added before sending from Minitab.

Important Money & Tree Saving Tip

Before you print the results contained in a Session


Window, you can edit the contents. To delete sections of
the results that you do not need to hand in, simply place
the cursor at the beginning of the section you wish to
remove, and hold the left mouse button down to highlight
the material to be deleted. Once the material is
highlighted, just press the delete key.

5. EXIT - used to leave MINITAB

When finished, to leave MINITAB simply close the main window or use the following
command: File > Exit.
31
Minitab Express Basics

If you have successfully opened MINITAB as described previously, you will see a standard
looking Window, with the name Untitled – Minitab Express across the top bar. Below the
title is a row of Menus (File, Home, Data, etc.), followed by the default tri-pane view of the
Navigator (left), Output Pane (top) and Data Pane (bottom). You can change the layout
of this window on the Home menu; for the purpose of this Guide, we will leave the default
view.

4. Entering Data

The worksheet grid is divided into columns (C1, C2, C3, ...) and rows (1, 2, 3, ....). A row
contains data on all variables for a given individual while a column contains data on all
individuals for a given variable. To enter data into column C1, for instance, click on row 1 of
C1 to highlight that cell, and then type in the first number. Now, go to row 2 by using the
down arrow (or by clicking on the cell) and type in the next number. Continue in this way
down the column until all of the data for the variable are entered. You can enter data for other
variables into columns C2, C3, etc. If you make a data entry error, simply click on the cell in
question to highlight it and re-enter or delete the number.

Try entering the following data, which represents a sample of students and their response to
the question “What is your favourite season?”

C1: Spring, Summer, Fall, Winter C2: 12, 14, 16, 8


32 Minitab Survival Guide: Minitab Express

There are different types of data that we will be working with, so to see how some other
Minitab Commands work, enter the following values for a sample of five individuals into
columns 3 and 4:
C3: 45, 53, 28, 47, 39 C4: 54, 27, 59, 32, 48
5. Naming Data Columns

Giving a name to a column serves two purposes:


i. The column may be referred to by its name as it often is easier to remember the name
of a variable than the number of a column.
ii. All output will be labelled with the name making the output easier to read.

To name a column, click in the white worksheet cell which is immediately below the column
label (above row one of the data), then type in the desired column name. Select names that
are descriptive of the data.

Recall that for the data entered in Step 1, the values in C1 represent the season and the
values in C2 represent how many students chose that as their favourite. So give the name
‘Season’ to C1 and ‘Number of Students to C2.

Note that for the second set of data entered in Step 1, the values in C3 represent
temperatures and the values in C4 represent reaction times. Give the name ‘temp’ to C3 and
‘react’ to C4.

This is what the Worksheet pane of your window should look like now:
Minitab Survival Guide: Minitab Express 33

Using Statistics Commands in MINITAB Express

The following MINITAB commands allow you to perform a number of useful statistical
computations. The results for most statistics commands are displayed in the Output Pane. One
exception is some graphics commands (e.g., histogram, plot) which are viewed by clicking on
the appropriate label on the Navigator Pane.

1. PIE CHART – used to obtain pie charts

Click on the command, Graphs > Pie Chart… and a dialogue box will open. Choose the
option “summarized values for each category in a table”, and put the label from C1 in the
“Category names” box. C2 will go in the “Summary Variables” box.
34 Minitab Survival Guide: Minitab Express

2. BAR CHART – used to obtain bar charts

Create a bar graph (Graphs > Bar Chart> Summarized data) of the data. Chose “Number
of Students” as the Summary variable, and “Season” as the categorical variable.

3. STATISTICS - used to obtain descriptive statistics

The following command may be used to generate a number of descriptive statistics including
mean, standard deviation, median, quartiles, etc. for specified variables:

Statistics > Descriptive Statistics...

Use the sample data to get descriptive statistics for C3 (temperature). This is the variable;
leave the Group variable box blank. The results should appear in the Output Pane. Note that
there are many other options and accessories available in the dialogue box — if these are
needed in future assignments, they will be explained then.
Minitab Survival Guide: Minitab Express 35

4. HISTOGRAM - used to obtain histograms

After using the command, Graphs > Histogram >Simple, a dialogue box will open. Double
click or select the columns for which you wish to get histograms. Then, click OK.

Use the sample data to get a histogram for C3 (temp).


36 Minitab Survival Guide: Minitab Express

5. STEM-AND-LEAF PLOT - used to create a stem plot

Use the command, Graphs > Stem-and-Leaf, and a dialogue box will open. Double click on
the columns for which you wish to get stem plots (the name of the column should be in the
Variable box), and then click OK. Try this command with column C3 data; the results
should appear in the Output Pane.
Minitab Survival Guide: Minitab Express 37

6. REGRESSION - used to perform regression analysis

The following command fits a regression equation to the data: Statistics > Simple
Regression . A dialogue box will open; click in the Responses box then chose the response
(or dependent) variable Y. Then click in the predictors box, and chose the column to be
used as the Predictor (or independent or explanatory) variable X. Finally, click OK.

Try this command with column C4 as Y and C3 as X. The results should appear in the Output
Pane.
38 Minitab Survival Guide: Minitab Express

7. CORRELATE - used to find a correlation coefficient

Use the command, Statistics > Correlation, and a dialogue box will open. Double click on the
two columns of the variables for which you wish to find the correlation and then click OK. Try
this command with columns C3 and C4; the results should appear in the Output Pane.
Minitab Survival Guide: Minitab Express 39

8. SCATTERPLOT - used to generate a scatterplot

This command produces a scatterplot with the data for the response variable on the vertical
axis (Y) and the data for the predictor variable on the horizontal axis (X). After you type the
following command a dialogue box will open: Graphs > Scatterplot > Simple. A dialogue box
will open. As with regression, double click on the column to be used as the Y variable and on
the column to be used as the X variable, and click OK.

Try this command with column C4 as Y and C3 as X; the results will appear in a new window.

If you’ve been following along in Minitab, you can see in the Navigator Pane all commands you
have used. Click on any of the items in that list to see the graph or output associated with
that command.
40 Minitab Survival Guide: Minitab Express

General Commands in MINITAB Express

The following are some other commands that are useful in MINITAB.

1. SAVE Commands - used to save files

There are various commands you can use to save your work. For instance, File > Save Project
(or pressing the diskette button, or using Ctrl+S) will save the entire project (i.e., all windows
including the worksheet, session, graphics, info, etc.). This commands initiate standard
Windows file management dialogue boxes; it is assumed that you are competent in the use of
these standard dialogue boxes.

2. OPEN Commands - used to open existing files

The command to retrieve previously saved projects are analogous to the SAVE commands:
File > Open.

To open Minitab files from the ActiveStats CD that came with your text, put the CD into your
CD drive. The Minitab files are in a folder called
DEVEAU_VELLEMAN_BOCK DATASETS\STATS AND DATA MODELS\MINITAB

3. HELP Command - used to get help with MINITAB

Note that all Help for Minitab Express is online; you need an internet connection to access
Help. However, you may wish to look at the “Getting Started with Minitab Express” video, or
experiment with the “How To” section.

4. PRINT Command - used to print files

You can get a hardcopy printout of any window in MINITAB, including the Session and
Graphics windows. It is particularly important to print these windows in order to turn in your
MINITAB assignments. For instance, to print the Session window you simply make sure that it
is active (i.e., click on the bar at the top of the Session window if it is not lit up as blue) and
then click the printer icon. The standard Windows Printer dialogue box will open, and you will
need to select the printer to which you wish to send your printout (Seton 316, Help Desk,
etc.). Also, make sure you print any graphs you need. Either do so when you first create the
graph by clicking the printer icon or at the end of your session by maximizing each graph
window and clicking on the printer icon.

You can also export the Output Pane or graphs to Microsoft Word. Simply click on the output
or on the graph you wish to copy, and right-click, and chose copy. Then go to where you want
to copy it, and right-click and chose paste.

5. EXIT - used to leave MINITAB


When finished, to leave MINITAB simply close the main window or use the following
command: File > Exit.
41
Using Your Statistical Calculator

Basic Functions

As was noted on your course syllabus, you will need a calculator which will allow you to enter
expressions using brackets to indicate the order of operations for this course. Hopefully, you
already have a suitable calculator, but if not, one is available at the MSVU book store. Along
with the appropriate calculator, you must have the user manual. There is a large variety of
calculators available, and your professor and lab instructor are not familiar with all of them
(especially the graphing calculators). If you no longer have the instructions for your calculator, do
not despair. There are a couple of websites that may help:
 Basic instructions for various models, including Texas Instrument, Hewlett-Packard,
Radio Shack, Sharp, and Casio are available at
http://office.manualsonline.com/manuals/device/calculator.html
 Complete instructions for most Texas Instrument calculators are available at
http://education.ti.com/us/global/guides.html#graph

Work through this section if you are not familiar with the operation of your calculator.
Depending on the model you have, some of the following features may not apply. The
instructions that follow are primarily for the Casio fx-300MS and Casio fx-300MS Plus. Where
applicable, instructions for the Texas Instrument TI-83 are given in a rectangular text box.
Finally, some common variations on the instructions are provided in the elliptical text boxes.

a. MODE Key

Before beginning any calculation, you need to ensure that your calculator is in the proper
mode. For this course, you should be familiar with COMP MODE (used for simple calculations), SD
MODE (used to calculate means and standard deviations), and REG MODE (used to calculate least-squares
regression coefficients and correlation coefficients). If you use your calculator in the wrong mode, the
memory and bracket keys may not work as you expect.

Select MODE
Press Mode , and then 1 for COMP (Computation Mode).
Press Mode , and then 2 for SD (Standard Deviation Mode).
Press Mode , and then 3 for REG and 1 for Lin (Linear Regression Mode).

With some calculators,


SD MODE may be referred to as STAT - 1 VAR,
& LIN-REG MODE may be referred to as STAT - 2 VAR.
42 Using Your Statistical Calculator

b. Display

For most calculators, you can specify how many decimal places will be displayed in an answer.
In this course, it is preferable to maintain all decimal places (i.e., 9). Additionally, you often
can specify the notation that answers will be displayed in. The default setting for many
calculators is scientific notation. For example, 1/10,000 may be displayed as 1x10 -04 . If you
understand that 1 x10 -04 = 0.0001, you may not wish to change the display; however, if you find
scientific notation confusing, change the display to decimal notation.

Change the Number of Decimal Places Displayed


Press Mode Mode Mode , then 1 for Fix, and 9 for 9 decimal places.

With the TI-83, press Mode , then  to select Float, and  to select the desired number of decimal places.

Change the Display Notation


Press Mode Mode Mode , then 3 for Norm, and 2 for decimal notation.

The TI-83 displays decimal notation by default, unless the number is less than 0.001

c. Correcting Errors

Imagine that you are in the middle of a long calculation when your finger slips, and you
inadvertently press the wrong number or function key. Fortunately, it is not always necessary
to start the calculation over from the beginning.

Assuming you want to evaluate 2 + 3 = 5, three mistakes are possible: mispunching the 2
key, the + key, or the 3 key. With the Casio fx-300MS and fx-300MS PLUS, there are two ways
to correct such an error.

Use the Delete Key


If you press 2 , and then press × by mistake, it can be corrected immediately by
pressing DEL , and then + 3 = ; you should see 5 on the display.

The DEL key on theTI-83 works the same way.

With some calculators,


a CE key is used to remove
the last entry.
Using Your Statistical Calculator 43

Use the Replay Key


If you notice the error after the whole calculation is complete (i.e. 2 × 3 = 6), you can use
the replay keys to get to the incorrect entry.

Press REPLAY REPLAY + = ; you should see 5 on the display.

d. Negative Numbers

The most common mistakes made in mathematical calculations involve the negative sign. As
such, it is essential that you know how to enter a negative number on your calculator.

Enter a Negative Number


To evaluate 6 × -3, press 6 × (-) 3 = ; you should see -18 on
the display.

Follow the same procedure for the TI-83.

e. Memory

Mastering the use of your calculator’s memory function is key to the accurate and efficient
evaluation of complicated formulas. Most calculators have the following memory keys: add into
memory (M+ or Min), subtract from memory (M-), recall from memory (MR, RCL, or RCL M),
clear memory (CM or Mcl), and store in memory (STO or STO M). However, some calculators
require the use of separate M, +, and - keys. It is also important to know that some
calculators, even solar powered ones, have a battery to maintain the memory function;
therefore, simply turning the calculator off will not clear the memory. Moreover, problems with
the memory function can arise when the battery is low.

Clear the Memory


Press Shift Mode , then 1 for Mcl, and then = .

To clear the memory on a TI-83, press 2nd , then + for MEM, 5 to select Reset…,
1 to select All Memory…, and 2 to select Reset; you should see Mem cleared on the display.

Enter a Number into Memory


To enter 17 into memory, press 1 7 Shift , then RCL for STO, and then
M+ ; you should see 17M on the display.
Recall Memory
Press RCL M+ ; you should see 17 on the display.
44 Using Your Statistical Calculator

Add a Number in the Display to the Number in Memory


To add 3 to the number stored in memory (i.e., 17), press 3 M+

Now, press RCL M+ ; you should see 20 on the display.

Subtract a Number in the Display from the Number in Memory


To subtract 8 from the number stored in memory (i.e., 20), press 8 , then
Shift M+ for M-.
Now, press RCL M+ ; you should see 12 on the display.

Use the Memory Function in a Calculation


Suppose you want to double the value of the number held in memory. You need only press
2 × RCL M+ = ; you should see 24 on the display.

f. Squares and Square Roots

Most calculators have square (x2) and square root () keys. These functions not only can
reduce the chances of making an error in a calculation, but also can save time, which is
important during a timed exam.

Square a Positive number


To square 2, press 2 X2 = ; you should see 4 on the display.

Square a Negative number


For many calculators, it is necessary to use brackets when squaring negative numbers.
To square -7, press ( (-) 7 ) X2 = ; you should see 49 on
the display.

Follow the same procedure for the TI-83.


Using Your Statistical Calculator 45

Find the Square Root of a Number


To find the square root of 9, press √ 9 = ; you should see 3 on the display.

To find the square root of 9 using the TI-83, press 2nd , then X2 for √, and then 9 ) = .

Practice Exercise

Try each of the calculations below. By using the memory function on your calculator, you
should be able to do these calculations without having to write down any partial or
intermediate steps. Remember, unless brackets are used, you do exponents first, then
multiplication and division, and finally addition and subtraction (BEDMAS). In this course, you
occasionally may be required to calculate certain statistics on your calculator; thus, it would be
to your advantage to be comfortable performing calculation like the ones below.

23.32  51.70  45.23  15.72  91.87


a. 
5

b. 23.322  51.702  45.232  15.722  91.872 

c. 23  0.5  36  12.3  19  0.97  0.1  7.6 

d. 17 

e. 2  15  1.7 

Answers: a) 45.568 b) 13949.681 c) 473.49 e) 4.123 f) 8.584


46 Using Your Statistical Calculator

OPTIONAL Calculator Information


Using Your Statistical Calculator:
Statistical Functions

1. Single Variable Statistics

Statisticians calculate statistics on a sample of measurements. In this course, the first type of
statistics that you will be introduced to is single variable statistics. By setting your calculator to
statistical mode (see p.22), you will be able to calculate these values accurately and efficiently.

IMPORTANT! The first step in calculating single variable statistics is to CLEAR the statistical
memory. Because many calculators are equipped with a battery to maintain various memory
functions, simply turning off the calculator WILL NOT clear the statistical memory.

Clear the Statistical Memory


Press Shift , then Mode for CLR, 1 for Scl, and then = ; you should see Stat
clear and 0 on the display.

To clear the statistical memory of the TI-83, you can follow the instructions for clearing memory on p.26, or you can clear
the data stored in the lists. To clear List 1, press STAT , and 1 for edit. Scroll to the top using the  key
until L1 is highlighted. Now, press CLEAR and ENTER .

After you are sure that the statistical memory is clear, you can begin entering data. For
practice, try entering the following data set: {23.32, 51.7, 45.23, 15.72, 91.87}.

Enter Data
For the first datum, press 2 3 . 3 2 , then M+ for DT; you
should see n=1 on the display. Next, press 5 1 . 7 M+ ; you
should see n=2 on the display. Continue in similar fashion until you have entered all the
data.

To enter data using the TI-83, press STAT , and 1 for edit. Scroll to the first row in L1 using the  key,
and now press 2 3 . 3 2 , and then ENTER to enter the first datum. Next, press
5 1 . 7 , and then ENTER . Continue until all the data are entered.
Using Your Statistical Calculator 47

When you finish entering data, it is always a good idea to check it. Just as many calculators
allow you to correct an error in a long calculation, many also allow you to correct an error in
data entry.

Check Data

To check the data, simple press the REPLAY key to scroll through the data. For each entry,
you will see two displays. When you press REPLAY the first time, you should see x1 =
23.32 on the display. Press it again, and you should see Freq 1 = 1.

To check data using the TI-83, simply scroll through the list using the  and  keys.

Correct an Erroneous Data Entry


Suppose you mistakenly entered 52.7 instead of 51.7 for x2. When you are at the screen
displaying x2 = 52.7, press 5 1 . 7 =. Now, you should see x2 =
51.7 on the display.

To correct data using the TI-83, highlight the incorrect datum (e.g., 52.7) using the  key. Now, press
5 1 . 7 ENTER .

After the data are entered and checked, you need only to press one or two keys to calculate
any one of a variety of statistics. The two calculations that you will use most in this course are
the sample mean and sample standard deviation.

Calculate the Sample Mean ( x )


Press Shift , 2 for S-VAR, then 1 for x , and finally = ; you should see
45.568 on the display.

Calculate the Sample Standard Deviation (xσn-1)


Press Shift , 2 for S-VAR, then 3 for xσn-1 and finally = ; you should see
29.8641402 on the display.

Assuming your data is entered into L1 of the TI-83, press STAT , use the  key to highlight CALC, and then
press 1 for 1-Var Stats. You should see a list of various statistics on the display. Use the  key to scroll
through the list. The sample standard deviation is denoted by sx.
.
48 Using Your Statistical Calculator

Practice Exercise

Calculate the mean ( y ) and standard deviation (yσn-1) of the following data: {14.62,
12.12, 13.63, 14.07}

Answers: y = 13.58 yσn-1 = 13.12

2. Two Variable Statistics

Soon after you become acquainted with single variable statistics, you will be introduced to two
variable statistics. Here, you will be required to use the linear regression mode of your
calculator (see p.24).

Importantly, as with single variable statistics, the first step in calculating two variable
statistics is to CLEAR the statistical memory. Follow the same procedure as with single
variable statistics (p.29).

Now, you can begin entering data. For practice, try entering the following data set:

X 5 3 8 4 6
Y 15 12 20 15 18

Enter Data
Data for linear regression are entered in pairs (i.e., x,y). For the first pair listed above,
press 5  1 5 M+ ; you should see n=1 on the display. For the
next pair, press 3 3   1 1 2 M+ ; you should see n=2 on the display.
Continue the same way until all the pairs are entered.

To enter pairs of data using the TI-83, press STAT , and 1 for edit. Scroll to the first row in L1 using the
………….key,
 and press 5 ENTER , 3 ENTER , and so forth until all the variable X data are entered.
Now, press the  key to select the second list (L2). Starting in the first row, press 1 5 ENTER ,
and so forth until all the variable Y data are entered.

When you finish entering the data, remember to check it. Use the same procedure as used
with single variable statistics to check the data and correct any errors (see p.30). After the data
are entered and checked, you need only to press one or two keys to calculate any one of a
variety of statistics. The most common ones used in this course are the variable means and
standard deviations, the sample slope and y-intercept, the correlation coefficient, and the
coefficient of determination.
Using Your Statistical Calculator 49

Calculate the Sample Slope (b1)


Press Shift 2 for S-VAR, and then press REPLAY REPLAY 2 for B – the
sample slope – and lastly press = ; you should see 1.554054054 on the display.

Calculate the Sample Y-Intercept (b0)


Press Shift 2 for S-VAR, and then press REPLAY REPLAY 1 for A – the
sample y-intercept – and finally press = ; you should see 7.918918919 on the display.

Calculate the Correlation Coefficient (r)


Press Shift 2 for S-VAR, and then press REPLAY REPLAY 3 for r – the
correlation coefficient – and finally press = ; you should see 0.969851362 on the
display.

Calculate the Coefficient of Determination (r2)


Press Shift 2 for S-VAR, then press REPLAY REPLAY 3 for r, and now
press X2 and = ; you should see 0.940611664 on the display.

Calculate a Predicted Value of Y


Suppose you wish to know the predicted value of Y when the value of X is 7. First, press
7 , Shift 2 for S-VAR, REPLAY REPLAY REPLAY 2 for the ŷ , and finally
=
; you should see 7 ŷ and 18.7972974 on the display.

By default, the TI-83 displays only the sample slope and y-intercept when you construct a regression equation. If you
change the default setting, the TI-83 will also display the correlation coefficient and the coefficient of determination. To
change the default setting, press 2nd CATALOGUE , then select DIAGNOSTIC ON , and press ENTER.

To calculate the slope and y-intercept using the TI-83, press STAT  8 to select LinReg(a + bx). Now,
press 2nd L1  2nd L1  , and then VARS  1 to select the Y-
VARS menu. Press 1 to select Y1, and then press ENTER . You should see LinReg y = a + bx,
a = 7.918918919 [ y-intercept], b = 1.554054054 [ slope], r2 = 0.940611664, and r = 0.969851362 on the display.

To find the predicted value of Y when X is 7, press VARS  1 to select the Y-VARS menu. Now, press
…………
1 to select Y1, and then press ( 7 ) ENTER . You should see Y1(7) and 18.7972974 on
the display.
50 Using Your Statistical Calculator

Practice Exercise

Using the following set of data, calculate the b1, b0, r, r2, and ŷ when x = 6.5.

X 6 3 8 4 9
Y 14 7 24 10 31

Answers: b1 = 3.846 b0 = -5.877 r = 0.977 r2 = 0.955 when x = 6.5, ŷ = 19.123


51
Practice Exercises for MATH2208

Basic Skills for Math 2208

The purpose of these questions are to help you become more familiar with your calculator,
and to review some basic math concepts. It is especially important to know how to change
the Display Mode of your calculator. You may find it useful to read the previous section of
the lab manual, “Using Your Statistical Calculator”. If desired, try the practice exercises in
that section.

Complete the following calculations. Note that most of these calculations can be done
without having to write down partial or intermediate steps, by using the memory function
or brackets on your calculator (be sure to remember BEDMAS).

Always keep an appropriate number of decimal places in your final answer, at least three.
Do not round your answer until the final step; keep all decimal places during intermediate
calculations.

Question 1

32.2  15.7  54.2  51.7


a. __________________
4

b. (16.2 – 31.5)2 + (43.8 – 31.5)2 + (29.6 – 31.5)2 + (36.5 – 31.5)2 _____________________

c. 46 * 2.3 + 26 * 22.3 + 12 * 0.47 __________________

d. 56  42  0.4 __________________

Question 2

Calculate the mean, y , for the following data: 33.8, -27.4, 32.5, and 41.3.

y = ___________
52 Basic Skills for Math2208

Question 3

a. Express the following as a percentage.

0.35 __________ 0.075 __________

b. Express the following as a decimal.

45% __________ 90% __________

c. Express the following as a decimal.

-4 -3
1.6 X10 __________ 5.5 X10 __________

d. Express the following in scientific notation.

0.00036 __________ 0.000085 __________

e. If μ = 6 and σ = 1.05, find the approximate value for the following expression.

y
, when y = 3.21 _________________

f. If y = 12, μ = 24, s = 21, and n = 49, find the exact value for the following:

y
s n _________________

| y - μ| _________________

g. Express the following in symbols.

F is at least 18 _________________

p is less than 0.10 but greater than or equal to 0.05 _________________


53
Chapter 2: Displaying and Describing Categorical Data

The progress of male and female Ph.D. graduate students at a major university was the
subject of a study designed to see if sex differences existed. All students who had entered
Ph.D. programs in a given year were classified as to their sex and status six years later. The
following categories were used: completed the degree, still enrolled, and dropped out.

a. Complete the following table by finding the row totals, column totals and overall total
(bottom right-hand corner).

Degree Status

Sex Completed Still Enrolled Dropped Out Row Total


Male 432 134 238
Female 98 33 98
Column Total

b. How many students have completed their degree? _______

c. What percentage of the students who completed their degree were women?

Calculation: ______ / ______=_________ Answer: ________ %

d. What percentage of male students were still enrolled?

Calculation: ____________=________ Answer: ________ %

e. What percentage of the students who dropped out of their program were men?

Calculation: ______ / ______=_________ Answer: ________ %

f. For students that are male, calculate the distribution of Degree Status. Then, for students
that are female, calculate the distribution of Degree Status [i.e., the conditional
distributions of Status given Sex].

Degree Status

Sex Completed Still Enrolled Dropped Out


432 134 238
Male  %  %  %

98 33 98
Female  %  %  %
54 Chapter 2: Displaying and Describing Categorical Data

g. i) For students that are male, draw a bar chart of the distribution of Degree Status. ii) For
students that are female, draw a bar chart of the distribution of Degree Status. Make sure
to use percentages and use the same scale for both charts to make your comparisons
easier. Remember to label your graphs clearly. Be sure give each chart a title, include
variable names, and label both the horizontal and vertical axes.

i) ii)

h. Based on the bar graph above, address the question of sex differences in degree status.
Does status in the Ph.D. program appear to be related to sex? (i.e.; does the distribution of
degree status depend on sex?) Why or why not?
55
Chapter 3: Displaying and Summarizing Quantitative Data

Question 1

In anticipation of the 1984 Olympics, the L.A. Times (Aug. 15, 1983) reported ozone levels in
parts per million (ppm) for several sites to be used for Olympic events the following summer.
The ozone reading taken between July 28 and August 12 for one of such sites is given below.

Site 1: East L.A. College 10 14 13 18 12 22 14 19 22 03 14 16 13 06 07 19

a. Make a stem and leaf display for the above data, i) without ordered leaves, ii) with ordered
leaves. It is desirable to split the stems here; for example, values from 10 to 14 will be one
stem, 15 to 19 another. Be sure to label your display.

i) without ordered leaves, in the order that the data is given. It is desirable to split the
stems here; for example, values from 10 to 14 will be one stem, 15 to 19 another.

0 |
0 |
1 |
1 |
2 |
2 |

ii) with ordered leaves. Be sure to label your display.


56 Chapter 3: Displaying and Summarizing Qualitative Data

b. Display the ozone levels of the site in a histogram with 7 bins. First, construct a frequency
table and then construct the histogram. The bins will be as follows; 3-6, 6-9, 9-12, 12-15,
15-18, 18-21, 21-24. As always, be sure to give your graph a title and label the axes
carefully. Create your histogram using percentages (relative frequencies).

Note: If a data point falls on a cut-point, place it in the upper bin. For example, 21 would
go in the 21-24 bin.

Class Count (Frequency) Percent (Relative Frequency)

3-6

6-9

9-12

12-15

15-18

18-21

21-24

Total

c. Describe the shape of both the histogram from part (b) and the stem and leaf plot from
part (a).
Chapter 3: Displaying and Summarizing Qualitative Data 57

Question 2

Dr. Einstein taught Intro Stats for the first time in the fall of 2009. One of the most challenging
aspects of his new job was creating tests that were neither too easy nor too hard. After the
first midterm was written, Dr. Einstein decided to compare the distribution of grades received
in his class to the distribution of grades received in a more senior faculty member’s class.
These data are displayed in the following two histograms.

a. How many of Dr. Einstein’s students received marks between 75 and 90?
Calculation: _______________ Answer: __________

b. What percentage the senior faculty member’s class received grades of 85 or more?
Calculation: _______________ Answer: _________%

c. Which class had the larger proportion of grades below 80? ____________________

d. I estimate the median grade in the senior faculty member’s class to be between
_________________. Explain.

e. Compare the two distributions in terms of their symmetry, modality and outliers.

f. Based on the two histograms, does it appear that Dr. Einstein’s midterm was easier than
that of the senior faculty member? Explain.
58 Chapter 3: Displaying and Summarizing Qualitative Data

This page left blank intentionally.


59
Chapter 4: Understanding and Comparing Distributions

By this point in the course, you should be able to describe data that has been graphically
displayed, commenting on the shape, centre, and spread of a distribution. Also, you should be
able to describe data using numbers and get information about the shape, centre, and spread
of a distribution based on these numbers.

For this question, we are going to use the data about ozone readings that we used in the
practice exercise for stem plots. Remember, the ozone levels for East LA College and the
Coliseum were measured in parts per million (ppm) during the two week period between July
28 and August 12, 2002. These data are given below.

Site 1: East LA College 10 14 13 18 12 22 14 19 22 3 14 16 13 6 7 19


Site 2: Coliseum 8 13 10 9 16 12 13 14 17 13 9 16 1 9 8 12

a. Find the five number summary for East LA, and the IQR for both East LA and the Coliseum.
The five number summary for the Coliseum has been provided.

East LA:
Minimum: _____ , Q1: ______ , Median: ______ , Q3: ______ , Maximum: ______

IQR = __________________ = ___________________ Answer:


(formula) (calculation)

Coliseum:
Minimum: ___1__ , Q1: __9__ , Median: __12__ , Q3: __13.5__ , Maximum: __17__

IQR = __________________ = ___________________ Answer:


(formula) (calculation)

b. Explain, using the outlier rule, why the value of 1 (from the Coliseum data) is an outlier.
Be sure to show all of your work.
60 Chapter 4: Understanding and Comparing Distributions

c. Use the values from part (a) to make side-by-side boxplots. The boxplot for East LA has
been given to you; there were no outliers at that site. Remember to use give your graph an
appropriate title which includes the “Who” (individuals; which is not always people) and the
“what” (variable), and label the axis.

d. Compare the two distributions in terms of their centres, spread, and outliers.

e. Determine the mean for each site; the standard deviations have been provided.

Mean SD
Site 1: East LA 5.50
Site 2: Coliseum 3.99

f. For the first observation in the East LA data, 10, what is the deviation from the mean?

Dev = __________________ = ___________________ Answer:


(formula) (calculation)

g. Find the square of the deviation found in (f). ____________________________


*Remember the variance formula uses square deviations. See page 65 of text.
61
Chapter 5: The Standard Deviation as a ruler and the Normal Model

Let us assume that head circumference for women, with a mean of 56cm and a standard
deviation of 2cm follows a normal model.

a. Label the density curve for this distribution with the value of the mean, and 3 standard
deviations above/below the mean.

b. Use the 68-95-99.7 rule to answer the following two questions.

i. Between what two values do the middle 68% of all head circumferences fall?

ii. What is the circumference of a woman’s head if only 2.5% of women have heads
that are larger than hers?

c. On the diagram in part (a), label the head circumference of 53 cm and shade the
corresponding area for the proportion of women who have a head circumference smaller
than 53 cm.
62 Chapter 5: The Standard Deviation as a ruler and the Normal Model

Use Table Z to answer the following questions.

d. About what proportion of women have a head circumference smaller than 53 cm?

e. About what proportion of the women have a head circumference greater than 59 cm?

f. About what proportion of women have a head circumference of more than 1.5 standard
deviations away from the mean?

g. About what proportion of women have a head circumference of within 1.5 standard
deviations of the mean?
63
Chapters 6 - 8: Linear Regression

Question 1

Turn to page 258 in your textbook and read question 19. Use the Minitab output and plots
provided to answer the questions below.

Scatterplot of Age (yr) vs Diameter (in.)

40

30
Age (yr)

20

10

0
0 2 4 6 8 10 12 14 16 18
Diameter (in.)

a. Based on the scatterplot, what can you say about the relationship between the Diameter
and Age of the trees?

b. Identify the tree with a diameter of 11 inches on the scatterplot.


64 Chapter 6 - 8: Linear Regression

Here is the Minitab output for this data:

Regression Analysis: Age (yr) versus Diameter (in)

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 1 2905.5 2905.55 93.44 0.000
Diameter (in) 1 2905.5 2905.55 93.44 0.000
Error 25 777.4 31.10
Lack-of-Fit 16 608.7 38.05 2.03 0.141
Pure Error 9 168.7 18.74
Total 26 3683.0

Model Summary

S R-sq R-sq(adj) R-sq(pred)


5.57643 78.89% 78.05% 76.31%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant -0.97 2.60 -0.37 0.711
Diameter (in) 2.206 0.228 9.67 0.000 1.00

Regression Equation

Age (yr) = -0.97 + 2.206 Diameter (in)

Fits and Diagnostics for All Observations

Obs Age (yr) Fit Resid Std Resid


1 4.00 3.00 1.00 0.20
2 5.00 3.00 2.00 0.39
3 8.00 3.88 4.12 0.80
4 8.00 8.73 -0.73 -0.14
5 8.00 13.58 -5.58 -1.03
6 10.00 8.73 1.27 0.24
7 10.00 16.01 -6.01 -1.10
8 12.00 22.85 -10.85 -1.98
9 13.00 16.01 -3.01 -0.55
10 14.00 11.16 2.84 0.53
11 16.00 20.86 -4.86 -0.89
12 18.00 21.30 -3.30 -0.60
13 20.00 25.71 -5.71 -1.05
14 22.00 27.26 -5.26 -0.97
15 23.00 21.74 1.26 0.23
16 25.00 30.56 -5.56 -1.03
17 28.00 28.14 -0.14 -0.03
18 29.00 20.86 8.14 1.49
19 30.00 28.14 1.86 0.34
20 30.00 32.99 -2.99 -0.56
21 33.00 37.84 -4.84 -0.93
22 34.00 30.56 3.44 0.64
23 35.00 32.99 2.01 0.38
24 38.00 23.29 14.71 2.69 R
25 38.00 32.99 5.01 0.94
26 40.00 35.42 4.58 0.87
27 42.00 35.42 6.58 1.24

R Large residual
Chapters 6 - 8: Linear Regression 65

Fitted Line Plot


Age (yr) = - 0.974 + 2.206 Diameter (in)
S 5.57643
40 R-Sq 78.9%
R-Sq(adj) 78.0%

30
Age (yr)

20

10

0
0 2 4 6 8 10 12 14 16 18
Diameter (in)
66 Chapter 6 - 8: Linear Regression

c. Are the conditions for regression satisfied? Explain in the context of the question, using the
plots provided to help you when needed.

i. Quantitative Variable Condition:

ii. Straight Enough Condition:

iii. Outlier Condition:

iv. Does the Plot Thicken? Condition:

________________________________________________________________________

_______________________________________________________________________
Chapters 6 - 8: Linear Regression 67

d. State the least-squares regression line relating predicted Age and Diameter for the trees.

____________________________________________________________________

e. Use the regression equation to find the predicted age, ŷ, for a tree that is 11 inches in
diameter.

Diameter of tree: _______ Predicted age: ______________


(Don’t forget to state your units.)

f. Find and circle the predicted age for a tree that is 11 inches in diameter on the MINITAB
output (observation 24). Compare this value with that of the value computed in part (e).

Value from output:_____________

Comparison:________________________________________________________________
________________________________________________________________________
g. Calculate the residual for the tree that has a diameter of 11 inches by hand and verify it
(circle it) on the MINITAB output.

Residual = ______________ = _____________ Answer: ____________


(formula) (calculation)

Copy the Minitab value here: _________________________

h. Identify the tree that has a diameter of 11 inches and the predicted value for that
observation on the Fitted Line Plot. Also, indicate the residual on this plot.

i. The slope of the least-squares regression line is ____________.

j. Interpret the value of the slope in the context of the problem.

__________________________________________________________________________

__________________________________________________________________________

_________________________________________________________________

k. The value of the coefficient of determination (r2 or R-sq) is ____________.


68 Chapter 6 - 8: Linear Regression

l. Interpret this value in the context of the problem.

__________________________________________________________________________

__________________________________________________________________________

m. Calculate the value of the correlation coefficient (r) between diameter and age.

r = ____________

n. Interpret this value in the context of the problem.

__________________________________________________________________________

__________________________________________________________________________

_________________________________________________________________

o. Based on your answers to (l) and (n), do you think Diameter is a good predictor of Age?

__________________________________________________________________________

__________________________________________________________________________

p. Is your answer to part (o) consistent with your answer in part (a)? Explain.

__________________________________________________________________________

________________________________________________________________________

q. Would you be justified in using the regression equation to predict the Age if the diameter
of the tree was 20 inches? Explain.
Chapters 6 - 8: Linear Regression 69

r. Circle the residual for a tree with a diameter of 11 inches on the residual plot above.

s. What does this residual plot indicate about the appropriateness of the linear model fit to
the data?
70 Chapter 6 - 8: Linear Regression

Question 2

Turn to page 241 of your text book, and answer question 12 for all but the first graph.

Top right-hand graph:

(a)

(b)

(c)

(d)

Bottom left graph:

(a)

(b)

(c)

(d)
Chapters 6 - 8: Linear Regression 71

Bottom right-hand graph:

(a)

(b)

(c)

(d)
72
73

Chapter 10: Sample Surveys

Question 1

Turn to page 297 of your textbook and read question 11. Copy the question to the space
below, and answer the textbook question.

a. Population:

b. Parameter of interest:

c. Sampling Frame:

d. Sample:

e. Sampling Method:

f. Any bias or other problems:


74 Chapter 10: Sample Surveys

This page left blank intentionally.


75

Chapter 11: Experiments & Observational Studies

A university researcher wants to determine how concerned faculty members are about campus
security and whether faculty who teach in the evenings are more concerned about it than
those who only teach during the day. Of the more than 800 faculty members employed at the
university, 300 teach at least one evening course. The researcher plans to distribute a survey
to 160 faculty members: 60 of whom teach at least one evening class and 100 of whom only
teach day classes. One of the questions was “How comfortable do you feel about your
personal safety on campus?”, where the respondents must rank their comfort level on a scale
from 1 (not at all comfortable) to 10 (very comfortable).

a. Is this study observational, experimental, or a survey? Explain.

b. The sample is

c. Identify the “W’s” for the sample where applicable. (See page 8 of your text)

Who (the individuals being measured):

What (variable and units):

When:

Where:

Why:

How (was the variable measured and recorded):

d. The populations are

e. The experimental/observational units are

f. List some possible confounding or lurking variables.


76 Chapter 11: Experiments and Observational Studies

This page left blank intentionally.


77

Chapter 12: From Randomness to Probability

Question 1

Turn to page 349 of your textbook and read question 1. Copy the question to the space below,
and answer the textbook question.

a. Question:

Answer:

b. Question:

Answer:

c. Question:

Answer:

d. Question:

Answer:
78 Chapter 12: From Randomness to Probability

Question 2

Turn to page 373 of your textbook and read question 35. Copy the question to the space
below, and answer the textbook question.

Question:

Answer:
79

Chapter 14: Random Variables

Question 1

Turn to page 410 of your text and read question 75. Answer questions a - c from the book; a
space is provided to copy the question if you wish.

a. What is the expected difference between the larger and smaller bowls?

b. What is the standard deviation of that difference?

c. If the differences can be described by a Normal model (distribution), what’s the probability
that the small bowl of cereal contains more cereal than the large one?
80 Chapter 16: Random Variables

This page left blank intentionally.


81

Chapter 15: Sampling Distribution Models

Question 1

Turn to page 444 of your text book and answer question 21.
If you wish, copy the question into this space.

____________________________________________________________________________

____________________________________________________________________________

____________________________________________________________________________

ANSWER (Hint: Write the question as a DECIMAL):

a. Mean: ___________________________________________________

Standard Deviation: ________________________________________

b. Assumptions
82 Chapter 15: Sampling Distribution Models

c. _______________________________________________________________________

_______________________________________________________________________

*Hint: Write the question as a DECIMAL.


83

Chapter 16: Confidence Intervals for Proportions

Question 1

Turn to page 473 in your text. Copy the STORY (not the questions) from question 21 in this
space, and answer the following questions.

____________________________________________________________________________

____________________________________________________________________________

____________________________________________________________________________

____________________________________________________________________________

____________________________________________________________________________

a. Confirm that the assumptions are satisfied.

Independence Assumption:

Randomization Condition:

Sample Size Assumption:

Success/Failure Condition:

b. Calculate a 95% confidence interval for the proportion of all auto accidents that involve
teenage drivers.
84 Chapter 16: Confidence Intervals for Proportions

Consider the MINITAB output below.


Test and CI for One Proportion

Sample X N Sample p 95% CI


1 91 582 0.156357 (0.126850, 0.185864)

Using the normal approximation.

c. Compare your confidence interval with that from the MINITAB output.

__________________________________________________________________________

__________________________________________________________________________

d. Interpret the confidence interval in the context of the question.

_________________________________________________________________________

_________________________________________________________________________

_________________________________________________________________________

e. Explain what “95% confidence” means.

_________________________________________________________________________

_________________________________________________________________________

_________________________________________________________________________

f. Copy part (d) from the book and answer.

_________________________________________________________________________

_________________________________________________________________________

_________________________________________________________________________

_________________________________________________________________________

Answer: __________________________________________________________________

_________________________________________________________________________

_________________________________________________________________________
Chapter 16: Confidence Intervals for Proportions 85

Question 2

Turn to page 474 in your text. Read question 33 and answer the following questions.

a. Answer part (a) from the textbook.

n = __________________

b. In general, what would happen to a confidence interval if the sample size increases?

_________________________________________________________________________

_________________________________________________________________________

_________________________________________________________________________

_________________________________________________________________________

c. In general, what would happen to a confidence interval if the confidence level decreases?

_________________________________________________________________________

_________________________________________________________________________

_________________________________________________________________________

_________________________________________________________________________
86

This page left blank intentionally.


87

Chapter 17: Testing Hypothesis about Proportions

A bank manager wanted to increase the amount of money deposited at his bank. One way he
felt that he could increase business was to target clients of his bank who also had investments
in other financial institutions. In 1990, 30% of the banks customers had investments in other
financial institutions, how does this compare to now? His assistant manager thinks that it has
decreased over time. To see if there is any evidence that he is correct, he randomly surveyed
200 customers and found that 44 had investments in other financial institutions.

a. State the hypotheses, defining any parameters used.

__________________________________________________________________________

__________________________________________________________________________

__________________________________________________________________________

__________________________________________________________________________

b. State the assumption(s) that are required to carry out the hypothesis test, and show that
they have been met in context. Refer to the appropriate conditions in your answer.

__________________________________________________________________________

__________________________________________________________________________

__________________________________________________________________________

_________________________________________________________________________

c. Calculate the appropriate test statistic. Show the formula and the values substituted in.

ANS: ______________
88 Chapter 17: Testing Hypothesis About Proportions

Test and CI for One Proportion


Test of p = 0.3 vs p < 0.3

95% Upper
Sample X N Sample p Bound Z-Value P-Value
1 44 200 0.220000 0.268180 -2.47 0.007

Using the normal approximation.

d. Compare your answer with that found on the MINITAB output.

e. Find the resulting P-value from Table Z.

f. Briefly assess the strength of the evidence.

Strength:

Significance: Not applicable; no significance level is given!

g. State your conclusion in the context of the problem.

h. Based on the assumptions, do you have any reservations about your conclusion in (g)?
Explain.
89

Chapter 18: More About Tests

Question 1

Turn to page 526 of your text book and answer question 3. (A space is provided for you to
copy the question if you wish.)

ANSWER:
90 Chapter 18: More about Tests

Question 2

Turn to page 526 of your text and answer question 5. (A space is provided for you to copy the
question if you wish.)

ANSWER:
Chapter 18: More about Tests 91

Question 3

Turn to page 526 of your text and read the story for question 7. (A space is provided for you
to copy the question if you wish.)

a. NEW PART a Calculate a 95% confidence interval for the population proportion.

b. Answer part (b) from the question in the text book.

_________________________________________________________________________

_________________________________________________________________________

ANSWER: _________________________________________________________________

_________________________________________________________________________

_________________________________________________________________________
92 Chapter 18: More about Tests

Question 4

Turn to page 528 of your text book and answer question 23, parts a, b, and c, only. Be sure to
explain parts b) and c) in CONTEXT! (A space is provided for you to copy the question if you
wish.)
____________________________________________________________________________

____________________________________________________________________________

____________________________________________________________________________

____________________________________________________________________________

____________________________________________________________________________

a. _________________________________________________________________________

_________________________________________________________________________

ANSWER: _________________________________________________________________

_________________________________________________________________________

_________________________________________________________________________

b. _________________________________________________________________________

_________________________________________________________________________

ANSWER: _________________________________________________________________

_________________________________________________________________________

_________________________________________________________________________

c. _________________________________________________________________________

_________________________________________________________________________

ANSWER: _________________________________________________________________

_________________________________________________________________________

_________________________________________________________________________
Chapter 18: More about Tests 93

Question 5

A supermarket manager has received a number of complaints concerning the price scanner at
the speedy checkout. Apparently, it has been over charging customers. The manager is
understandably concerned. He plans to do a hypothesis test. If there is evidence at the 1%
level that more than 3% of speedy checkout items are being over charged, he will consider
replacing the scanner.

a. State the hypotheses, defining any parameters used.

He randomly selects 360 items at the speedy checkout and records how many items are
overcharged. The relevant Minitab Output is shown below.

Test and CI for One Proportion

Test of p = 0.03 vs p > 0.03

95% Lower
Sample X N Sample p Bound Z-Value P-Value
1 12 360 0.033333 0.017772 0.37 0.355

Using the normal approximation.

b. State the test statistic from the output. z = _________________________

c. Show the formula for z, and the values substituted in, you do not need to complete the
calculation. (Although, as practice for exam, you could complete it and check your answer
against the MINITAB result.)
94 Chapter 18: More about Tests

d. State the p-value from the output and shade the appropriate area on the diagram to
illustrate the P-value.

P-value = ____________________

e. Briefly assess the strength and significance of the evidence.

i) Significance: Is your p-value less than or equal to the specified significance level?
Explain.

So, are the results of test significant at the 1% level? Yes / No (circle one)

Decide if the following are correct. Indicate with true or false.

______ Conclude Ho and reject Ha.

______ Do not reject Ho and do not conclude Ha.

______ Reject Ho and conclude Ha

ii) Strength: Which of the following best explains the strength of the P-value?

_____ very small P-value; strong evidence to support Ha

_____ small P-value; good evidence to support Ha

_____ moderately small P-value; weak evidence to support Ha

_____ large P-value; virtually no evidence to support Ha


Chapter 18: More about Tests 95

f. State your conclusion in the context of the problem.

g. Considering the assumptions, do you have any reservations about your conclusion in (f)?
Explain.

h. Which type of error could you have committed? (Type I or II) Explain in context.
96 Chapter 18: More about Tests

This page left blank intentionally.


97

Chapter 19: Comparing Two Proportions

When building or renovating a home, there are many decisions that have to be made. One
very important decision is the choice of roofing material. Here in Nova Scotia, most
homeowners use Asphalt roofing shingles; however, some use Aluminum roofing. A local
roofing contractor was interested in whether there was a difference in the quality of the two
roofing materials with regard to leaks. This local roofing contractor has a huge project coming
up. If there is a difference in quality between the two materials, at a 5% significance level,
they will choose the best material to use in their next project.

From a random sample of 200 houses roofed with Asphalt shingles, 36 experienced leaks
within the first ten years. From a random sample of 125 houses roofed with Aluminum roofing,
19 experienced leaks within the first ten years.

Is there sufficient evidence at the 5% level of significance to conclude that there is a


difference in the proportion of all Asphalt and all Aluminum roofs that experience leaks?
Use the Minitab output below to help you answer the questions.

Test and CI for Two Proportions


Sample X N Sample p
1 36 200 0.180000
2 19 125 0.152000

Difference = p (1) - p (2)


Estimate for difference: 0.028
95% CI for difference: (-0.0544390, 0.110439)
Test for difference = 0 (vs not = 0): Z = 0.67 P-Value = 0.506

a. State the null and alternative hypotheses, defining any parameters used.

______________________________________________________________________

______________________________________________________________________

______________________________________________________________________

______________________________________________________________________

______________________________________________________________________
98 Chapter 19: Comparing two Proportions

b. State the assumption(s) needed to carry out the hypothesis test, and show that they have
been met in the context of the question.

________________________________________________________________________

________________________________________________________________________

________________________________________________________________________

________________________________________________________________________

________________________________________________________________________

________________________________________________________________________

________________________________________________________________________

________________________________________________________________________

________________________________________________________________________

________________________________________________________________________

________________________________________________________________________

________________________________________________________________________

________________________________________________________________________

________________________________________________________________________

c. What is the value of the test statistic and the p-value? Give the formula for the test
statistic and show how the appropriate values are substituted in. (You do not have to
actually calculate this.)

Test Statistic (Z)=_________

P-Value = ______________
Chapter 19: Comparing Two Proportions 99

d. Indicate the p-value on the diagram below, and find the p-value to 4 decimal places
using Table Z.

P-value = ___________

e. Briefly assess the strength and significance of the evidence.

Strength: _______________________________________________________________

_______________________________________________________________________

Significance: ____________________________________________________________

_______________________________________________________________________

f. State your conclusion in the context of the problem.

______________________________________________________________________

______________________________________________________________________

______________________________________________________________________

______________________________________________________________________

______________________________________________________________________

g. What is the 95% confidence interval for the difference in the proportion of all Asphalt and
all Aluminum roofs that experience leaks? Give the formula for the confidence interval and
show how the appropriate values are substituted in. (You do not have to actually calculate
anything here.)

Confidence interval: ________________________


100 Chapter 19: Comparing two Proportions

h. State the interval in the form of an estimate with a margin of error.

______________________________________________________________________

i. Interpret your confidence interval in the context of the problem.

______________________________________________________________________

______________________________________________________________________

______________________________________________________________________

______________________________________________________________________

______________________________________________________________________

j. A 95% confidence interval will generally give results consistent with (circle one)

(a) A 2 sided test at the 10% significance level

(b) A 1 sided test at the 5% significance level

(c) A 2 sided test at the 5% significance level

k. Show how this applies to our test and CI: Based on your confidence interval, would you say
that there is a difference in the quality of the two roofing materials with regard to leaks?
Why or why not? Does this agree with your conclusion from the hypothesis test?

________________________________________________________________________

________________________________________________________________________

________________________________________________________________________
101

Practice Exercises for MATH2209

Chapter 15: Central Limit Theorem

Statistics have many everyday applications. For example, bus companies record information
about the number of passengers per run on various bus routes. This information is used to
help determine if particular bus routes require more frequent service. A bus company claims
that the average number of passengers on the University Route is 37.5, with a standard
deviation of 18.1; assume for now this claim is correct.

Use Table Z to answer the following questions.

a. Approximately, what is the probability of observing a sample of 25 runs where the mean
number of passengers exceeds 42?

b. What assumptions must you make in order to answer question (a)? Explain in the context
of the question.

__________________________________________________________________________

__________________________________________________________________________

__________________________________________________________________________

__________________________________________________________________________

__________________________________________________________________________

c. Approximately, what is the probability of observing a sample of 75 runs where the mean
number of passengers exceeds 42?
102 Chapter 15: Normal Model for Means

d. What assumptions must you make in order to answer question (c)? Explain in the context
of the question.

__________________________________________________________________________

__________________________________________________________________________

__________________________________________________________________________

__________________________________________________________________________

__________________________________________________________________________

__________________________________________________________________________

e. Approximately, what is the probability that the number of passengers will exceed 42 on a
particular run?

f. What assumptions are needed to answer part (e)?


103

Chapter 20: Inferences About Means

Catching undersized scallops is a serious offence under the Fisheries Act. This act states that
the mean number of scallops per 500g must not exceed 39. Fisheries officers have suspected
that Captain Hook might be catching undersized scallops. The Crown must prove beyond
reasonable doubt that Captain Hook is guilty before they can formally charge him. As such,
they employ a stringent 1% level of significance as their criterion.

They take random samples of 500g each of scallops from his fishing vessel on 20 randomly
selected occasions, for which the mean number of scallops was 40.95, with a standard
deviation of 8.34.

Use the Minitab output to help answer the following questions. (Note; there are two outputs,
one is two sided, the other is one sided. Be careful which output you use to help you answer
each question.)

One-Sample T

Test of mu = 39 vs > 39

99% Lower
N Mean StDev SE Mean Bound T P
20 40.95 8.34 1.86 36.21 1.05 0.154

One-Sample T

Test of mu = 39 vs not = 39

N Mean StDev SE Mean 99% CI T P


20 40.95 8.34 1.86 (35.61, 46.29) 1.05 0.309

a. State the hypothesis to be tested, defining any parameters used. Assume that the
assumptions are met and the conditions hold.

_________________________________________________________________________

_________________________________________________________________________

_________________________________________________________________________

b. State the test statistic and P-value from the output, and show how to find the P-value from
Table T.

Output: t = ________________________ P-value ________________________

TableT: df = Value(s) found on Table: p-value:


104 Chapter 20: Inferences About Means

c. Briefly assess the strength and significance of the evidence.

i. Strength:

ii. Significance:

d. State your conclusion in the context of the problem.

________________________________________________________________________

________________________________________________________________________

________________________________________________________________________

e. State the 99% confidence interval and show how it is calculated. Give the formula and
show how the appropriate values are substituted in.

Confidence interval _________________________

Formula:

Values filled in:

f. Interpret the 99% confidence interval in the context of the question.

________________________________________________________________________

________________________________________________________________________

________________________________________________________________________

g. Based on the confidence interval, do you think that Captain Hook’s catches differs, on
average, from 39 scallops per 500g? Why or why not?

________________________________________________________________________

________________________________________________________________________

________________________________________________________________________
105

Chapter 21: Comparing Means

Recently, more and more university courses have been offered through distance education. In
keeping with this trend, educational researchers have started to focus on the efficacy of
various video presentation formats. One group of researchers examined which of two
presentation formats was more effective for an introductory calculus course. They randomly
assigned 55 students to Presentation Method 1 and 54 students to Presentation Method 2.
Both presentation methods covered the same information; however, the style in which the
information was presented differed. One week later, all 109 students wrote a short quiz
covering the material presented in the videos. The means and standard deviations for the
grades are given below.

Presentation Method Sample Size Mean Std. Dev.


Method 1 55 81.7 9.1
Method 2 54 75.6 18.6

a. Based on the data, would it be reasonable to pool the standard deviations? Why or why
not? What assumption is required to justify pooling?

Based on your answer to part (a), perform the appropriate test to determine whether there is
sufficient evidence at the 5% significance level that Method 1 is more effective than Method 2.
(Use the Minitab output on the next page to help you.)

b. State the appropriate hypotheses, defining any parameters used.


106 Chapter 21: Comparing Means

Minitab Output:

Two-Sample T-Test and CI

Sample N Mean StDev SE Mean


1 55 81.70 9.10 1.2
2 54 75.6 18.6 2.5

Difference = mu (1) - mu (2)


Estimate for difference: 6.10
95% lower bound for difference: 1.42
T-Test of difference = 0 (vs >): T-Value = 2.17 P-Value = 0.017 DF = 76

c. What is the value of the test statistic? Give the formula and show how the appropriate
values are substituted in. (You do not actually have to calculate this.) State the degrees of
freedom from the output.

The test statistic: ____________ Degrees of Freedom: ________________


d. Find and report the p-value from Table T: _______________________________

Find and report the p-value from the Minitab output: p-value = ______________

e. Briefly assess the strength and significance of the evidence.

Strength:

Significance:

f. State your conclusion in the context of the problem.


Chapter 21: Comparing Means 107

Minitab Output with a confidence interval:

Two-Sample T-Test and CI

Sample N Mean StDev SE Mean


1 55 81.70 9.10 1.2
2 54 75.6 18.6 2.5

Difference = mu (1) - mu (2)


Estimate for difference: 6.10
95% CI for difference: (0.50, 11.70)
T-Test of difference = 0 (vs not =): T-Value = 2.17 P-Value = 0.033 DF = 76

g. State the 95% confidence interval for the difference between Method 1 and Method 2. Give
the formula for the confidence interval and show how the appropriate values are
substituted in. (You do not have to actually calculate this.)

Confidence Interval: _______________________

Formula:

Values filled in:

h. Interpret the interval in the context of the problem.

i. Based on the confidence interval, do you think that the difference between the two
methods is of practical significance? Explain.
108

This page left blank intentionally.


109

Chapter 22: Paired Samples and Blocks

The Bouncing Baby Company manufactures a variety of baby formulas. Recently, the
company’s scientists developed a new formula specifically for low birth weight babies. The new
formula was designed to facilitate faster weight gain than their standard formula. Before
releasing the new formula on the market, a research team tested the formula to determine
whether it would promote faster weight gain in low birth weight babies. Twelve low birth
weight infants, paired on the basis of birth weight, were used to compare the new formula
with the standard formula. Weight gains (grams) for the infants are provided in the table
below.

Infant Pair 1 2 3 4 5 6
New Formula 3604 3050 3344 3758 3361 3507
Standard Formula 3140 3100 2832 3458 3374 2930
New – Standard 464 -50 512

Determine if there is evidence at a 5% level of significance that, on average, the new formula
is better than the standard formula. Use the Minitab output to help you answer the following
questions.

Paired T-Test and CI: New, Old

Paired T for New - Old

N Mean StDev SE Mean


New 6 3437 245 100
Old 6 3139 243 99
Difference 6 298 272 111

95% lower bound for mean difference: 75


T-Test of mean difference = 0 (vs > 0): T-Value = 2.69 P-Value = 0.022

a. Complete the table above by calculating the remaining differences (New - Standard)

b. State your hypotheses, defining any parameters used.

_____________________________________________________________________

_____________________________________________________________________

_____________________________________________________________________

c. What is the value of the test statistic?


t = __________

d. What is the value of the degrees of freedom? DF = __________


110 Chapter 22: Paired Samples and Blocks

e. i. Find and report the p-value from Table T:

Value(s) from Table= ___________

Two-tailed or one-tailed? ___________ p-value = ______________

ii. Circle and report the p-value from the Minitab output: p-value = ______________

f. Briefly assess the strength and significance of the evidence.

Strength: _______________________________________________________________

_______________________________________________________________________

Significance: ____________________________________________________________

_______________________________________________________________________

g. Express your conclusion in terms of the problem.

______________________________________________________________________

_______________________________________________________________________

_______________________________________________________________________

Minitab Output for Confidence Interval:


Paired T for New - Standard

N Mean StDev SE Mean


New 6 3437 245 100
Standard 6 3139 243 99
Difference 6 298 272 111

95% CI for mean difference: (13, 583)


T-Test of mean difference = 0 (vs not = 0): T-Value = 2.69 P-Value = 0.043

h. State the 95% confidence interval for the mean difference in weight gain. Give the formula
for the confidence interval and show how the appropriate values are substituted in. (You
do not have to actually calculate this.)

Confidence Interval ____________________________


Chapter 22: Paired Samples and Blocks 111

i. Interpret the confidence interval in context.

Suppose you carried out your test by finding the difference between weight gains by
subtracted New from Standard instead of subtracting Standard from New.

j. Complete the table below by finding the remaining differences (Standard – New.)

Infant Pair 1 2 3 4 5 6
New Formula 3604 3050 3344 3758 3361 3507
Standard Formula 3140 3100 2832 3458 3374 2930
Standard – New -464 50 -512

The Minitab output would look like this.


Paired T-Test and CI: Old, New

Paired T for Old - New

N Mean StDev SE Mean


Old 6 3139 243 99
New 6 3437 245 100
Difference 6 -298 272 111

95% upper bound for mean difference: -75


T-Test of mean difference = 0 (vs < 0): T-Value = -2.69 P-Value = 0.022

k. What do you notice? What is the same? What is different?

_________________________________________________________________________

_________________________________________________________________________

_________________________________________________________________________

_________________________________________________________________________
112 Chapter 22: Paired Samples and Blocks
113

Chapter 23: Inference for Two-way Tables

Taking good notes is necessary for successful academic performance. Interestingly, the ability
to make useful notes begins to develop early in childhood. Dr. Michelle Eskritt, a professor in
the Department of Psychology here at the Mount, has conducted a number of studies
examining both children’s and adult’s use of notations as memory aids. In one such study 1,
children in Kindergarten and Grade 1 were asked to make notes that would help them to
remember the colour and shape of cards that a customer had ordered. Useful notes required
information on both the colour and shape of the cards; these were referred to as Full
Notations. Notes that included only one of the two types of information were classified as
Partial notations, and notes that contained no information about the cards (e.g., the child drew
picture of a dog) were classified as Non-mnemonic Notations. Data from a sample of 47
children are in the table below.

Notations
Grade in School Full Partial Non-mnemonic Row Total
Kindergarten 2 10 11 23
Grade 1 16 4 4 24
Column Total 18 14 15 47

a. Is this an observational study or an experiment? Explain in the context of the question.

b. In the table above, circle the number representing Grade 1 students who took partial
notations.

c. For the students in Grade 1, calculate the distribution of Notation Type.

Notations
Grade in
School Full Partial Non-mnemonic Total
2 10 11
Kindergarten  8.70 %  43.48 %  47.83 % 100 %
23 23 23
16 4 4
Grade 1  %  %  %

1
Eskritt, M., & Olson, D. R. (2001). A comparison of young children’s production and evaluation of notations
and verbal messages. Poster presented at the Biennial Meeting of the Society for research in Child Development,
Minneapolis.
114 Chapter 23: Comparing Counts

d. For students in Kindergarten, the distribution of Notation type has been provided, along
with its graph. For students in Grade 1, graph the distribution of Notation type (i.e. for
those individuals who are in Grade 1). Remember to label your graph clearly, and give it a
title.

e. Based on the graphs in part (b), does the distribution of notation depend on the student’s
grade for the individuals in this sample (i.e. are Grade and Notation type associated)?
Explain.

Using a 5% level of significance, do the data provide evidence that the distributions of
Notation type are the same for all students in Kindergarten and Grade 1 (i.e. does the
distribution of Notation type depend on the Grade)? Use the Minitab output to help you.

f. Is this a test of homogeneity or a test of independence? Explain in the context of the


question.
Chapter 23: Comparing Counts 115

Chi-Square Test: Full, Partial, Non-mnemonic

Expected counts are printed below observed counts


Chi-Square contributions are printed below expected counts

Full Partial Non-mnemonic Total


1 2 10 11 23
8.81 6.85 7.34
5.263 1.447 1.824

2 16 4 4 24
9.19 7.15 7.66
5.043 1.387 1.748

Total 18 14 15 47

Chi-Sq = 16.713, DF = 2, P-Value = 0.000

g. On the Minitab output, circle the number representing Grade 1 students who took partial
notations.

h. State the appropriate hypotheses in plain English. Remember to define the population.

i. For those Grade 1 students who took partial notations, state the expected value and the
chi-squared contribution (from the output) and show how each is calculated.

Expected Value : _____________ χ2 contribution : ___________________

Expected Value =

χ2 contribution =

j. What is the value of the test statistic? Show how it is calculated from the contributions

χ2 = ______________________________________________________________
116 Chapter 23: Comparing Counts

k. Calculate the degrees of freedom and report the P-value from the table in your text AND
from the Minitab output

Degrees of Freedom: ______________________________________________

Value(s) from table: __________________________

P-value from the table: _____________________

P-value from the output: ______________________

l. Briefly assess the strength and significance of the evidence.

Strength:

Significance:

m. State your conclusion in the context of the problem.


117

Chapter 24: Inference for Regression

Question 1

A statistics professor wished to test whether midterm grades were predictive of final exam
grades. She randomly selected ten students from an introductory statistics course to see if
their midterm grade was predictive of the grade they received on their final. Below is the
MINITAB printout of her analysis.

Regression Analysis: Final versus Midterm

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 1 3463.51 3463.51 333.86 0.000
Midterm 1 3463.51 3463.51 333.86 0.000
Error 8 82.99 10.37
Total 9 3546.50

Model Summary

S R-sq R-sq(adj) R-sq(pred)


3.22090 97.66% 97.37% 96.95%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant -22.12 4.85 -4.56 0.002
Midterm 1.1833 0.0648 18.27 0.000 1.00

Regression Equation

Final = -22.12 + 1.1833 Midterm

Fits and Diagnostics for All Observations

Obs Final Fit Resid Std Resid


1 34.00 35.86 -1.86 -0.71
2 53.00 52.43 0.57 0.19
3 55.00 57.16 -2.16 -0.71
4 66.00 64.26 1.74 0.57
5 37.00 37.05 -0.05 -0.02
6 68.00 71.36 -3.36 -1.11
7 73.00 65.45 7.55 2.47 R
8 80.00 80.83 -0.83 -0.28
9 84.00 85.56 -1.56 -0.55
10 95.00 95.03 -0.03 -0.01

R Large residual
118 Chapter 24: Simple Linear Regression

Prediction for Final

Regression Equation

Final = -22.12 + 1.1833 Midterm

Variable Setting
Midterm 79

Fit SE Fit 95% CI 95% PI


71.3631 1.08559 (68.8597, 73.8665) (63.5252, 79.2011)

a. State the equation of the least-squares regression line.

_______________________________________________________________________

b. Calculate the predicted final grade for a midterm grade of 79, and verify the value on the
printout. Plot the predicted value on the fitted line plot, and label it.

Value from printout: __________

Fitted Line Plot


Final = - 22.12 + 1.183 Midterm
100
S 3.22090
R-Sq 97.7%
90 R-Sq(adj) 97.4%

80

70
Final

60

50

40

30
50 60 70 80 90 100
Midterm
Chapter 24: Simple Linear Regression 119

c. Calculate the residual for a student with a midterm grade of 79 and verify the result on the
printout. Identify the residual on the graph above.

Value from printout: _________

d. Based on the plot, does the regression line appear to fit the data well? Why or why not?

e. The sample slope is __________. Interpret this value in the context of the problem.

Use the residual plots below, and the fitted line plot, to answer the following questions. (You
will only be concerned with the top two, and the one in the bottom left.)
120 Chapter 24: Simple Linear Regression

f. Check assumptions for valid inference, clearly referring to the relevant plots.
Independence Assumption: This condition is met since the data was randomly selected
from a large population [satisfying the randomization condition].

Linearity Assumption:
Straight Enough Condition:

Equal Variance Assumption


Does the Plot Thicken? Condition:

Normal Population Assumption

Nearly normal and the outlier condition:

i. There is one clear outlier, explain why this could be problematic. Circle the outlier on
the residual plots (omit the lower right plot) and on the fitted line plot.

ii. With the exception of the outlier, are the residuals approximately normal? Why or why
not?

At the 1% level of significance, do the data indicate that there is a positive linear association
between the midterm grades and the final grades of all statistics students in the course?

g. State the appropriate hypotheses, defining any parameters used.


Chapter 24: Simple Linear Regression 121

h. State the value of the test statistic, and circle it on the Minitab output. ___________.
i. Using Table T, find the P-value.
Degrees of Freedom: _____________

Value(s) from the Table: _________________ P-value from the text: ____________

j. State the P-value from the printout, and circle it. P =_________

k. Briefly assess the strength and significance of the evidence.

Strength:

Significance:

l. State your conclusion in the context of the problem.

m. Construct a 95% confidence interval for the population slope.

n. Interpret the confidence interval for the population slope in the context of the problem.

____________
122 Chapter 24: Simple Linear Regression

o. State the value of the coefficient of determination (r2), and interpret this value in the
context of the question.

____________

p. Calculate the value of the correlation coefficient (r), and interpret this value in the context
of the question.

____________

q. Find and interpret the 95% confidence interval for New Observation #1, in the context of
the problem.

____________

r. Find and interpret the 95% prediction interval for New Observation #1, in the context of
the problem.

____________
Chapter 24: Simple Linear Regression 123

Question 2

Turn to page 735 in your textbook, and read question 23. Answer the following questions.

a. What is the explanatory variable?

b. What is the response variable?

Note that in the textbook, a partial Minitab output was given. Here is more information to help
you:

Regression Analysis: calories versus sodium

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 1 2608 2607.9 7.45 0.008
sodium 1 2608 2607.9 7.45 0.008
Error 75 26244 349.9
Lack-of-Fit 25 12080 483.2 1.71 0.054
Pure Error 50 14164 283.3
Total 76 28852

Model Summary

S R-sq R-sq(adj) R-sq(pred)


18.7062 9.04% 7.83% 3.91%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 95.73 4.61 20.77 0.000
sodium 0.0699 0.0256 2.73 0.008 1.00

Regression Equation

calories = 95.73 + 0.0699 sodium

Fits and Diagnostics for Unusual Observations

Obs calories Fit Resid Std Resid


3 70.00 113.89 -43.89 -2.38 R
4 50.00 105.51 -55.51 -2.99 R
45 150.00 102.36 47.64 2.57 R
46 150.00 106.21 43.79 2.36 R
47 160.00 106.21 53.79 2.89 R
55 50.00 95.73 -45.73 -2.52 R
56 50.00 95.73 -45.73 -2.52 R

R Large residual

Regression Analysis: Brain Size versus VerbalIQ


124 Chapter 24: Simple Linear Regression

Answer part (a) from the textbook by following the steps below.

c. State the appropriate hypothesis, defining any parameters used.

d. State the value of the test statistic and p-value from the Minitab output.

t = ______________ p-value = ____________

e. Using Table T, find the p-value

Degrees of Freedom: _____________ Value(s) from Table: _________________

P-value from the text: ____________

f. Briefly assess the strength and significance of the evidence.

Strength:

Significance: No significance level given, so we cannot assess significance.

g. State your conclusion in the context of the problem.


125

Chapter 25: Analysis of Variance

Question 1

In the Climate Change Plan for Canada, government researchers reported that nearly one
quarter of this country’s greenhouse gases are generated by the transportation sector, with
the majority of these emissions coming from cars and trucks. Not surprisingly, these
researchers noted that approximately two thirds of greenhouse gases from transportation are
produced in urban areas. Based on the information provided in the Climate Change Plan for
Canada report, larger cities should have higher pollution indices. Do these data suggest that
there is a significant difference in the mean pollution indices for Halifax, Vancouver and
Toronto?

They randomly select 5 sites in each of the three cities, and the Minitab output is given below
to help you (note that there are values missing, you will be asked to complete the ANOVA
table as part of this practice exercise).

One-way ANOVA: Halifax, Toronto, Vancouver

Source DF SS MS F P
Factor 3.396 0.002
Error
Total 5.294

S = 0.3967 R-Sq = 64.27% R-Sq(adj) = 58.31%

Individual 95% CIs For Mean Based on


Pooled StDev
Level N Mean StDev --+---------+---------+---------+-------
Halifax 5 2.2800 0.3768 (-------*------)
Toronto 5 3.4200 0.4764 (------*-------)
Vancouver 5 2.6400 0.3209 (-------*-------)
--+---------+---------+---------+-------
2.00 2.50 3.00 3.50

Pooled StDev = 0.3967

a. State the appropriate hypotheses and define one of the parameters.


126 Chapter 25: Analysis of Variance

b. Plot the sample means on the Dot plot, and draw a line to connect them. Is there a
noticeable difference between the sample means? Explain.

Dot plot of Pollution Indices vs City

4.0

3.5
Pollution Indice

3.0

2.5

2.0

Halifax Toronto Vancouver


City

c. Check the assumptions and conditions, clearly referring to the plots below.

Residual Plots for Halifax, Toronto, Vancouver


Normal Probability Plot of the Residuals Residuals Versus the Fitted Values
99 0.8

90 0.4
Residual
Percent

50 0.0

10 -0.4

1 -0.8
-1.0 -0.5 0.0 0.5 1.0 2.5 3.0 3.5
Residual Fitted Value

Histogram of the Residuals


4

3
Frequency

0
-0.6 -0.4 -0.2 0.0 0.2 0.4 0.6
Residual
Chapter 25: Analysis of Variance 127

i. Independence Assumption:

ii. Equal Variance Assumption

Similar Variance Condition:

Does the plot thicken? Condition:

iii. Normal Population Assumption:

Nearly normal and the outlier condition:

Are the residuals approximately normal? Why or why not?

Are there any outliers?


128 Chapter 25: Analysis of Variance

d. Complete the following ANOVA table. Show your calculations.

Source df Sum of Squares Mean Square F P-value


Group 3.396 0.002
Error
Total 5.294

e. State the value of the p-value from the Minitab output.

p-value = ____________

f. Briefly assess the strength and significance of the evidence.

Strength: _______________________________________________________________

_______________________________________________________________________

Significance: No significance level given, so we can not assess significance.

g. State your conclusion in the context of the problem.


Chapter 25: Analysis of Variance 129

h. If we want to look at the data a little closer by comparing confidence intervals, how many
confidence intervals do we need to consider?
_________________

i. Using the Bonferroni method for multiple comparisons, what confidence level should we be
using here to ensure an overall confidence level of 95%?

j. Calculate the interval for the difference in the mean pollution indices between Halifax and
Toronto, given that the t** is 2.779.

k. Interpret the confidence interval in the context of the problem.

l. Below is a table showing the confidence intervals for all the comparisons. Fill in your values
from part (i) in the table below.

Lower Limit Upper Limit


Halifax – Toronto
Halifax – Vancouver -1.057 0.337
Toronto – Vancouver 0.083 1.477

m. Does the confidence interval support that larger cities have higher pollution indices? Why
or why not?
130 Chapter 25: Analysis of Variance
131

Chapter 26: Multifactor Analysis of Variance

Question 1

Turn to page 809 of your textbook, and read Question 12.

Does the data provide sufficient evidence at the 5 % level of significance that the temperature
setting has an effect on the average cleanliness score? Does the data provide sufficient
evidence at the 5 % level that the cycle length has an effect on the average cleanliness score?

The Minitab output is provided below to help you answer the questions.

Two-way ANOVA: Score versus Temp, Cycle

Source DF SS MS F P
Temp 3 33.2519 11.0840 23.47 0.000
Cycle 3 7.1969 2.3990 5.08 0.025
Error 9 4.2506 0.4723
Total 15 44.6994

S = 0.6872 R-Sq = 90.49% R-Sq(adj) = 84.15%

a. State the null and alternative hypothesis.

i) Temperature setting:

ii) Type of Cycle:


132 Chapter 26: Two-Way Analysis of Variance

b. Briefly assess the strength and significance of the evidence.

Strength: _______________________________________________________________

_______________________________________________________________________

Significance:

c. State your conclusion in the context of the question.


Chapter 26: Two-Way Analysis of Variance 133

Question 2

There are many factors that can influence the growth of trees, and sometimes it is an
interaction of those factors that may more of an influence.

The provincial department of agriculture is interested in whether or not there is an interaction


between the type of orchid environment and the effects of fertilizer and irrigation on the
growth of apple trees. Two sites with different environments were each split into four areas.
The four treatments (control, fertilizer, irrigation, both fertilization and irrigation) were
randomly assigned to the four areas in each site, and Blossom Red of apple trees were planted
at the sites. Five apples were randomly selected from each of the eight areas, and their weight
was measured in Newtons2. Using a 5% level of significance, is there evidence of an
interaction between the type of environment and treatment on the mean weight of Blossom
Red apples?

No Treatment Fertilizer Irrigation Fertilizer & Irrigation


Site 1 (rich, moist) 0.184 0.332 0.164 1.334
Site 2 (sandy, dry) 0.458 0.630 0.278 1.308

a. On the interaction plot, draw in lines connecting the means for the two sites. Clearly label
the lines you draw.

Scatterplot of Means vs Treatment


1.4

1.2

1.0

0.8
Means

0.6

0.4

0.2

0.0
1.0 1.5 2.0 2.5 3.0 3.5 4.0
Treatment

2
Note: 1 Newton ≈ 102 g
134 Chapter 26: Two-Way Analysis of Variance

b. Explain why an interaction term is appropriate. Refer to the plot of means and Additive
Enough condition in your answer.

c. Check the assumptions and conditions, clearly referring to the plots below.

Residual Plots for wt


Normal Probability Plot Versus Fits
99 1.0

90 0.5

Residual
Percent

50 0.0

10 -0.5

1 -1.0
-1.0 -0.5 0.0 0.5 1.0 0.50 0.75 1.00 1.25 1.50
Residual Fitted Value

Histogram Versus Order


1.0
8
0.5
Frequency

6
Residual

0.0
4
-0.5
2

0 -1.0
-1.00 -0.75 -0.50 -0.25 0.00 0.25 0.50 0.75 1 5 10 15 20 25 30 35 40
Residual Observation Order
Chapter 26: Two-Way Analysis of Variance 135

d. A partial ANOVA table for these data is given below. Complete the table by calculating the
missing values and finding the appropriate P-value. Show your work.

Two-way ANOVA: wt versus treatment, site

Source DF SS MS F P
treatment 3 1.7518 0.583949 2.57 0.072
site 1 0.3258 0.325803 1.43 0.240
Interaction 3 ______ ________ ____ 0.034
Error __ 7.2755 ________
Total 39 11.2592

S = 0.4768 R-Sq = 35.38% R-Sq(adj) = 21.25%

Reminder: the question, and the significance level to use, was given at the beginning of the
question.

e. State the appropriate hypotheses in plain English.

f. State the values of the appropriate test statistic and p-value.

Test Statistic: P-value:

g. Briefly assess the strength and significance of the evidence.

Strength: _______________________________________________________________

_______________________________________________________________________

Significance: ____________________________________________________________

_______________________________________________________________________
136 Chapter 26: Two-Way Analysis of Variance

h. State your conclusion in the context of the problem.


137

Chapter 27: Multiple Regression

In 2008, a study was conducted to look at the effect of a number of variables on the selling
price of homes in Duke County, New York (a fictional location). The following results were
obtained:

Regression Analysis: Value ($) versus Lot Size, Age, Bed

Analysis of Variance

Source DF Adj SS Adj MS F-Value P-Value


Regression 3 24252323172 8084107724 13.19 0.000
Lot Size 1 84601233 84601233 0.14 0.712
Age 1 13230814791 13230814791 21.58 0.000
Bed 1 2803654124 2803654124 4.57 0.039
Error 36 22068352828 613009801
Lack-of-Fit 28 21703447828 775123137 16.99 0.000
Pure Error 8 364905000 45613125
Total 39 46320676000

Model Summary

S R-sq R-sq(adj) R-sq(pred)


24759.0 52.36% 48.39% 39.87%

Coefficients

Term Coef SE Coef T-Value P-Value VIF


Constant 302660 40572 7.46 0.000
Lot Size 846 2278 0.37 0.712 1.52
Age -3961 852 -4.65 0.000 1.32
Bed 12017 5619 2.14 0.039 1.17

Regression Equation

Value ($) = 302660 + 846 Lot Size - 3961 Age + 12017 Bed

Fits and Diagnostics for Unusual Observations

Obs Value ($) Fit Resid Std Resid


18 286500 241084 45416 2.02 R
35 205000 254099 -49099 -2.10 R

R Large residual

Regression Equation

Value ($) = 302660 + 846 Lot Size - 3961 Age + 12017 Bed

Variable Setting
Lot Size 10
Age 40
Bed 3

Fit SE Fit 95% CI 95% PI


188750 12030.4 (164352, 213149) (132923, 244578)
138 Chapter 27: Multiple Regression

a. State the regression equation.

b. Use the regression equation in part (a) to calculate the predicted grade for the
subpopulation on the output, which represents a home which (fill in the blanks with the
correct values):

Has ______ rooms, is _______ years old, and has a lot size of _______ acres.

i. Calculation:

ii. State the predicted value from the output, and circle it on the output. __________

c. The coefficient of determination (r2) for the model is ________. Interpret this value in the
context of the problem.

d. The regression coefficient for age is b2=______. Interpret this value in the context of the
problem.
Chapter 27: Multiple Regression 139

Scatterplot of Value ($) vs FITS


300000

280000

260000
Value ($)

240000

220000

200000

180000

160000
150000 175000 200000 225000 250000 275000
FITS
140 Chapter 27: Multiple Regression

e. What are the assumptions and conditions of a multiple regression model? Are they
satisfied?
Chapter 27: Multiple Regression 141

Is there sufficient evidence at the 5% level of significance that a linear relationship exists
between the selling price of a home and the size of the lot, the age and the number of rooms?

f. State the hypotheses, and define one of the parameters, β1.

g. The test statistic is __________, and the P-value is ________. Circle them on the output.

h. Use the appropriate table to look up the p-value.

Degrees of Freedom: __________________ __ Value(s) from Table: _______________

P-value from the table: __________________

i. Briefly assess the strength and significance of the evidence.

Strength: _______________________________________________________________

_______________________________________________________________________

Significance: ____________________________________________________________

_______________________________________________________________________

j. State your conclusion in the context of the problem.


142 Chapter 27: Multiple Regression

k. State the 95% confidence interval for the subpopulation on the output. Interpret this
interval in the context of the problem.

Interval: ______________________________________

Interpret:__________________________________________________________________

__________________________________________________________________________

__________________________________________________________________________

________________________________________________________________________

l. State the 95% prediction interval for the subpopulation on the output. Interpret this
interval in the context of the problem.

Interval: ___________________________________________________

Interpret:_________________________________________________________________

_________________________________________________________________________

_________________________________________________________________________

_________________________________________________________________________

Is there sufficient evidence at the 5% significance level that the average selling price will
increase when the lot size increases? Use the Minitab output to help you answer the following
questions.

m. State the hypotheses and define any parameters used.

n. State the test statistic and p-value from the Minitab output, and circle them on the output.

Test Statistic: ____________________ P-Value: __________________________


Chapter 27: Multiple Regression 143

o. Use the appropriate table to look up the p-value.

Degrees of Freedom: __________________ __

Value(s) from Table: _________________ P-value from the table: ____________

p. Briefly assess the strength and significance of the evidence.

Strength:__________________________________________________________________

Significance:__________________________________________________________

q. State your conclusion in the context of the problem.

r. Calculate a 90% confidence interval for the coefficient “age” and interpret.

Interpret: _________________________________________________________________

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy