Decision Sciences 1: Sample Questions (Set 2)
Decision Sciences 1: Sample Questions (Set 2)
(1) Carpetland salespersons average $8000 per week in sales, with a standard deviation of
$500. Steve Contois, the firm’s vice president, proposes a compensation plan with new
selling incentives. Steve hopes that the results of a trial selling period will enable him to
conclude that the compensation plan increases the average sales per salesperson. He
selects 200 salespersons randomly to test his theory. He decides that he will consider the
plan to be successful if the average sales per week for those 200 selected exceeds $8500.
a) Develop the appropriate null and alternative hypotheses.
b) Find the probability of making Type I error in this situation?
c) Does the above test procedure seem sensible to you?
(2) Individuals filing federal income tax returns prior to March 31 received an average
refund of $1056. Consider the population of “last-minute” filers who mail their tax return
during the last five days of the income tax period (typically April 10 to April 15).
a) A researcher suggests that a reason individuals wait until the last five days is that on
average these individuals receive lower refunds than the early filers. Develop the
hypotheses such that rejection of the null will support the researcher’s contention.
b) For a sample of 400 individuals who filed a tax return between April 10 and 15, the
sample mean refund was $910. Based on prior experience a population standard
deviation of $1600 may be assumed. What is the p-value of the test?
c) At α = 0.05, what is your conclusion?
d) Repeat the preceding hypothesis test using the critical value approach.
(3) The mean hourly wage for employees in goods-producing industries is currently $24.57.
Suppose we take a sample of employees from the manufacturing industry to see if the mean
hourly wage differs from the reported mean of $24.57 for the goods-producing industries.
a) State the null and alternative hypotheses we should use to test the above.
b) Suppose a sample of 30 employees showed a sample mean of $23.89 per hour.
Assume a population standard deviation of $2.40 per hour and compute the p-value.
c) At 5% level of significance, what is your conclusion?
d) Repeat the preceding hypothesis test using the critical value approach.
(4) Joan’s nursery specializes in custom-designed landscaping for residential areas. The
estimated labor cost associated with a particular landscaping proposal is based on the
number of plantings of trees, shrubs, and so on to be used for the project. For cost-
estimating purposes, managers use two hours of labor time for the planting of a medium-
sized tree. Actual times from a sample of 10 plantings during the past month follow (times
in hours) is as follows:
With a 0.05 level of significance, test if the mean tree-planting time differs from two hours.
(5) Costs are rising for all kinds of medical care. The mean monthly rent at assisted-living
facilities was reported to have increased 17% over the last five years to $3486 (the Wall
Street Journal, October 27, 2012). Assume this cost estimate is based on a sample of 120
facilities and, from past studies, it can be assumed that the population standard deviation is
s = $650.
a) Develop a 90% confidence interval estimate of the population mean monthly rent.
b) Develop a 95% confidence interval estimate of the population mean monthly rent.
c) Develop a 99% confidence interval estimate of the population mean monthly rent.
d) What happens to the width of the confidence interval as the confidence level is
increased? Does this seem reasonable? Explain.
(6) The 92 million Americans of age 50 and over control 50 percent of all discretionary
income. AARP estimates that the average annual expenditure on restaurants and carryout
food was $1873 for individuals in this age group. Suppose this estimate is based on a sample
of 80 persons and that the sample standard deviation is $550.
a) At 95% confidence, what is the margin of error?
b) What is the 95% confidence interval for the population mean amount spent on
restaurants and carryout food?
c) What is your estimate of the total amount spent by Americans of age 50 and over on
restaurants and carryout food?
d) If the amount spent on restaurants and carryout food is skewed to the right, would
you expect the median amount spent to be greater or less than $1873?
(7) Sales personnel for Skillings Distributors submit weekly reports listing the customer
contacts made during the week. A sample of 65 weekly reports showed a sample mean of
19.5 customer contacts per week. The sample standard deviation was 5.2. Provide 90% and
95% confidence intervals for the population mean number of weekly customer contacts for
the sales personnel.
(8) The U.S. Energy Information Administration (US EIA) reported that the average price for
a gallon of regular gasoline is $3.94 (US EIA website, April 6, 2012). The US EIA updates its
estimates of average gas prices on a weekly basis. Assume the standard deviation is $0.25
for the price of a gallon of regular gasoline and recommend the appropriate sample size for
the US EIA to use if they wish to report each of the following margins of error at 95%
confidence.
a) The desired margin of error is $0.10.
b) The desired margin of error is $0.07.
(9) A special industrial battery must have a life of at least 400 hours. A hypothesis test is to
be conducted with a 0.02 level of significance. If the batteries from a particular production
run have an actual mean use life of 385 hours, the production manager wants a sampling
procedure that only 10% of the time would show erroneously that the batch is acceptable.
What sample size is recommended for the hypothesis test? Use 30 hours as an estimate of
the population standard deviation.
(10) Customers arrive at a movie theatre at the advertised movie time only to find that they
have to sit through several previews and pre-preview ads before the movie starts. Many
complain that the time devoted to previews is too long. A preliminary sample conducted by
the Wall Street Journal showed that the standard deviation of the amount of time devoted
to previews was 4 minutes. Use that as a planning value for the standard deviation in
answering the following questions.
a) If we want to estimate the population mean time for previews at movie theatres
with a margin of error of 75 seconds, what sample size should be used? Assume 95%
confidence.
b) Suppose that for a sample of 120 movies, the sample mean of the previews is 6
minutes and 35 seconds. Estimate a 99% confidence interval for the population
mean time for previews at movie theatres.
(11) There has been a trend toward less driving in the last few years, especially by young
people. From 2001 to 2009 the annual vehicle miles travelled by people from 16 to 34 years
of age decreased from 10,300 to 7900 miles per person. Assume the standard deviation was
2000 miles in 2009. Suppose you would like to conduct a survey to develop a 95%
confidence interval estimate of the annual vehicle-miles per person for people 16 to 34
years of age at the current time. A margin of error of 100 miles is desired. How large a
sample should be used for the current survey?
(12) The Pew Research Centre Internet Project, conducted on the 25th anniversary of the
Internet, involved a survey of 857 Internet users (Pew Research Centre website, April 1,
2014). It provided a variety of statistics on Internet users. For instance, in 2014, 87% of
American adults were Internet users. In 1995 only 14% of American adults used the
Internet.
a) The sample survey showed that 90% of respondents said the Internet has been a
good thing for them personally. Develop a 95% confidence interval for the
proportion of respondents who say the Internet has been a good thing for them
personally.
b) If 67% of Internet users agree that the Internet has strengthened their relationship
with family and friends, find a 95% confidence interval for the population proportion
of people believing the same.
c) Fifty-six percent of Internet users have seen an online group come together to help a
person or community solve a problem. Develop a 95% confidence interval for the
proportion of Internet users who say online groups have helped solve a problem.
d) How are the margin of errors in the above three parts related to the sample
proportions?
(13) A poll for the presidential campaign sampled 491 potential voters in June. A primary
purpose of the poll was to obtain an estimate of the proportion of potential voters who
favoured each candidate. Assume a planning value of 𝑝 = 0.5 and a 95% confidence level.
a) For 𝑝 = 0.5, what was the planned margin of error for the June poll?
b) Closer to the November election, better precision and smaller margins of error are
desired. Assume the following margins of error are requested for surveys to be
conducted during the presidential campaign. Compute the recommended sample
size for each survey.
Survey Margin of Error
September 0.04
October 0.03
Early November 0.02
Pre-Election Day 0.01
(14) The manager of an automobile dealership is considering a new bonus plan designed to
increase sales volume. Currently, the mean sales volume is 14 automobiles per month. The
manager wants to conduct a research study to see whether the new bonus plan increases
sales volume. To collect data on the plan, a sample of sales personnel will be allowed to sell
under the new bonus plan for a one-month period. Develop the null and alternative
hypothesis most appropriate for this situation.
(15) A production line operation is designed to fill cartons with laundry detergent to a mean
weight of 32 ounces. In order to check the efficiency and accuracy of the production line,
the manager decided to take a sample of 100 cartons and weighed them. Suppose, the
mean of the weights of the sample observations is 31.97 ounces. Assume that the
population standard deviation of the mean filling weights is 0.17 ounces.
a) Formulate the null and alternative hypotheses that will help in deciding whether to
shut down and adjust the production line.
b) Is there enough evidence to say that the production line is not operating correctly?
(16) A survey was conducted among randomly selected registered flats in a city regarding
water bill. The table below provides selected summarized statistics from the last month’s
water bill (in Rs.) of the sampled flats, segregated according to the region of the city and
whether the residents owned the flat or not. The rightmost column represents the standard
deviation of water bills of all sampled flats in the corresponding region. Answer all
subsequent questions on the basis of the information provided.
NOTE: To answer the following questions, consider “Region A” using the last digit of your
PGP roll no as follows:
Roll no ends with Region A
1 or 5 or 9 Region 4
2 or 6 or 0 Region 3
3 or 7 Region 2
4 or 8 Region 1
a) Find a point estimate of the average monthly water bill in flats in Region A.
b) Find a 94% confidence interval estimate of the same parameter as in part (a).
c) What percentage of flats in the city are rented? Give a point estimate. What is the
standard error of this estimate? Is it the true standard error or an estimated
standard error?
d) Find a confidence interval estimate of the same target parameter as in part (c). The
confidence coefficient should be (89.5+x)% where x is the SECOND digit from right of
your PGP roll no.
e) The water supplier wants to find out if the owned flats pay higher water bill on
average than rented ones in Region A, as given in part (a). To find a suitable answer,
statistically formulate and state the necessary null and alternative hypothesis. You
need to clearly explain your symbols. You are NOT required to conduct the test or
arrive at a conclusion.
(17) A social media platform goodbooks.co.in is used for discussion, reviews etc. for books.
Varun has taken a new year resolution that he is going to ramp up his readings habits this
year. But he doesn’t want to get bogged down by reading too many uninteresting books. To
take care of this problem, he has signed up on goodbooks, and decided to go according to
the five-star reviews there, but with a caveat. He knows that there are many bots putting in
five-star reviews. So, to bypass that, he has set a criterion that he is going to look at 100
five-star reviews for each book and decide to read it if he believes that more than 20% of
ALL the five-star reviews of the book has at least 150 words.
a) Varun is considering reading ‘The Day of the Jackal’. Suppose 30% of ALL five-star
reviews of this book has at least 150 words. What is the chance that among the
sample of five-star reviews collected by Varun, at least 37 would have a review of at
least 150 words?
b) Varun is now considering ‘The Afghan’. In his sample of 100 five-star reviews of ‘The
Afghan’, 29 have at least 150 words. Will Varun go ahead and read this novel?
c) While going through the reviews of ‘The Afghan’, ‘The Odessa File’ has suddenly
caught the attention of Varun, and he has gone ahead and sampled its five-star
reviews too. 41 of the 100 five-star reviews of ‘The Odessa File’ he has sampled have
at least 150 words. Varun wants to read the one that he thinks has a higher
proportion of at least 150 worded reviews considering ALL the five-star reviews.
Does Varun have a clear choice to make? To find a suitable answer, statistically
formulate and state the necessary null and alternative hypothesis. You need to
clearly explain your symbols. You are NOT required to conduct the test or arrive at a
conclusion.
(18) The life expectancy for a particular type of battery is known to be between twenty to
twenty five hours.
a) The regulator wants to estimate the percentage of this type of batteries which last
less than twenty one hours within a margin of ± 0.05 of the true value with 96%
certainty. How many batteries of this type need to be tested observing their life-
times, to ensure that the inference objective is met?
b) The manufacturer was given license on the understanding that life time of these
batteries would be at least twenty four hours on averages. A year later, the regulator
want to validate if the manufacturer is adhering to that norm. The regulator wants to
limit the chance of error in penalizing the manufacturer incorrectly to 1%, but they
want to take action against the manufacturer with at least 95% probability if the
actual average life time is 30 minutes less than the norm. How many batteries of this
type need to be tested observing their life-times, to ensure that the inference
objective is met?
(19) In a large reserve forest spanning over 500 square km, the forest departments have
adopted the following scheme for counting the number of tigers (N).
In the first week, an extensive search is conducted till the first tiger is sighted. Once the tiger
is sighted, the forest department has mechanism to catch the tiger without causing any
harm to it. Subsequently, tranquillizer is used to make it fall asleep for a while, an
identifying tag is permanently attached to the tiger before releasing it to back to the forest.
In the second week, again another extensive search is conducted till a tiger is sighted. If this
is found to be one caught earlier (detected from the tag), the search process stops
immediately. Otherwise, the same process of tagging is done with this tiger, as was the case
with the tiger caught in the first week.
The search continues in the third week (if it was not decided to be stopped at the second
week). If this is found to be one of the tigers caught earlier (detected from the tag), the
search process stops immediately. Otherwise, the same process of tagging is done with this
tiger, as was the case with the tiger caught in the first or the second week.
The search process continues following this process till it stops as per the stated guideline
already communicated.
Assume that the search team is successful in sighting a tiger in every week it decided to
conduct the search. And only one tiger is tagged in every week of search.
Let X denote the number of weeks for which the search is conducted. It is believed that it is
possible to estimate N from observed value of X. For this problem, however, we have a
much simpler problem assigned to you. Suppose 𝑁 = 5.
a) Obtain the probability distribution of X.
b) Find also the expected number of weeks during which the search is conducted. What
is the standard deviation of X?
c) What is the chance of X being within two standard deviation of mean in this given
case? Show how it violates or confirms Chebyshev’s theorem.
(20) Duke energy reported that the cost of electricity for an efficient home in a particular
neighbourhood of Cincinnati, Ohio, was $104 per. A researcher believes that the cost of
electricity for a comparable neighbourhood in Chicago, Illinois, is higher. A sample of homes
in this Chicago neighbourhood will be taken and the sample mean monthly cost of
electricity will be used to test the following null and alternative hypotheses.
𝐻! : µ ≤ 104
𝐻" : µ > 104
a) Assume the sample data led to rejection of the null hypothesis. What would be your
conclusion about the cost of electricity in the Chicago neighbourhood?
b) What is the Type I error in this situation? What are the consequences of making this
error?
(21) In Purna Analytics, there are some data scientists whose job is to develop new
algorithms and to provide the code for implementing those algorithms. Suppose that the
new algorithms developed by every data scientist can be classified as easy, moderate, or
hard. Further assume that they can take 1 hour, 2 hours, 8 hours, or 16 hours to develop
every algorithm (from start to finish), which happens according to the following joint
probability distribution:
a) Do you think that the time needed to completely develop a new algorithm is
independent of the difficulty level? Justify your answer.
b) Find the marginal distribution and the expected value of the time needed to write
the code of a random algorithm.
c) Suppose that a project requires the data scientists to develop 100 new algorithms in
a sequential manner (i.e., they can work on the second algorithm only when the first
algorithm is finished). If the project needs to be completed within 1000 hours (i.e. on
average 10 hours per algorithm), what is the probability that they will be able to
finish it on time?
(22) Mr. Zoobi Doobi arrives at the post office to open an NSC. Zoobi is a risk-averse
investor, but the interest rates offered by the bank FDs are forcing people like Zoobi to
explore other low-risk investment options. One of the problems at the post office is its lack
of modernization and effective online services. There are 40 customers before him in the
queue. It takes on average about two and half minutes to complete service for any
customer at the counter, although naturally, the actual time taken is random. Zoobi believes
the time taken to serve a customer at the counter follows an exponential distribution.
a) What is the probability that Zoobi has to wait more than 2 hours before he reaches
the counter for service?
b) What is the probability that at most 50% of the customers in the queue before Zoobi
would take longer than 2.5 minutes to get their service at the counter?
c) Comment on the criticality/importance in the previous questions of Zoobi’s belief
about the distribution of service time at the counter.
(23) Eclectic Hospitality Pvt. Ltd. (EHPL) has been seriously hit by the ongoing covid
pandemic. With the situation not looking bright, they have been forced to go for layoffs, as
well as salary cuts. 10% of their employees have been laid off. Their workforce before,
across different cities, was 1000, and the standard deviation in the salaries has been
observed to be constant over years from historical data, standing at Rs. 3000. To understand
the churning happening in the hospitality industry, Phoebe Accreditations (PA) have
conducted a survey on different hospitality brands. For a more in-depth study on EHPL, PA
collected a sample of 40 people from the pre-pandemic employee list of EHPL.
a) What is the chance that among these 40 employees at least 3 have been laid off?
b) What is the probability that the proportion of employees in the sample who have
been laid off is within +/- 0.05 of the population proportion of lay-offs for EHPL?
c) Those employees, across cities, who are still with EHPL have had their pay checks
reduced. In the sample of 40, 5 have lost their job at EHPL. The average salary of the
rest of them is Rs. 12000. What is a 95% confidence interval for the total amount
disbursed to the employees at EHPL after the pandemic?
(24) Vinod, a budding reporter for the Khabardar newspaper wants to write a review for the
newly released movie, Jaane Bhi Do Yaaro. He interviews 50 people coming out of the
theatre after the first show and asks them (i) if they liked the movie or not, and (ii) their age.
He finds that the interviewed group comprised of 60% men and 40% women, and 60% of
the men interviewed and 80% of the women interviewed liked the movie. He also notes that
the age of the men interviewed has an average of 32 with a standard deviation of 4, while
the age of the women interviewed has an average age of 28 with a standard deviation of 3.
a) Determine the proportion of people interviewed that liked the movie and the
standard error in this estimate.
b) Vinod wants a 95% guarantee that his percentage estimate will be within a margin of
error of 0.05, so that he is not accused of spreading “fake news”. How many people
should he have interviewed to provide this guarantee?
c) Compute the 98% confidence interval on the average age of women that would
watch the movie.
(25) Two absent-minded roommates Erich and Henry, both statisticians, forget their room-
keys in some way or another. Ronald, a common friend of the two, hypothesizes that Erich
always takes his key when he goes out of the house, but Henry is 50% likely to forget his key
while going out. On the other hand, at any place they might visit, each of them leaves
behind the key with probability 0.25. It is also known that their forgetfulness is independent
of each other. Moreover, note that the house can be locked without the key, but the key is
necessary to unlock and enter the house.
a) Once, Erich and Henry left their house together. They went separate ways afterward.
Erich went to two shops and then came back home. Henry, meanwhile, went to his
office to do some work and then returned home at around the same time as Erich. If
Ronald’s hypothesis is true, what is the probability that Erich and Henry will not be
able to enter their house? Assume that after leaving the home, Eric can lose his key
only at the shops he visited, and Henry can lose his one only at his office.
b) David, who is another common friend of all, is also a statistician. He thinks that
Ronald is not entirely correct about Henry’s forgetfulness at home. To further
establish his point, he stays at Erich and Henry’s house for the entire month of
August. During that month, he finds that Henry forgets his key 18 times while leaving
from home. If we know that Henry leaves his house exactly once a day, do we have
sufficient evidence to conclude that Ronald’s aforementioned hypothesis is not true?
State the null and alternate hypothesis clearly, mention any assumptions you are
making, then find the critical region of the test to make your conclusion.