0% found this document useful (0 votes)
5K views177 pages

Edexcel IAL Statistics 3

Textbook

Uploaded by

Ahsan Habib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
5K views177 pages

Edexcel IAL Statistics 3

Textbook

Uploaded by

Ahsan Habib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 177
eer delet Neca} Qualifications DN UN Tey e NUCH el STUDENT BOOK PEARSON EDEXCEL INTERNATIONAL A LEVEL SUIS GSE Student Book Series Editors: Joe Skrakowski and Harry Smith Authors: Greg Attwood, Tom Begley, lan Bettison, Alan Clegg, Martin Crozier, Gill Dyer, Jane Dyer, Keith Gallick, Susan Hooker, Michael Jennings, John Kinoulty, Guilherme Frederico Lima, Jean Littlewood, Bronwen Moran, James Nicholson, Su Nicholson, Laurence Pateman, Keith Pledger, Joe Skrakowski, Harry Smith ‘atshes by Posen Edation Line, 80 Star, Landon, WO2R ORL. wwipersongboaechoolcom ‘Copies ool speciteabon ora Pearson qualifications ay be found on the ‘webate: hipe//quatteatons pearon.com “ext © Pearson Eavoston Lites 2000 ted ny Ere Prat “ype! by Tach Sot Lis, Gatahans UK (Origa tuations © Pasron Edenton Linea 2000 ‘mista by © Tect-Se Li, Gateead, UK ‘Cover design by © Pearson Esucatin Lite 2000, “Tha its of Geng Atwood, Tom Bly lan Btison Alan Giga, Marin Cra, Ct Dyer Jane Dyer, Kath alc, Susan Hooker, Menzel Jamin, Jot Kut, (Gute Fraarico Lins, Joan itenood, Boren Morn, Jrnes Neon 'SuNhoon Laurene Patoran Ke Pedge, Je Skrakowsh ang Hay Smith tobe ented as the alos ofthis work have been aserted ther accordance wit he Copy. Design and Poors Act 968 Fst pubtnes 2000 ‘teh Library Cataloguing in Putten Data ‘catalogue recor forts book is avaible For the Beh Library sangre 224s 188 Copyright noice ‘Allights reserve. No par of his may be reproduce in ay etm by any means frchsing photocopying o storeg tin any acum by croc means and ‘wbste o not vanity or Iocan) to soe oa use ef hs publoon) ‘ithou th writen person othe copa ume, except in accordance with ‘he polons of the Copy. Belg and Pars Act OBS or under the ae lames need ty the Copytigt Loenang Agere, Sth Flor, Shackeon Hous, 4 Batebrige Lane, Loncer SE1 2H we. 20.49. Appeatos for ‘he copyight nna’ writen erniaion shou be adereted to he publaner Protein Slovekia by Neograia Picture Crocs ‘Tho ashore and pubtsher would keto ark ne folowing nels and rgareatiog fr permission to reproduce polographe ‘Shutterstock com: MPVAN 1, Vit Kovac 70, Leigh Prather 8; Alam ‘Stock Phot: Hr rages ic, PON Photog 10; Getty Images Duk protograpner «8 Cover mages Frnt: Getty mages: emer Van Steen ‘reds fot corse Shutterstock com: Oni Laborer ‘other images © Pearson Educaton Lied 2020 ‘sl artwork Pearson Edcaton United 220 Endorsement Statement rower to enaure hal hi retcuce fe highly sia forthe atocited Pearson auton, thas bob trou ave process by the awaring body “Ths proces covfme al thwsoure fy comes the teaching a earn contnt ofthe sposteaton or prof a soctoation at which ti med lso onfima that emonavates an appropiate balance betweon the devaonent fai sts role an ncetndig aan panton Ensocstmant a nt cover any gusance on atetementactitas or proceenea {@9-procice uestons or advice on how to wsewerasossrien questions) Iruted mtn recite, rer does t pesca parc appreach 10h teaching oF delivery ofa elated course. le the pubiars have made erry arp to enue that advice onthe (qualflcaton ands assessments accurate, the oficial epecfeaten ahd ‘icoited assesment gute mater re the any eave sere Irtoraton ana stous aways be refs to for dott guidance, Pearson examiners have nt contributed to any actions inthis resource relevant to ‘ramnaton papers er whien hey have responsbiy. ascecoment st by Poaron. Endorsmant of arovoure dees mean that he resource i rogue to schave ths Pearson quailtin, pox dows rea that Is tre ery eutabe matora avaible to support the qualcaton and ary rsource 1s produced by te awarding body shal icudo tha and ther approprate COL COURSE STRUCTURE ABOUT THIS BOOK vi QUALIFICATION AND ASSESSMENT OVERVIEW viii EXTRA ONLINE CONTENT x 1 SAMPLING 1 2 COMBINATIONS OF RANDOM VARIABLES 10 3 ESTIMATORS AND CONFIDENCE INTERVALS 19 REVIEW EXERCISE 1 43 4 CENTRAL LIMIT THEOREM AND TESTING THE MEAN 48 5 CORRELATION 70 6 GOODNESS OF FIT AND CONTINGENCY TABLES 88 REVIEW EXERCISE 2 127 EXAM PRACTICE 133 THE NORMAL DISTRIBUTION FUNCTION 135 PERCENTAGE POINTS OF THE NORMAL DISTRIBUTION 136 BINOMIAL CUMULATIVE DISTRIBUTION FUNCTION 137 POISSON CUMULATIVE DISTRIBUTION FUNCTION 141 PERCENTAGE POINTS OF THE x? DISTRIBUTION FUNCTION 142 CRITICAL VALUES FOR CORRELATION COEFFICIENTS 143 RANDOM NUMBERS 144 GLOSSARY 145 ANSWERS 147 INDEX 160 aca CHAPTER 1 SAMPLING 1 CHAPTER 4 CENTRAL LIMIT 1.1 SAMPLING 2 THEOREM AND TESTING 1.2 USING A RANDOM NUMBER TABLE 2 THE MEAN 48 1.3 RANDOM SAMPLING 4 4. THE CENTRAL LIMIT THEOREM 49 1.4 NON-RANDOM SAMPLING 7 4.2 APPLYING THE CENTRAL LIMIT CHAPTER REVIEW 1 8 THEOREM TO OTHER DISTRIBUTIONS 51 CHAPTER 2 COMBINATIONS 4.3 CONFIDENCE INTERVALS USING OF RANDOM VARIABLES 10 THE CENTRAL LIMIT THEOREM = 53 2.1 COMBINATIONS OF RANDOM 4.4 HYPOTHESIS TESTING THE MEAN 54 VARIABLES 17 4SHYPOTHESIS TESTING FOR THE CHAPTER REVIEW 2 16 DIFFERENCE BETWEEN MEANS 60 4.6 USE OF LARGE SAMPLE RESULTS FOR AN UNKNOWN POPULATION 64 CHAPTER 3 ESTIMATORS AND CHAPTER REVIEW 4 66 CONFIDENCE INTERVALS 19 eta eenee og CHAPTER SCORRELATION 70 3.2 CONFIDENCE INTERVALS , ee Ten enna - CORRELATION COEFFICIENT a 5.2 HYPOTHESIS TESTING FOR ZERO CORRELATION 7 REVIEW EXERCISE 1 43 CHAPTER REVIEWS 82 CUTS at 0 CHAPTER 6 GOODNESS OF FIT BINOMIAL CUMULATIVE AND CONTINGENCY TABLES 88 DISTRIBUTIONFUNCTION 137 6.1 GOODNESS OF FIT 89 6.2 DEGREES OF FREEDOM AND THE CHI-SQUARED (x?) FAMILY POISSON CUMULATIVE Oc iaSTTIONS og DISTRIBUTION FUNCTION 141 6.3 TESTING A HYPOTHESIS 96 ee PAT Ti scReTeomA sap PERGENTAGE POINTS OF 2 6.5 TESTING THE GOODNESS si x DISTRIBUTION FUNCTION 142 OF FIT WITH CONTINUOUS DATA 110 6.6 USING CONTINGENCY TABLES 116 CHAPTER REVIEW 6 122 GRITICAL VALUES FOR CORRELATION COEFFICIENTS 143 REVIEW EXERCISE 2 127 RANDOM NUMBERS 144 EXAM PRACTICE 133 THE NORMAL DISTRIBUTION GLOSSARY 145 FUNCTION 135 ANSWERS 147 PERCENTAGE POINTS OF THE NORMAL DISTRIBUTION 136 INDEX 160 PCS 4 ABOUT THIS BOOK The following three themes have been fully integrated throughout the Pear: son Edexcel International Advanced Level in Mathematics series, so they can be applied alongside your learning. 1, Mathematical argument, language and proof + Rigorous and consistent approach throughout. + Notation boxes explain key mathematical language and symbols 2, Mathematical problem-solving ‘The Mathematical Problem-Solving ycle + Hundreds of problem-solving questions, fully integrated r spectytheprobien => into the main exercises + Problem-solving boxes provide tips and strategies, Inert ress a = Challenge questions provide extra stretch t j 3. Transferable skills cnet tration SF + Transferable skills are embedded throughout this book, in the exercises and in some examples + These skills are signposted to show students which skills they are using and developing Finding your way around the book Each chaptor stars with Allst of Leoming objectives The Prior knowledge check helps make Suro you are ready to start the chapter Glossary terms wil be dentited by bald lu text on ta first appearance Each chapters mapped tothe specticaton cantent or easy reference “The oa world applications of the mathe you ae about to leam arp highighted atthe stat of the chapter Each soction bogie wth ‘an explanation are key learning points Probiersoiving boxes prowde hints, tins nd Satgies, ana Watch uf boxes highight ‘reas where etuderts ‘fen lose marks ther ‘Seep-by-step worked | ‘ramps focus on tho key types of questons, youllned te tackle. Exam-style questions bring you upto exam standard {uestions to ensure you are flagged with © ‘ready forthe exams Problem-solving CUCL Trerise questions are carctuly graded] Beerisos a packed 0 they nerease in dificult and gradualy | with exam-siyle Fansterable skis av signposted were they natural cocurn the Joxersses and examples Each hapa wih Geoterrr auestensareioesed Sinan ety ae Altar every few chapters, a Reviow exercise Felps you consolte your teaming with ——__ ots of exam-sive questions Exam practice Mathematics International Advanced Subsiiary/ Advanced Level Statistics 3 Review exercise [Atul practice paper at the back of ‘ie book hips you prepare forthe real thing vil vili QUALIFICATION AND ASSESSMENT OVERVIEW QUALIFICATION AND ASSESSMENT OVERVIEW Qualification and content overview Statistics 3 (S3) is an optional unit in the following qualifications: International Advanced Subsidiary in Further Mathematics International Advanced Level in Further Mathematics Assessment overview The following table gives an overview of the assessment for this unit. We recommend that you study this information closely to help ensure that you are fully prepared for this course and know exactly what to expect in the assessment, aC ry SB Statistics} sahieotvs [75 |2hour30mins |june Paper code WSTO3/01 16294 of IAL First assessment june 2020 AS: International Advanced Subsidiary, IAL: International Advanced A Level. Assessment objectives and weightings eal select and use thelr knowledge of mathematical Tacs concepis and techniques ina variety of contexts ‘Constuct igorous metherotical arguments and proofs through use of precise statements, logical deduction and inference and by the manipulation of mathematical expressions, om including the construction of extended arguments for handling substantial problems presented in unstructured frm, Recall select and use tel knowledge of standard mathematical models to vepresent situation inthe resl wort recognise and understand given representations involving standard models: resent and interpret results rom such madels in terms of the vigil statin, including discusion of the assumptions made and refinement of such medels Comprehend tranlations of common realstc contexts into mathematics; se the results of ‘AO4 | calculations to make predictions, or comment on the context; and, where appropriate, read 5% {cal and comprehend longer mathematical arguments or examples of applications. Use contemporary colulator technology and other pernitted resources Guch as formulae 105 | booklets or statistical ables) accurately ond eFiely understand when at to use such om technology, an its lintations. Give answers to appropriate accuracy. 10% CEU Uu cassie My a Ld Relationship of assessment objectives to units ed oy Marks out of 75 25-30 20-25 10-15 510 5-10 % Bho 26h} 120 a3} aap Calculators Students may use a calculator in assessments for these qualifications. Centres are responsible for making sure that calculators used by their students meet the requirements given in the table below. Students are expected to have available a calculator with at least the following keys: +,-, +, 7,32, ve, 4», ln x, e%, x1, sine, cosine and tangent and their inverses in degrees and decimals of a degree, and in radians; memory. Prohibitions Calculators with any of the following facilities are prohibited in all examinations: + databanks + retrieval of text or formulae + built-in symbolic algebra manipulations + symbolic differentiation and/or integration + language translators + communication with other machines or the internet un ey @ Extra online content Whenever you see an Online box, it means that there is extra online content available to support you. SolutionBank SolutionBank provides worked solutions for questions in the book. Download the solutions as a PDF or quickly find the solution you need online. Use of technology Explore topics in more detail, visualise problems and consolidate your understanding, Use pre-made GeoGebra activities or Casio resources for a graphic calculator. EDD ins nest ofnerecion GP EY graphically using technology. GeeaGebra GeoGebra-powered interactives CASIO. Graphic calculator interactives Interact with the maths you are learning using GeoGebra’s easy-to-use tools Explore the maths you are learning and gain confidence in using a graphic calculator Calculator tutorials Our helpful video tutorials will guide you through how to use your calculator in the exams. They cover both Casio's scientific and colour graphic calculators. ERD Work out each coefficient quickly using the "C; and power functions on your calculator. Finding the value of the first derivative te aces the function press Gav) © Learning objectives npleting this chapte Take a simple random sample op: Use random numbers for sampling “> pages 2-3 Take a stratified sample + pages &-5 “> pages 4-5 intages of the different metho > pages imstances in which Steven P Prior knowledge check 1 inan When trying to find o and fruit, From pr experience, pe desserts in the ratio 4:3:1 440 people is planned next es of cake should 5 Mathematics people think about an product, a ampling techniques to gather information that represe the population. ata) SU @&®) sampling In Statistics 2, you learned about populations and samples. Some of the key points about sampling are stated again below: ‘The size of the sample can affect the validity (Le. how true something is) of any conclusions drawn. + The size of the sample depends on the required accuracy and available resources, *+ Generally, the larger the sample, the more accurate itis, but it will cost more money or take more time, + If the population is varied, you need a larger sample than if the population were uniform. + Different samples can lead to different conclusions due to the natural variation in a population. * Individual units of a population are known as sampling units. ™ Often sampling units of a population are individually (separately) named or numbered to form a list called a sampling frame. an Give a brief explanation, and an example of the use of a a census b a sample survey A school wants to carry out a sample survey on their students Give an example of © a sampling unit da sampling frame a Cens) every member of the population is observed. Example: 10-year natlonal census b Sample survey ~ 2 small population of the population is observed, Example: opinion polls € Sampling unit — the enrolment numb the student, 4 Sampling frame — a list of all of the enrolment numbers of the students, ©&) Using a random number table Once you have a sampling frame where each sampling unit is a number, you can use a random number table (such as the one given in the formula book) to generate random numbers, instead of Using a calculator, The random number table is constructed with great care so that each digit is equally likely to appear. SNL Crate Suppose you want a sample of 50. You will need to select 50 random numbers from the table. You could start at the top left-hand comer and work down the column. If you reach the bottom of the table, you could start again at the top with the next unused digits along the top row. However, itis better to start at a randomly-selected place in the table. From there you may travel in any direction. Ifa number appears that has already appeared, itis ignored (in effect this is sampling without replacement), (Once you have extracted 50 random numbers, the sample is selected from the numbered sampling, frame by using these numbers, In random number sampling, each element is given a number. The numbers of the required elements are selected by using random number tables or other random number generators, Sampling frame You are going to take a sample of size 5 from a population of size 400. Write down the first five random numbers, using the thirteenth column of the table on page 144 and working down, 37 03 94 7 26: 9 | a oss _Theponion saison sores 05 33 The numbers are 372, 039, 172, 7, and O53 caida SAMPLING 1 Explain briefly what is meant by the term ‘sampling’. Give three advantages of taking a sample instead of a census. 2. Define what is meant by a census, By referring to specific examples, suggest two reasons why a census might be used A factory makes safety ropes for climbers and has an order to supply 3000 ropes. The buyer wants to know if the load at which the ropes break is more than a certain figure. Suggest a reason why a census would not be used for this purpose. 4 Explain: a why a sample might be preferred to a census b what you understand by a sampling frame € what effect the size of the population has on the size of the sampling frame. 5 Using the random number 7 for the column and 3 for the line, select a sample of size 6 from the numbers in the random number table: a 0-99 b 50-150 © 1-600 ©) Random sampling Simple random sampling ® In simple random sampling, you would assign each sampling unit with a number, and then use a random number generator or table to select the required sample size. Systematic sampling ™ In systematic sampling, the required elements are chosen at regular intervals from an ordered list. For example, if a sample of size 20 was required from a population of 100, you would take every fifth person since 100 + 20 = 5. The first person should be chosen at random. For example, if the first person chosen is number 2 in the list, the remaining sample would be persons 7, 12, 17, etc. Stratified sampling ™ In stratified sampling, the population is divided into mutually exclusive strata (males and females, for example) and a random sample is taken from each. The proportion of each strata sampled should be the same. A simple formula can be used to calculate the number of people we should sample from each stratum: number in stratum umber in population The number sampled in a stratum eax) ‘A factory manager wants to find out what his workers think about the factory canteen facilities. ‘The manager gives a questionnaire to a sample of 80 workers. It is thought that different age groups will have different opinions. x overall sample size SNL Crate ‘There are 75 workers between 18 and 32 years old There are 140 workers aged between 33 and 47. ‘There are 85 workers aged between 48 and 62. a Write down the name of the method of sampling the manager should use. b Explain how he could use this method to select a sample of workers’ opinions. terme @ Stratified sampling, b There are: 75 + 140 +85 = 300 workers altogether, 18-32: 2 x 60= pan 10 workers 140 Le -_ 33-47: 34S, x 80 = 375 & 37 workes 48-62: 5 x 80 = 22% » 23 workers a Number the workers in each age group. Use a random number table (or generator) to produce the required quantity of random numbers. Give the questionnaire to the workers corresponding to these numbers, Each method of random sampling has advantages and disadvantages. Simple random sampling ‘Advantages Disadvantages + Free of bias + Easy and inexpensive to implement for small populations and small samples + Each sampling unit has a known and equal chance of selection + Not suitable when the population size or the sample size is large + Asampling frame is needed ‘Systematic sampling Advantages Disadvantages + Simple and quick to use + Suitable for large samples and large populations + Asampling frame is needed + Itcan introduce bias if the sampling frame is not random Stratified sampling ‘Advantages Disadvantages + Sample accurately reflects the population structure + Guarantees proportional representation of groups within a population + Population must be clearly classified into separate strata + Selection within each stratum suffers from the same disadvantages as simple random sampling 6 CHAPTER1 SAMPLING 1 a The head teacher of an infant school wants to take a stratified sample of 20% of the children at the school. The school has the following numbers of children. Year 1 Year 2 Year 3 SPS 40 60 80 When describing advantages or REASOMIMNG/ARGUMENTATION Work out how many children in each age group will be in disadvantages of a particular the sample. sampling method, always refer b Describe one benefit to the head teacher of using a to the context of the question. stratified sample. 2 A survey is carried out on 100 members of the adult population of a small town. The population of the town is 2000. An alphabetical list of the inhabitants of the town is available. a Explain one limitation of using a systematic sample in this situation. b Describe a sampling method that would be free of bias for this survey. 3. A gym wants to take a sample of its members Each member has a S-digit membership number, and the gym selects every member with a membership number ending 000, a Is thisa systematic sample? Give a reason for your answer, b Suggest one way of improving the reliability of this sample, 4 A school head teacher wants to get the opinion of Year 12 | Year 13 year 12 and year 13 students about the facilities Mae 70 0 available in the common room. The table shows, the numbers of students in each year. a Suggest a suitable sampling method that might be used to take a sample of 40 students b_ How many students from each gender in each of the two years should the head teacher ask? Female 85 5 5 A factory manager wants to get information about how her employees travel to work. There are 480 employees in the factory, and each has a unique employee number from 1 to 480. Explain how the manager could take a systematic sample of size 30 from these employees. 6 The director of a sports club wants to take a sample of members, Each member has a unique membership number. There are 121 members who play tennis, 145 members who play badminton and 104 members who play squash, No members play more than one sport. a Explain how the director could take a disadvantage of this sampling method. ‘The director decides to take a stratified sample of 30 members. b State one advantage of this method of sampling. € Work out the number of members who play each sport that the director should select for the sample. imple random sample of 30 members and state one SNL Crate © Non-random sampling Quota sampling ™ In quota sampling, an interviewer or researcher selects a sample that reflects the character of the whole population. ‘The population is divided into groups according to a given characteristic. The size of each group determines (decides) the proportion of the sample that should have that characteristic As an interviewer, you would meet people, assess their group, and then allocate them into the appropriate quote. This continues until all quotas have been filed. Ifa person refuses to be interviewed or the quota into which they fit is full, then you simply ignore them and move on to the next person. Quota sampling Advantages Disadvantages + Allows a small sample to still be + Non-random sampling can introduce bias representative of the population + Population must be divided into groups, + No sampling frame required which can be costly or inaccurate + Quick, easy and inexpensive + Non-responses are not recorded + Allows for easy comparison between different groups within a population ANALYSIS; DECISION MAKING 1 Interviewers at a shopping centre collect information on the spending habits from a total of 40 shoppers. Explain how they could collect the information using quota sampling 2 Describe the similarities and differences between quota sampling and stratified random sampling, 3 An interviewer stops the first 50 people he sees outside a kebab shop on a Friday evening and asks them about their eating habits. a Explain why his sampling method may not be representative, b Suggest two improvements he could make to his data collection technique. 4A researcher is collecting data on the radio listening habits of people in a local town. She asks the first five people she sees entering a supermarket on Monday morning, ‘The number of hours per week each person listens to the radio is given below: 4°76 8 2 a Use the sample data to work out a prediction for the average number of hours listened per week for the town asa whole. b Describe the reliability of the data. ‘© Suggest two improvements to the method used. ea) SU 5 The heights, in metres, of 20 emu are listed below: Te [19 [23] W721] 20 [25 [27 [25 aa [22 p24 23] 22] 2s [19] a0 [22 Take an opportunity sample of size 5 from the data b Staring tom the second data valve akeasysemaie GD) Aaexanpeotan sample of size 5 from the data Se ere € Calculate the mean height for each sample fist Re Regt from the se 4 ‘State, with reasons, which sampling method is likely to be more reliable Croan 1 The table shows the daily high temperature (in °C) recorded on the first 15 days of January 2019 in Jacksonville, Florida, Day of month] 1 3[4[3]¢]7]8]9| [|B Bl ele Daily high ly 28 | 26 | 27 | 26 | 16 | 21 | 24 | 24 | as | a3 f a7 | 22 | 2s fas | as temp (°C) a Deseribe how you could use the random number function @IEP) ake sure you describe on your calculator to select a simple random sample of Tone nGhem. 5 dates from this data. b Use a simple random sample of S dates to estimate the mean high temperature in Jacksonville for the first 15 days of January 2019. € Use all 15 dates to calculate the mean high temperature in Jacksonville for the first 15 days of January 2019, Comment on the reliability of your two samples. 2 a Give one advantage and one disadvantage of using: i acensus iia sample survey. b Itis decided to take a sample of 100 from a population consisting of 500 elements, Explain how you would obtain a simple random sample from this population. 3a Explain briefly what is meant by ia population a sampling frame, b_A market research organisation wants to take a sample of: i owners of diesel automobiles in the UK ii people living in Oxford who suffered back injuries during July 2017. Suggest a suitable sampling frame in each case. 4 Write down one advantage and one disadvantage of using: a. stratified sampling b simple random sampling, SNL Crate 5 The managing director of a medium-sized airline wants to know what the employees think about the overtime pay scheme. The airline employs 100 pilots and 200 cabin crew. ‘The managing director decides to ask the pilots a Suggest a reason why this is likely to produce a biased sample. b Explain briefly how the managing director could select a sample of 30 employees using: i systematic sampling ratified sampling quota sampling, 6 There are 64 girls and 56 boys in a school. Explain briefly how you could take a random sample of 15 students using a simple random sampling b stratified sampling, 7 As part of her class project, Deepa decided to estimate the amount of time A level students at her school spent on private study each week, She took a random sample of students from those studying arts subjects, science subjects and a mixture of arts and science subjects. Each student kept a record of the time they spent on private study during the third week of term. a Write down the name of the sampling method used by Deepa. b Give a reason for using this method and give one advantage this method has over simple random sampling EA d cae 1 + Instatistics, a population is the whole set of items that are of interest. + Acensus observes or measures every member of a population 2 + Asampleis a selection of observations taken from a subset of the population which is used to find out information about the population as a whole. + Individual units of a population are known as sampling unit * Often sampling units of @ population are individually named or numbered to form a list called a sampling frame. 3 + Asimple random sample of size n is one where every sample of size m has an equal chance of being selected. + In systematic sampling, the required elements are chosen at regular intervals from an ordered list. + In stratified sampling, the population is divided into mutually exclusive strata (males and females, for example) and a random sample is taken from each. + In quota sampling, an interviewer or researcher selects a sample that reflects the characteristics of the whole population aun BNA N seco After completing this chapter you should be able to: Find th bution of linear combinations and functions of normal random es > pages 12-13 Solve modelling problems i and functions of n Arandom variable a P(Y>15) b Pag a p such that P(X > b qsuch that P(- q 1 Section 7.3 X'is a random variable with E(X) = 7 and Var(X) = Find: b Var(3x-5) ormally distributed, € Statistics 1 Section 6.5 ~> Exercise 2A, Q6 COTM Sac PUD NGI) raid @ Combinations of random variables Two random variables are independent if the outcome of one variable does not affect the distribution of the other. You need to be able to combine random variables with different distributions. You will use these two results: = If Xand ¥ are two independent random variables, then: + EV + YF) =E(X) +E(Y) + E(K- ¥) = £(X) - EY) = If Xand Yare two independent random variables, then: + Var(X+ ¥) =Var(X) + Var() + Var(X- ¥) = Var(X) + Var(¥) ax) If Ysa random variable with E(X) variable with E(Y) = y1: and Var( ¥) aXey bx-¥ CREED Vo aac ny and Var(X) = 07, and Yis an independent random. find the mean and variance of: Jar(X) + Var(Y) -e) You can combine the above result with standard results about expectations and variances of multiples of a random variable to analyse linear combinations of independent random variables. = If Xand Yare two independent random variables, then: + EUAN + BY) = aE(X) + BE(Y) + EX - bY) = aE(X) - BE(Y) = If Xand Yare two independent random variables, then: + Var(aX + BY) = a’Var(X) + bVar(¥) + Var(aX ~ BY) = aVar(X) + 6°Var(¥) In this chapter you will apply these results to analyse normally distributed random variables. = Allinear combination of normally distributed independent random variables is also normally distributed. This result allows you to fully define the distribution of a linear combination of independent normal random variables. Peary Ua Ube) = If Xand Yare independent random variables with X ~ Ni, 0) and Y ~ N(jt, 2), then: + aX + bY ~ N(apy + buy, a0? + Bo!) + aX bY ~N(aqy ~ bya, 2a + Bea) You can also use this result to find the distribution of sums of identically distributed independent. normal random variables. © IFX;, Xp,» X, are independent identically distributed random variables with 2X)o Nu then 32 Ny Ml no) nes Sometimes itis better not to use these formula, and use the expected values and variances as above. GE?) BED recs some ‘The independent random variables Vand Y have distributions ¥ ~ N(5, 2") and ¥ ~ N(10, 3°). a Find the distribution of bA=X+Y¥ li B=9N-2Y b Find PB > 30), The independent random variables ¥,, ¥:, ¥, and Y, all have the same distribution as ¥. The random variable Z is defined as: Z=DY, ¢ Find the mean and standard deviation of Z. ns Varta) = Vari + ¥) Var(X) + Wart) A~ NOS, 13) il W E(B =Cex- 2% E(x) - B2Y) 91x) ~ 26%) =9xX5-2x10 Var(B) = Var(9X- 2¥) VartOX) + Vari2¥) 92 Var(X) + 22 Var(¥) B1x44+4%9=360 COTM Sac PUD NGI) raid © Z=h+ En +¥a+%s HY + %+ ht Y) E(Y) + (Ye) + EY) + EK) =4x10=40 VartZ) = Worl, + Yo+ ¥ + Yo) = Var(Y,) + Vart¥_) + Vart¥a) + Vart¥) = 438336 Z~ NiA0, 36) SoZ has a mean of 40 and a standard deviation of 6. GEES) aD ore rm ‘The independent random variables X and Y have distributions ¥ ~ N(25,6) and ¥ ~ N(22, 10). Find POY> ¥). Manoa Problem-solving E wen ‘You can compare independent normal random ‘variables by defining a new random variable to = nn au be the difference between them. If ¥>0 7 then X> ¥,andif ¥- ¥<0 then ¥>X. Var(C) = Var(X = ¥) variX) + Var(V) =6+10=16 C~NG, 16) ne>our(2>958) ii NE ETI ©) LED wren nea ranons Bottles of water are delivered to shops in boxes containing 12 bottles each. The weights of bottles are normally distributed with mean weight 2kg and standard deviation 0.05 kg. The weights of empty boxes are normally distributed with mean 2.5 kg and standard deviation 0.3 kg. a Assuming that all random variables are independent, find the probability that a full box will weigh between 26 kg and 27 kg b Two bottles are selected at random from a box. Find the probability that they differ in weight by more than 0.1 kg. € Find the weight m that a full box should have on its label so that there is a 1% chance that it weighs more than m. eR aiord Ua Ube) a let We Xy+¥o4 + Ket EW) = EO + Xa + NH O = Eth) + Eat) + + EAE # £0 ‘enas2oec0s an nf, soe Ha # 2 ant) + NG) + Vr) sane wa wees 0% Fes < W< 27) Sandan f28=265.«< 7 < 27 = 265) vor vor Var(HV) a2 Pas < Z< 144) PZ < 1.44) - PZ < -1.44) b let Y=%- €() = EX, - 2) ° VartX, - Xa) Var(X) + VartX2) = 0.005 So ¥~ N(O, 0.005) Pay] > on) Var ¥) 1-PeoI<¥ m) = 0.01 So PW < m) = 0.99 "yaa | 9 m_ Zig 25263 [iin a) m= 23263 x (O12 + 265 27.3kg (3 si) r(z< COTM Sac PUD NGI) raid FEE) asians 1 Given the random variables ¥ ~ N(80, 3?) and Y ~ N(50, 2%) where X and Y are independent, find the distribution of W where: a W=X4¥ b W=X-¥ 2. Given the random variables X ~ N(4S, 6), ¥-~ (54,4) and W~ N(49, 8) where X, Yand W are independent, find the distribution of R, where R= X+ ¥+ W. 3. Vand Yare independent normal random variables. X ~ N(60, 25) and ¥ ~ N(50, 16). Find the distribution of T where: a T=3X bT=7Y © T=3X47Y a T=X-2Y 4A, Band Care independent normal random variables. 4 ~ N(50, 6), B-~ N(60, 8) and C~N(80, 10), Find: a P(A+B< 115) b P(A+ B+ C> 198) © P(B+ C< 138) 4 PQ44+B-C< 70) © P(d+3B-C> 140) f£ POS <4+B<116) @® 5 Xand ¥are independent random variables with ¥~ N(76, 15) and ¥~ N(80, 10). Fin a P(Y>X) b PLY> ¥y € the probability that Yand Y differ by: i less than 3 ii more than 7. ©) 6 To runners recorded the Mean ‘Standard deviation ‘mean and standard deviation of their 100 m sprint times ina table a Assuming that each runner's times are normally distributed, find the probability that in a head-to-head race, runner 4 will win by more than 0.5 seconds, (S marks) A ‘photo finish’ occurs if the winning margin (the difference between two times) is less than 0.1 seconds, b Find the probability of a ‘photo finish’ (marks) Runner A | 13.2 seconds |0.9 seconds Runner B | 12.9 seconds | 1.3 seconds @®) 7 A factory makes steel rods and stee! tubes. The diameter of a steel rod is normally distributed with mean 3.55em and standard deviation 0.02cm. The internal diameter of a steel tube is normally distributed with mean 3.60cm and standard deviation 0.02cm. ‘A rod and a tube are selected at random. Find the probability that the rod cannot pass through the tube. (marks) 8 The mass of a randomly selected jar of jam is normally distributed with a mean mass of 1 kg and a standard deviation of 12. The jars are packed in boxes of 6 and the mass of the box is normally distributed with mean mass 250 g and standard deviation 10g. Find the probability that a randomly chosen box of 6 jars will have a mass less than 6.2kg (6 marks) SC al airs PSR ea) 1X, Yand Ware independent normal random variables. ¥ ~ N(8, 2), ¥~ N(12, 3) and W~ N(15, 4). Find the distribution of A where: adaXe¥ew b A=W-X © A=N-¥s3W a 4-344" e A= 2N-¥+W © 2 Given the random variables ¥ ~ N(20, 5) and ¥~ N(10, 4) where Vand Yare independent, find: a E(X- Y) (2marks) b Var(x= Y) (2marks) ¢ PU3. 5A sweet manufacturer produces two varieties of f The masses, ¥and Y, in grams, of randomly selected Yiras and Yummies are such that X~ NGO, 25) and ¥~ N(32, 16) a Find the probability that the mass of two randomly selected Yummies will differ sweet, Niras and Yummies. by more than Se. (Smarks) (One sweet of each variety is selected at random, b Find the probability that the Yummy sweet has a greater mass than the Xtra. (Smarks) A packet contains 6 Xeras and 4 Yummies. € Find the probability that the average mass of the sweets in the packet lies between 280g and 330. (6 marks) COTM Sac PUD NGI) raid @« INTERPRETATION @7 ‘caTICAL ‘TaN A certain brand of biscuit is individually wrapped. The mass of a biscuit can be taken to be normally distributed with mean 75 g and standard deviation 5 g. The mass of an individual wrapping is normally distributed with mean 10g and standard deviation 2g. Six of these individually wrapped biscuits are then packed together. The mass of the packing material is a normal random variable with mean 40 g and standard deviation 3 g. Find, to 3 decimal places, the probability that the total mass of the packet lies between 535 ¢ and 565 g (7 marks) The independent normal random variables X and ¥ have distributions N(10, 2°) and N(40, 3°) respectively (ie. in the same order as mentioned before). ‘The random variable Q is defined as Q=2¥ + ¥ a Find: i FQ) (Q marks) ii Var(Q) (3 marks) The random variables X;, Xs, Xs, Xj, and X- are independent and all share the same distribution as X. The random variable & is defined as: R=DX, b i Find the distribution of R. Find P(Q > R). (7 marks) The usable capacity of the hard drive on a games console is normally distributed with mean 60 GB and standard deviation 2.5 GB. The amounts of storage required by games are modelled as being identically normally distributed with mean 5.5 GB and standard deviation 1.2GB. a Chakrita wants to save 10 randomly chosen games onto her empty hard drive. Find the probability that they will fit. (8 marks) b State one assumption you have made in your calculations, and comment on its validity. (mark) X,, Xo, Ny and X, are independent random variables, each with distribution N(4, 0.03). ‘The random variables ¥ and Z are defined as: Y=Meh+k Z=3K Find the probability that Y¥ and Z differ by no more than 1. (S marks) A builder purchases bags of sand in two sizes, large and small. Large bags have mass L kg and small bags have mass Skg. Land S are independent normally distributed random. variables with distributions N(75, 5*) and N(40, 32) respectively. A large bag and a small bag of sand are chosen at random, a Find the probability that the mass of the small bag is more than half the mass of the large bag. (Gmarks) The builder purchases 10 small bags of sand, The total mass of these bags is represented by the random variable M. b Find P(|M ~ 400] < 5) (S marks) SCM air Ua Ube) cms GED 01 may make use ofthe For independent random variables Vand Y, EWYY) = ECX)E(. fact that for any two random DWNOVATION Use this result to prove that if X’and ¥ are independent random EU CES eat variables, then Var(¥+ ¥) = Var(X) + Var( 1), Summary of key points 1 If Nand Yare two independent random variables, ther + EX 4 Y= E(X) +E) EY 1) (x) - EY) 2 If Yand Yare two independent random variables, then: + Var(¥+Y) = Var(X) + Var(¥) + Var(X— ¥) = Var(X) + Var(¥) 3. If Vand Yare two independent random variables, then: + ElaX + bY) = aE(X) + DE(Y) + ElaX ~ bY) = aE(X) ~ DEY) 4 if Xand ¥ are two independent random variables, then: + Var(aX + bY) = @°Var(X) + bVar(Y) + Var(aX — bY) = @°Var(X) + bVar(Y) 5. Alinear combination of normally distributed random variables is also normally distributed. 6 If ¥and Yare independent random variables with X ~ N(jiy, 03) and ¥ ~ N(jiz, 03), then: + aN + bY ~ Nap, + bjtz, 203 + B03) + aN ~ bY ~ Naps ~ bjsn aa} + B09) T IFN; Xs, .-+X, ate independent identically distributed random variables with X,~ N(j, 0°), then 5° Xj ~ Ny, no) tei at NS Ree INTERVALS Zs TT) rere r | aT) aan TT Pron as HEH vreeh heey ARE AOS As [a Q) be | Uff ] tj a Find P(A > B). The random variable b Find the distribution of X Pe asi} SAU Une al ea Ls Ina large population (e.g, the number of trees in a forest), it would take too long or cost too much money to carry out a census (e.g, to record the height of every tree). In cases like this, population parameters such as the mean yr or the standard deviation « are likely to be unknown. In Chapter 1, you looked at methods of sampling that allow you to take a representative sample to estimate various population parameters. A census observes every member of a population, whereas a sample isa selection af observations taken from a subset of the ‘Acommon way of estimating population Population. ‘+ Statistics 2 Section 6.1 parameters is to take a random sample from the population. * IF Xisa random variable, then a random sample of size 1 will consist of observa random variable X. These are referred to as Xy, Xo, Xy, «++, Xy where the X;: + are independent random variables + each have the same distribution as X. CED & represents tne th observation = Astatistic T's defined as a function of the IS CaM PRTTNE VAIO CIRC SEERTN X; that involves no other quantities, such as denoted by x. unknown population parameters. o WF For example, ¥, the sample mean, is a statistic, whereas >=" — 2s not a statistic since it involves the unknown population parameter j. oe EID) aD rcseneren A sample X, X2, .... Xy is taken from a population with unknown population parameters j: and o. State whether or not each of the following are statistics. Mit a Ms a b max(X), Xo... X,) aka Ts 7 a Fis a statistic. b max(M', Mo, 4 Ai ls a statisti, ———— tse Since itis possible to repeat the process of taking a sample, the specific value of a statistic 7’ will be different for each sample. I all possible samples are taken, these values will form a probability distribution called the sampling distribution of 7. = The sampling distribution of a statistic Tis the probability distribution of 7. If the distribution of the population is known, then the sampling distribution of a statistic can sometimes be found. ESTO gs ae ai ex) ‘The masses, in grams, of boxes of apples are normally distributed with a mean j and standard deviation 4, A random sample of size 25 is taken and the statisties Rand T are calculated as follows: Re Xy- Xyand T= X, 4 Ny+.. + Yas Find the distributions of R and 7. that fo R ~ N(O, (4/2) T2Xj+ Xe tt Xs oo a. In a bag that contains a large number of counters, the number 0 is written on 60% of the counters, and the number | is written on the other 40%. a Find the population mean j« and population variance o? of the values shown on the counters. A simple random sample of size 3 is taken from this population. b List all the possible observations from this sample. ¢ Find the sampling distribution for the mean Xt eN 3 where Xj, X, and ¥; are the values shown on the three counters in the sample. Hence find ECY) and Var(¥ ), € Find the sampling distribution for the sample mode M. Hence find E(M) and Var(M), ‘a If X represents the valve shown on a randomly chosen counter, then X has distribution: PO = §) 1 aa] © wee) = Dare ye04 Saeed +P oF = Vat) = SPP = w) ~ 4 FT Pa ata SUS Ub) The possible cbaewations ae ———_______| ist the stealthy (0.0.0 1,00) (0,10) 0,0, ,1,.0) 0,0, 10,1, 1) 40) ie. the (0, 0, 0) case le. the (1, 0, 0), ©, 1,0» (0,0, 9 cases ie. the (1,1, 0), 4, 0, 0, 1. N cases ie. the (1,1, 1) ease x [0 em | i d €(R)=043 VaR) 045% Cr le, cases (0, 0, 0), (1,0, 0), ,1,0) 10,0, le, the other cases sothe distribution of Mis [m | Oo | 1 pom | | fea and Var(M) = 228 = Asstatistic that is used to estimate a population parameter is called an estimator and the particular value of the estimator generated from the sample taken is called an estimate. You need to be able to determine how reliable these sample statistics are as estimators for the corresponding population parameters. Since all the X, are random variables having the same mean and variance as the population, you can sometimes find expected values of a statistic 7; E(7'). This will tell you what the ‘average’ value of the statistic should be. GE) BD A random sample 1, X: Show that E(Y) = p.. _ X, is taken from a population with ¥~N(s, 0°). ESTO gs ae COGaitoky ides Mt ca ead ma. This example shows that if you use the sample mean as an estimator of the population mean, then ‘on average’ it will give the correct value. This is an important property for an estimator to have. You say that ¥ is an unbiased estimator of 1. A specific value of ¥ will be an unbiased estimate for p. = Ifa statistic Tis used as an estimator for a population parameter @ and E(7) = 6, then Tis an unbiased estimator for 0. In Example 3, you found two statistics based on samples of size 3 from a population of counters The two statistics that you calculated were the sample mean Y and the sample mode M. You could use either of them as estimators for 1, the population mean, but you saw that E(¥’) = : and E(M) # In this case, if you wanted an unbiased estimator for j, you would choose the sample mean ¥ rather than the sample mode Af, which we would call a biased estimator. How about an estimator for the population mode? Neither of the statistics that you calculated had the property of being unbiased since E(Y’ 2 and E(M) = 44, whereas the population mode was 0. cd The bias is the expected value of the estimator minus the parameter of the population itis estimating, = Ifa statistic 7'is used as an estimator for a population parameter 0, then the bias is E(7) - 0. In this case the bias is = For an unbiased estimator, the bias is 0. In Example 4, the mean of a sample was an unbiased estimator for the population mean. if you take a sample ¥X, of size 1 from a population with mean j and variance o, then the sample mean is ¥ = X,, because there is only one value. So E(¥) = E(X,) = 4 If you wanted to find an estimator for the In general, the variance of. wl populatonvaraceoumigrctyusngshe GD) Maser he wane fase us be an underestimate forthe variance ofthe Dwi- KF population This is because the statistic 7 ye 2201 = T7205 the sample mean Y rather than the population mean j, and on average the sample observations willbe close to T than to. variance of the sample, For our sample X, of size 1, the variance of M- XP the sample will be = (%-¥)?=0 So for a sample of size 1, E() = 0 4 0%. This illustrates that the variance of the sample is not an unbiased estimator for the variance of the population Fz) Pe ara) SAU Une al ea Ls You can use a slightly different statistic, called the sample variance, as an unbiased estimator for the population variance. = An unbiased estimator for 0? is given by the sample variance S# where: vet Shy xe se da-X) (CEEIEDD 5% the estimator random variable, ‘and ss the estimate (an observation from this There are several ways to calculate the value of s fandom variable, for a particular sample: Yon You can use the equivalence of these farms to show that sis an unbiased estimate far > Exercise 28 challenge “ne The form that you use will depend on the information that you are given in the question. Although a sample of size 1 can be used as an unbiased estimator of ja single observation from a population will not provide a useful estimate of the population mean. You need some way of distinguishing between the quality of different unbiased estimators. GES) ED = A random sample X, X5, .... X, is taken from a population with X~ Nt, 0°). iat +. « eee) $$ abort. to) Vart) = ESTO gs ae COGaitoky One reason that the sample mean is used as an estimator for jis that the variance of the estimator Var(¥) = & decreases as n increases. For larger values of n the value of an estimate is more likely to be close to the population mean. So, a larger value of n will result in a better estimator. = The standard deviation of an estimator is called the standard error of the estimator. When you are using the sample mean F you can QS PaRS aE use the following result for the standard error. CEED « Sane naee ai situations where you do not know the population = Standard faz andard error of pee vn SEO ©) ETE woermeresmne ‘The table below summarises the number of breakdowns V on a busy road on 30 randomly chosen days. Number of breakdowns [ 2 [ 3 | 4 | 5 | 6 [7 [8 Number of days 3[sfa[3[s[4]a a Calculate unbiased estimates of the mean and variance of the number of breakdowns, ‘Twenty more days were randomly sampled, and this sample had X = 6.0 days and s° = 5.0 b Treating the 50 results as a single sample, obtain further unbiased estimates of the population mean and variance, ¢ Find the standard error of this new estimate of the mean. Estimate the size of sample required to achieve a standard error of less than 0.25, By calculator QEEECED Hat’ notation is used to describe an estimate of a parameter. For (GO and Sox? = 990 example: 4? represents an estimate forthe population variance oi represents an estimate forthe population mean jt Grcueea First you need to use the formulae for 7 and sto find oy and Soy. So the combined sample (w) of size SO has Seco Yow = 160 + 120 = 280 EEBD ou can use your calculator to find yee ad oon eos —_ unbiased estimates ofthe mean and variance but you should show your working in the exam. 25 Pe arse} STUN en aa) Then the combined estimate of jis ge tics and the estimate for 0? is 1805 = 50 x 5.6% 49 48367... = 484 (3 54) The best estimate of o? wil be 53 since itis based an a langer sample than $2 or 8. oo —i To achieve a standard error < 0.25 you require So the standard error is (4236 < 025 me YER ” 0.25 n> 0797 = n>7738 So we need a sample size of at least 76. We have seen that for the independent observations X;~ N(j.,q°), we can evaluate the statistic pede tee Me In Example 4, we saw that ey B and in Example § that e var(¥) Since ¥ is normally distributed and each X,is an independent observation, ¥ must also be normally distributed, so we can create the distribution of the sample mean. 1FX,~ N(u, a) then ¥ ~ N(x, %), where Lis the standard error. am Ten independent observations from X ~ N(15, 3°) are taken. a State the distribution of the sample mean. b Find PCY < 14) ESTO gs ae COGaitoky 2 Since Xis normally distributed and the ¥ ~ N05, 0.9) — wees ante (z< ) 1 The lengths of nails produced by a certain machine are normally distributed with mean j. and standard deviation o. A random sample of 10 nails is taken and their lengths X), Xo, Ny +, Xp are measured. i Write down the distributions of the following: INTERPRETATION wv 2X1 + 3X0 10 ax y ee © San-9 ay e Sx Sy, rye) ii State which of the above are statistics 2A large bag of coins contains I cent, 5 cent and 10 cent coins in the ratio 2:2: 1 a Find the mean jr and the variance o for the value of coins in this population. A random sample of two coins is taken and their values X; and > are recorded. b List all the possible observations from this sampk ; o htm ¢ Find the sampling distribution for the mean ¥ =! 4. Hence show that ECP) = and Var(¥ 2 3. Find unbiased estimates of the mean and variance of the populations from which the following, random samples have been taken, a 213 19.6 185 223 174 163 18.9 17.6 18.7 165 19.3 218 20.1 220 bl251641328562431 € 120.4 230.6 356.1 129.8 185.6 147.6 258.3 329.7 249.3 4 0.862 0.754 0.459 0.473 0.493 0.681 0.743 0.469 0.538 0.361 Pd ee aL STUN aaa Ly 4. Pind unbiased estimates of the mean and the variance of the populations from which random samples with the following summaries have been taken. a n= 120 Dr = 4368 Lv = 162466 30 Dr 270 iy = 2546 en=1037 | Sv=11407 Sx = 1278.08 dn=ls Sv = 168 Sx = 1913 © 5 Theconcentrations in mg per litre, of an clement in 7 randomly chosen samples of water from a spring were: 2408 237.3 236.7 236.6 234.2 233.9 232.5 a Explain what is meant by an unbiased estimator. (mark) b Determine unbiased estimates of the mean and the variance of the concentration of the clement per litre of water from the spring. (4marks) © 6 A sample of size 6 is taken from a population that is normally distributed with mean 10 and standard deviation 2 a Find the probability that the sample mean is greater than 12. (3marks) State, with a reason, if your answer is an approximation (mark) 7 A machine fills cartons in such a way that the amount of drink in each carton is distributed, normally with a mean of 40cm? and a standard deviation of 1.5 em’ A sample of four cartons is examined. a. Find the probability that the mean amount of drink is more than 40.5em’, A sample of 49 cartons is examined. b Find the probability that the mean amount of drink is more than 40.5 em® on this occasion. © _ 8 Cartons of orange juice are filled by a machine. A sample of 10 cartons selected at random from the production line contained the following quantities of orange juice (in ml) 201.2 205.0 209.1 202.3 204.6 206.4 210.1 201.9 203.7 207.3 Calculate unbiased estimates of the mean and variance of the population from which this sample was taken. (A marks) 9 A manufacturer of self-build furniture required bolts of two lengths, Sem and 10cm, in the ratio 2:1 respectively. a Find the mean ji and the variance 2° for the lengths of bolts in this population, A random sample of three bolts is selected from a large box containing bolts in the required ratio. b List all the possible observations from this sample. ¢ Find the sampling distribution for the mean J @ Hence find EY) and Var(). ESTO gs ae COGaitoky ®w ADAPTIVE teams @&) 12 ADAPTIVE EARN € Find the sampling distribution for the mode M. f Hence find E(M) and Var(4). Find the bias when M is used as an estimator of the population mode. ‘A biased six-sided dice has probability p of landing on asi Every day, for a period of 25 days, the dice is rolled 10 times and the number of sixes X is recorded, giving rise to a sample Xi, Xo, .... Xos. a Write down E(X) in terms of p. Show that the sample mean ¥ is a biased estimator of p and find the bias. € Suggest a suitable unbiased estimator of p. The random variable ¥ ~ Uf-a, al) a Find E(X) and EQ), ‘Arandom sample Xi, X3, X3is taken and the statistic Y= X7 + X3 b Show that Y'is an unbiased estimator of «2, X7is calculated. Jiagi and Mei Mei each independently took a random sample of students at their school and asked them how much money, in RMB, they earned last week. Jiagi used his sample of size 20 to obtain unbiased estimates of the mean and variance of the amount earned by a student at their college last week. He obtained values of ¥ = 15.5 and s? = 8.0 Mei Me's sample of size 30 can be summarised as S>y = 486 and >y? = 8222 a Use Mei Mei’s sample to find unbiased estimates of 1 and 0°. (2 marks) b Combine the samples and use all 50 observations to obtain further unbiased estimates of yx and 0°. (marks) ¢ Explain what is meant by standard error, (mark) Find the standard error of the mean for each of these estimates of (2marks) © Comment on which estimate of 1 you would prefer to use. (1mark) A factory worker checks a random sample of 20 bottles from a production line in order to estimate the mean volume of bottles (in em*) from this production run, The 20 values can be summarised as} = 1300 and 7x? = 84 685. a Use this sample to find unbiased estimates of 1 and 0°, (marks) ‘A factory manager knows from experience that the standard deviation of volumes on this process, ¢, should be 3m? and he wishes to have an estimate of j that has a standard error of less than 0.Sem*, b Recommend a sample size for the manager, showing working to support your recommendation. (marks) © Does your recommended sample size guarantee a standard error of less than 0.5em?? Give a reason for your answer. (1 mark) The manager takes a further sample of size 16 and finds 7x = 1060. 4 Combine the two samples to obtain a revised estimate of 4 (2 marks) FE) Ee arse} STUN aaa Ly © 14 After growing for 10 weeks in a greenhouse, the heights of certain plants. have a standard deviation of 2.6¢m. Find the smallest sample that must be taken for the standard error of the mean to be less than 0.5em, (marks) © 15 The hardness of a new type of material was determined by measuring the depth of the hole made by a heavy pointed deviee ‘The following observations in tenths of a millimetre were obtained: 47 52 54 48 45 49 45 51 50 48 a Estimate the mean depth of hole for this material. (mark) b Find the standard error for your estimate. (2 marks) Estimate the size of sample required so that in future the standard error of the mean should be just less than 0.05 marks) © 16 Towork fora company, applicants need to complete a medical test. The probability of each applicant passing the test is p, independent of any other applicant. The medicals are carried Fecheey out over two days and on the first day m applicants are seen, and on the next day 2n are seen. ARGUMENTATION Let X represent the number of applicants who pass the test on the first day and let X> represent the number who pass on the second day. a Write down E(4;), BC), Var) and Var(X3). X; ', b Show that [* and 5; are both unbiased estimates of p and state, giving a reason, hn which you would prefer to use. Xi 4X, 4 Show that ¥= (= is an unbiased estimator of p X or Yis the best estimator of p? xX € Which of the statistics =, 5 2X) +X) The statistic 7=( oy ) is proposed as an estimator of p £ Find the bias, ©® 17 Ina bag that contains a large number of counters, the number 0 is written on 40% of the counters, the number | is written on 20% of the counters, and the number 2 is written on the remaining 40% of the counters a Find the mean js and the variance a? for this population of counters. A random sample of size 3 is taken from the bag. b List all the possible observations from this sample. ¢ Find the sampling distribution for the mean Y. Find EY) and Var(¥), ¢ Find the sampling distribution for the median N. f Hence, find EV) and Var(). £ h REASONING? AARGAMENTATION Show that Vis an unbiased estimator of ju Explain which estimator, ¥ or N, you would choose as an estimator of 4. ESTO gs ae COGaitoky os 2 sowinat 1 S598 4 oe - ar) m1 b Hence, or otherwise, show that s#is an unbiased estimate for the population variance o®, @®) confidence intervals The value of @, which is an estimator of 4, is found from a sample. It is used as an unbiased estimate for the population parameter @ and is very unlikely to be exactly equal to 4. There is no way of establishing, from the sample data only, how close the estimate is, Instead, you can form a confidence interval for 0. = A confidence interval (C.L) for a population parameter 0 is a range of values defined so that there is a specific probability that the true value of the parameter lies within that range. For example, you could establish a 90% confidence interval, or @ 95% confidence interval. ‘A.95% confidence interval is an interval such that CEEEND se popsisticn parameter there is a 0.95 probability that the interval contains 0. Dera ae a Different samples will generate different confidence Pieter ier intervals since estimates for the parameter will change based on the data in the sample and the sample size. Hence, if you know the population standard deviation, you can establish a confidence interval for the population mean j using the standardised normal distribution, Amo Given that ¥ is normally distributed, show that a 95% confidence interval for based on a sample of size n, is given by: (r-196%% 4 196-2) wn in 1 92 ones ¥~n(u.%) ‘ou wil need to use the standardised normal and therefore distribution N(, 1) to tackle problems like this. Poa Xn Z=—5*~No, ®) IY ~ Ny. 0°) then Z== >" Noo 19 te + Statistics 1 Section 7.4 Using tables, you can see that forthe NiO. F) striotion: FIZ > 19600) = FIZ < -19600) = 0025 and 50 95% of the distribution Is between “19600 and 1.3600 Ea ark SAU Une al ea Ls fle) 96 «0~«196~SCO So PH196 #-196<%5H< ia) = 095 Look at the inequality inside the probability statement: Y-196x2 le n> 73719, soyouneed n= 74 25. v25 © Avwidth of 15 + 1532 re15 From the table on page 135 you find that PZ < 15) = 09332 and so PZ > 15) = PZ <-15) 1- 09332 o.0ces, fe) ooces o.0ces aS 15 i 1 A random sample of size 9 is taken from a normal distribution with variance 36. The sample mean is 128 a Find a 95% confidence interval for the mean 1 of the distribution. b Find a 99% confidence interval for the mean 1 of the distribution. 2 A random sample of size 25 is taken from a normal distribution with standard deviation 4. The sample mean is 85. a Find a 90% confidence interval for the mean j of the distribution. b Find a 95% confidence interval for the mean 1 of the distribution. 3 A.95% confidence interval is given by (25.61, 27.19) Calculate a 99% confidence interval EM arse} STUN aaa Ly 4A normal distribution has standard deviation 15. Estimate the sample size required if the following confidence intervals for the mean should have width of less than 2. a 90% b 95% © 99% ‘A railway company is studying the number of seconds that express trains are late to arrive. Previous surveys have shown that the times are normally distributed and that the standard deviation is 50. A random sample of 200 trains was selected and gave rise to a mean of 310 seconds late. a Find a 90% confidence interval for 1, the mean number of seconds that express trains are late (marks) Five different independent random samples of 200 trains are selected, and each sample is used to generate a different 90% confidence interval for b Find the probability that exactly three of these confidence ED se siabie cama intervals contain j @marks) — (BEEmiatvaetesd TaN 6 Amy is investigating the total distance travelled by vans in current use. The standard deviation can be assumed to be 15000 km. In a random sample, $0 vans were stopped and their mean distance travelled was found to be 75 872 km. Amy suspects that the population is normally distributed, but claims that she can still use the normal distribution to find a confidence interval for 1. Find a 90% confidence interval for the mean distance travelled by vans in current use. (marks) 7 Itis known that each year the standard deviation of the marks in a certain examination is 13.5 but the mean mark 1 will fluctuate, An examiner wants to estimate the mean mark of all the candidates on the examination but he only has the marks of a sample of 250 candidates, which gives a sample mean of 68.4 a What assumption about the candidates must the examiner make in order to use this sample mean to calculate a confidence interval for j? (1 mark) b Assuming that the above assumption is justified, calculate a 95% confidence interval for 4 (3 marks) Later, the examiner discovers that the actual value of « was 65.3, © What conclusions might the examiner draw about his sample? (2 marks) 8 A student calculated 95% and 99% confidence intervals for the mean of a certain population but failed to label them. The two intervals were (22.7, 27.3) and (23.2, 26.8). a State, with a reason, which interval is the 95% one. (1 mark) Estimate the standard error of the mean in this case, (marks) € What was the student’s unbiased estimate of the mean jin this case? (2 marks) © _% The director of a company has asked for a survey to estimate the mean expenditure of customers on electrical appliances. In a random sample, 100 people were questioned and the research team presented the director with a 95% confidence interval of ($128.14, $141.86). The director says that this interval is too wide and wants a confidence interval of total width S10. a Using the same value of X, find the confidence limits in this case, (3 marks) b Find the level of confidence for the interval in part a. (2 marks) ESTO gs ae COGaitoky ®w @®u ‘The managing director is still not happy and now wishes to know how large a sample would be required to obtain a 95% confidence interval of total width no greater than $10. ¢ Find the smallest size of sample that will satisfy this request. (3 marks) ‘A factory produces steel sheets whose masses are known to be normally distributed with a standard deviation of 2.4kg. A random sample of 36 sheets had a mean mass of 31.4kg. Find 99% confidence limits for the population mean, (Gmarks) A machine is set up to pour liquid into cartons in such a way that the amount of liquid poured on each occasion is normally distributed with a standard deviation of 20 ml Find 99% confidence limits for the mean amount of liquid poured if a random sample of 40 cartons had an average content of 266m (3 marks) a The error made when a certain instrument is used to measure the body length of a butterfly of a particular species is known to be normally distributed with mean 0 and standard deviation | mm, Calculate, to 3 decimal places, the probability that the size of the error made when the instrument is used once is less than 0.4mm, (marks) b Given that the body length of a butterfly is measured 9 times with the instrument, calculate, to 3 decimal places, the probability that the mean of the 9 readings will be within 0.5mm of the true length, (GB marks) ¢ Given that the mean of the 9 readings was 22.53mm, determine a 98% confidence interval for the true body length of the butterfly (3 marks) ®i1 The masses of bags of lentils, X’kg, have a normal distribution with unknown mean jkg and a known standard deviation akg. A random sample of 80 bags of lentils gave a 90% confidence interval for j of (0.4533, 0.5227). a Without carrying out any further calculations, use this confidence interval to test whether = 0.48, State your hypotheses clearly and write down the significance level you have used. (3 marks) A second random sample of 120 of these bags of lentils had a mean mass of 0.482 kg. b Calculate a 95% confidence interval for j« based on this second sample. (G marks) The lengths of the tails of mice in a pet shop are assumed to have unknown mean j1 and unknown standard deviation o. A random sample of 20 mice is taken and the length of their tails recorded. The sample is represented by Xi, X35... Xoo a State whether or not the following are statistics. Give reasons for your answers. | 2X, + Xy ~s3 (4 marks) b Find the mean and variance of (G marks) EC aie} STUN en aa) © _3 The breaking stresses of elastic bands are normally distributed. A company uses bands with a mean breaking stress of 46.50 N. ‘A new supplier claims that they can supply bands that are stronger, and provides a sample of 100 bands for the company to test. The company checked the breaking stress X for each of these 100 bands and the results are summarised as follows: n=100) | Dx=47l5 Dx? =222910 a Find an approximate 95% confidence interval for the mean breaking stress of these new rubber bands. (3 marks) b Do you agree with the new supplier, that they can supply bands that are stronger? (2 marks) 4 Oneach of 100 days, a scientist took a sample of 1 litre of water from a particular place along, a river, and measured the amount, Ymg, of chlorine in the sample, The results she obtained are shown in the table. ¥ 1 3[4[slel[7[s]9 Number of days | 4 | 8 | 20 [22 | 16) 13| 10 | 6 a Estimate the mean amount of chlorine present per litre of water, and estimate, to 3 decimal places, the standard error of this estimate, (3 marks) b Obtain approximate 98% confidence limits for the mean amount of chlorine present per litre of water. (3 marks) Given that measurements at the same point under the same conditions are taken for a further 100 days, € estimate, to 3 decimal places, the probability that the mean of these ‘measurements will be greater than 4.6 mg per litre of water. (3 marks) © § Theamount, to the nearest mg, of a certain chemical in particles inthe air at a weather station was measured each day for 300 days, The results are shown in the table Amount of chemical (mg) | 12 | 13 | 14 | 15 | 16. Number of days s | 42 [aio [a1 [12 Estimate the mean amount of this chemical in the air, and find, to 2 decimal places, the standard error of this estimate. (3 marks) @®) 6 Occasionally, a firm manufacturing furniture needs to check the mean distance between pairs of holes drilled by a machine in pieces of wood to ensure that no change has occurred. It is known from experience that the standard deviation of the distance is 0.43 mm. The firm intends to take a random sample of size n, and to calculate a 99% confidence interval for the mean of the population. The width of this interval must be no more than 0.60 mm. Calculate the minimum value of 2. (4 marks) © 7 The times taken by five-year-old children to complete a certain task are normally distributed with a standard deviation of 8.0 s. In a random sample, 25 five-year-old children from school 4 were given this task and their mean time was 44.25, a Find 95% confidence limits for the mean time taken by five-year-old cl school 4 to complete this task, (3 marks) ren from ESTO gs ae COGaitoky ADIPTIVE EARNING ‘The mean time for a random sample of 20 five-year-old children from school B was 40.95. ‘The headteacher of schoo! B concluded that the overall mean for school B must be less than that of school 4. Given that the two samples were independent, b test the headteacher’s conclusion using a 5% significance level. State your hypothes clearly. (6 marks) ‘The random variable X is normally distributed with mean p and variance 0°, a Write down the distribution of the sample mean ¥ of a random sample of size n. (1 mark) b State, with a reason, whether this distribution is exact or is an estimate. (1 mark) An efficiency expert wishes to determine the mean time taken to drill a fixed number of holes in a metal sheet. © Determine how large a random sample is needed so that the expert can be 95% certain that the sample mean time will differ from the true mean time by less than 15 seconds. ‘Assume that itis known from previous studies that o = 40 seconds, (4marks) ‘A man regularly uses a train service which should arrive in Zurich at 09:31. He decided to test this stated arrival time. Each weekday for a period of 4 weeks, he recorded the number of minutes that the train was late on arrival in Zurich. If the train arrived early then the value of X was negative. His results are summarised as follows: n=20) Yv=180 Dx? =103.21 a Calculate unbiased estimates of the mean and variance of the number of minutes, late of this train service. (Smarks) The random variable ¥ represents the number of minutes that the train is late on arriving in Zurich. Records kept by the railway company show that over fairly short periods, the standard deviation of is 2.5 minutes. The man made two assumptions about the distribution of Vand the values obtained in the sample and went on to calculate a 95% confidence interval for the mean arrival time of this train service, b State the two assumptions, (2 marks) ¢ Find the confidence interval. (3 marks) 4 Given that the assumptions are reasonable, comment on the stated arrival time of the service. (1 mark) The random variable ¥ is normally distributed with mean and variance a? a Write down the distribution of the sample mean ¥ of a random sample of size n. (1 mark) b Explain what you understand by a 95% confidence interval. (marks) A garage sells both leaded and unleaded fuel. The distribution of the values of sales for each type is normal, During 2010, the standard deviation of individual sales of each type of fuel was £3.25, The mean of the individual sales of leaded fuel during this time was £8.72. A random sample of 100 individual sales of unleaded fuel gave a mean of £9.71 Caleulate: € an interval within which 90% of the sales of leaded fuel will lie (3 marks) da 95% confidence interval for the mean sales of unleaded fuel. (GB marks) ‘The mean of the sales of unleaded fuel for 2009 was £9.10. Ce arse} STUN aaa Ly € Using a 5% significance level, investigate whether there is sufficient evidence to conclude that the mean of all the 2010 unleaded sales was greater than the mean of the 2009 sales. (Gmarks) {Find the size of the sample that should be taken so that the garage owner can be 95% certain that the sample mean of sales of unleaded fuel during 2010 will differ from the true mean by less than £0.50. (Amarks) G@®) 11 a Explain what is meant by a 98% confidence interval for a population mean, (marks) The lengths, in cm, of the leaves of oak trees are known to be normally distributed with variance 1.33. 0m?, ‘A sample of 40 oak tree leaves is found to have a mean of 10.20em. b Estimate, giving your answer to 3 decimal places, the standard error of the mean. (2 marks) © Use this value to estimate 95% confidence limits for the mean length of the population of oak tree leaves, giving your answer to 2 decimal places, (B marks) 4 Find the minimum size of the sample of leaves which must be taken if the width of the 98% confidence interval for the population mean is at most 1.50cm. (A marks) GP) 12 a Write down the mean and the variance of the distribution of the means of all possible samples of size n taken from an infinite population having mean js and variance o?, (2 marks) Describe the form of this distribution of sample means when: i nislarge the distribution of the population is normal, (2marks) The standard deviation of all the til receipts of a supermarket during 2014 was £4.25. € Given that the mean of a random sample of 100 of the till receipts is £18.50, obtain an approximate 95% confidence interval for the mean of all the till receipts during 2014. (3 marks) Find the size of sample that should be taken so that the management can be 95% confident that the sample mean will not differ from the true mean by more than £0.50. (GB marks) € The mean of all the till receipts of the supermarket during 2013 was £19.40. Using a 5% significance level, investigate whether the sample in part a provides sufficient evidence to conclude that the mean of all the 2014 till receipts is different from that in 2013. (6 marks) Records of the diameters of spherical metal balls produced on a certain machine show that, the diameters are normally distributed with mean 0.824em and standard deviation 0.046cm. ‘Two hundred samples are randomly chosen, each consisting of 100 metal balls. a Calculate the expected number of the 200 samples having a mean diameter less than 0.823em, (2 marks) Ona certain day, it was believed that the machine was faulty. It may be assumed that if the ‘machine is faulty, it will change the mean of the diameters without changing their standard deviation, On that day, a random sample of 100 metal balls had mean diameter 0.834 em, b Determine a 98% confidence interval for the mean diameter of the metal balls being produced that day. (3 marks) € Hence state whether or not you would conclude that the machine is faulty on that day given that the significance level is 2° (3 marks) CST Une gy Pa Ue CHAPTER @®) 14 A doctor claims that there is a higher mean heart rate in people who always drive to work compared to people who regularly walk to work. She measures the heart rates Y of 30 people who always drive to work and 36 people who regularly walk to work. Her results are summarised in the table below. Drivetowork | 30 | 52_| 60.2 Walktowork | 36 | 47 | S58 a Test, at the 5% level of significance, the doctor's claim. State your hypotheses clearly. (6 marks) b State any assumptions you have made in testing the doctor’s claim, (marks) The doctor decides to add another person who drives to work to her data. ‘She measures the person's heart rate and finds X = 55. ¢ Find an unbiased estimate of the variance for the sample of 31 people who drive to work. Give your answer to 3 significant figures (4marks) Cars ‘ADAPTIVE CERRWINS Bead a 1. If Xisa random variable, then a random sample of size m will consist of n observations of the random variable X, which are referred to as Xy, Xs, Ny, ..., X, where the X,: + are independent random variables + each have the same distribution as X. statistic T'is defined as a random variable consisting of any function of the X, that involves ‘no other quantities, such as unknown population parameters. 2 The sampling distribution of a statistic T'is the probability distribution of 7 3. Astatistic that is used to estimate a population parameter is called an estimator and the particular value of the estimator generated from the sample taken is called an estimate. a Ceara STUN aaa Ly ‘4 Ifa statistic Tis used as an estimator for a population parameter # and E(7) = 4, then Tis an unbiased estimator for 0, 5 Ifa statistic Tis used as an estimator for a population parameter #, then the bias is E(7’) — 8. For an unbiased estimator, the bias is 0 6 An unbiased estimator for 0” is given by the sample variance S? where: wayai-¥e 7 The standard deviation of an estimator is called the standard error of the estimator. 8 When using the sample mean ¥, you can use the following result for the standard error: Standard error of ¥=-2 or van 9 Aconfidence interval for a population parameter @is a range of values defined so that there is a specific probability that the true value of the parameter lies within that range. 196 x7, ¥ +196 x 40 4.95% confidence interval forthe population mean jis wn a 11 The width of a confidence interval is the difference between the upper confidence limit and the lower confidence limit. This is 2 x z x 2, where z is the relevant percentage point from ai the standard normal distribution, for example 1.96, 1.6449, etc. Palast as Review exercise © 1 A researcher is hired by a cleaning company to survey the opinions of employees on a proposed pension scheme. The company employs 55 managers and 495 cleaners, a Explain what is meant by a census and give one disadvantage of using it in this contest. To collect data, the researcher decides, to give a questionnaire to the first 50 cleaners to leave at the end of the day. b State the sampling method used by the researcher. © Give two reasons why this method is likely to produce biased results. (2) 4 Explain briefly how the researcher could select a sample of 50 employees using: ia systematic sample a stratified sample, Q ‘€ Statistics 3 Sections 1, 1.2, 1.3, @ a 2. Describe one advantage and one disadvantage of: a quota sampling b simple random sampling ‘© Statistics 3 Sections 1.1, 14 3. Mrs Hilyard wants to select a sample of 50 of her students to fill in a questionnaire. The school has a record of all 500 students, listed alphabetically and numbered 1 to 500, Mrs Hilyard uses the same random ‘number table that appears on page 144 of this textbook. Starting with the top-left hand corner and working across, Mrs Hilyard chooses three random numbers. The first wo suitable numbers are 384 and 100. @s5 a What are the next two suitable sumbers? Mrs Hilyard decides to take a systematic sample instead, using the same list. b Explain why a systematic sample may not give a sample that represents the proportion of boys and girls in the school. © Which sampling technique should Mrs Hilyard use? € Statistics 3 Sections 1, 1.2, 1.3, A hotel has 320 rooms, of which 180 are classified as standard, 100 are classified as premier, and 40 are classified as executive. ‘The manager wants to obtain information about room usage in the hotel by taking a 10% sample of the rooms. Explain how the manager should obtain a stratified sample + Statistics 3 Section 1.2 At an amusement park, the duration R seconds of a ride on the rollercoaster has the normal distribution N(82, 3°). The duration F of a ride on the Ferris Wheel has the normal distribution N(238, 7°). Alice rides on the rollercoaster and the Ferris Wheel. a Find the probability that her ride on the Ferris Wheel is less than three times as long as her ride on the rollercoaster. © b State one assumption you have made and comment on its validity @ Paul rides on the rollercoaster three times in a row. The random variable D represents the total duration of the three rides. ¢ Find the distribution of D. 8 cc} Given that Alice starts a ride on the Ferris Wheel at the same time as Paul starts his three rides on the rollercoaster, 4 find the probability that Alice and Paul's rides finish within 10 seconds of one another, S) Statistics 3 Section 21 A workshop makes two types of electrical resistor. ‘The resistance, X ohms, of resistors of ‘Type A is such that ¥ ~ N(20, 4), The resistance, Y ohms, of resistors of Type Bis such that Y ~ N10, 0.84). When a resistor of each type is connected into a circuit, the resistance R ohms of the circuit is given by R= X'+ ¥, where Xand Yare independent. Find: a E(R) Q b Var(R) @ © P(28.90 < R< 32.64) 8 Statisties 3 Section 2.1 A simple random sample X,, X, X; is taken from a normal distribution with ‘mean and standard deviation o. yp kewe hs Given that ¥ =" an aha (HEME where k is a contant, find the value of k, giving your answer correct to 3 sf ‘© Statistics 3 Section 21 > ¥+ko)=02 Ina bag, there are five coins worth | RMB, three coins worth 0.5 RMB and two coins worth 0.1 RMB. Two coins are taken from the bag without replacement and the mean value calculated. Write down the sample distribution for the mean value, ‘© Statistics 3 Section 3.1 fala estat 9 The random variable Cis defined as C=2445B where 4 ~ N(IS, 1.5%) and B~ N(j, 22) and A and Bare independent Given that P(C < 83.5) = 0.9, find the value of 1, giving your answer to 2 decimal places. Statistics 3 Section 2.1 © 10 The random variables 4,, A>, As and Ay each have the same distribution as 4, where 4 ~ N(24, 42), The random variable X has distribution Y ~ N(20, 3°) ‘The random variable B is defined as KD A where X, A), Ao, Asand Ay are independent. Find P(B< 170| B> 156) est B istics 3 Section 2.4, 11 The random variable has a continuous uniform distribution over the interval [a 3, 5a-9] , where a is a constant. ‘The mean of a random sample of size n taken from this distribution is Y. a Show that is a biased estimator for cvand calculate the bias of ¥ when used as an estimator for a. b Given that ¥ = K+ 4 is an unbiased estimator, find the value of k. A random sample of 10 values of is taken and the results are as follows: 17 425 32.2 423 46 45 46.3 30.7 117 499 © Use the sample to estimate the maximum value that ¥’ can take. ‘€ Statistics 3 Sections 3.1, 3.2 © 12 The weights of adult men are normally distributed with a mean of 84kg and a standard deviation of 11 kg. a Find the probability that the total weight of 4 randomly chosen adult men is less than 350 kg, 8) Palast as ‘The weights of adult women are normally distributed with a mean of 62kg and a standard deviation of 10kg. b Find the probability that the weight of a randomly chosen adult man is less than one and a half times the weight of a randomly chosen adult woman. (4) + Statistics 3 Section 3.1 The random variable D is defined as D=A-3B+4C where A ~ N(5, 2%), B~ N(7, 33) and C~NO,4), and 4, Band Care independent. a Find PD < 44) ‘The random variables B,, By and By ate independent and each has the same distribution as B. @ ‘The random variable Wis defined as =4-SB,44AC b Find P(Y > 0) @ Statistics 3 Section 21 A manufacturer produces two flavours of soft drink: cola and lemonade. The weights, Cand L, in grams, of randomly selected cola and lemonade cans are such that C ~~ N(350, 8) and L ~ NG45, 17). a Find the probability that the weights of two randomly selected cans of cola will differ by more than 6g, @ ‘One can of each flavour is selected at random, b Find the probability that the can of cola weighs more than the can of lemonade, Cans are delivered to shops in boxes of 24 cans. The weights of empty boxes are normally distributed with mean 100g and standard deviation 2. @ 5 @v € Find the probability that a full box of cola cans weighs between 8.51 kg and 8.52kg. @ 4 State an assumption you made in your calculation in part e. @ + Statistics 3 Section 3.1 Ina trial of diet 4, a random sample of 80 participants was taken to record their weight loss, xkg, after their first week of using the diet. The results are summarised as follows: Yxs361.6 Ex = 1753.95 a Find unbiased estimates for the mean and variance of weight lost after the first week of using diet 4 The designers of dict A believe it can achieve a greater mean weight loss after the first week than an existing diet B. A random sample of 60 people used diet B. After the first week they had achieved a mean weight loss of 4.06 kg, with an unbiased estimate of variance of weight loss of 2.50 kg? b Test, at the 5% level of significance, whether or not the mean weight loss after the first week using diet 4 is greater than that using diet B. State your hypotheses clearly () ¢ Explain the significance of the central, limit theorem to the test in part b. (1) 4 State an assumption you have made in carrying out the test in partb. (1) + Statistics 3 Sections 3.1, 3.4 @ ‘A random sample of the daily sales (in Rand) of a small company is taken and, using tables of the normal distribution, 4.99% confidence interval for the mean daily sales is found to be (123.5, 154.7). Find a 95% confidence interval for the ‘mean daily sales of the company. (6) « Statistics 3 Section 3.2 45 @®u A machine produces metal containers. ‘The masses of the containers are normally distributed. A random sample of 10 containers was taken and the ‘mass of each container was recorded to the nearest 0.1 kg. The results were as follows: 49.7 50.3 510 49.5 49.9 50.1 50.2 50.0 49.6 49.7 a Find unbiased estimates of the mean and variance of the masses of the population of metal containers, ‘The machine is set to produce metal containers whose masses have a population standard deviation of 0.5 kg. b For the population mean, find: i 95% confidence interval fi a 99% confidence interval ag © Statistics 3 Sections 3.1, 3.2 ‘The drying times of paint can be assumed to be normally distributed. A paint ‘manufacturer paints 10 test areas with a new paint. The following drying times, to the nearest minute, were recorded: 82 98 140 110 90 125 150 13070110 a Calculate unbiased estimates for the ‘mean and the variance of the population of drying times of this paint @ Given that the population standard deviation is 25, b find a 95% confidence interval for the ‘mean drying time of this paint. (5) Fifteen similar sets of tests are done and the 95% confidence interval is determined for each set. ¢ Find the probability that all 15 of these confidence intervals contain the population mean. Q «Statistics 3 Sections 3.1, 3.2 @®vw fala estat Some biologists were studying a large group of birds. A random sample of 36 were measured and the wing length, xmm, of each bird was recorded. The results are summarised as follows: Yox= 6046 Oe = 1016338 a Calculate unbiased estimates for the mean and the variance of the wing lengths of these birds 8 Given that the wing lengths are assumed to be normally distributed and that the standard deviation of the wing lengths of this particular type of bird is actually 5.1mm, b find a 99% confidence interval for the mean wing length of the birds from this group. 8) 4 Statistics 3 Sections 3.1, 3.2 A computer company repairs large sumbers of PCs and wants to estimate the mean time taken to repair a particular fault, Five repairs are chosen at random. from the company’s records and the times taken, in seconds, ate as follows: 205 310 405 195320 a Calculate unbiased estimates of the mean and the variance of the population of repair times from which this sample has been taken. Itis known from previous results that the standard deviation of the repair time for this fault is 100 seconds and that the repair time is normally distributed. The company manager wants to ensure that there is a probability of at least 0.95 that the estimate of the population mean lies within 20 seconds of its true value. b Find the minimum sample size required. @) 6) Statistics 3 Sections 3.1, 3.2 Palast as 21 A company makes individual slices of cheesecake, The weight of each slice is normally distributed with mean 135 g and standard deviation 3 g It is possible to buy a box of 12 individual slices of cheesecake. The box has a weight which is normally distributed with weight 100 g and standard deviation 6 g. a Find the probability that the weight of the 12 slices and the box is greater than L7kg. b- What assumptions about the weights of the slices of cheesecake are you making? & Statistics 3 Sections 3.1, 32 22, Maike found a 95% confidence interval to be (14.6904, 15.7096) Unfortunately, he had lost his original information, but he did remember that the standard deviation was 1.3 Calculate the sample size that Maike used to create this confidence interval Statisties 3 Section 32 23. A c% confidence interval was calculated using 36 observations from a data set which is normally distributed. The value of s was 3.6 ‘The confidence interval calculated was (12,9636, 16.2364). a Find the value of ¥. b Find the value of « + Statistics 3 Section 3.2 ‘Arandom sample of three independent variables 2%, Xeand is taken from a distribution with mean j and variance of a Show that 2X; - 354 estimator fr ‘An unbiased estimator for sis given by f= aX;+ bXpwhere a and b are constants. b Show that Var(i) = 2a ~2a+ 1a? Hence determine the value of a and the value of b for which fi has minimum variance. “Statistics 3 Section 3.1 5 san unbiased a7 = va ee r Haye a ba ae ee Et = TESTING ta eae) © Understand and apply the central limit theorem to approximate the sample mean of a random variable ¥ “+ pages 49-51 © Apply the central limit theorem to other distributions + pages 51-53 © Apply the central limit theorem in finding confidence “+ pages 53-54 on the mean > pages 54-59 means + pages 60-64 Understand beeen Be large sample pages 64-66 Arandom variable X"~ N(120, 8°). Find a P(¥>115) — b P(120< ¥<130) © auch that P(X'< a) A fair six-sided dice is rol uppermost fat entral limit theorem a E(Y) b Var(¥) about the distribution of t cP «Statist 3 Sections 6.5, 6.6 population is unki sticians use Robin flips a fair . d it to infer how likely the views of a sample the probability that the coin is flipped at least 12 are to be representative of the population times. + Statistics 2 Sect > Chapter review 4 Q10 MOOR a) ses ©) The central limit theorem Ifyou takearendom sample ofn observations from a normally distributed random variable X~ Nu.) then the sample mean ¥’is also normally distributed with ¥ ~ N(u, 7 In fact, this result is a special case of a more powerful result called the central limit theorem. This states that the mean of a large random sample taken from any random variable is always approximately normally distributed, This result is true without paying attention to the distribution of the original random variable. = The central limit theorem says that if X;, Xo, .... Xyis a random sample of size n froma “ 2 population with mean # and variance o%, then X is approximately ~ N{q, 7}. In general, the sample mean is only approximately distributed with N(y., 7). As n gets large, this approximation gets better. The variance of the sample mean also decreases as 7 gets large. You can say that for a large sample, the sample mean will be very close to the population mean, A six-sided dice is changed so that there are three faces marked 1, two faces marked 3 and one face marked 6. The dice is rolled 40 times and the mean of the 40 scores is recorded. a Find an approximate distribution for the mean of the scores, b Use your approximation to estimate the probability that the mean is greater than 3. cue a Let the random variable X re score ona single roll Find the mean and variance of the discrete Then the distribution of Xs: distribution € Statistics 1 Sections 6.3, 6.6 x 1 a] 6 rxven | 3 sl So: y= EX) = Ea = +3xde6xd and a? = Vart! = DvP =) +aexdrarxd- (3 ex 25 or B Now by the central limit theorem: ati Fw ~nl25, i) ol nal limit rT Ce ase} eMC RS aU) all =) Vigo CEIEDD 01 do not need to apelye - PZ < 175. ‘continuity correction when using the central -09599 limit theorem, This is because the underlying, 401 distribution is the mean of the sample. Although this isa discrete random variable, it does not have o take integer values. Ittakes fractional values, and the gaps between values get smaller and smalleras-n gets larger. ons 1 The lengths of bolts produced by a machine have an unknown distribution with mean 3.03.em and standard deviation 0.20em. ‘A sample of 100 bolts is taken. a Estimate the probability that the mean length of this sample is less than 3cm. (3 marks) A second sample is taken. The probability that the mean of this sample is less than 3.cm needs to be less than 1%. b Find the minimum sample size required. (Smarks) © 2 A random variable ¥ has the discrete uniform distribution P(V=x)=4 x=1,2,3,4,5 40 observations are taken from X, and their mean ¥ is recorded. Find an estimate for PLY > 3.2) (6 marks) ©® 3 A fair dice is rolled 35 times. a Find the approximate probability that the mean of the 35 scores is more than 4, b Find the approximate probability that the total of the 35 scores is less than 100. 4 The 25 children in a class each roll a fair dice 30 times and record the number of sixes they obtain, Find an estimate of the probability that the mean number of sixes recorded for the class is less than 4.5. © 5 The random variable V has the probability x 0 2 3 3 distribution shown in the table. paso [or | x | & [03 a Find the value of &. (2 marks) A random sample of 100 observations of ‘is taken. Use the central limit theorem to estimate the probability that the mean of these observations is greater than 3, (6 marks) © Comment on the accuracy of your estimate. (mark) MOOR a) ses @® 6 A fair dice is rolled m times. Given that there is less than a 1% chance that the mean of all the scores differs from 3.5 by more than 0.1, find the minimum sample size. 7 The annual part-time salaries of employees at a large company have an unknown distribution with mean AUDS28,500 and standard deviation AUDS6800. ‘A random sample of 5 members of the senior management team is taken. ‘A researcher suggests that N(28 500, S") of the sample mean, a Give a reason why this is unlikely to be a good model (mark ‘A ssecond random sample of 15 employees from the whole company is taken. b Estimate the probability that the mean annual salary of these employees is: i less than AUDS25,000 ii between AUDS25,000 and AUDS30,000. (4marks) € Comment on the accuracy of your estimate. (mark) could be used to model the distribution GP 8 Anclectrical company repairs very large numbers of television sets and wishes to estimate the mean time taken to repair a particular fault It is known from previous research that the standard deviation of the time taken to repair this particular fault is 2.5 minutes, The manager wishes to ensure that the probability that the estimate differs from the true mean by less than 30 seconds is 0.95. Find how large a sample is required. (G marks) Applying the central limit theorem to other distributions You can use the central limit theorem to solve problems involving other distributions such as the binomial, the Poisson and the uniform distributions. A supermarket manager is trying to model the number of customers who visit her store each day. She observes that, on average, 20 new customers enter the store every minute, a Calculate the probability that fewer than 4 customers arrive in a given 15-second interval. b Use the central limit theorem to estimate the probability that in one hour no more than 1150 customers arrive ‘a Let Xrepresent the number of customers who arrive in one minute. Then X ~ Pot20) Let Y represent the number of customers who arrive in a 15-Second interval Then ¥ ~ PotS) PLY < 5} = 0.4408, from tables Ce ataiey eMC RS aU) b Consider a sample of GO observations taken from X. By the central limit theorem, X is approxim: or N[20, If T-< 1150 then ¥ < Problem-solving IF IX; is the sum of the observations from a sample of size n, then the sample mean is given by: =X Standaraising POY < 19:16. PZ < 144...) - PZ <1.44,..) 1-09251 0.0749 1 A random sample of 10 observations is taken from a Poisson distribution with mean 3 a Find the exact probability that the sample mean does not exceed 2.5. b Estimate the probability that the sample mean does not exceed 2.5 using the central limit theorem, and compare your answer to part a. © 2 A sample of size 20/is taken from a binomial distribution with = 10 and p = 0.2 Estimate the probability that the sample mean does not exceed 2.4 (marks) GP) 3 There are 20 children in a class. Each flips a biased coin 15 times. The probability of getting a head is 0.25. a Write down the expected number of heads that each child would get. (2 marks) b Find an estimate of the probability that the mean number of heads is at most 4. (3 marks) 4 A town is hit by three thunderstorms per month, on average. a Find the probability that there are four thunderstorms next month, (2 marks) b Use the central limit theorem to estimate the probability that over the course of a year, the average number of thunderstorms each month is at most 2.5 (4marks) 5 The continuous random variable ¥ is uniformly distributed over the interval fa-3,30+ 5] where a is a constant. 40 observations of X will be taken. Use the central limit theorem to find an approximate distribution for Y.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy