0% found this document useful (0 votes)
796 views448 pages

Robert v. Hogg, Allen T. Craig - Introduction To M

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
796 views448 pages

Robert v. Hogg, Allen T. Craig - Introduction To M

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 448
Robert V. Hogg Allen T. Craig THE UNIVERSITY OF IOWA Introduction to Mathematical Statistics Fourth Edition Macmillan Publishing Co., Inc. NEW YORK Collier Macmillan Publishers LONDON Copyright © 1978, Macmillan Publishing Co., Inc. Printed in the United States of America All rights reserved. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the Publisher. Earlier editions © 1958 and 1959 and copyright © 1965 and 1970 by Macmillan Publishing Co., Inc. Macmillan Publishing Co., Inc. 866 Third Avenue, New York, New York 10022 Collier Macmillan Canada, Ltd, Library of Congress Cataloging in Publication Data ‘Hogg, Robert V Introduction to mathematical statistics. Bibliography: p. Includes index. 1. Mathematical statistics. I, Craig, Allen Thornton, (date) joint author. II, Title. QA276.H59 1978 519 77-2884 ISBN 0-02-355710-9 (Hardbound) ISBN 0-02-978990-7 (International Edition) PRINTING 131415 YEAR 56789 TERN = A-na-a6671.0-9 Preface We are much indebted to our colleagues throughout the country who have so generously provided us with suggestions on both the order of presentation and the kind of material to be included in this edition of Introduction to Mathematical Statistics. We believe that you will find the book much more adaptable for classroom use than the previous edition. Again, essentially all the distribution theory that is needed is found in the first five chapters. Estimation and tests of statistical hypotheses, including nonparameteric methods, follow in Chapters 6, 7, 8, and 9, respectively. However, sufficient statistics can be introduced earlier by considering Chapter 10 immediately after Chapter 6 on estimation. Many of the topics of Chapter 11 are such that they may also be introduced sooner: the Rao-Cramér inequality (11.1) and robust estimation (11.7) after measures of the quality of estimators (6.2), sequential analysis (11.2) after best tests (7.2), multiple com- parisons (11.3) after the analysis of variance (8.5), and classification (11.4) after material on the sample correlation coefficient (8.7). With this flexibility the first eight chapters can easily be covered in courses of either six semester hours or eight quarter hours, supplementing with the various topics from Chapters 9 through 11 as the teacher chooses and as the time permits. Ina longer course, we hope many teachers and students will be interested in the topics of stochastic independence (11.5), robustness (11.6 and 11.7), multivariate normal distributions (12.1), and quadratic forms (12.2 and 12.3). We are obligated to Catherine M. Thompson and Maxine Merrington and to Professor E. S. Pearson for permission to include Tables II and V, which are abridgments and adaptations of tables published in Biometrika. We wish to thank Oliver & Boyd Ltd., Edinburgh, for permission to include Table IV, which is an abridgment and adaptation v vi Preface of Table IIT from the book Statistical Tables for Biological, Agricultural, and Medical Research by the late Professor Sir Ronald A. Fisher, Cambridge, and Dr. Frank Yates, Rothamsted. Finally, we wish to thank Mrs. Karen Horner for her first-class help in the preparation of the manuscript. RV.H. ATG Contents Chapter 1 Distributions of Random Variables 1.1 Introduction 1 12 Algebra of Sets 4 1.3 Set Functions 8 14 The Probability Set Function 12 15 Random Variables 16 1.6 The Probability Density Function 23 1.7 The Distribution Function 37 18 Certain Probability Models 38 1.9 Mathematical Expectation 44 1.10 Some Special Mathematical Expectations 48 1.11 Chebyshev’s Inequality 58 Chapter 2 Cenditional Probability and Stochastic Independence 2.1 Conditional Probability 61 2.2 Marginal and Conditional Distributions 65 2.3. The Correlation Coefficient 73 2.4 Stochastic Independence 80 Chapter 3 Some Special Distributions 3.1 The Binomial, Trinomial, and Multinomial Distributions 3.2 The Poisson Distribution 99 3.3. The Gamma and Chi-Square Distributions 103 3.4 The Normal Distribution 109 3.5 The Bivariate Normal Distribution 1/7 vii 61 Contents viii Chapter 4 Distributions of Functions of Random Variables 122 4.1. Sampling Theory 122 4,2 Transformations of Variables of the Discrete Type 128 43 Transformations of Variables of the Continuous Type 132 4.4 Thetand F Distributions 143 4.5 Extensions of the Change-of-Variable Technique 147 4.6 Distributions of Order Statistics 154 4.7. The Moment-Generating-Function Technique 164 4.8 The Distributions of X and nS?/o? 172 4.9 Expectations of Functions of Random Variables 176 Chapter 5 Limiting Distributions 181 5.1 Limiting Distributions 187 5.2 Stochastic Convergence 186 5.3 Limiting Moment-Generating Functions 188 5.4 The Central Limit Theorem 192 5.5 Some Theorems on Limiting Distributions 196 Chapter 6 Estimation 200 6.1 Point Estimation 200 6.2 Measures of Quality of Estimators 207 6.3 Confidence Intervals for Means 212 6.4 Confidence Intervals for Differences of Means 219 6.5 Confidence Intervals for Variances 222 6.6 Bayesian Estimates 227 Chapter 7 Statistical Hypotheses 235 7.1 Some Examples and Definitions 235 7.2 Certain Best Tests 242 7.3 Uniformly Most Powerful Tests 257 oe Likelihood Ratio Tests 257 Contents ix Chapter 8 Other Statistical Tests 269 8.1 Chi-Square Tests 269 8.2 The Distributions of Certain Quadratic Forms 278 8.3. A Test of the Equality of Several Means 283 8.4 Noncentral x? and Noncentral F 288 8.5 The Analysis of Variance 297 8.6 8.7 A Regression Problem 296 A Test of Stochastic Independence 300 Chapter 9 Nonparametric Methods 304 ol 9.2 Lact 9.4 Bae ane 97 9.8 Confidence Intervals for Distribution Quantiles 304 Tolerance Limits for Distributions 307 The Sign Test 312 A Test of Wilcoxon 314 The Equality of Two Distributions 320 The Mann-Whitney-Wilcoxon Test 326 Distributions Under Alternative Hypotheses 337 Linear Rank Statistics 334 Chapter 10 Sufficient Statistics 341 w.1 10.2 10.3 10.4 10.5 10.6 ‘A Sufficient Statistic for a Parameter 341 The Rao-Blackwell Theorem 349 Completeness and Uniqueness 353 The Exponential Class of Probability Density Functions 357 Functions of a Parameter 361 The Case of Several Parameters 364 Chapter 11 Further Topics in Statistical Inference 370 ee 11.2 11.3 114 The Rao-Cramér Inequality 370 The Sequential Probability Ratio Test 374 Multiple Comparisons 380 Classification 385 11.5 Sufficiency, Completeness, and Stochastic Independence 11.6 Robust Nonparametric Methods 396 11.7. Robust Estimation 400 Chapter 12 Further Normal Distribution Theory 12.1 The Multivariate Normal Distribution 405 12.2 The Distributions of Certain Quadratic Forms 410 12.3. The Independence of Certain Quadratic Forms 414 Appendia A References Appendix B Tables Appendia C Answers to Selected Exercises Index Contents 389 405 42 429 435 Chapter I Distributions of Random Variables 1.1 Introduction Many kinds of investigations may be characterized in part by the fact that repeated experimentation, under essentially the same con- ditions, is more or less standard procedure. For instance, in medical research, interest may center on the effect of a drug that is to be administered; or an economist may be concerned with the prices of three specified commodities at various time intervals; or the agronomist may wish to study the effect that a chemical fertilizer has on the yield of a cereal grain, The only way in which an investigator can elicit information about any such phenomenon is to perform his experiment. Each experiment terminates with an outcome. But it is characteristic of these experiments that the outcome cannot be predicted with certainty prior to the performance of the experiment. Suppose that we have such an experiment, the outcome of which cannot be predicted with certainty, but the experiment is of such a nature that the collection of every possible outcome can be described prior to its performance. If this kind of experiment can be repeated under the same conditions, it is called a random experiment, and the collection of every possible outcome is called the experimental space or the sample space. Example 1. In the toss of a coin, let the outcome tails be denoted by T and let the outcome heads be denoted by H. If we assume that the coin may be repeatedly tossed under the same conditions, then the toss of this coin is an example of a random experiment in which the outcome is one of 1 2 Distributions of Random Variables [Ch.1 the two symbols T and H; that is, the sample space is the collection of these two symbols. Example 2. In the cast of one red die and one white die, let the outcome be the ordered pair (number of spots up on the red die, number of spots up on the white die). If we assume that these two dice may be repeatedly cast under the same conditions, then the cast of this pair of dice is a random experiment and the sample space consists of the 36 order pairs (1, 1),..., (1, 6), (2, 1),.-+, (2, 6),.-., (6, 6). Let @ denote a sample space, and let C represent a part of &. If, upon the performance of the experiment, the outcome is in C, we shall say that the event C has occurred. Now conceive of our having made N repeated performances of the random experiment. Then we can count the number fof times (the frequency) that the event C actually occurred throughout the N performances. The ratio f/N is called the relative ‘frequency of the event C in these N experiments. A relative frequency is usually quite erratic for small values of N, as you can discover by tossing a coin. But as N increases, experience indicates that relative frequencies tend to stabilize. This suggests that we associate with the event C a number, say #, that is equal or approximately equal to that number about which the relative frequency seems to stabilize. If we do this, then the number # can be interpreted as that number which, in future performances of the experiment, the relative frequency of the event C will either equal or approximate. Thus, although we cannot predict the outcome of a random experiment, we can, for a large value of N, predict approximately the relative frequency with which the outcome will be in C. The number # associated with the event C is given various names, Sometimes it is called the probability that the outcome of the random experiment is in C; sometimes it is called the probability of the event C; and sometimes it is called the probability measure of C. The context usually suggests an appropriate choice of terminology. Example 3. Let @ denote the sample space of Example 2 and let C be the collection of every ordered pair of @ for which the sum of the pair is equal to seven. Thus C is the collection (1, 6), (2, 5), (3, 4). (4, 3), (5, 2), and (6, 1). Suppose that the dice are cast N = 400 times and let f, the frequency of a sum of seven, be f = 60. Then the relative frequency with which the outcome was in C is f/N = 8%; = 0.15. Thus we might associate with C a number p that is close to 0.15, and p would be called the probability of the event C. Remark. The preceding interpretation of probability is sometimes re- ferred to as the relative frequency approach, and it obviously depends upon the fact that an experiment can be repeated under essentially identical con- Sec. 1.1] Introduction 3 ditions. However, many persons extend probability to other situations by treating it as rational measure of belief. For example, the statement p = 3 would mean to them that their personal or subjective probability of the event C is equal to 2. Hence, if they are not opposed to gambling, this could be interpreted as a willingness on their part to bet on the outcome of C so that the two possible payoffs are in the ratio p/(1 — p) = 2/3 = 4. Moreover, if they truly believe that p = 3 is correct, they would be willing to accept either side of the bet: (a) win 3 units if C occurs and lose 2 if it does not occur, or (b) win 2 units if C does not occur and lose 3 if it does. However, since the mathematical properties of probability given in Section 1.4 are consistent with cither of these interpretations, the subsequent mathematical develop- ment does not depend upon which approach is used. The primary purpose of having a mathematical theory of statistics is to provide mathematical models for random experiments. Once a model for such an experiment has been provided and the theory worked out in detail, the statistician may, within this framework, make inferences (that is, draw conclusions) about the random experiment. The construction of such a model requires a theory of probability. One of the more logically satisfying theories of probability is that based on the concepts of sets and functions of sets. These concepts are introduced in Sections 1.2 and 1.3. EXERCISES 1.1. In each of the following random experiments, describe the sample space @. Use any experience that you may have had (or use your intuition) to assign a value to the probability p of the event C in each of the following instances: (a) The toss of an unbiased coin where the event C is tails. (b) The cast of an honest die where the event C is a five or a six. (©) The draw of a card from an ordinary deck of playing cards where the event C occurs if the card is a spade. (d) The choice of a number on the interval zero to 1 where the event C occurs if the number is less than 4. (e) The choice of a point from the interior of a square with opposite vertices (1, —1) and (1, 1) where the event C occurs if the sum of the coordinates of the point is less than 4. 1.2. A point is to be chosen in a haphazard fashion from the interior of a fixed circle. Assign a probability p that the point will be inside another circle, which has a radius of one-half the first circle and which lies entirely within the first circle. 1.3. An unbiased coin is to be tossed twice. Assign a probability p, to the event that the first toss will be a head and that the second toss will be a 4 Distributions of Random Variables [Ch. 1 tail. Assign a probability pa to the event that there will be one head and one tail in the two tosses. 1.2 Algebra of Sets The concept of a set or a collection of objects is usually left undefined. However, a particular set can be described so that there is no misunder- standing as to what collection of objects is under consideration. For example, the set of the first 10 positive integers is sufficiently well described to make clear that the numbers } and 14 are not in the set, while the number 3 is in the set. If an object belongs to a set, it is said to be an element of the set. For example, if A denotes the set of real numbers z for which 0 < x < 1, then is an element of the set A. The fact that } is an element of the set A is indicated by writing 3 ¢ A. More generally, a ¢ A means that a is an element of the set A. The sets that concern us will frequently be se¢s of numbers. However, the language of sets of points proves somewhat more convenient than that of sets of numbers, Accordingly, we briefly indicate how we use this terminology. In analytic geometry considerable emphasis is placed on the fact that to each point on a line (on which an origin and a unit point have been selected) there corresponds one and only one number, say x; and that to each number « there corresponds one and only one point on the line. This one-to-one correspondence between the numbers and points on a line enables us to speak, without misunderstanding, of the “point x” instead of the “‘number z.” Furthermore, with a plane rectangular coordinate system and with x and y numbers, to each symbol (x, y) there corresponds one and only one point in the plane; and to each point in the plane there corresponds but one such symbol. Here again, we may speak of the “ point (z, y),” meaning the “ordered number pair x and y.” This convenient language can be used when we have a rectangular coordinate system in a space of three or more dimensions. Thus the “point (2, z,..., a,)” means the numbers z,, 2,-.., 2, in the order stated. Accordingly, in describing our sets, we frequently speak of a set of points (a set whose elements are points), being careful, of course, to describe the set so as to avoid any ambiguity. The nota- tion A = {x;0 < x < 1} is read “A is the one-dimensional set of points x for which 0 <2 < 1.” Similarly, A = {(x,y);0<@<1, 0 < y < 1} can be read “A is the two-dimensional set of points (x, y) that are interior to, or on the boundary of, a square with opposite vertices at (0,0) and (1, 1).” We now give some definitions (together with illustrative examples) that lead to an elementary algebra of sets adequate for our purposes. Sec. 1.2] Algebra of Sets 5 Definition 1. If each element of a set A; is also an element of set ‘Ag, the set Ay is called a subset of the set A». This is indicated by writing A, © Aq. If Ay © Ay and also Ay ¢ Ay, the two sets have the same elements, and this is indicated by writing A, = Ay. Example 1. Let A; = {v;0 < @ < Iyand A, = {e; -1 -- = {a, 0 < @ < 1}. Note that the number zero is not in this set, since it is not in one of the sets A,, dy, Ag,.... Definition 4. The set of all elements that belong to each of the sets A, and A, is called the intersection of A, and A, The intersection of Ay and Ag is indicated by writing A, 0 Ag, The intersection of several sets A,, Ag, Ag, ... is the set of all elements that belong to each of the sets Aj, Ag, Ag, .... This intersection is denoted by 4, AgN Ag n+ or by Ay A AgN--- A Ay if a finite number of sets is involved. Example 8. Let A, = {(z, y); (2, y) = (0, 0), (0, 1), (1, I} and A, = {@ y); (ey) = (1, 1), (1, 2), (2, 1}. Then Ay 0 Ag = {(@, 9); (@ y) = (1, 6 Distributions of Random Variables [Ch.1 A Az Ain Aa FIGURE 1.1 Example9. Let A, = {(x,y);0 0,y > 0}. Definition 6. Let / denote a space and let A be a subset of the set , The set that consists of all elements of .f that are not elements of A is called the complement of A (actually, with respect to «/). The complement of A is denoted by A*. In particular, /* = 9. Example 16. Let sf be defined as in Example 14, and let the set A = {x;x = 0, 1}. The complement of A (with respect to sf) is A* ,3, 4}. Example 17. Given A ¢ of. Then AU A* = of, AN A* = 9,4 Ul = oh, AQ = A,and (A*)* = A, EXERCISES 1.4. Find the union 4, U A, and the intersection 4, A Ag of the two sets A, and Ag, where: (a) Ay = fx; 2 = 0,1, 2}, do = (a; 2 = 2,3, 4}. (b) Ay = {x;0 <2 <2, A, = (a1 <2 < 3}. (c) Ai = {(0,9)0 Apes, A = 1,2,3,..., the sequence is said to be a nonincreasing sequence. Give an example of this kind of sequence of sets. 8 Distributions of Random Variables [Ch.1 1.10. If A,, Aa, Ag... are sets such that Ay © Ayyi, & = 1,2,3,..., jim A,, is defined as the union A, U A, U A, U---. Find jim A, if: (a) Ay = (a; Ik Sx <3 - 1/8}, k =1,2,3,...; (b) A, = {@ yi a s P+ y? <4— hb = 1,2,3,.... LU. If Ay, Ag, Ag,-.- are sets such that A, > Ayer, R= 1,2,3,..., lim A,, is defined as the intersection A, 0 Ap Ag Q:+. Find lim A, if: koe koe (a) Ay = {32 - Ik < es 2},k = 1,2,3, (b) Ay = 32 <2 52+ 1/8}, k= 1,2,3, (c) Ay = {(@, y); 0 < a? + y? < Wk}, k = 1,2,3,.... 1.3 Set Functions In the calculus, functions such as fQ@) =2, wo <2 Jj, then Q(A) is undefined. At this point we introduce the following notations. The symbol J, fe) ae will mean the ordinary (Riemann) integral of f(x) over a prescribed one-dimensional set A; the symbol J, ele.) de ay will mean the Riemann integral of g(x,y) over a prescribed two- dimensional set A; and so on. To be sure, unless these sets A and these functions f(z) and g(x,y) are chosen with care, the integrals will frequently fail to exist. Similarly, the symbol 2 fe) a will mean the sum extended over all x ¢ 4; the symbol Zale y) a will mean the sum extended over all (x, y) € A; and so on. Example 4. Let A be a set in one-dimensional space and let Q(4) = Je), where J@) =H x= 1,2,3,..., = 0 elsewhere. If A = {2,0 < 2 < 3}, then O14) = 4+ OP + GP = 3 Example 5. Let Q(A) = 5 f(a), where f@) =p7(1-p)-*, 2 =0,1, = O elsewhere, 10 Distributions of Random Variables [Ch.1 If A = {@; 2 = 0}, then zo Q(4) = QP — py =1—p; if A = {@;1< x < Bj, thenQ(A) =f(1) = 4. Example 6. Let A be a one-dimensional set and let Q(A) = fetae. Thus, if A = (x; 0 < x < a}, then Q(4) = f e-Fdx=1; if A = {w;1 < x < 2}, then oe eer (A) = fie de = et — 0-3; if Ay = (2,0 PC). ‘ Theorem 4. For eachC < @,0 < P(C) < 1. “4 Distributions of Random Variables [Ch. 1 Proof. Since @ ¢ C < %, we have by Theorem 3 that P(2)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy