Sampling
Sampling
This chapter emphasizes understanding of role of sampling that provides reliable information in a lesser time, at lesser cost and even with lesser manpower. However the advantages accrue only when a sample is a true representative of the population. This is achieved by organizing the sampling process through some of the well designed
Sampling techniques that have been described. These are: Simple random sampling Systematic sampling Stratified sampling Cluster sampling Judgment Sampling CENSUS SAMPLING The collection of data from a population can be either on Census or Sampling basis. Census implies 100% enumeration of items in the population, Sampling implies selecting only a part of the population.
Non-Random or Non-Probability sampling. 1. Judgment sampling 2. Quota sampling 3. Convenience sampling Simple random sampling Simple random sampling selects samples by methods that allow each possible sample to have an equal probability of being picked and each item in entire population to have an equal chance of being included in the sample e.g. suppose we have a population of four students in a seminar and we want samples of two students at a time for interviewing purposes. So all students have equal chance of being included in the sample. There are two types of populations, Finite and Infinite. By finite we mean that the population has stated or limited size. An infinite population is a population in which it is theoretically impossible to observe all the elements. Random Table Method: The easiest way to select randomly is to use random numbers. These numbers can be generated either by a computer programmed to scramble
numbers or by a table of random numbers or by a table of random numbers, which should properly be called a table of random digits. Lottery Method: Another way to select our samples would be to write the name or number of each one on a sleep of paper and deposit the slips in a box. After mixing them thoroughly, we could draw 10 slips at random. This method works well with a small group of people but presents problems if the people in the population number in the thousands. Probability sampling means that everyone in a given population has an equal chance of being surveyed for a particular piece of research. Lets say we want to know how many people would choose blue as their favorite color. If we wanted to answer that question in the context of the average American, that would mean that everyone in the United States would have an equal chance of being sampled for the study.
Systematic Sampling In simple random sampling the units in the sample are selected with the help of the random number table. However, there is another method of sampling in which, only the first unit of the sample is selected with help of random number table, and the rest are selected automatically according to a predetermined pattern. The method is known as systematic sampling. In systematic sampling, elements are selected from the population at a uniform interval that is measured in time, order, or space. In systematic sampling, there is the problem of introducing an error into the sample process. In Systematic sampling, however even though systematic sampling may be inappropriate when the elements lie in a sequential pattern, this method may require less time and sometimes results in lower costs than the simple random sample method.
We first calculate the sampling interval by dividing the total number of households in the population (40) by the number we want in the sample (10). In this case, the sampling is 4. We then select a number between 1 and the sampling interval from the random number table (in this case 3). Household #3 is the first household. We then count down the list starting with household #3 and select each 4th household. For example, the second selected household is 3 + 4, or #7. Note that when you reach the end of the list, you should have selected your desired number of households. If you have not, you have counted wrong or miscalculated the sampling interval. You should go back and start over. This is what your final selection should look like:
Stratified Sampling To use Stratified sampling, we divide the population into relatively homogeneous groups, called strata. Then we use one of the approaches. Either we select at random from each stratum a specified number of elements corresponding to the proportion of that stratum in the population as a whole or we draw an equal number of elements from each stratum and
give weight to the results according to the stratums proportion of total population. With either approach, stratified sampling guarantees that every element in the population has a chance of being selected. Stratified sampling is appropriate when the population is already divided into groups of different sizes. The advantage of stratified sampling is that when they are properly designed, they more accurate reflect characteristics of the population from which they were chosen than do other kinds of samples. For example, let's say that the population of clients for our agency can be divided into three groups: Caucasian, African-American and HispanicAmerican. Furthermore, let's assume that both the African-Americans and Hispanic-Americans are relatively small minorities of the clientele (10% and 5% respectively). If we just did a simple random sample of n=100 with a sampling fraction of 10%, we would expect by chance alone that we would only get 10 and 5 persons from each of our two smaller groups. And, by chance, we could get fewer than that! If we stratify, we can do better. First, let's determine how many people we want to have in each group. Let's say we still want to take a sample of 100 from the population of 1000 clients over the past year. But we think that in order to say anything about subgroups we will need at least 25 cases in each group. So, let's sample 50 Caucasians, 25 African-Americans, and 25 Hispanic-Americans. We know that 10% of the population, or 100 clients, are African-American. If we randomly sample 25 of these, we have a within-stratum sampling fraction of 25/100 = 25%. Similarly, we know that 5% or 50 clients are HispanicAmerican. So our within-stratum sampling fraction will be 25/50 = 50%. Finally, by subtraction we know that there are 850 Caucasian clients. Our within-stratum sampling fraction for them is 50/850 = about 5.88%. Because the groups are more homogeneous within-group than across the population as a whole, we can expect greater statistical precision (less variance). And, because we stratified, we know we will have enough cases from each group to make meaningful subgroup inferences. Cluster Sampling In cluster sampling, we divide the population into groups or clusters, and then select a random sample of these clusters. There are two types of cluster sampling: 1. Single stage cluster sampling
2. Multistage cluster sampling We assume that individual clusters are representative of the population as a whole. The advantage of cluster sampling from the point of view of cost arises mainly due to the fact that collection of data for near by units is easier, cheaper, faster and more convenient than observing units scattered over a wide area. For instance, in the figure we see a map of the counties in New York State. Let's say that we have to do a survey of town governments that will require us going to the towns personally. If we do a simple random sample statewide we'll have to cover the entire state geographically. Instead, we decide to do a cluster sampling of five counties (marked in red in the figure). Once these are selected, we go to every town government in the five areas. Clearly this strategy will help us to economize on our mileage. Cluster or area sampling, then, is useful in situations like this, and is done primarily for efficiency of administration. Note also, that we probably don't have to worry about using this approach if we are conducting a mail or telephone survey because it doesn't matter as much (or cost more or raise inefficiency) where we call or send letters to.
NON-RANDOM SAMPLING
JUDGMENT SAMPLING In such type of sampling, the selection of units, to be included in the sample, depends on the judgment or assessment of the person(s) collecting the sample. The sample is selected based on their judgment/assessment as to what would constitute a representative sample. This is spatially useful when the sample size is small, and if random sampling is adopted, the units that are more important and critical to the objective of the study might not get included in the sample. e.g. in a training institute, the teaching staff was 30. However, for urgent academic or administrative matters, the director used to get opinion of one particular Assistant professor as he was known to have balanced views, did not belong to any group and was frank enough to express his views. Thus, the director used to rely on a sample of size one.
QUOTA SAMPLING Such sampling is usually resorted when some quota about the number of units to be included in the sample is fixed. The quota is fixed due to constrain on availability of time or cost. Within the quota stipulated, one has to select a sample, which is representative of the entire population. For example, within the overall quota of interviewing 100 persons for some opinion poll, one may contact some persons from various categories like college students, housewives, shopkeepers, daily wage earners, etc. Similarly, in an organization, one might include persons from categories of staff function-wise, department-wise, etc. For example, A researcher is interested in the attitudes of members of different religions towards the death penalty. In Iowa a random sample might miss Muslims (because there are not many in that state). To be sure of their inclusion, a researcher could set a quota of 3% Muslim for the sample. However, the sample will no longer be representative of the actual proportions in the population. This may limit generalizing to the state population. But the quota will guarantee that the views of Muslims are represented in the survey.
CONVENIENCE SAMPLING Such sampling is detected by the needs of convenience rather than any other consideration. For example, one could select a sample of persons from the list of credit card holders. Another example relates to opinion poll when one may find it easier to get the opinion of those in the shops or restaurants or walking on pavement rather than going from house to house. For example, to have an idea of the preferences of brands and features of a television, one could contact the persons in or outside an electronics store selling televisions. SNOW BALL SAMPLING A snowball sample is a non-probability sampling technique that is appropriate to use in research when the members of a population are difficult to locate. A snowball sample is one in which the researcher collects data on the few members of the target population he or she can locate, then asks those individuals to provide information needed to locate other members of that population whom they know. Snowball sampling is hardly likely to lead a representative sample, but there are times when it may be the best or only method available. For instance, if you are studying the homeless, you are not likely to find a list of all the homeless people in your city. However, if you identify one or two homeless individuals that are willing to participate in your study, it is likely that they know other homeless individuals in their area and can help you locate them. The same goes for underground subcultures, or any population that might want to keep their identity hidden, such as undocumented immigrants or exconvicts. A subset of a purposive sample is a snowball sample -- so named because one picks up the sample along the way, analogous to a snowball accumulating snow. A snowball sample is achieved by asking a participant to suggest someone else who might be willing or appropriate for the study. Snowball samples are particularly useful in hard-to-track populations, such as truants, drug users, etc. For example, if a researcher wishes to interview undocumented immigrants from Mexico, he or she might interview a few undocumented individuals that he or she knows or can locate and would then rely on those subjects to
help locate more undocumented individuals. This process continues until the researcher has all the interviews he or she needs or until all contacts have been exhausted.
It assures us that the sampling distribution of the mean approaches normal as the sample size increases. The significance of the central limit thm is that it permits us to use sample statistics to make inferences about population parameters without knowing anything about the shape of the distribution of that population other than what we can get from the sample. The relationship between the shape of the population distribution and the shape of the sampling distribution of the mean is called the CENTRAL LIMIT THEOREM.