Sampling
Sampling
What is sampling?
Sampling is the selection of subjects in a statistical study to
represent a larger population. Because testing every member of
a given population isn’t always feasible, researchers select
samples to make testing more efficient and cost-effective.
Probability sampling:
1. Simple random sampling
In a simple random sample, every member of the population has an equal chance of being
selected. Your sampling frame should include the whole population.
To conduct this type of sampling, you can use tools like random number generators or other
techniques that are based entirely on chance.
Example: You want to select a simple random sample of 1000 employees of a social media
marketing company. You assign a number to every employee in the company database from 1
to 1000, and use a random number generator to select 100 numbers.
2. Systematic sampling
Systematic sampling is similar to simple random sampling, but it is usually slightly easier to
conduct. Every member of the population is listed with a number, but instead of randomly
generating numbers, individuals are chosen at regular intervals.
Example: Systematic sampling: All employees of the company are listed in alphabetical order.
From the first 10 numbers, you randomly select a starting point: number 6. From number 6
onwards, every 10th person on the list is selected (6, 16, 26, 36, and so on), and you end up
with a sample of 100 people.
3. Stratified sampling
Stratified sampling involves dividing the population into subpopulations that may differ in
important ways. It allows you draw more precise conclusions by ensuring that every subgroup
is properly represented in the sample.
To use this sampling method, you divide the population into subgroups (called strata) based on
the relevant characteristic (e.g., gender identity, age range, income bracket, job role).
Based on the overall proportions of the population, you calculate how many people should be
sampled from each subgroup. Then you use random or systematic sampling to select a sample
from each subgroup.
Example: Stratified sampling: The company has 800 female employees and 200 male
employees. You want to ensure that the sample reflects the gender balance of the company, so
you sort the population into two strata based on gender. Then you use random sampling on
each group, selecting 80 women and 20 men, which gives you a representative sample of 100
people.
4. Cluster sampling
Cluster sampling also involves dividing the population into subgroups, but each subgroup
should have similar characteristics to the whole sample. Instead of sampling individuals from
each subgroup, you randomly select entire subgroups.
If it is practically possible, you might include every individual from each sampled cluster. If
the clusters themselves are large, you can also sample individuals from within each cluster
using one of the techniques above. This is called multistage sampling.
This method is good for dealing with large and dispersed populations, but there is more risk of
error in the sample, as there could be substantial differences between clusters. It’s difficult to
guarantee that the sampled clusters are really representative of the whole population.
Example: Cluster sampling: The company has offices in 10 cities across the country (all with
roughly the same number of employees in similar roles). You don’t have the capacity to travel
to every office to collect your data, so you use random sampling to select 3 offices – these are
your clusters.
1. Convenience sampling
A convenience sample simply includes the individuals who happen to be most accessible to the
researcher.
This is an easy and inexpensive way to gather initial data, but there is no way to tell if the
sample is representative of the population, so it can’t produce generalizable results.
Convenience samples are at risk for both sampling bias and selection bias.
Example: Convenience sampling: You are researching opinions about student support services
in your university, so after each of your classes, you ask your fellow students to complete a
survey on the topic. This is a convenient way to gather data, but as you only surveyed students
taking the same classes as you at the same level, the sample is not representative of all the
students at your university.
2. Snowball sampling
If the population is hard to access, snowball sampling can be used to recruit participants via
other participants. The number of people you have access to “snowballs” as you get in contact
with more people. The downside here is also representativeness, as you have no way of
knowing how representative your sample is due to the reliance on participants recruiting others.
This can lead to sampling bias.
Example: Snowball sampling: You are researching experiences of homelessness in your city.
Since there is no list of all homeless people in the city, probability sampling isn’t possible. You
meet one person who agrees to participate in the research, and she puts you in contact with
other homeless people that she knows in the area.
3. Quota sampling
Quota sampling relies on the non-random selection of a predetermined number or proportion
of units. This is called a quota.
You first divide the population into mutually exclusive subgroups (called strata) and then
recruit sample units until you reach your quota. These units share specific characteristics,
determined by you prior to forming your strata. The aim of quota sampling is to control what
or who makes up your sample.
Example: Quota sampling: You want to gauge consumer interest in a new produce delivery
service in Boston, focused on dietary preferences. You divide the population into meat eaters,
vegetarians, and vegans, drawing a sample of 1000 people. Since the company wants to cater
to all consumers, you set a quota of 200 people for each dietary group. In this way, all dietary
preferences are equally represented in your research, and you can easily compare these groups.
You continue recruiting until you reach the quota of 200 participants for each subgroup.