Unit V
Unit V
Convenience sampling: This method is dependent on the ease of access to subjects such
as surveying customers at a mall or passers-by on a busy street. It is usually termed
as convenience sampling, as it’s carried out on the basis of how easy is it for a
researcher to get in touch with the subjects. Researchers have nearly no authority over
selecting elements of the sample and it’s purely done on the basis of proximity and not
representativeness. This non-probability sampling method is used when there are time
and cost limitations in collecting feedback. In situations where there are resource
limitations such as the initial stages of research, convenience sampling is used.
For example, startups and NGOs usually conduct convenience sampling at a mall to
distribute leaflets of upcoming events or promotion of a cause – they do that by
standing at the entrance of the mall and giving out pamphlets randomly.
Judgmental or Purposive Sampling: In judgmental or purposive sampling, the
sample is formed by the discretion of the judge purely considering the purpose of
study along with the understanding of target audience. Also known as deliberate
sampling, the participants are selected solely on the basis of research requirements
and elements who do not suffice the purpose are kept out of the sample. For instance,
when researchers want to understand the thought process of people who are
interested in studying for their master’s degree. The selection criteria will be: “Are
you interested in studying for Masters in …?” and those who respond with a “No”
will be excluded from the sample.
Snowball sampling: Snowball sampling is a sampling method that is used in studies
which need to be carried out to understand subjects which are difficult to trace. For
example, it will be extremely challenging to survey shelterless people or illegal
immigrants. In such cases, using the snowball theory, researchers can track a few of
that particular category to interview and results will be derived on that basis. This
sampling method is implemented in situations where the topic is highly sensitive and
not openly discussed such as conducting surveys to gather information about HIV
Aids. Not many victims will readily respond to the questions but researchers can
contact people they might know or volunteers associated with the cause to get in
touch with the victims and collect information.
Quota sampling: In Quota sampling, selection of members in this sampling
technique happens on basis of a pre-set standard. In this case, as a sample is formed
on basis of specific attributes, the created sample will have the same attributes that
are found in the total population. It is an extremely quick method of collecting
samples.
Use of the Non-Probability Sampling Method
There are multiple uses of the non-probability sampling method. They are:
Create a hypothesis: The non-probability sampling method is used to create a
hypothesis when limited to no prior information is available. This method helps with
immediate return of data and helps to build a base for any further research.
Exploratory research: This sampling technique is widely used when researchers aim
at conducting qualitative research, pilot studies or exploratory research.
Budget and time constraints: The non-probability method when there are budget
and time constraints and some preliminary data has to be collected. Since the survey
design is not rigid, it is easier to pick respondents at random and have them take
the survey or questionnaire.
Monte Carlo-
Monte Carlo methods refer to a series of statistical methods essentially used to find
solutions to things such as computing the expected values of a function, or
integrating functions which can't be integrated analytically because they don't have a
closed-form solution.
The Monte Carlo method is a numerical method of solving mathematical problems by
random sampling (or by the simulation of random variables).
MC methods all share the concept of using randomly drawn samples to compute a
solution to a given problem. These problems generally come in two main categories:
The field of machine learning has seen the development of thousands of learning
algorithms. Typically, scientists choose from these algorithms to solve specific
problems. Their choices often being limited by their familiarity with these algorithms.
In this classical/traditional framework of machine learning, scientists are constrained to
making some assumptions so as to use an existing algorithm. This is in contrast to the
model-based machine learning approach which seeks to create a bespoke solution
tailored to each new problem.
The goal of MBML is “to provide a single development framework which supports the
creation of a wide range of bespoke models“. This framework emerged from an
important convergence of three key ideas:
1. the adoption of a Bayesian viewpoint,
2. the use of factor graphs (a type of a probabilistic graphical model), and
3. the application of fast, deterministic, efficient and approximate inference
algorithms.
There are 3 steps to model based machine learning, namely:
1. Describe the Model: Describe the process that generated the data using factor
graphs.
2. Condition on Observed Data: Condition the observed variables to their known
quantities.
3. Perform Inference: Perform backward reasoning to update the prior distribution
over the latent variables or parameters. In other words, calculate the posterior
probability distributions of latent variables conditioned on observed variables.
The core idea at the heart of model-based machine learning is that all
the assumptions about the problem domain are made explicit in the form of a model. In
fact, a model is just made up of this set of assumptions, expressed in a precise
mathematical form. These assumptions include the number and types of variables in the
problem domain, which variables affect each other, and what the effect of changing one
variable is on another variable.
Value iteration and policy iteration
Value iteration and policy iteration are two classic algorithms used in the field of
reinforcement learning and Markov decision processes (MDPs) to solve for an optimal
policy.
Value Iteration:
Value iteration is an iterative algorithm that aims to find the optimal value function and
policy for a given MDP. It starts with an initial estimate of the value function and
updates it iteratively until convergence. The algorithm alternates between two steps:
a. Value evaluation: In this step, the value function is updated based on the Bellman
equation, which expresses the value of a state in terms of the expected immediate
reward and the expected value of the successor states. The update equation is given by:
b. Policy improvement: After updating the value function, the policy is improved by
selecting the action that maximizes the expected return from each state. The policy
improvement step is given by:
π(s) <- argmax[∑P(s'|s,a)[R(s,a,s')+γV(s')]]
a. Policy evaluation: In this step, the value function is computed for a given policy. It
involves solving a set of linear equations called the policy evaluation equations, which
express the value of a state in terms of the expected immediate reward and the value of
the successor states under the current policy. The update equation is given by:
b. Policy improvement: After evaluating the current policy, a new policy is derived by
selecting the action that maximizes the expected return from each state. The policy
improvement step is the same as in value iteration:
where V(s) is the value estimate of state s, α is the learning rate, r is the immediate
reward received after transitioning from state s to state s', γ is the discount factor that
balances the importance of immediate and future rewards, and V(s') is the value
estimate of the next state.
TD learning algorithms are known for their ability to learn online and update estimates
efficiently. They are widely used in various domains, including game playing, robotics,
and prediction tasks in general.
Deterministic Rewards:
Deterministic rewards refer to the type of rewards that have a fixed and predictable
outcome based on a particular action or state. When an agent takes a specific action in a
given state, it will always receive the same reward. The deterministic nature of rewards
simplifies the learning process for the agent since it can easily associate actions with
specific outcomes. For example, in a game where the agent receives a reward of +10 for
reaching a certain goal state, this reward is deterministic because every time the agent
reaches that state, it will receive the same reward of +10.
Non-deterministic Rewards:
Non-deterministic rewards, on the other hand, have an element of randomness or
uncertainty associated with them. When an agent takes a particular action in a given
state, it may receive different rewards each time, even if the action is repeated in the
same state. Non-deterministic rewards can introduce additional complexity to the
learning process since the agent needs to estimate the expected value or distribution of
rewards associated with different actions. An example of non-deterministic rewards
could be a game where the agent receives a reward of +5 with a certain probability, but
there is also a chance of receiving a reward of -2 or +10.
Non-deterministic Actions:
Non-deterministic actions are actions where the outcome or effect of taking a particular
action in a given state is uncertain or probabilistic. The same action taken in the same
state can lead to different resulting states with certain probabilities. This introduces an
element of randomness in the decision-making process. For example, in a game where
the agent can move in four directions but there is a chance of slipping and moving in a
random direction, selecting the action "up" may result in moving up with a high
probability but also slipping to the left or right with a lower probability.
Occam learning
In computational learning theory, Occam learning is a model of algorithmic learning
where the objective of the learner is to output a concise representation of received
training data. This is closely related to probably approximately correct (PAC) learning,
where the learner is evaluated on its predictive power of a test set.
Occam learnability implies PAC learning, and for a wide variety of concept classes, the
converse is also true: PAC learnability implies Occam learnability.
VC) dimension
(VC) dimension is a measure of the capacity (complexity, expressive power, richness, or
flexibility) of a space of functions that can be learned by a statistical
classification algorithm. It is defined as the cardinality of the largest set of points that
the algorithm can shatter.