0% found this document useful (0 votes)
22 views12 pages

Unit 5 - QB FML

1. Inductive-analytical approaches to learning aim to find a hypothesis that balances fitting the training data and domain theory by minimizing errors with respect to both. 2. Bayes' theorem can be used to compute the posterior probability of a hypothesis given the data, which considers the data, prior probability of the hypothesis, and likelihood of the data given the hypothesis. 3. Most learning methods can be viewed as search algorithms over a hypothesis space, starting from an initial hypothesis and using search operators to find a hypothesis that meets a goal criterion.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views12 pages

Unit 5 - QB FML

1. Inductive-analytical approaches to learning aim to find a hypothesis that balances fitting the training data and domain theory by minimizing errors with respect to both. 2. Bayes' theorem can be used to compute the posterior probability of a hypothesis given the data, which considers the data, prior probability of the hypothesis, and likelihood of the data given the hypothesis. 3. Most learning methods can be viewed as search algorithms over a hypothesis space, starting from an initial hypothesis and using search operators to find a hypothesis that meets a goal criterion.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

ML UNIT 5 QB Answers

Discuss about the Explanation-based Learning of Search Control Knowledge.

Inductive learning methods, i.e. methods that generalize from observed training examples.
• The key practical limit on these inductive learners is that they perform poorly when insufficient data is
available.
• One way is to develop learning algorithms that accept explicit prior knowledge as an input, in addition to the
input training data.
• Explanation-based learning is one such approach.
• It uses prior knowledge to analyze, or explain, each training example in order to infer which example features
are relevant to the target function and which are irrelevant.
• These explanation helps in generalizing more accurately than inductive learning
• Explanation- based learning uses prior knowledge to reduce the complexity of the hypothesis space to be
searched, thereby reducing space complexity and improving generalization accuracy of the learner.

Example 1: Let us consider the task of learning to play chess. Here we are making our program to recognize the
game position i.e. target concept as "chessboard positions in which black will lose its queen within two moves."
Figure 1 shows the positive samples of training concept.

Now if we take inductive learning method to perform this task, it would be difficult because the chess board is
fairly complex (32 pieces can be on any 64 square) and particular patterns i.e. to place the pieces in the relative
positions (placing them exactly following game rules).So for all these we need to provide thousand of training
examples similar to figure 1 to expect an inductively learned hypothesis to generalize correctly to new situations.

Even after considering only the single suggest a general hypothesis for the target concept, such as "board
positions in which the black king and queen are simultaneously attacked," and would not even consider the
(equally consistent) hypothesis "board positions in which four white pawns are still in their original locations."So
we can’t generalize successfully with that one example.

Now why to consider training example as positive target concept? attacking both the king and queen, black must
move out of check, thereby al to capture the queen." They provide the information needed to rationally
generalize from the details of the training example to a correct general hypothesis.

What knowledge is needed to learn chess? the knight and other pieces, the fact that players must alternate
moves in the game, and the fact that to win the game one player must capture his opponent's king. However, in
practice this calculation can be frustratingly complex and despite the fact that we humans ourselves possess this
complete, perfect knowledge of chess, we remain unable to play the game optimally.

1. Give an illustrative example of KBANN.


Here each instance describes a physical object in terms of the material from which it is made, whether it is light,
etc. The task is to learn the target concept Cup defined over such physical objects. The domain theory defines a
Cup as an object that is Stable, Liftable, and an OpenVessel. The domain theory also defines each of these three
attributes in terms of more primitive attributes and all those attributes describe the instances.

Table 1. describes a set of training examples and a do-main theory for the Cup target concept.

Here the domain theory is inconsistent because the training examples. KBANN uses the domain theory target
concept more accurately than it could from either alone.

1. In First stage, Initial network is constructed consistent with domain


2. KBANN follows the convention that a sigmoid output value greater than interpreted as True and a value
below 0.5 as False.

3. Each unit is therefore constructed so that its output will be greater than 0.5 just in those cases where the
corresponding Horn clause applies.

4. for each input corresponding to a non positive constant W. For each input corresponding to a negated
antecedent, the weight is set to - W.

5. The threshold weight of the unit, wo is then set to non-negated antecedents. When i/p values are 1 or 0
then weightedsum+ w0 will be +ve , if all antecedents are satisfied.

6. Each sigmoid unit input is connected to the appropriate network input or to the output of another sigmoid
unit, to mirror the graph of dependencies among the corr attributes in the domain theory. As a final step
many additional inputs are added to each threshold unit, with their weights set approximately to zero.
The solid lines in the network of Figure2 indicate unit inputs with weights of W, whereas the lightly shaded
lines indicate connections with initial weights near zero.

7. The second stage of KBANN uses the training examples and the BACKPROPAGATION algorithm to refine the
initial network weight network is not consistent with theory. If consistent no need of backpropagation.

8. But our example is not consistent so we perform backpropagation


Figure 3, with dark solid lines indicating the largest positive weights, dashed lines largest negative weights,
and light linesindicating negligible weights.
2. Describe remarks on explanation-based learning.

REMARKS ON EXPLANATION-BASED LEARNING

 Unlike inductive methods, PROLOG-EBG produces justified general hypotheses by using prior
knowledge to analyze individual examples.

 The explanation of how the example satisfies the target concept determines which example
attributes are relevant: those mentioned by the explanation.

 The further analysis of the explanation, regressing the target concept to determine its weakest
preimage with respect to the explanation, allows deriving more general constraints on the values of
the relevant features.

 The generality of the learned Horn clauses will depend on the formulation of the domain theory and
on the sequence in which training examples are considered.

 PROLOG-EBG implicitly assumes that the domain theory is correct and complete. If the domain
theory is incorrect or incomplete, the resulting learned concept may also be incorrect.

There are several related perspectives on explanation-based learning that help to understand its capabilities
and limitations.

> EBL as theory-guided generalization of examples. EBL uses its given domain theory to generalize rationally
from examples, distinguishing the relevant ex- ample attributes from the irrelevant, thereby allowing it to
avoid the bounds on sample complexity that apply to purely inductive learning.

> EBL as example-guided reformulation of theories. The PROLOG-EBG algorithm can be viewed as a method
for reformulating the domain theory into a more operational formby creating rules that (a) follow deductively
from the domain theory, and (b) classify the observed training examples in a single inference step. Thus, the
learned rules can be seen as a reformulation of the domain theory classifying instances of the target concept
in a single inference step.

> EBL as "just" restating what the learner already "knows. " In one sense, the learner in our SafeToStack
example begins with full knowledge of the Safe-ToStack concept.If its initial domain theory is sufficient to
explain any observed training examples, then it is also sufficient to predict their classification in advance.

3. What are the inductive-analytical approaches to learning? Discuss.

INDUCTIVE-ANALYTICAL APPROACHES TO LEARNING

The Learning Problem


Given:
A set of training examples D, possibly containing errors
A domain theory B, possibly containing errors
A space of candidate hypotheses H

Determine:
A hypothesis that best fits the training examples and domain theory
Which hypothesis to consider?
→ One which fits training data well
→ One which fits domain theory well
errorD(h) is defined to be the proportion of examples from D that are misclassified by h. Let us define the
error errorB(h) of h with respect to a domain theory B to be the probability that h will disagree with B on the
classification of a randomly drawn instance. We can attempt to characterize the desired output hypothesis in
terms of these errors.

We require hypothesis that could minimize some combined measures of hypothesis such as

At first instance it satisfies, it is not clear what values to assign to kDand kB to specify the relative importance
of fitting the data versus fitting the theory.

If we have poor theory and great deal of data the error w.r.t D weight more heavily and if we have strong
theory and noisy data the error w.r.t B weight more heavily.so the learner doesn’t know about training data
and domain theory to unclear these components.

So to weight these we use Bayes theorem. Bayes theorem describes how to compute the posterior
probability P(h/D) of hypothesis h given observed training data D.Bayes theorem computes this posterior
probability based on the observed data D, together with prior knowledge in the form of P(h), P(D), and
P(D/h).we can think of P(h), P(D), and P(D/h) as a form of background knowledge or domain theory.Here we
should choose hypothesis whose posterior probability is high. If P(h), P(D), and P(D/h) these are not perfectly
known then Bayes theorem alone does not prescribe how to combine them with the observed data. Then,
we have to assume prior probabilistic values for P(h), P(D), and P(D/h).

Hypothesis space search:

We can characterize most learning methods as search algorithms by describing the hypothesis space H they
search, the initial hypothesis ho at which they begin their search, the set of search operators 0 that define
individual search steps, and the goal criterion G that specifies the search objective.

Three different methods are:

 Use prior knowledge to derive an initial hypothesis from which to begin the search. In this approach the
domain theory B is used to construct an initial hypothesis ho that is consistent with B. A standard inductive
method is then applied, starting with the initial hypothesis ho.
 Use prior knowledge to alter the objective of the hypothesis space search. In this approach, the goal
criterion G is modified to require that the out- put hypothesis fits the domain theory as well as the training
examples.
 Use prior knowledge to alter the available search steps. In this approach, the set of search operators 0 is
altered by the domain theory.

4. Explain the EBNN algorithm with an example.


EBNN Algorithm
The EBNN (Explanation-Based Neural Network learning) algorithm (Mitchell and Thrun 1993a; Thrun 1996)
builds on the TANGENTPROP algorithm in two significant ways.
 First, instead of relying on the user to provide training derivatives, EBNN computes training
derivatives itself for each observed training example. These training derivatives are calculated by
explaining each training example in terms of a given domain theory, then extracting training
derivatives from this explanation. (how to select mue).
 Second, EBNN addresses the issue of how to weight the relative importance of the inductive and
analytical components of learning
value of u is chosen independently for each training example.

The inputs to EBNN include (1) a set of training examples of the form (xi, f(xi)) with no traming derivatives
provided, and (2) a domain theory analogous to that used in explanation-based learning and in KBANN, but
represented by a set of previously trained neural networks rather than a set of Horn clauses. The output of
EBNN is a new neural network that approximates the target function f

To illustrate the type of domain theory used by EBNN, consider Figure. The top portion of this figure depicts
an EBNN domain theory for the target function Cup, with each rectangular block representing a distinct
neural network in the domain theory. Notice in this example there is co network for each of the Hom clauses
in the symbolic domain theory of Table 1 For example, the network labeled Graspable takes as input the
desenption of an instance and produces as output a value indicating whether the object is graspable (EBNN
typically represents to propositions by the value 0.8 and false propositions by the value 02). This network is
analogous to the Horn clause for Graspable given in Table 1. Some networks take the outputs of other
networks as their inputs (e.g., the right- most network labelled Cup takes its inputs from the outputs of the
Stable, Liftable, and OpenVessel networks). Thus, the networks that make up the domain theory can be
chained together to infer the target function value for the input instance, just as Horn clauses might be
chained together for this purpose. In general, these domain theory networks may be provided to the learner
by some external source, or they may be the result of previous learning by the same system. EBNN makes
use of these domain theory networks to learn the newtarget function. It does not alter the domain theory
networks during this process.

The goal of EBNN is to learn a new neural network to describe the target function. We will refer to this new
network as the target network. In the example of Figure, the target network Cup...... shown at the bottom of
the figure takes as input the description of an arbitrary instance and outputs a value indicating whether the
object is a Cup. EBNN algorithm uses a domain theory expressed as a set of previously learned neural
networks, together with a set of training examples, to train its output hypothesis.

5. Explain the Learning with perfect domain theories: Prolog-EBG.

LEARNING WITH PERFECT DOMAIN THEORIES: PROLOG-EBG

 we consider explanation-based learning from domain theories that are perfect, that is, domain
theories that are correct and complete. . A domain theory is said to be correct if each of its
assertions is a truthful statement about the world.
 A domain theory is said to be complete with respect to a given target concept and instance space, if
the domain theory covers every positive example in the instance space.
 But our definition of completeness does not require that the domain theory be able to prove that
negative examples do not satisfy the target concept.
 So we now with help of PROLOG-EBG explain definition of completeness includes full coverage of
both positive and negative examples by the domain theory.

PROLOG-EBG Algorithm:

PROLOG-EBG is a sequential covering algorithmthat considers the training data incrementally.

For each new positive training example that is not yet covered by a learned Horn clause, it forms a new Horn
clause by:
(1) explaining the new positive training example,
(2) analyzing this explanation to determine an appropriate generalization, and
(3) refining the current hypothesis by adding a new Horn clause rule to cover this positive example, as well as
other similar instances.

6. Explain the combined inductive analytical approaches of learning.

Motivation:
• two paradigms for machine learning: inductive learning and analytical learning.
• Purely analytical learning methods offer the advantage of generalizing more accurately from less data by
using prior knowledge to guide learning. However, they can be misled when given incorrect or insufficient
prior knowledge.
Eg: PROLOG-EBG, seek general hypotheses that fit prior knowledge while covering the observed data.
• Purely inductive methods offer the advantage that they require no explicit prior knowledge and learn
regularities based solely on the training data. However, they can fail when given insufficient training data,
and can be misled by the implicit inductive bias they must adopt in order to generalize beyond the observed
data.
Eg : decision tree induction and neural network BACKPROPAGATION, seek general hypotheses that fit the
observed training data.
• Combining them offers the possibility of more powerful learning methods.

The two approaches work well for different types of problems. By combining them we can hope to devise a
more general learning approach that covers a more broad range of learning tasks. Figl,a spectrum of learning
problems that varies by the availability of prior knowledge and training data. At one extreme, a large volume
of training data is available, but no prior knowledge. At the other extreme, strong prior knowledge is
available, but little training data. Most practical learning problems lie somewhere between these two
extremes of the spectrum.

At the left extreme, no prior knowledge is available, and purely inductive leaming methods with high sample
complexity are therefore necessary. At the rightmost extreme, a perfect domain theory is available, enabling
the use of purely analytical methods such as PROLOG-EBG. Most practical problems lie somewhere between
these two extremes.

Some specific properties we would like from such a learning method include:

• Given no domain theory, it should learn at least as effectively as purely inductive methods.
• Given a perfect domain theory, it should learn at least as effectively as purely analytical methods.
• Given an imperfect domain theory and imperfect training data, it should combine the two to outperform
either purely inductive or purely analytical metho
• It should accommodate an unknown level of error in the training data.
• It should accommodate an unknown level of error in the domain theory

7. How to make use of prior knowledge to alter the search objective? Explain.
USING PRIOR KNOWLEDGE TO ALTER THE SEARCH OBJECTIVE
The above approach begins the gradient descent search with a hypothesis that perfectly fits the domain
theory, then perturbs this hypothesis as needed to maximize the fit to the training data.
An alternative way of using prior knowledge is to incorporate it into the error criterion minimized by gradient
descent, so that the network must fit a combined function of the training data and domain theory.

EBNN Algorithm
The EBNN (Explanation-Based Neural Network learning) algorithm (Mitchell and Thrun 1993a; Thrun 1996)
builds on the TANGENTPROP algorithm in two significant ways.
 First, instead of relying on the user to provide training derivatives, EBNN computes training derivatives itself
for each observed training example. These training derivatives are calculated by explaining each training
example in terms of a given domain theory, then extracting training derivatives from this explanation. (how
to select mue).
 Second, EBNN addresses the issue of how to weight the relative importance of the inductive and analytical
components of learning
value of u is chosen independently for each training example.

The inputs to EBNN include (1) a set of training examples of the form (xi, f(xi)) with no traming derivatives
provided, and (2) a domain theory analogous to that used in explanation-based learning and in KBANN, but
represented by a set of previously trained neural networks rather than a set of Horn clauses. The output of
EBNN is a new neural network that approximates the target function f

To illustrate the type of domain theory used by EBNN, consider Figure. The top portion of this figure depicts
an EBNN domain theory for the target function Cup, with each rectangular block representing a distinct
neural network in the domain theory. Notice in this example there is co network for each of the Hom clauses
in the symbolic domain theory of Table 1 For example, the network labeled Graspable takes as input the
desenption of an instance and produces as output a value indicating whether the object is graspable (EBNN
typically represents to propositions by the value 0.8 and false propositions by the value 02). This network is
analogous to the Horn clause for Graspable given in Table 1. Some networks take the outputs of other
networks as their inputs (e.g., the right- most network labelled Cup takes its inputs from the outputs of the
Stable, Liftable, and OpenVessel networks). Thus, the networks that make up the domain theory can be
chained together to infer the target function value for the input instance, just as Horn clauses might be
chained together for this purpose. In general, these domain theory networks may be provided to the learner
by some external source, or they may be the result of previous learning by the same system. EBNN makes
use of these domain theory networks to learn the newtarget function. It does not alter the domain theory
networks during this process.

The goal of EBNN is to learn a new neural network to describe the target function. We will refer to this new
network as the target network. In the example of Figure, the target network Cup...... shown at the bottom of
the figure takes as input the description of an arbitrary instance and outputs a value indicating whether the
object is a Cup. EBNN algorithm uses a domain theory expressed as a set of previously learned neural
networks, together with a set of training examples, to train its output hypothesis.

8. Explain the inductive analytical approaches to learning.


INDUCTIVE-ANALYTICAL APPROACHES TO LEARNING

The Learning Problem


Given:
A set of training examples D, possibly containing errors
A domain theory B, possibly containing errors
A space of candidate hypotheses H

Determine:
A hypothesis that best fits the training examples and domain theory
Which hypothesis to consider?
→ One which fits training data well
→ One which fits domain theory well

errorD(h) is defined to be the proportion of examples from D that are misclassified by h. Let us define the
error errorB(h) of h with respect to a domain theory B to be the probability that h will disagree with B on the
classification of a randomly drawn instance. We can attempt to characterize the desired output hypothesis in
terms of these errors.

We require hypothesis that could minimize some combined measures of hypothesis such as

At first instance it satisfies, it is not clear what values to assign to kDand kB to specify the relative importance
of fitting the data versus fitting the theory.

If we have poor theory and great deal of data the error w.r.t D weight more heavily and if we have strong
theory and noisy data the error w.r.t B weight more heavily.so the learner doesn’t know about training data
and domain theory to unclear these components.

So to weight these we use Bayes theorem. Bayes theorem describes how to compute the posterior
probability P(h/D) of hypothesis h given observed training data D.Bayes theorem computes this posterior
probability based on the observed data D, together with prior knowledge in the form of P(h), P(D), and
P(D/h).we can think of P(h), P(D), and P(D/h) as a form of background knowledge or domain theory.Here we
should choose hypothesis whose posterior probability is high. If P(h), P(D), and P(D/h) these are not perfectly
known then Bayes theorem alone does not prescribe how to combine them with the observed data. Then,
we have to assume prior probabilistic values for P(h), P(D), and P(D/h).

Hypothesis space search:

We can characterize most learning methods as search algorithms by describing the hypothesis space H they
search, the initial hypothesis ho at which they begin their search, the set of search operators 0 that define
individual search steps, and the goal criterion G that specifies the search objective.

Three different methods are:

 Use prior knowledge to derive an initial hypothesis from which to begin the search. In this approach the
domain theory B is used to construct an initial hypothesis ho that is consistent with B. A standard inductive
method is then applied, starting with the initial hypothesis ho.
 Use prior knowledge to alter the objective of the hypothesis space search. In this approach, the goal
criterion G is modified to require that the out- put hypothesis fits the domain theory as well as the training
examples.
 Use prior knowledge to alter the available search steps. In this approach, the set of search operators 0 is
altered by the domain theory.
9. Explain how to initialize the hypothesis by using prior knowledge.
USING PRIOR KNOWLEDGE TO INITIALIZE THE HYPOTHESIS

One approach to using prior knowledge is to initialize the hypothesis to perfectly fit the domain theory, then
inductively refine this initial hypothesis as needed to fit the training data. This approach is used by the
KBANN (Knowledge-Based Artificial Neural Network) algorithm to learn artificial neural networks.

In KBANN, initial network is first constructed for every instance, the classification assigned by the network is
identical to that assigned by the domain theory. Backpropagation algorithm is employed to adjust the
weights of initial network as needed to fit training examples.

If the initial hypothesis is found to imperfectly classify the training examples, then it will be

refined inductively to improve its fit to the training examples (Backpropagation algorithm). If the

domain theory is correct, the initial hypothesis will correctly classify all the training examples.

The intuition behind KBANN is that even if the domain theory is only approximately correct, initializing the
network to fit this domain theory will give a better starting approximation to the target function than
initializing the network to random initial weights. The KBANN Algorithm

It first initializes the hypothesis approach to using domain theories It assumes a domain theory

represented by a set of propositional, non-recursive Horn clauses.

The two stages of the KBANN algorithm are first to create an artificial neural network that perfectly fits the
domain theory and second to use the BACKPROPAGATION algorithm to refine this initial network to fit the
training examples.

REMARKS:
 The chief benefit of KBANN over purely inductive BACKPROPAGATION is that it typically generalizes more
accurately than BACKPROPAGATION when given an approximately correct domain theory, especially when
training data is scarce.
 Limitations of KBANN include the fact that it can accommodate only propositional domain theories; that is,
collections of variable-free Horn clauses. It is also possible for KBANN to be misled when given highly
inaccurate domain theories, so that its generalization accuracy can deteriorate below the level of
BACKPROPAGATION

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy