Learning
Learning
Chapter# 6 Lecture#
Learning
Contents
Nature, basic terms & processes, and principles, Factors in classical conditioning, Application of classical
conditioning. Operant conditioning, The basis of operant conditioning: nature, basic terms & processes,
Schedules of reinforcement Principles of operant conditioning,
Definition
Learning is often defined as a relatively lasting change in behavior that is the result of experience. When
you think of learning, it might be easy to fall into the trap of only considering formal education that takes
place during childhood and early adulthood, but learning is actually an ongoing process that takes place
throughout all of life.
Have you ever wondered how our behaviors are learned? Meet Ivan Pavlov and B.F. Skinner, two
behavioral psychologists who pioneered the theories of classical and operant conditioning, respectively.
Let's examine how the theories they studied help us understand the way the way we learn.
Association
Learning by association is one of the most common types of learning. Learning by association happens in
humans and animals. Let's use a young girl named Sally as an example to explain learning by association.
Ever since she's been old enough, Sally's father has brought her on enjoyable outdoor activities in the
winter. Before starting the fun, Sally's father always puts strawberry lip balm on his daughter to protect
her from the cold. Many years later Sally buys a strawberry lip balm for her trip to the beach. But when
she pulls the cover off, all she can think of is snow. What happened? Sally has learned by association.
Sally's consistent experiences with strawberry lip balm and winter fun with her father have made her
associate that very fruity scent with something that it isn't commonly associated with; cold, snowy winter.
Sally might also think of her father, snowshoes and several other things that were paired together, or
associated, when she was younger. In the early 1900's, psychologists were starting to make their own
associations about learning. They found that they could elicit responses in animals by methods of
association called conditioning. Since then, we have also found that it works for humans too! So how
does conditioning work?
Classical Conditioning
First, let's visit Mr. Pavlov. He studied what is called classical conditioning. You'll sometimes also hear
this referred to as respondent conditioning. In classical conditioning, learning refers to involuntary
responses that result from experiences that occur before a response. Classical conditioning occurs when
you learn to associate two different stimuli. No behavior is involved. The first
stimulus that you will encounter is called the unconditioned stimulus. An
unconditioned stimulus produces a response without any previous learning. This
response is called an unconditioned response. Ivan Pavlov (1849-1936) was a
Russian scientist interested in studying how digestion works in mammals. He
observed and recorded information about dogs and their digestive process. As
part of his work, he began to study what triggers dogs to salivate. It should have
been an easy study: mammals produce saliva to help them break down food, so
the dogs should have simply began drooling when presented with food.
But what Pavlov discovered when he observed the dogs was that drooling had a much more far-reaching
effect than he ever thought: it paved the way for a new theory about behavior and a new way to study
humans.
Behaviorists have described several different phenomena associated with classical conditioning. Some of
these elements involve the initial establishment of the response while others describe the disappearance of
a response. These elements are important in understanding the classical conditioning process. Let's take a
closer look at five key principles of classical conditioning.
Acquisition
Acquisition is the initial stage of learning when a response is first established and gradually strengthened. 5
During the acquisition phase of classical conditioning, a neutral stimulus is repeatedly paired with an
unconditioned stimulus.
As you may recall, an unconditioned stimulus is something that naturally and automatically triggers a
response without any learning. After an association is made, the subject will begin to emit a behavior
in response to the previously neutral stimulus, which is now known as a conditioned stimulus. It is at
this point that we can say that the response has been acquired. For example, imagine that you are
conditioning a dog to salivate in response to the sound of a bell. You repeatedly pair the presentation of
food with the sound of the bell. You can say the response has been acquired as soon as the dog begins
to salivate in response to the bell tone. Once the response has been established, you can gradually
reinforce the salivation response to make sure the behavior is well learned.
Extinction
Spontaneous Recovery
Stimulus Generalization
Stimulus generalization is the tendency for the conditioned stimulus to evoke similar responses after the
response has been conditioned.8 For example, if a dog has been conditioned to salivate at the sound of a
bell, the animal may also exhibit the same response to stimuli that are similar to the conditioned stimulus.
In John B. Watson's famous Little Albert Experiment, for example, a small child was conditioned to fear a
white rat. The child demonstrated stimulus generalization by also exhibiting fear in response to other
fuzzy white objects including stuffed toys and Watson's own hair.
Stimulus Discrimination
Discrimination is the ability to differentiate between a conditioned stimulus and other stimuli that have
not been paired with an unconditioned stimulus. For example, if a bell tone were the conditioned stimulus,
discrimination would involve being able to tell the difference between the bell tone and other similar
sounds. Because the subject is able to distinguish between these stimuli, they will only respond when the
conditioned stimulus is presented.
It can be helpful to look at a few examples of how the classical conditioning process operates both in
experimental and real-world settings.
Fear Response
John B. Watson's experiment with Little Albert is a perfect example of the fear response. 10 The child
initially showed no fear of a white rat, but after the rat was paired repeatedly with loud, scary sounds, the
child would cry when the rat was present. The child's fear also generalized to other fuzzy white objects.
Prior to the conditioning, the white rat was a neutral stimulus. The unconditioned stimulus was the loud,
clanging sounds, and the unconditioned response was the fear response created by the noise. By
repeatedly pairing the rat with the unconditioned stimulus, the white rat (now the conditioned stimulus)
came to evoke the fear response (now the conditioned response). This experiment illustrates how phobias
can form through classical conditioning. In many cases, a single pairing of a neutral stimulus (a dog, for
example) and a frightening experience (being bitten by the dog) can lead to a lasting phobia (being afraid
of dogs).
Taste Aversions
Another example of classical conditioning can be seen in the development of conditioned taste
aversions. Researchers John Garcia and Bob Koelling first noticed this phenomenon when they
observed how rats that had been exposed to a nausea-causing radiation developed an aversion to
flavored water after the radiation and the water were presented together. In this example, the
radiation represents the unconditioned stimulus, and the nausea represents the unconditioned
response. After the pairing of the two, the flavored water is the conditioned stimulus, while the
nausea that formed when exposed to the water alone is the conditioned response. Later research
demonstrated that such classically conditioned aversions could be produced through a single
pairing of the conditioned stimulus and the unconditioned stimulus. Researchers also found that
such aversions can even develop if the conditioned stimulus (the taste of the food) is presented
several hours before the unconditioned stimulus (the nauseacausing stimulus). Why do such
associations develop so quickly? Obviously, forming such associations can have survival benefits
for the organism. If an animal eats something that makes it ill, it needs to avoid eating the same
food in the future to avoid sickness or even death. This is a great example of what is known as
biological preparedness. Some associations form more readily because they aid in survival. In one
famous field study, researchers injected sheep carcasses with a poison that would make coyotes
sick but not kill them. The goal was to help sheep ranchers reduce the number of sheep lost to
coyote killings. Not only did the experiment work by lowering the number of sheep killed, it also
caused some of the coyotes to develop such a strong aversion to sheep that they would actually
run away at the scent or sight of a sheep. In reality, people do not respond exactly like Pavlov's
dogs. There are, however, numerous realworld applications for classical conditioning. For
example, many dog trainers use classical conditioning techniques to help people train their pets.
These techniques are also useful for helping people cope with phobias or anxiety problems. Therapists
might, for example, repeatedly pair something that provokes anxiety with relaxation techniques in order
to create an association. Teachers are able to apply classical conditioning in the class by creating a
positive classroom environment to help students overcome anxiety or fear. Pairing an anxiety-provoking
situation, such as performing in front of a group, with pleasant surroundings helps the student learn new
associations. Instead of feeling anxious and tense in these situations, the child will learn to stay relaxed
and calm.
Operant Conditioning
For example, when lab rats press a lever when a green light is on, they receive a food pellet as a reward.
When they press the lever when a red light is on, they receive a mild electric shock. As a result, they learn
to press the lever when the green light is on and avoid the red light.
But operant conditioning is not just something that takes place in experimental settings while training lab
animals. It also plays a powerful role in everyday learning. Reinforcement and punishment take place in
natural settings all the time, as well as in more structured settings such as classrooms or therapy sessions.
The History of Operant Conditioning
Operant conditioning was first described by behaviorist B.F. Skinner, which is why you may occasionally
hear it referred to as Skinnerian conditioning. As a behaviorist, Skinner believed that it was not necessary
to look at internal thoughts and motivations to explain behavior. Instead, he suggested, we should look
only at the external, observable causes of human behavior. Through the first part of the 20th century,
behaviorism became a major force within psychology. The ideas of John B. Watson dominated this school
of thought early on. Watson focused on the principles of classical conditioning, once famously suggesting
that he could take any person regardless of their background and train them to be anything he chose.
Early behaviorists focused their interests on associative learning. Skinner was more interested in how the
consequences of people's actions influenced their behavior. Skinner used the term operant to refer to any
"active behavior that operates upon the environment to generate consequences." Skinner's theory
explained how we acquire the range of learned behaviors we exhibit every day. His theory was heavily
influenced by the work of psychologist Edward Thorndike, who had proposed what he called the law of
effect.3 According to this principle, actions that are followed by desirable outcomes are more likely to be
repeated while those followed by undesirable outcomes are less likely to be repeated. Operant
conditioning relies on a simple premise: Actions that are followed by reinforcement will be strengthened
and more likely to occur again in the future. If you tell a funny story in class and everybody laughs, you
will probably be more likely to tell that story again in the future. If you raise your hand to ask a question
and your teacher praises your polite behavior, you will be more likely to raise your hand the next time you
have a question or comment. Because the behavior was followed by reinforcement, or a desirable
outcome, the preceding action is strengthened. Conversely, actions that result in punishment or
undesirable consequences will be weakened and less likely to occur again in the future. If you tell the
same story again in another class but nobody laughs this time, you will be less likely to repeat the story
again in the future. If you shout out an answer in class and your teacher scolds you, then you might be less
likely to interrupt the class again.
Skinner is regarded as the father of Operant Conditioning, but his work was based on Thorndike’s law of
effect. Skinner introduced a new term into the Law of Effect - Reinforcement. Behavior which is
reinforced tends to be repeated (i.e. strengthened); behavior which is not reinforced tends to die out-or be
extinguished (i.e. weakened). Skinner (1948) studied operant conditioning by conducting experiments
using animals which he placed in a 'Skinner Box' which was similar to Thorndike’s puzzle box.
Animals in a skinner box learnt to obtain food by operating on their environment within the box. To
illustrate skinner’s experiment, suppose we want to teach a hungry pigeon to peck a key that is located in
its box. At first the pigeon will wander around the box, exploring the environment in in a realtively
random fashion. At some point, however, it will probably peck the key by chance and when it does, it will
receive a food tray. The first time this happens, the pigeon will not learn the connection between pecking
and receiving food and will continue to explore the box. After repeating the practice, it will eventually
peck the key and satisfy its hunger, thereby demonstrating that it has learned that the receipt of food is
contingent on the pecking behavior.
B.F. Skinner (1938) coined the term operant conditioning; it means roughly changing of behavior by the
use of reinforcement which is given after the desired response.
Types of Behaviors
• Respondent behaviors are those that occur automatically and reflexively, such as pulling your
hand back from a hot stove or jerking your leg when the doctor taps on your knee. You don't
have to learn these behaviors. They simply occur automatically and involuntarily.
• Operant behaviors, on the other hand, are those under our conscious control. Some may occur
spontaneously and others purposely, but it is the consequences of these actions that then
influence whether or not they occur again in the future. Our actions on the environment and the
consequences of that action make up an important part of the learning process.
While classical conditioning could account for respondent behaviors, Skinner realized that it could not
account for a great deal of learning. Instead, Skinner suggested that operant conditioning held far greater
importance. Skinner invented different devices during his boyhood, and he put these skills to work during
his studies on operant conditioning. He created a device known as an operant conditioning chamber, often
referred to today as a Skinner box. The chamber could hold a small animal, such as a rat or pigeon. The
box also contained a bar or key that the animal could press to receive a reward. To track responses,
Skinner also developed a device known as a cumulative recorder. The device recorded responses as an
upward movement of a line so that response rates could be read by looking at the slope of the line.
1. POSITIVE REINFORCEMENT - Reward – presenting the subject with something that it likes. E.g.
Skinner rewarded his rats with food pellets.
2. NEGATIVE REINFORCEMENT - Reward – in the sense of removing or avoiding some
aversive (painful) stimulus. E.g.Skinner's rats learned to press the lever in order to switch off the electric
current in the cage.
3. PUNISHMENT - Imposing an aversive or painful stimulus. E.g., Skinner’s rats were given electric
shocks.
4. PRIMARY REINFORCERS - These are stimuli which are naturally reinforcing because they
directly satisfy a need. E.g., food, water.
5. SECONDARY REINFORCERS - These are stimuli, which are reinforcing through their association
with a primary reinforcer. I.e., they do not directly satisfy a need but may be the means to do so. E.g.,
Money! You cannot eat it or drink it but if you have it, you can buy whatever you want. So, a
secondary reinforcer can be just as powerful a motivator as a primary reinforcer.
Principles of Operant Conditioning:
1. Acquisition and shaping: In order for the acquisition of operant conditioning, the subject must
be motivated in order to get reinforcement. If reinforcement is food then the subject must be
hungry. Operant responses are typically established through a gradual process called shaping: the
reinforcement of closer and closer approximations of a desired response.
3. Spontaneous recovery: it is the occurrence of behavior that had been extinguished, after a rest
period. Rats that have been on extinction in Skinner box will begin to press the lever.
4. Generalization: the organism learns to respond to one stimulus and then generalized this
response for the similar stimulus.
Schedule of Reinforcement
A schedule of reinforcement is basically a rule stating which instances of behavior will be reinforced. In
some cases, a behavior might be reinforced every time it occurs. Sometimes, a behavior might not be
reinforced at all. Either positive reinforcement or negative reinforcement may be used as a part of operant
conditioning. In both cases, the goal of reinforcement is to strengthen a behavior so that it will likely
occur again. Reinforcement schedules take place in both naturally occurring learning situations as well as
more structured training situations. In real-world settings, behaviors are probably not going to be
reinforced every time they occur. In situations where you are intentionally trying to reinforce a specific
action (such as in school, sports, or in animal training), you would follow a specific reinforcement
schedule. Some schedules are better suited to certain types of training situations. In some cases, training
might call for one schedule and then switch to another once the desired behavior has been taught.
The two foundational forms of reinforcement schedules are referred to as continuous reinforcement and
partial reinforcement.
1. Continuous Reinforcement
In continuous reinforcement, the desired behavior is reinforced every single time it occurs. 1 This schedule
is best used during the initial stages of learning to create a strong association between the behavior and
response. Imagine, for example, that you are trying to teach a dog to shake your hand. During the initial
stages of learning, you would stick to a continuous reinforcement schedule to teach and establish the
behavior. This might involve grabbing the dog's paw, shaking it, saying "shake," and then offering a
reward each time you perform these steps. Eventually, the dog will start to perform the action on its own.
Continuous reinforcement schedules are most effective when trying to teach a new behavior. It denotes a
pattern to which every narrowly defined response is followed by a narrowly-defined consequence.
2. Partial Reinforcement
Once the response is firmly established, a continuous reinforcement schedule is usually switched to a
partial reinforcement schedule.1 In partial (or intermittent) reinforcement, the response is reinforced only
part of the time. Learned behaviors are acquired more slowly with partial reinforcement, but the response
is more resistant to extinction.
Think of the earlier example in which you were training a dog to shake and. While you initially used
continuous reinforcement, reinforcing the behavior every time is simply unrealistic. In time, you would
switch to a partial schedule to provide additional reinforcement once the behavior has been established or
after considerable time has passed.
1. Fixed-ratio schedules are those in which a response is reinforced only after a specified number of
responses. This schedule produces a high, steady rate of responding with only a brief pause after the
delivery of the reinforcer. An example of a fixed-ratio schedule would be delivering a food pellet to a
rat after it presses a bar five times.
2. Variable-ratio schedules occur when a response is reinforced after an unpredictable number of
responses. This schedule creates a high steady rate of responding. Gambling and lottery games are
good examples of a reward based on a variable ratio schedule. In a lab setting, this might involve
delivering food pellets to a rat after one bar press, again after four bar presses, and then again after
two bar presses.
3. Fixed-interval schedules are those where the first response is rewarded only after a specified amount
of time has elapsed. This schedule causes high amounts of responding near the end of the interval but
slower responding immediately after the delivery of the reinforcer. An example of this in a lab setting
would be reinforcing a rat with a lab pellet for the first bar press after a 30second interval has elapsed.
4. Variable-interval schedules occur when a response is rewarded after an unpredictable amount of time
has passed. This schedule produces a slow, steady rate of response.
An example of this would be delivering a food pellet to a rat after the first bar press following a one-
minute interval; a second pellet for the first response following a five-minute interval; and a third pellet
for the first response following a three-minute interval.
Using the Appropriate Schedule
Deciding when to reinforce a behavior can depend on a number of factors. In cases where you are
specifically trying to teach a new behavior, a continuous schedule is often a good choice. Once the
behavior has been learned, switching to a partial schedule is often preferable. In daily life, partial
schedules of reinforcement occur much more frequently than do continuous ones. For example, imagine if
you received a reward every time you showed up to work on time. Over time, instead of the reward being
a positive reinforcement, the denial of the reward could be regarded as negative reinforcement.
Instead, rewards like these are usually doled out on a much less predictable partial reinforcement
schedule. Not only are these much more realistic, but they also tend to produce higher response rates
while being less susceptible to extinction. Partial schedules reduce the risk of satiation once a behavior
has been established. If a reward is given without end, the subject may stop performing the behavior if the
reward is no longer wanted or needed. For example, imagine that you are trying to teach a dog to sit. If
you use food as a reward every time, the dog might stop performing once it is full. In such instances,
something like praise or attention may be more effective in reinforcing an already-established behavior.
We can find examples of operant conditioning at work all around us. Consider the case of children
completing homework to earn a reward from a parent or teacher, or employees finishing projects to
receive praise or promotions. More examples of operant conditioning in action include:
• After performing in a community theater play, you receive applause from the audience. This acts
as a positive reinforcer, inspiring you to try out for more performance roles.
• A professor tells students that if they have perfect attendance all semester, then they do not have
to take the final comprehensive exam. By removing an unpleasant stimulus (the final test),
students are negatively reinforced to attend class regularly.
• If you fail to hand in a project on time, your boss becomes angry and berates your performance in
front of your co-workers. This acts as a positive punisher, making it less likely that you will finish
projects late in the future.
• A teen girl does not clean up her room as she was asked, so her parents take away her phone for
the rest of the day. This is an example of a negative punishment in which a positive stimulus is
taken away.
In some of these examples, the promise or possibility of rewards causes an increase in behavior. Operant
conditioning can also be used to decrease a behavior via the removal of a desirable outcome or the
application of a negative outcome.