0% found this document useful (0 votes)
8 views32 pages

Cpy 3202 Lecture 4

The document discusses operant conditioning, a learning theory associated with Thorndike and Skinner, emphasizing that behavior is influenced by its consequences rather than specific stimuli. Thorndike's laws of learning, including the law of effect and the law of exercise, highlight how satisfying responses are reinforced while discomforting ones are weakened, and Skinner expanded on these principles through experiments demonstrating reinforcement and punishment. The document also outlines various types of reinforcement, schedules of reinforcement, and implications for learning practices in educational settings.

Uploaded by

otienotonny88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views32 pages

Cpy 3202 Lecture 4

The document discusses operant conditioning, a learning theory associated with Thorndike and Skinner, emphasizing that behavior is influenced by its consequences rather than specific stimuli. Thorndike's laws of learning, including the law of effect and the law of exercise, highlight how satisfying responses are reinforced while discomforting ones are weakened, and Skinner expanded on these principles through experiments demonstrating reinforcement and punishment. The document also outlines various types of reinforcement, schedules of reinforcement, and implications for learning practices in educational settings.

Uploaded by

otienotonny88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

CPY 3202

LECTURE 4
BEHAVIOURISM: OPERANT CONDITIONING

• Associated with Thorndike (1874-1949) and Skinner (1904-1990)


• Basic principle: Animal behaviour is not triggered or elicited by specific stimuli
instead:
• Organisms operate on their environments and this operant
behaviour is instrumental in bringing about certain
consequences
• The consequences determine the probability of that behaviour
being repeated
• The learner is much more active in operant conditioning than in
classical conditioning
Operant conditioning: Summary of
E.L. Thorndike’s work
• Thorndike built puzzle-boxes for use with animals - (usually cats) – in which the animal had to
perform a simple act (such as pulling a string or pushing a pedal) to make its escape and reach a
dish of food placed within its view just outside the cage.

Source: Nevid, 2017

Thorndike’s puzzle box


Thorndike’s work cont’d
• The cats were deprived of food for a considerable period before the
experiments began, so they were highly motivated.
• After eating, the cats were put straight back into the box and the experiment
repeated.
• The animal would first engage in seemingly random behaviors until it
accidentally performed the response that released the door. Thorndike
argued that the animals did not employ reasoning, insight, or any other form
of higher intelligence to find their way to the exit.
• Rather, it was through a random process of trial and error that they gradually
eliminated useless responses and eventually chanced upon the successful
behavior. Successful responses were then “stamped in” by the pleasure they
produced and became more likely to be repeated in the future.
Thorndike’s laws of learning
• From his observations, Thorndike formulated two major laws of learning:
❑ Law of effect: states that responses that have satisfying effects are strengthened and become
more likely to occur again in a given situation, whereas responses that lead to discomfort are
weakened and become less likely to recur.
NOTE: Modern psychologists call the first part of the law of effect reinforcement and
the second part punishment (Benjamin, 1988).
• Implications: We are more likely to repeat responses that have satisfying effects and are
less likely to repeat those that lead to discomfort.
❑Law of exercise: the more a stimulus-response connection was practised or used,
the stronger it would become; and the less it was used the weaker it would become. (The law was
later modified to include the fact that connections were only strengthened by rewarded practice)
• Implications: Practice makes perfect
Thorndike’s laws and educational
practice
❑Thorndike went on to study how the principles of animal learning that
he formulated could be applied to human behavior and especially to
education. He believed that although human behavior is certainly
more complex than animal behavior, it, too, can be explained on the
basis of trial-and-error learning in which accidental successes become
“stamped in” by positive consequences.
❑The law of effect led to the use of concrete rewards and verbal praise
❑The law of exercise led to the use of repetition and practice
Skinner’s operant conditioning
• Skinner was a strict behaviorist who believed that psychologists
should limit themselves to the study of observable behavior. Because
“private events,” such as thoughts and feelings, cannot be observed,
he believed they have no place in a scientific account of behavior. For
Skinner, the mind was a “black box” whose contents cannot be
illuminated by science.
• For Skinner, classical conditioning was limited to explaining how new
stimuli can elicit existing behaviors, such as salivation. It could not
account for new behaviors, such as the behavior of the experimental
animals in Thorndike’s puzzle box.
Skinner’s operant conditioning cont’d…
• Skinner found in Thorndike’s work a guiding principle that
behavior is shaped by its consequences.
• But he rejected Thorndike’s mentalistic concept that
consequences influence behavior because they produce
“satisfying effects.”
• Skinner proposed that organisms learn responses that operate
on the environment to produce consequences; he therefore
called this learning process operant conditioning.
Skinner’s experiments
• Did experiments using a Skinner box (allowed organisms to do things in, rather than,
escape from the box). He used rats or pigeons for his experiments.
• The box had a lever under which there was a food tray. The organism had total control:
by pressing the lever a food pellet would be introduced into the box)
• The rats had the drive to explore around for several minutes. By accident, they would
press the lever, and food would be introduced into the box
• The random behaviours soon became consistent or reinforced with repeated reward.
• Extinction would also take place followed by spontaneous recovery. Extinction was a
result of the introduction of an aversive stimulus which reduced the pressing
behaviours. If by pressing the lever, the rats were electrocuted, the voluntary behaviour
of pressing the lever gradually weakened and finally became extinct. Extinction also
happened when no food was forthcoming after pressing the lever.
The Skinner box
1) Reinforcement: A consequence that
increases the probability that a
behaviour will occur
Operant 2) Punishment: A consequence that
conditioning: decreases the probability that a
Learning points behaviour will occur
NOTE: Reinforcement strengthens
behaviour while punishment
weakens behaviour
REINFORCEMENT
• A reinforcer is a stimulus whose presentation or removal increases the probability
of a response.
• The process whereby a reinforcer is presented or removed in order to strengthen
behaviour is called reinforcement
• Types of reinforcement:
❑Positive reinforcement – involves the presentation of a pleasurable
stimulus (food, money, a smile, a pat on the back, etc) to strengthen
behaviour

❑Negative reinforcement – involves the removal or avoidance of an


aversive (painful) state of affairs. It involves providing a way or a means
of escaping or avoiding negative/painful consequences.
Reinforcement cont’d

• When the consequence that strengthens behaviour is the


appearance (addition) of a new stimulus, the situation is
defined as positive reinforcement.
• When the consequence that strengthens behaviour is the
disappearance (subtraction) of a stimulus, the process is
called negative reinforcement. If a particular action leads
to avoiding or escaping an aversive situation, the action is
likely to be repeated in a similar situation in future.
Reinforcement examples
Type of reinforcement Description Example
Positive reinforcement Receiving something pleasant will A student is praised for asking a
increase behaviour occurrences (A question. Consequently, the
consequence that creates students asks more questions
a pleasant state, making
behavior more likely to occur)

Negative reinforcement Removing something unpleasant 1) A child who is tired of hearing a


will increase behaviour parent’s nagging will do their
occurrences (A homework. The child does the
consequence that takes away homework to remove the
an unpleasant state, making nagging
behavior more likely to occur.) 2) Whining child – you give in to
the child because you are
negatively reinforced by the
whining. Next time the child
whines, you are likely to give in
because it helps get rid of an
unpleasant state of affairs
Alternative classification: Contingent vs
noncontingent reinforcement
• Contingent reinforcement, as the name suggests, is all about
dependencies. It’s the classic, “If you do this, then you get that” scenario.
This type of reinforcement involves a reward, but to earn it, a task must be
completed to a certain standard.
• Example: Students working towards an A grade at the end of the
semester know precisely what they’re working towards and the steps they
need to take to get there.
• Noncontingent Reinforcement (NCR) is the process of delivering rewards based on the
passage of time. Rewards are not given based on behavior.
• Example: Think of NCR as rewards that are not tied to the student’s behavior and delivered
with no strings attached.
• NCR is used to prevent behavior before it happens and to increase appropriate behavior
over time.
Types of reinforcers: Primary reinforcers
• A reinforcer is a stimulus that is presented or removed with the consequences of increasing the
probability of a response. There are two types of reinforcers namely: primary and secondary
reinforcers.

• Primary reinforcers include all those items that satisfy the primary human needs. They include
food, shelter and clothing. A primary reinforcer is rewarding in and of itself without association
with other reinforcers. If you reward a child with a sweet for displaying good behaviour the child
does not associate a sweet with anything to find it rewarding. It is rewarding just as it is.

• However, primary reinforcers are not encouraged because they are not effective in bringing about
long-lasting change in behavior. The effect is temporary.
Types of reinforcers: secondary
reinforcers
A secondary reinforcer is one whose value has to be learned through association with
other reinforcers. We have to learn to find secondary reinforcers reinforcing through
classical conditioning. For instance, money or school grades are good examples of
secondary reinforcers.

Money is useless on its own. You can see that from the way children throw it in the bin
or tear a currency note up before they realize what it can buy for them. It only becomes
a reinforcer when it is associated with the acquisition of basic needs.

The same thing applies to school grades. On their own they mean nothing to you.
However, when you begin to associate them with getting a job which will enable you get
money that you will use to satisfy your basic needs then the grades become very
powerful reinforcers.
 Most positive reinforcers can be classified under five
somewhat overlapping headings: consumable, activity,
manipulative, possessional, and social.
 Consumable reinforcers are items that one can eat or drink
such as candy, cookies, fruit, and beverages.
 Examples of activity reinforcers are the opportunities to watch
television, look at a picture book, or even stare out of a
Classification of window.
reinforcers  Manipulative reinforcers include the opportunities to play
with a favorite toy, build with LEGO®, color or paint, ride a
(Piazza, Roanne bicycle, or surf the Internet.
 Possessional reinforcers include the opportunities to sit in
& Karsten, 2011) one’s favorite chair, wear a favorite shirt or dress, have a
private room, or enjoy some other item that one can possess
(at least temporarily).
 Social reinforcement includes affectionate pats and hugs,
praise, nods, smiles, and even a simple glance or other
indication of social attention. Attention from others is a very
strong reinforcer for almost everyone.
Schedules of reinforcement
• A schedule of reinforcement describes when and how often a response should be
reinforced. There is a relationship between the frequency of reinforcement and the
maintenance of behavior.

❑ There are five major schedules of reinforcement each associated with a characteristic
pattern of responding:

(i) continuous schedule

(ii) fixed interval schedule

(iii) variable interval schedule

(iv) fixed ratio schedule

(v) variable ratio schedule


Schedules of reinforcement cont’d
❑Continuous Schedule
• This is a schedule in which every correct or desirable response is reinforced. For

example, if every time a child displays desirable behaviour, the child gets rewarded,

the parent would be using a continuous schedule of reinforcement.

• NOTE: People, however, are much better at working for delayed


reinforcement.
Schedules of reinforcement cont’d
❑Fixed Interval Schedule
• In this schedule, reinforcement is given after a certain given period of time has elapsed. For
example, you may decide to reinforce responses after every 10 seconds or after a week.
The time interval after which a response is reinforced is predictable. This is one of the main
weaknesses of this schedule. Since it is predictable a person can decide to display the
desired behaviour towards the end of the time interval.

• EXAMPLE: a student who knows that exams come only at the end of the semester, tends to
study at the end of the semester only. Or those students in a boarding school who know
that inspection of the rooms happens only on Saturday morning and the person with the
cleanest room is rewarded, will tend to clean their rooms on Friday evening.
Schedules of reinforcement cont’d
Fixed Ratio Schedule
• In a fixed ratio schedule, reinforcement is given after a fixed number of
responses have occurred. For example, dairy farmers get paid after a fixed
number of litres of milk have been delivered no matter how long it takes, or
workers are paid according to the number of items they produce.
• Fixed-ratio schedules produce a constant, high level of response, with a slight
dip in responses occurring after each reinforcement. Therefore, the faster
people work, the more items they produce and the more money they earn.
However, quality may suffer if quantity alone determines how
reinforcements are dispensed.
Schedules of reinforcement cont’d
Variable Interval Schedule
• In this case, reinforcement is given after an unpredictable length of time has
elapsed. It could be given after a month or after a week nobody is aware when it
will be given. The person gets surprised by the reinforcement.
• Variable-ratio schedules typically produce high, steady rates of response.
They are also more resistant to extinction than fixed-ratio schedules
because one cannot reliably predict whether a given number of responses
will be rewarded.
Schedules of reinforcement cont’d
Variable Ratio Schedule
❑ Reinforcement in this case is given after an unpredictable (varied) number of
responses have occurred. It can be given after two responses or seven
responses or one response, etc. The number of responses is not fixed.
• An effective example of a variable-ratio schedule is a slot machine found in
casinos. The machines are set to pay off on a variable ratio: For example,
after an average of 10,000 quarters are played, a payout of winnings will
occur. But it is not predictable exactly when that will occur, because the
payoff may occur much earlier or much later than that average. This variable-
ratio schedule works well to keep the response coming because of the
uncertainty about exactly when the reward will occur.
Schedules of reinforcement: summary

Notice how ratio schedules produce much faster response rates than interval schedules.
However, there is usually a short pause following each reinforced set of responses under fixed-
ratio schedules. Variable-interval schedules typically produce a slow but steady rate of response.

Source: Nevid, 2017


Practice Exercise
• Operant conditioning is a form of learning in which the (a) _______ of behavior
influence the strength or likelihood that the behavior will occur. Edward Thorndike
developed the law of (b) _______, which holds that responses that have satisfying
effects will be strengthened whereas those that lead to discomfort will be
weakened. B. F. Skinner developed the principles of (c) _______ conditioning,
including the roles of positive and negative reinforcement and of schedules of
reinforcement.
• A schedule of (d) _______ reinforcement produces the most rapid learning but
also the most rapid extinction of a response when reinforcement is withheld. (e)
_______ is slower under a partial reinforcement schedule, but resistance to
extinction varies with the particular type of schedule. Response rates also vary
with the particular type of partial reinforcement schedule.
Implications for learning
• 1) When new tasks are being learnt, reinforcement should be given
immediately; there should be no delay between the response and
reinforcement.
• 2) In the early stages of learning, every correct response should be
reinforced. As learning progresses, more responses should be required
before reinforcement is given. Eventually, the reinforcement should be
more intermittent in order to maintain the behaviour.
• 3) Steps or improvements in the right direction should be reinforced.
Requiring perfect performance/answers before reinforcement
discourages learners and hinders the learning process.
4. Use the Premack principle

 Premack principle (first formulated by David Premack, 1959), which states that if the
opportunity to engage in a behavior that has a high probability of occurring is made
contingent on a behavior that has a low probability of occurring, then the behavior that has
a low probability of occurring will increase.
 Example: Suppose that parents of a 13-year-old boy observe that, during the school year,
their son spends several hours each weekday evening on Facebook or texting friends, but
almost never studies or does homework. If the parents were to assume control of their
son’s cell phone and computer each evening, and if they were to tell their son, “From now
on, each school night, for each hour of studying or doing homework, you can have access to
your computer and cell phone for the following half hour,” then studying and homework
would likely increase in frequency.
NOTE: The problem with reinforcement

• 5) Undesirable behaviour should never be reinforced.


 Unfortunately, it is possible for positive reinforcement to be used
unknowingly to strengthen undesirable behavior.
 Many undesirable behaviors are due to the social attention that
such behavior evokes from peers, teachers, parents, and others.
e.g., A teacher who gives an easier alternative assignment to a student
who complains that the given assignment is too hard.
The problem with reinforcement:
Illustration
Key differences between classical
conditioning and operant conditioning
• Classical conditioning
✓The organism is passive
✓Responses are reflexes (limited number of possible responses)
✓Connects an involuntary response to a neutral stimulus
✓Responses are not followed by a reinforcer)
• Operant conditioning
✓ The organism is active (“operating” on the world)
✓ Responses are voluntary behaviours (limitless possible response)
✓ Encourages or discourages behaviour by pairing it with a consequence
(i.e., responses are followed by a reinforcer)
Assignment
1) Give two examples
of negative
reinforcement.
2) Identify three
classroom situations
in which negative
behaviour might be
reinforced.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy