Report in Philosophy
Report in Philosophy
When Pavlov discovered that any object or event which the dogs learned to
associate with food (such as the lab assistant) would trigger the same response,
he realized that he had made an important scientific discovery. Accordingly, he
devoted the rest of his career to studying this type of learning.
Pavlovian Conditioning
Pavlov (1902) started from the idea that there are some things that a dog does not
need to learn. For example, dogs don’t learn to salivate whenever they see food.
This reflex is ‘hard-wired’ into the dog.
In behaviorist terms, food is an unconditioned stimulus and salivation is an
unconditioned response. (i.e., a stimulus-response connection that required no
learning).
Unconditioned Stimulus (Food) > Unconditioned Response (Salivate)
In his experiment, Pavlov used a metronome as his neutral stimulus. By itself the
metronome did not elecit a response from the dogs.
Neutral Stimulus (Metronome) > No Conditioned Response
Next, Pavlov began the conditioning procedure, whereby the clicking metronome
was introduced just before he gave food to his dogs. After a number of repeats
(trials) of this procedure he presented the metronome on its own.
As you might expect, the sound of the clicking metronome on its own now caused
an increase in salivation.
Conditioned Stimulus (Metronome) > Conditioned Response(Salivate)
So the dog had learned an association between the metronome and the food and a
new behavior had been learned. Because this response was learned (or
conditioned), it is called a conditioned response (and also known as a Pavlovian
response). The neutral stimulus has become a conditioned stimulus.
Pavlov found that for associations to be made, the two stimuli had to be
presented close together in time (such as a bell). He called this the law of
temporal contiguity. If the time between the conditioned stimulus (bell) and
unconditioned stimulus (food) is too great, then learning will not occur.
Pavlov and his studies of classical conditioning have become famous since his
early work between 1890-1930. Classical conditioning is "classical" in that it is
the first systematic study of basic laws of learning / conditioning.
Summary
To summarize, classical conditioning (later developed by Watson, 1913) involves
learning to associate an unconditioned stimulus that already brings about a
particular response (i.e., a reflex) with a new (conditioned) stimulus, so that the
new stimulus brings about the same response.
Pavlov developed some rather unfriendly technical terms to describe this process.
The unconditioned stimulus (or UCS) is the object or event that originally
produces the reflexive / natural response.
The response to this is called the unconditioned response (or UCR). The neutral
stimulus (NS) is a new stimulus that does not produce a response.
Once the neutral stimulus has become associated with the unconditioned
stimulus, it becomes a conditioned stimulus (CS). The conditioned response (CR)
is the response to the conditioned stimulus.
Classical Conditioning
By Saul McLeod, updated 2018
Critical Evaluation
Classical conditioning emphasizes the importance of learning from the
environment, and supports nurture over nature. However, it is limiting to
describe behavior solely in terms of either nature or nurture, and attempts to do
this underestimate the complexity of human behavior. It is more likely that
behavior is due to an interaction between nature (biology) and nurture
(environment).
A strength of classical conditioning theory is that it is scientific. This is because
it's based on empirical evidence carried out by controlled experiments. For
example, Pavlov (1902) showed how classical conditioning could be used to make
a dog salivate to the sound of a bell.
Classical conditioning is also a reductionist explanation of behavior. This is
because a complex behavior is broken down into smaller stimulus-response units
of behavior.
Supporters of a reductionist approach say that it is scientific. Breaking
complicated behaviors down to small parts means that they can be scientifically
tested. However, some would argue that the reductionist view lacks validity.
Thus, while reductionism is useful, it can lead to incomplete explanations.
A final criticism of classical conditioning theory is that it is deterministic. This
means that it does not allow for any degree of free will in the individual.
Accordingly, a person has no control over the reactions they have learned from
classical conditioning, such as a phobia.
The deterministic approach also has important implications for psychology as a
science. Scientists are interested in discovering laws which can then be used to
predict events. However, by creating general laws of behavior, deterministic
psychology underestimates the uniqueness of human beings and their freedom to
choose their own destiny.
What Is Classical
Conditioning?
A Step-by-Step Guide to How Classical Conditioning Really
Works
By Kendra Cherry | Reviewed by Steven Gans, MD
Share
Flip
Email
PRINT
Classical conditioning is a type of learning that had a major influence on the school of
thought in psychology known as behaviorism. Discovered by Russian physiologist Ivan
Pavlov, classical conditioning is a learning process that occurs through associations
between an environmental stimulus and a naturally occurring stimulus.
It's important to note that classical conditioning involves placing a neutral signal before a
naturally occurring reflex. In Pavlov's classic experiment with dogs, the neutral signal
was the sound of a tone and the naturally occurring reflex was salivating in response to
food. By associating the neutral stimulus with the environmental stimulus (presenting of
food), the sound of the tone alone could produce the salivation response.
In order to understand how more about how classical conditioning works, it is important
to be familiar with the basic principles of the process.
The first part of the classical conditioning process requires a naturally occurring stimulus
that will automatically elicit a response. Salivating in response to the smell of food is a
good example of a naturally occurring stimulus.
During this phase of the processes, the unconditioned stimulus (UCS) results in an
unconditioned response (UCR). For example, presenting food (the UCS) naturally and
automatically triggers a salivation response (the UCR).
At this point, there is also a neutral stimulus that produces no effect - yet. It isn't until this
neutral stimulus is paired with the UCS that it will come to evoke a response.
Let's take a closer look at the two critical components of this phase of classical
conditioning.
During the second phase of the classical conditioning process, the previously neutral
stimulus is repeatedly paired with the unconditioned stimulus. As a result of this pairing,
an association between the previously neutral stimulus and the UCS is formed. At this
point, the once neutral stimulus becomes known as the conditioned stimulus (CS). The
subject has now been conditioned to respond to this stimulus.
Once the association has been made between the UCS and the CS, presenting the
conditioned stimulus alone will come to evoke a response even without the unconditioned
stimulus. The resulting response is known as the conditioned response (CR).
1. Acquisition
Acquisition is the initial stage of learning when a response is first established and
gradually strengthened. During the acquisition phase of classical conditioning, a neutral
stimulus is repeatedly paired with an unconditioned stimulus. As you may recall, an
unconditioned stimulus is something that naturally and automatically triggers a response
without any learning. After an association is made, the subject will begin to emit a
behavior in response to the previously neutral stimulus, which is now known as
a conditioned stimulus. It is at this point that we can say that the response has been
acquired.
For example, imagine that you are conditioning a dog to salivate in response to the sound
of a bell. You repeatedly pair the presentation of food with the sound of the bell. You can
say the response has been acquired as soon as the dog begins to salivate in response to the
bell tone.
Once the response has been established, you can gradually reinforce the salivation
response to make sure the behavior is well learned.
2. Extinction
For example, if the smell of food (the unconditioned stimulus) had been paired with the
sound of a whistle (the conditioned stimulus), it would eventually come to evoke the
conditioned response of hunger. However, if the unconditioned stimulus (the smell of
food) were no longer paired with the conditioned stimulus (the whistle), eventually the
conditioned response (hunger) would disappear.
3. Spontaneous Recovery
4. Stimulus Generalization
Stimulus generalization is the tendency for the conditioned stimulus to evoke similar
responses after the response has been conditioned.
For example, if a dog has been conditioned to salivate at the sound of a bell, the animal
may also exhibit the same response to stimuli that are similar to the conditioned stimulus.
In John B. Watson's famous Little Albert Experiment, for example, a small child was
conditioned to fear a white rat. The child demonstrated stimulus generalization by also
exhibiting fear in response to other fuzzy white objects including stuffed toys and Watson
own hair.
5. Stimulus Discrimination
For example, if a bell tone were the conditioned stimulus, discrimination would involve
being able to tell the difference between the bell tone and other similar sounds. Because
the subject is able to distinguish between these stimuli, he or she will only respond when
the conditioned stimulus is presented.
It can be helpful to look at a few examples of how the classical conditioning process
operates both in experimental and real-world settings.
Classical Conditioning of a Fear Response
Let's examine the elements of this classic experiment. Prior to the conditioning, the white
rat was a neutral stimulus. The unconditioned stimulus was the loud, clanging sounds and
the unconditioned response was the fear response created by the noise. By repeatedly
pairing the rat with the unconditioned stimulus, the white rat (now the conditioned
stimulus) came to evoke the fear response (now the conditioned response).
In one famous field study, researchers injected sheep carcasses with a poison that would
make coyotes sick but not kill them. The goal was to help sheep ranchers reduce the
number of sheep lost to coyote killings. Not only did the experiment work by lowering
the number of sheep killed, it also caused some of the coyotes to develop such a strong
aversion to sheep that they would actually run away at the scent or sight of a sheep.
In reality, people do not respond exactly like Pavlov's dogs. There are, however,
numerous real-world applications for classical conditioning. For example, many dog
trainers use classical conditioning techniques to help people train their pets.
These techniques are also useful for helping people cope with phobias or anxiety
problems. Therapists might, for example, repeatedly pair something that provokes
anxiety with relaxation techniques in order to create an association.
Teachers are able to apply classical conditioning in the class by creating a positive
classroom environment to help students overcome anxiety or fear. Pairing an anxiety-
provoking situation, such as performing in front of a group, with pleasant surroundings
helps the student learn new associations. Instead of feeling anxious and tense in these
situations, the child will learn to stay relaxed and calm.
Skinner - Operant
Conditioning
By Saul McLeod, updated 2018
Operant conditioning is a method of learning that occurs through rewards and
punishments for behavior. Through operant conditioning, an individual makes
an association between a particular behavior and a consequence (Skinner, 1938).
By the 1920s, John B. Watson had left academic psychology, and
other behaviorists were becoming influential, proposing new forms of learning
other than classical conditioning. Perhaps the most important of these was
Burrhus Frederic Skinner. Although, for obvious reasons, he is more commonly
known as B.F. Skinner.
Skinner's views were slightly less extreme than those of Watson (1913). Skinner
believed that we do have such a thing as a mind, but that it is simply more
productive to study observable behavior rather than internal mental events.
The work of Skinner was rooted in a view that classical conditioning was far too
simplistic to be a complete explanation of complex human behavior. He believed
that the best way to understand behavior is to look at the causes of an action and
its consequences. He called this approach operant conditioning.
Negative Reinforcement
The removal of an unpleasant reinforcer can also strengthen behavior. This is
known as negative reinforcement because it is the removal of an adverse stimulus
which is ‘rewarding’ to the animal or person. Negative reinforcement strengthens
behavior because it stops or removes an unpleasant experience.
For example, if you do not complete your homework, you give your teacher £5.
You will complete your homework to avoid paying £5, thus strengthening the
behavior of completing your homework.
Skinner showed how negative reinforcement worked by placing a rat in his
Skinner box and then subjecting it to an unpleasant electric current which caused
it some discomfort. As the rat moved about the box it would accidentally knock
the lever. Immediately it did so the electric current would be switched off. The
rats quickly learned to go straight to the lever after a few times of being put in the
box. The consequence of escaping the electric current ensured that they would
repeat the action again and again.
In fact Skinner even taught the rats to avoid the electric current by turning on a
light just before the electric current came on. The rats soon learned to press the
lever when the light came on because they knew that this would stop the electric
current being switched on.
These two learned responses are known as Escape Learning and Avoidance
Learning.
Punishment (weakens behavior)
Punishment is defined as the opposite of reinforcement since it is designed to
weaken or eliminate a response rather than increase it. It is an aversive event that
decreases the behavior that it follows.
Like reinforcement, punishment can work either by directly applying an
unpleasant stimulus like a shock after a response or by removing a potentially
rewarding stimulus, for instance, deducting someone’s pocket money to punish
undesirable behavior.
Note: It is not always easy to distinguish between punishment and negative
reinforcement.
There are many problems with using punishment, such as:
Punished behavior is not forgotten, it's suppressed - behavior returns when
punishment is no longer present.
Causes increased aggression - shows that aggression is a way to cope with
problems.
Creates fear that can generalize to undesirable behaviors, e.g., fear of
school.
Does not necessarily guide toward desired behavior - reinforcement tells
you what to do, punishment only tells you what not to do.
Schedules of Reinforcement
Imagine a rat in a “Skinner box.” In operant conditioning, if no food pellet is
delivered immediately after the lever is pressed then after several attempts the rat
stops pressing the lever (how long would someone continue to go to work if their
employer stopped paying them?). The behavior has been extinguished.
Behaviorists discovered that different patterns (or schedules) of
reinforcement had different effects on the speed of learning and extinction.
Ferster and Skinner (1957) devised different ways of delivering reinforcement
and found that this had effects on
1. The Response Rate - The rate at which the rat pressed the lever (i.e.,
how hard the rat worked).
2. The Extinction Rate - The rate at which lever pressing dies out (i.e.,
how soon the rat gave up).
Skinner found that the type of reinforcement which produces the slowest rate of
extinction (i.e., people will go on repeating the behavior for the longest time
without reinforcement) is variable-ratio reinforcement. The type of
reinforcement which has the quickest rate of extinction is continuous
reinforcement.
Token Economy
Token economy is a system in which targeted behaviors are reinforced with
tokens (secondary reinforcers) and later exchanged for rewards (primary
reinforcers).
Tokens can be in the form of fake money, buttons, poker chips, stickers, etc.
While the rewards can range anywhere from snacks to privileges or activities. For
example, teachers use token economy at primary school by giving young children
stickers to reward good behavior.
Token economy has been found to be very effective in managing psychiatric
patients. However, the patients can become over reliant on the tokens, making it
difficult for them to adjust to society once they leave prison, hospital, etc.
Staff implementing a token economy programme have a lot of power. It is
important that staff do not favor or ignore certain individuals if the programme is
to work. Therefore, staff need to be trained to give tokens fairly and consistently
even when there are shift changes such as in prisons or in a psychiatric hospital.
Behavior Shaping
A further important contribution made by Skinner (1951) is the notion of
behavior shaping through successive approximation. Skinner argues that the
principles of operant conditioning can be used to produce extremely complex
behavior if rewards and punishments are delivered in such a way as to encourage
move an organism closer and closer to the desired behavior each time.
To do this, the conditions (or contingencies) required to receive the reward
should shift each time the organism moves a step closer to the desired behavior.
According to Skinner, most animal and human behavior (including language) can
be explained as a product of this type of successive approximation.
Educational Applications
In the conventional learning situation, operant conditioning applies largely to
issues of class and student management, rather than to learning content. It is
very relevant to shaping skill performance.
A simple way to shape behavior is to provide feedback on learner performance,
e.g., compliments, approval, encouragement, and affirmation. A variable-ratio
produces the highest response rate for students learning a new task, whereby
initially reinforcement (e.g., praise) occurs at frequent intervals, and as the
performance improves reinforcement occurs less frequently, until eventually only
exceptional outcomes are reinforced.
For example, if a teacher wanted to encourage students to answer questions in
class they should praise them for every attempt (regardless of whether their
answer is correct). Gradually the teacher will only praise the students when their
answer is correct, and over time only exceptional answers will be praised.
Unwanted behaviors, such as tardiness and dominating class discussion can be
extinguished through being ignored by the teacher (rather than being reinforced
by having attention drawn to them). This is not an easy task, as the teacher may
appear insincere if he/she thinks too much about the way to behave.
Knowledge of success is also important as it motivates future learning. However,
it is important to vary the type of reinforcement given so that the behavior is
maintained. This is not an easy task, as the teacher may appear insincere if
he/she thinks too much about the way to behave.
Summary
Looking at Skinner's classic studies on pigeons’ / rat's behavior we can identify
some of the major assumptions of the behaviorist approach.
• Psychology should be seen as a science, to be studied in a scientific manner.
Skinner's study of behavior in rats was conducted under carefully
controlled laboratory conditions.
• Behaviorism is primarily concerned with observable behavior, as opposed to
internal events like thinking and emotion. Note that Skinner did not say that the rats
learned to press a lever because they wanted food. He instead concentrated on
describing the easily observed behavior that the rats acquired.
• The major influence on human behavior is learning from our environment. In the
Skinner study, because food followed a particular behavior the rats learned to repeat
that behavior, e.g., operant conditioning.
• There is little difference between the learning that takes place in humans and that
in other animals. Therefore research (e.g., operant conditioning) can be carried out
on animals (Rats / Pigeons) as well as on humans. Skinner proposed that the way
humans learn behavior is much the same as the way the rats learned to press a lever.
So, if your layperson's idea of psychology has always been of people in
laboratories wearing white coats and watching hapless rats try to negotiate mazes
in order to get to their dinner, then you are probably thinking of behavioral
psychology.
Behaviorism and its offshoots tend to be among the most scientific of
the psychological perspectives. The emphasis of behavioral psychology is on how
we learn to behave in certain ways.
We are all constantly learning new behaviors and how to modify our existing
behavior. Behavioral psychology is the psychological approach that focuses on
how this learning takes place.
Critical Evaluation
Operant conditioning can be used to explain a wide variety of behaviors, from the
process of learning, to addiction and language acquisition. It also has practical
application (such as token economy) which can be applied in classrooms, prisons
and psychiatric hospitals.
However, operant conditioning fails to take into account the role of inherited
and cognitive factors in learning, and thus is an incomplete explanation of the
learning process in humans and animals.
For example, Kohler (1924) found that primates often seem to solve problems in
a flash of insight rather than be trial and error learning. Also, social learning
theory (Bandura, 1977) suggests that humans can learn automatically through
observation rather than through personal experience.
The use of animal research in operant conditioning studies also raises the issue of
extrapolation. Some psychologists argue we cannot generalize from studies on
animals to humans as their anatomy and physiology is different from humans,
and they cannot think about their experiences and invoke reason, patience,
memory or self-comfort.
Share
Flip
Email
PRINT
For example, when a lab rat presses a blue button, he receives a food pellet as a reward,
but when he presses the red button he receives a mild electric shock. As a result, he learns
to press the blue button but avoid the red button.
But operant conditioning is not just something that takes place in experimental settings
while training lab animals; it also plays a powerful role in everyday learning.
Reinforcement and punishment take place almost every day in natural settings as well as
in more structured settings such as the classroom or therapy sessions.
Let's take a closer look at how operant conditioning was discovered, the impact it had on
psychology, and how it is used to change old behaviors and teach new ones.
Operant conditioning was coined by behaviorist B.F. Skinner, which is why you may
occasionally hear it referred to as Skinnerian conditioning. As a behaviorist, Skinner
believed that it was not really necessary to look at internal thoughts and motivations in
order to explain behavior. Instead, he suggested, we should look only at the external,
observable causes of human behavior.
Through the first part of the 20th-century, behaviorism had become a major force within
psychology. The ideas of John B. Watson dominated this school of thought early on.
Watson focused on the principles of classical conditioning, once famously suggesting that
he could take any person regardless of their background and train them to be anything he
chose.
Where the early behaviorists had focused their interests on associative learning, Skinner
was more interested in how the consequences of people's actions influenced their
behavior.
Skinner used the term operant to refer to any "active behavior that operates upon the
environment to generate consequences." In other words, Skinner's theory explained how
we acquire the range of learned behaviors we exhibit each and every day.
His theory was heavily influenced by the work of psychologist Edward Thorndike, who
had proposed what he called the law of effect. According to this principle, actions that are
followed by desirable outcomes are more likely to be repeated while those followed by
undesirable outcomes are less likely to be repeated.
Operant conditioning relies on a fairly simple premise - actions that are followed by
reinforcement will be strengthened and more likely to occur again in the future. If you tell
a funny story in class and everybody laughs, you will probably be more likely to tell that
story again in the future. If you raise your hand to ask a question and your teacher praises
your polite behavior, you will be more likely to raise your hand the next time you have a
question or comment. Because the behavior was followed by reinforcement, or a
desirable outcome, the preceding actions are strengthened.
Types of Behaviors
While classical conditioning could account for respondent behaviors, Skinner realized
that it could not account for a great deal of learning. Instead, Skinner suggested that
operant conditioning held far greater importance.
Skinner invented different devices during his boyhood and he put these skills to work
during his studies on operant conditioning.
Reinforcement is any event that strengthens or increases the behavior it follows. There
are two kinds of reinforcers:
1. Positive reinforcers are favorable events or outcomes that are presented after the
behavior. In situations that reflect positive reinforcement, a response or behavior is
strengthened by the addition of something, such as praise or a direct reward. For
example, if you do a good job at work and your manager gives you a bonus.
2. Negative reinforcers involve the removal of an unfavorable events or outcomes
after the display of a behavior. In these situations, a response is strengthened by
the removal of something considered unpleasant. For example, if your child starts
to scream in the middle of the grocery store, but stops once you hand him a treat,
you will be more likely to hand him a treat the next time he starts to scream. Your
action led to the removal of the unpleasant condition (the child screaming),
negatively reinforcing your behavior.
Reinforcement Schedules
Reinforcement is not necessarily a straightforward process and there are a number of
factors that can influence how quickly and how well new things are learned. Skinner
found that when and how often behaviors were reinforced played a role in the speed and
strength of acquisition. In other words, the timing and frequency of reinforcement
influenced how new behaviors were learned and how old behaviors were modified.
We can find examples of operant conditioning at work all around us. Consider the case of
children completing homework to earn a reward from a parent or teacher, or employees
finishing projects to receive praise or promotions.
Some more examples of operant conditioning in action:
If your child acts out during a shopping trip, you might give him a treat to get him
to be quiet. Because you have positively reinforced the misbehavior, he will
probably be more likely to act out again in the future in order to receive another
treat.
After performing in a community theater play, you receive applause from the
audience. This acts as a positive reinforcer inspiring you to try out for more
performance roles.
You train your dog to fetch by offering him praise and a pat on the head whenever
he performs the behavior correctly.
A professor tells students that if they have perfect attendance all semester, then
they do not have to take the final comprehensive exam. By removing an
unpleasant stimulus (the final test) students are negatively reinforced to attend
class regularly.
If you fail to hand in a project on time, your boss becomes angry and berates your
performance in front of your co-workers. This acts as a positive punisher making it
less likely that you will finish projects late in the future.
A teen girl does not clean up her room as she was asked, so her parents take away
her phone for the rest of the day. This is an example of a negative punishment in
which a positive stimulus is taken away.
While behaviorism may have lost much of the dominance it held during the early part of
the 20th-century, operant conditioning remains an important and often utilized tool in the
learning and behavior modification process. Sometimes natural consequences lead to
changes in our behavior. In other instances, rewards and punishments may be consciously
doled out in order to create a change.
Operant conditioning is something you may immediately recognize in your own life,
whether it is in your approach to teaching your children good behavior or in training the
family dog to stop chewing on your favorite slippers. The important thing to remember is
that with any type of learning, it can sometimes take time. Consider the type of
reinforcement or punishment that may work best for your unique situation and assess
which type of reinforcement schedule might lead to the best results.
Building cognitive learning skills teaches students how to learn more effectively.
Students learn to do more than repeat what they have learned. They understand the
Memory
deeper understanding of a subject. This improves recall in the long run, so students
Application
The cognitive learning approach gives students the chance to reflect on what they are
learning and how it applies to other material. This helps students develop problem-
solving skills they need to create new connections between what they are learning.
learning. This allows them to explore the material and develop a deeper
understanding.
The cognitive learning approach teaches students the skills they need to learn
and ideas. This teaches students to make connections and apply new concepts
4. IMPROVES CONFIDENCE
With a deeper understanding of topics and stronger learning skills, students can
Giving students the chance to actively engage in learning makes it fun and
exciting. This helps students develop a lifelong love for learning outside of the
classroom.
Find out more about how the tutoring programs at GradePower Learning use
Cognitive Learning Theory implies that the different processes concerning learning can be explained
by analyzing the mental processes first. It posits that with effective cognitive processes, learning is
easier and new information can be stored in the memory for a long time. On the other hand,
ineffective cognitive processes result to learning difficulties that can be seen anytime during the
lifetime of an individual.
behavioral factors
environmental factors (extrinsic)
personal factors (intrinsic)
These 3 variables in Social Cognitive Theory are said to be interrelated with each other, causing
learning to occur. An individual’s personal experience can converge with the behavioral
determinants and the environmental factors.
In the person-environment interaction, human beliefs, ideas and cognitive competencies are
modified by external factors such as a supportive parent, stressful environment or a hot climate. In
the person-behavior interaction, the cognitive processes of a person affect his behavior; likewise,
performance of such behavior can modify the way he thinks. Lastly, the environment-behavior
interaction, external factors can alter the way you display the behavior. Also, your behavior can
affect and modify your environment. This model clearly implies that for effective and positive learning
to occur an individual should have positive personal characteristics, exhibit appropriate behavior and
stay in a supportive environment.
In addition, Social Cognitive Theory states that new experiences are to be evaluated by the learner
by means of analyzing his past experiences with the same determinants. Learning, therefore, is a
result of a thorough evaluation of the present experience versus the past.
Basic Concepts
Social Cognitive Theory includes several basic concepts that can manifest not only in adults but also
in infants, children and adolescents.
1. Observational Learning
learning from other people by means of observing them is an effective way of gaining
knowledge and altering behavior.
2. Reproduction
the process wherein there is an aim to effectively increase the repeating of a behavior by
means of putting the individual in a comfortable environment with readily accessible
materials to motivate him to retain the new knowledge and behavior learned and practice
them.
3. Self-efficacy
the course wherein the learner improves his newly learned knowledge or behavior by
putting it into practice.
4. Emotional coping
good coping mechanisms against stressful environment and negative personal
characteristics can lead to effective learning, especially in adults.
5. Self-regulatory capability
ability to control behavior even within an unfavorable environment.
B. Cognitive Behavioral Theory
Cognitive Behavioral Theory describes the role of cognition (knowing) to determining and predicting
the behavioral pattern of an individual. This theory was developed by Aaron Beck.
The Cognitive Behavioral Theory says that individuals tend to form self-concepts that affect the
behavior they display. These concepts can be positive or negative and can be affected by a person’s
environment.
Cognitive Behavioral Theory further explains human behavior and learning using the cognitive triad.
This triad includes negative thoughts about:
Types of learning
1. Implicit learning
Implicit learning is a kind of “blind” learning, as you acquire
information without realizing what you’re learning.
The main characteristic of this type of cognitive learning is that it is
unintentional. The learner doesn’t set out to learn the information
and the learning results from an automatic motor behavior.
Certain activities require an unintentional type of learning, like
walking or talking. There are more things than you might think that
you learned implicitly, without even realizing that you were
learning.
2. Explicit learning
Explicit learning is characterized by the intention of learning and
consciously setting out to learn. There are many examples of this
type of cognitive learning, like reading this article to learn about
explicit learning, as the intention is to learn about the topic.
Explicit learning is an intentional act that requires sustained
attention and an effort to learn.
4. Meaningful learning
This type of cognitive learning uses cognitive, emotional, and
motivational dimensions. The person organize and connect their
personal experiences with their learning, relating their experiences
to the information or idea that they are learning. This means that
the new concept will be unique to each individual person, as each
person will have their own history and experiences.
5. Associative learning
If you’ve ever heard of Pavlov’s dogs, you know this kind of
cognitive learning. Associative learning is defined as the association
between a determined stimulus and a precise behavior. In the case
of Pavlov’s dogs, the sound of the bell saying that it was time to eat
was translated into the dogs salivating and anticipating eating
whenever they heard the bell.
7. Discovery learning
When you actively search for information and go out of your way to
learn, you’re learning through discovery. This type of cognitive
learning is characterized by when an individual is interested, learns,
relates concepts, and adapts to the cognitive scheme.
9. Emotional learning
This type of learning involves the individual’s emotional
development and is what helps develop emotional intelligence,
which is what manages and controls emotions.
Emotions also play an important role in learning, which we will talk
about later.
Learning Windows
If we’re talking about learning, we have to mention learning
windows.
Brain-based learning defends the idea that there are certain
“windows” where you are in your prime to learn. It refers to critical
periods that favor one type of learning over another, says Francisco
Mora.
While it is possible to learn to talk at any age, our bodies are
primed to learn between the ages of 0-3, which makes it the
optimal time to learn to talk. Even though it’s possible to learn later
in life, it’s more difficult and it’s possible that the results won’t be
the same.
FACEBOOK
TWITTER
GOOGLE+
PINTEREST
LINKEDIN
One of the pioneers of educational psychology, E.L. Thorndike formulated three laws of
learning in the early 20th century. [Figure 2-7] These laws are universally accepted and
apply to all kinds of learning: the law of readiness, the law of exercise, and the law of
effect. Since Thorndike set down his laws, three more have been added: the law of
primacy, the law of intensity, and the law of recency.
Instructors can take two steps to keep their students in a state of readiness to learn.
First, instructors should communicate a clear set of learning objectives to the student
and relate each new topic to those objectives. Second, instructors should introduce
topics in a logical order and leave students with a need to learn the next topic. The
development and use of a well-designed curriculum accomplish this goal.
Readiness to learn also involves what is called the “teachable moment” or a moment of
educational opportunity when a person is particularly responsive to being taught
something. One of the most important skills to develop as an instructor is the ability to
recognize and capitalize on “teachable moments” in aviation training. An instructor can
find or create teachable moments in flight training activity: pattern work, air work in the
local practice area, cross-country, flight review, or instrument proficiency check.
For example, while on final approach several deer cross the runway. Bill capitalizes on
this teachable moment to stress the importance of always being ready to perform a go-
around.
Effect
All learning involves the formation of connections and connections are strengthened or
weakened according to the law of effect. Responses to a situation that are followed by
satisfaction are strengthened; responses followed by discomfort are weakened, either
strengthening or weakening the connection of learning. Thus, learning is strengthened
when accompanied by a pleasant or satisfying feeling, and weakened when associated
with an unpleasant feeling. Experiences that produce feelings of defeat, frustration,
anger, confusion, or futility are unpleasant for the student. For example, if Bill teaches
landings to Beverly during the first flight, she is likely to feel inferior and be frustrated,
which weakens the learning connection.
The learner needs to have success in order to have more success in the future. It is
important for the instructor to create situations designed to promote success. Positive
training experiences are more apt to lead to success and motivate the learner, while
negative training experiences might stimulate forgetfulness or avoidance. When
presented correctly, SBT provides immediate positive experiences in terms of real world
applications.
Exercise
Connections are strengthened with practice and weakened when practice is
discontinued, which reflects the adage “use it or lose it.” The learner needs to practice
what has been learned in order to understand and remember the learning. Practice
strengthens the learning connection; disuse weakens it. Exercise is most meaningful
and effective when a skill is learned within the context of a real world application.
Primacy
Primacy, the state of being first, often creates a strong, almost unshakable impression
and underlies the reason an instructor must teach correctly the first time and the student
must learn correctly the first time. For example, a maintenance student learns a faulty
riveting technique. Now the instructor must correct the bad habit and reteach the correct
technique. Relearning is more difficult than initial learning.
Also, if the task is learned in isolation, it is not initially applied to the overall
performance, or if it must be relearned, the process can be confusing and time
consuming. The first experience should be positive, functional, and lay the foundation
for all that is to follow.
Intensity
Immediate, exciting, or dramatic learning connected to a real situation teaches a learner
more than a routine or boring experience. Real world applications (scenarios) that
integrate procedures and tasks the learner is capable of learning make a vivid
impression and he or she is least likely to forget the experience. For example, using
realistic scenarios has been shown to be effective in the development of proficiency in
flight maneuvers, tasks, and single-pilot resource management (SRM) skills.
Recency
The principle of recency states that things most recently learned are best remembered.
Conversely, the further a learner is removed in time from a new fact or understanding,
the more difficult it is to remember. For example, it is easy for a learner to recall a
torque value used a few minutes earlier, but it is more difficult or even impossible to
remember an unfamiliar one used a week earlier.
Instructors recognize the principle of recency when they carefully plan a summary for a
ground school lesson, a shop period, or a postflight critique. The instructor repeats,
restates, or reemphasizes important points at the end of a lesson to help the learner
remember them. The principle of recency often determines the sequence of lectures
within a course of instruction.
In SBT, the closer the training or learning time is to the time of the actual scenario, the
more apt the learner is to perform successfully. This law is most effectively addressed
by making the training experience as much like the scenario as possible.
Laws of Learning: Primary and
Secondary | Psychology
Article Shared by <="" div="" style="margin: 0px; padding: 0px; border: 0px;
outline: 0px; font-size: 14px; vertical-align: bottom; background: transparent; max-width: 100%;">
(i) Law of Use:
When a modifiable connection is made between a situation and a
response, that connection’s strength is other things being equal,
increased’.
ADVERTISEMENTS:
(ii) Law of Disuse:
When a modifiable connection is not made between a situation and a
response over a length of time, that connection’s strength, other things
being equal, decrease.
In brief, we may say that repetition and drill helps learning, and its
absence causes forgetfulness. We also believe in the common proverb,
practice makes a man perfect’. Drill is based on the principle that
repetition fixes the facts to be learnt. That is the reason why the pupils
have to repeat arithmetical tables, formulae, spelling lists and
definitions in order to establish these.
In all skill lessons, say handwriting, dance, music, craft and drawing
repetition is necessary. Lack of practice or exercise causes the memory
of the learned material to weaken. Lack of practice causes
forgetfulness. We forget because subsequent experiences tend to rule
out what has been learnt.
ADVERTISEMENTS:
(ii) We should have constant practice in what has once been learnt.
(iii) Much time should not elapse between one practice and the
subsequent one. Delayed use or long disuse may cause forgetfulness.
ADVERTISEMENTS:
Law of Effect:
Thorndike defines it as follows:
“When a modifiable connection between a situation and
response is made and is accompanied or followed by a
satisfying state of affairs that connection’s strength is
increased, but when made and accompanied by an annoying
state of affairs its strength is decreased”.
ADVERTISEMENTS:
In simpler words, it means that a response which gives achievement of
the goal and thus provides satisfaction, will be stamped in, while those
which are accompanied by dissatisfaction will be stamped out. In
short, the feeling or the emotional state affects learning.
Educational Implication:
(i) As a failure is accompanied by a discouraging emotional state, it
should be avoided. The evaluation system should be so modified that
nobody is called ‘a failure’. A student may pass in 4 subjects out of 7.
He should be given a certificate to that effect, and encouraged to
appear again in the other three subjects.
(ii) Reward and recognition play a great role in encouraging the pupil.
Due recognition should be given to good achievement, so that the
pupil is cheered up to march forward.
(vi) Memory is also directly related to this law. Pleasant things are
remembered better than unpleasant things. What interests most,
which is vital for us, what gives us great satisfaction, is remembered
the most. The pupil forgets the home-task because it is unpleasant job
for him.
Law of Readiness:
“When a person feels ready to act or to learn, he acts or learns more
effectively and with greater satisfaction than when not ready’. Before
actual learning, one must be mentally prepared; one’s mind, must be
mentally-set.
Educational Implications:
(i) Readiness means desire to do a job. In the absence of desire
learning cannot be effective. Hence the teacher must arouse the
interest or readiness of the pupils. In teaching any topic, he must tap
their previous knowledge, arouse interest for the new topic through
suitable questions and then announce the aim of the new lesson. So
‘motivation’ is one of the important step in lesson-planning.
(ii) Curiosity is essential for learning. Hence the teacher should
arouse curiosity for learning, so that the pupils feel ready to imbibe the
new experiences. Some teachers do not prepare their pupils
psychologically for their lessons. They dole out the knowledge they
possess in a mechanical way. The teacher should, before taking up the
new lesson arouse interest and curiosity by making the problems real
and concrete. Abstract elements not connected with real- life
situations should be avoided.
Secondary or Subordinate Laws of Learning:
Thorndike gave the following Secondary laws also:
1. Law of Primacy.
2. Law of Recency.
1. Law of Primacy:
‘Learning that takes place in the beginning is the best and lasting’.
Usually we say, first impression is the best. Hence the pupils should
make the right start, and be most serious even from the first day. The
learning on the first day is most vivid and strong. The teacher also
should be most serious on the first day of teaching. He must impress
his students on the very first day.
2. Law of Recency:
‘Recent acts are lasting’. We remember those things better which are
recent. Hence a pupil should revise his entire course just before the
examination. Without revision, he is apt to forget even the best
assimilated matter. The revision just before the examination helps
him.
Teaching set
Civil Secret Teachings 1.11 (11)
Observed lessons
Reward and punishment sends signals not just to the person but all who know about this
action. If more know, then more will respond to the signals.
Reward indicates trust.
Punishment indicates a need for certainty.
The rewards and punishments given by the ruler are far more significant than those given
by lesser people.
Discussion
Reward and punishment are two sides of the same coin. They are both forms of extrinsic
motivation, which can be rather pernicious in the way it appears to work at the time.
In conditioning, punishment stops action while reward encourages it. Yet many use
punishment with the intent of persuading people what they should do. This is one reason why
punishment can be ineffective. It can also cause reaction or other forms of coping that easily
becomes dysfunctional.
Intrinsic motivation, on the other hand, seeks to build deep personal motivation through
inspiration and other more difficult forms of motivating people. The main problem for many
leaders is that intrinsic motivation is harder, requiring more time and skill. Yet done well it is
far more powerful.
In general, the Civil Secret Teachings puts far more emphasis on intrinsic motivation, which
illustrates the maturity of the author, even though it was written many centuries ago.
Abstract
Go to:
Introduction
Reward and punishment are potent modulators of human and animal behavior (Thorndike,
1911; Pavlov, 1927; Skinner, 1938; Sutton and Barto, 1998). However, despite the great increase
in knowledge in the past two decades of the neural basis of the reward effect (Schultz, 2002), and
that of punishment to a lesser extent, we lack clear data about how reward and punishment
influence the learning of specific behaviors, apart from those in classical and instrumental
conditioning, and how this might be mediated at a neural level (Delgado, 2007). To address this
issue, we focused on procedural learning, a distinct learning behavior that is the foundation of
many of our motor skills and perhaps other functions such as cognitive, category and perceptual
learning (Squire, 2004). The fact that procedural learning is thought to be largely dependent on
the basal ganglia (Willingham et al., 2002), which also mediate the effect of reward and
punishment (Schultz, 2002), make it an ideal behavior to study.
To test the influence of reward and punishment on procedural learning, we used a modified
version of the serial reaction time (SRT) task (Nissen and Bullemer, 1987), a simple and robust
experimental probe (Robertson, 2007), during which continuous modulation of motor output was
required and reinforcement and non-reinforcement learning were dissociated. We opted to use
monetary reward (and punishment) because it is a strong modulator of human behavior and has
clear effects on brain activity (Breiter et al., 2001; Delgado et al., 2003). As in the original SRT
task, subjects pressed one of four buttons with the right hand when instructed by a visual display.
Trials were presented in blocks in which the lights were illuminated either randomly or,
unbeknownst to the subject, on the basis of a 12-element repeating sequence. All subjects first
performed several blocks of random trials to minimize the effects of learning the general
visuomotor behavior on later blocks, and to establish an individual criterion response time (cRT)
on which subsequent reward and punishment would be based. Subjects were then randomly
assigned to reward, punishment, and control groups. Because we used reaction time (RT) as an
index of learning, subjects were, therefore, rewarded for faster responses, which were easiest to
generate through learning the repeating sequence. To control for a potential distractor effect of
the incentives on the learning process, several additional blocks without reward or punishment
were presented after the trials with incentives. The blocks were presented in the order outlined
in Figure1A-B. In the current report we describe the impact of reward and punishment
behavioral results using a large cohort of subjects and the neural correlates of this behavior in a
smaller number of different subject on whom functional imaging was also performed.
Open in a separate window
Figure 1
Behavioral results: (A) Time course of response time gains and, (B), error rates in subject groups during
task performance. The mean (± SEM) response time gain for each block, relative to individual subjects'
cRT, in the reward [yellow (n=21)], punishment [blue (n=24)] and control [black (n=19)] groups. The
gray hatching indicates the blocks during which incentives were used. Random and sequence blocks were
presented either with, (R and S), or without, (r and s), monetary incentives, respectively. (C) The absolute
gain in response time due to learning in subject groups. The mean (± SEM) of the absolute response gain
in the transfer portion of the task (blocks 12-15) was derived by comparing RT in the sequence block
(block 14) to the mean of the adjacent blocks (blocks 12,13, and 16) for each group. The difference
between the reward and the other two groups was significant (p =0.003). Error bars indicate standard
error of mean.
Go to:
Subjects
In the behavioral study we tested 91 healthy human subjects (female= 69) (mean 21.7 ± 3.5
years); in the functional imaging component, we studied 41 right-handed subjects (22 female; 21
years ± 2.58). All subjects were of similar socioeconomic backgrounds recruited from the
University community, and gave informed consent to participate in the study which was
approved by the local Institutional Review Board.
Behavioral task
The task (Fig. 1) was a modification of the original SRTT of Nissen and Bullemer (Nissen and
Bullemer, 1987). Subjects were presented with four visual stimuli arranged horizontally. Each
stimulus was associated with one of the four fingers of the right hand; illumination of the visual
stimulus indicated the finger to be moved. Subjects used a key-press device to respond as quickly
as possible when instructed by the visual display. Stimuli were presented in blocks of 96 trials in
either a pseudorandom or repeating 8 × 12-element sequence with an 800 ms inter-stimulus
interval. In order to minimize awareness, the sequence was constrained (Willingham et al.,
2000). Contrary to the original SRT task, random and sequence blocks were presented both with,
(R-blocks and S-blocks), or without, (r-blocks and s-blocks), monetary incentives, respectively.
For the purposes of the current experiment, we define reward and punishment in terms of
positive and negative monetary incentives. Of the 91 original subjects, 25 subjects were excluded
because they developed explicit knowledge of the sequence (see below), and a further two
subjects were eliminated because did not perform the test properly, leaving on cohort of 64
subjects. At the beginning of the experiment, subjects performed four practice r-blocks; we then
calculated a criterion response time (cRT) for each subject based on their median RT in the last
of the four blocks. Subjects were then randomized into reward (n=21), punishment (n=24) and
control (n=19) groups. It is important to note, that since learning occurs implicitly in the SRT
task, reward or punishment could not be applied directly in this paradigm. However learning in
the SRT task leads to faster reaction time and this is usually used as a behavioural index of
learning in this task. We therefore rewarded / punished subjects according to the change
(increase/decrease) of their individual reaction time. Those in the reward group were informed
that they would be rewarded (+ 4 cents) for each trial for which their RT was less then cRT; the
penalty group were penalized (− 4 cents) if their RT was greater than their cRT. Rewarded
subjects started with $0, while punished group were given $38. Because of normalization of
incentives to base performance both rewarded and punished subjects ended up with an average of
$21-22, while the control subjects received a fixed payment of $23. The incentive schedule also
controlled for motivation between the two groups. Subjects received ongoing feedback of their
performance both through an incrementing (or decrementing) counter displaying their current
monetary position displayed on the screen and also by the color of the visual stimuli (green and
red stimuli indicated that the RT on the preceding trial was less or greater than the cRT,
respectively). Since learning of the sequence enabled subjects to reduce their RT significantly
compared to the cRT, those in the reward group were rewarded for learning, while the penalty
group were punished if they did not learn. Control subjects were presented with an equal number
of red and green stimuli and told that the color was of no significance. To control for the
potential distractor effect of the counter in the incentive groups, the controls saw an identical
tally that kept a count of either red or green lights. After the initial practice session (four
consecutive r-blocks) the full experiment comprised 15 blocks in the following order: r-R-R-R-
R-S-S-S-S-R-S-r-r-s-r (R = random blocks; S = sequence blocks; upper case, blocks with
incentives in rewarded and punished groups; lower case, blocks without incentives). The color of
the feedback visual stimuli was counter-balanced across subjects (Fig. S1). Blocks were
separated by 30 second breaks. Of the 64 subjects that are the basis of the behavioral component
of the current report, the first 27 subjects performed only the first 11 blocks since the remaining
4 blocks were added later to measure the knowledge transfer in the absence of incentives in the
different groups.
A different group of 41 (reward, n=11; punished, n=13; control, n=17) subjects was used for the
imaging experiment of whom 32 (reward, n=11; punished, n=13; control, n=17) learned the
sequence implicitly. The behavioral paradigm was identical to that of the behavioral experiment,
with the exception of the monetary incentives which were adjusted as follows: reward group
started with $0 and were rewarded with 10 cents for every correct response; in the punished
group 20 cents were deducted for each incorrect response from a base of $95; the average final
payment for the rewarded and punished groups was $55, which was also the fixed payment given
to the control group.
Explicit Learning
Immediately after performing both the behavioral and the imaging experiments, all subjects were
tested for explicit knowledge of the sequence. The participants were told that the stimuli might
have appeared in a repeating sequence and were asked to reproduce the sequence in a free recall
task. They were allowed to use the key-press device if they thought it might be helpful. Subjects
who could not recall anything were encouraged to guess. If a participant could reproduce a
correct string of more than 4 consecutive items of the sequence, learning was considered to be
explicit and the subject excluded from the analysis. We used a relatively conservative measure of
explicit knowledge (Willingham and Goedert-Eschmann, 1999) to ensure that the effects
described were attributable primarily to procedural knowledge. To limit the number of subjects
excluded from the fMRI analysis because of explicit learning, we used a slightly less
conservative measure and the learning was considered to be explicit if a subject could reproduce
a correct string of more than 5 consecutive items of the sequence.
Imaging parameters
Image acquisition was done with a 3 Tesla Whole Body MR System (MAGNETOM Trio,
Siemens Medical Systems, Erlangen, Germany) using a Siemens head coil. High-resolution
(1mm iso-voxel) T1-weighted-3d-FLASH sagittal sequences (160 slices per slab, FOV
256×256mm, TR=20ms, TE=4.7ms, flip angle=22 degree, slice thickness=1 mm) was first
acquired to enable localization of functional images. Thereafter whole brain fMRI was
performed using an echo-planar imaging (EPI) sequence measuring blood oxygenation level
dependent (BOLD) signal. A total of 30 functional slices per volume were acquired for all
subjects, in all runs. These slices had a thickness of 3mm and they were acquired in the
transverse plane (matrix size 64×64), field of view (FOV) 192×192 mm, with a 33% gap. A
complete scan of the whole brain was acquired in 2560 ms (TR), flip angle=80 degree, TE=30
ms and a total number of 611 volumes were acquired for the whole experiment.
Statistical Analyses
Behavioral data
Statistical analyses were done using SPSS 11.0. Response time gains in individual subjects were
calculated by deducting of the RT of each trial from the cRT for that subject. Since RT gains
were negative values and not normally distributed, we added 1000 to each value and did a
logarithmic transformation before any statistical calculations. However, when differences in RT
or RT gains are stated or when displayed in graphs, the original, non-transformed scale was used.
Generalized linear models (GLM) were used for comparison of different conditions among
groups. The threshold for significance was 0.05 adjusted for multiple comparisons, if necessary.
To determine the extent of learning in each of the three behavioral groups (reward, punishment,
control), we used a custom orthogonal contrast between the sequence block (14) and surrounding
random blocks (13 and 15) for which we used the following contrast weights ½, −1, ½ for blocks
13, 14 and 15 respectively.
fMRI data
GLM models were used for voxel-wise data analysis. A block design protocol was used to
demonstrate the main effects: the ‘effect of procedural learning’ for all subjects, specifically we
regrouped the 15 blocks into four regressors [random and sequence, with (R,S) and without (r,s)
incentive] and the baseline in rewarded and punished subject groups. We did an additional block-
based analysis to address the neural basis of the ‘performance effect’ using a GLM that included
each block (total 15) as an independent regressor and contrasted the first non-incentive block (r;
block 1) with the first incentive block (R; block 2). An event-related protocol was used to assess
the influence of incentives by testing the difference between trials types (rewarded, punished,
control) and differences within types relating to the presence or absence of incentive. The event-
related analysis was performed for blocks 2-11 and each behavioral event (800 ms) was
classified according to our experimental manipulation (reward, punishment, or control) and the
performance of the subject (green or red lights). These predictors were entered as fixed factors in
a mixed GLM model, with subjects as random factors (to control for possible differences among
subjects).
The statistical parameters of these models were computed voxel-wise for the entire brain and
activation maps were computed for various contrasts between the predictors. The criteria used
for activation maps we generated were: cluster size of 10 adjacent voxels (to demonstrate the
performance effect, the cluster size was raised to 100 adjacent voxels) and a statistical threshold
for the cluster of p<0.001. The output of these models was selectively used for region of interest
analysis both of the learning and the performance effects using SPSS.
We performed two additional voxel-wise analyses. The first analysis (Table S3) was to
determine the brain areas that related to the enhanced learning in the reward group (punished
group as control). We first applied the main effects of the block design model as a mask
(uncorrected p<0.05 voxels > 108 mm3) within which we tested the interaction between Learning
and Group (t=3.174; p<0.005; voxels > 108 mm3); Group was defined as reward/punishment, and
Learning as sequence (S blocks)/baseline. The second analysis (Table S5) was to detect the
functional activation associated with the ‘performance effect’ in the punished group. A GLM
was built that included each block (total 15) as an independent regressor. The normalized CRT
[criterion response time; mean 0, SD 1, only for first 2 blocks] was included in this model to
determine the neural basis of the ‘performance effect’.
Go to:
Results
As mentioned above, we first present the data from a behavioral study on a large cohort of
subjects followed by the data from functional imaging on a different but smaller group
performing the same task.
Figure 2
Effect of reward and punishment on brain activity: (A,B) Event-related averaging in the rewarded group
(n=11) when rewarded trials (green stimuli) were compared to trials without reward (red stimuli) there
was significant (p<0.001) activation in the Striatum on both sides as well as the N. Accumbens on the left.
(C) Trials with punishment compared to trials without in the punished group (n=13) result in a decreased
BOLD signal in the striatum bilaterally (activation above baseline during neutral stimuli), and an increase
in the insula bilaterally. There was no activation in these regions comparing red and green stimuli (both
without incentives) in the controls (n=17).
We next determined the areas responsible for procedural learning in all subjects (across groups)
by comparing the activity during sequence blocks to the activity during random blocks. This
analysis showed a significant increase in BOLD signal during the learning phase in the putamen
bilaterally (Fig. 3A,B) as well as in the pontine gray matter and in the dentate nucleus on the
ipsilateral side (Table S2) indicating that these areas are implicated in the procedural learning
process. In order to evaluate the effect of incentives more specifically in these areas important
for procedural learning, we plotted (Fig. 3C), for each implicit learning group separately, the
changes in the BOLD signal in the putamen (left and right combined) during the following three
trial types: (i) trials in which no incentive was possible (named ‘neutral’ during r & s blocks), (ii)
incentive trials in R & S blocks (green trials for rewarded and red for punished group), and (iii)
non-incentive trials during R & S blocks (red trials for rewarded and green for punished group).
We then analyzed the BOLD signal in these ROIs using a GLM model with group (control,
rewarded, punished) and color of the stimulus (neutral, green, red) in the current trial as
independent factors. Pair-wise comparisons between various groups were assessed using Sidak
tests, adjusting for the number of multiple comparisons. We found a significant main effect of
group [F(2,1016)=32.59, p<0.001, MSE=0.093], colour of the stimulus
[F(2,1016)=18.17, p<0.001, MSE=0.093] and a significant interaction effect
[F(2,1016)=4.41, p<0.001, MSE=0.093] between the independent factors. Sidak tests for pair-wise
comparisons showed that, overall, the BOLD signal change in the putamen was different in all
three groups: it was the highest for the rewarded group, followed by controls and then by the
punished group (all differences significant at p<0.05, adjusted for multiple comparisons). In the
reward group, there was a significant increase in BOLD signal (Sidak test significant at p<0.001)
for trials that were rewarded compared to non-rewarded trials or those in which no reward
(incentive) was possible. In the punished group, the BOLD signal change during incentive trials
(i.e. punishment; red stimuli) was not different than neutral trials (Sidak test: p=0.386), but it was
significantly less (Sidak test: p<0.005) than that during non-incentive trials (green trials). It can
be seen that the colour of the stimulus had no effect on striatal activation in the control group.
Furthermore, there was no significant difference in the BOLD signal during non-incentive trials
between control and rewarded groups (red trials) or between control and punished groups (green
trials), indicating a stable baseline activation when executing the task without incentives. We
further confirmed the findings of this RO1 approach using a voxel-based analysis of the
interaction between Learning and Group (see Table S3). The voxel-wise analysis also showed an
interaction effect in the middle frontal gyrus in an area consistent with the location of dorsal
premotor cortex, which suggests that the effect we documented in basal ganglia is transmitted to
the cortex thus facilitating changes in motor output.
Figure 3
Activation in areas critical for procedural learning. (A,B) A comparison of sequence blocks with random
blocks over all groups in all subjects with implicit learning (n=41) showed a significant (p < 0.001,
uncorrected) increase in activation during motor learning in the corpus striatum (putamen and globus
pallidus) bilaterally. (C) More specific analysis of the striatal region in each group during implicit
learning showed that in the rewarded group there was a significant (p < 0.001; Sidak tests adjusted for
multiple comparisons) change in BOLD signal during trials in which subjects were rewarded (purple bar)
compared to non-rewarded trials (grey bar) or trials in which no reward could be obtained (black bar). In
contrast, the BOLD signal in the punishment group during punished trials (purple bar) and non-punished
trials (grey bar) did not significantly diverge from that during trials in blocks where no punishment could
be obtained (black bar). The difference between trials with punishment and trials without punishment is
however significant, (p < 0.005; Sidak tests adjusted for multiple comparisons). The trials in the control
group show that there was no effect of stimulus color (square on each bar). The color of the visual
stimulus for particular trial types is indicated at the midpoint of each bar. Error bars show the 95%
confidence interval.
Although in the behavioral experiment punishment showed no effect on the process of learning,
it led to an immediate performance effect expressed by a decrease in reaction time immediately
following the introduction of punishment (Fig. 1A) during the random blocks. To examine the
neural substrate of this performance effect, we compared the BOLD response in the last block of
random stimuli without incentives to the first block with incentives (Table S4). In the punished
group this contrast showed an increase in activation in the insula, and in the inferior frontal gyrus
and hippocampus, in addition to other areas (Fig. 4). Among the areas activated in this contrast
only the change in activation in the insula was correlated (Pearson r=−0.54; p<0.005; n=26) with
the performance change in punished subjects. Although the rewarded subjects also showed a
performance effect this did not correlate with changes in insula activation
(Pearson r=0.001; p=0.997; n=22). We confirmed the performance related activation of the
insula in a separate voxel-wise analysis (see Table S5)
Open in a separate window
Figure 4
Neural substrate of change in performance. (A) Activation in right insula, in punished subjects only,
following the contrast between Block 2 (R) random, with incentives (green and red stimuli combined),
and Block 1 random (r), without incentives. Statistical threshold was t(12) ≥ 5.7 with a minimum cluster
size of 100 voxels (1×1×1 mm resolution). (B,C) The correlation between changes in BOLD signal and
reaction time relative to CRT for both Block 1 and 2 combined for the rewarded (B) and the punished (C)
groups. In each panel, each subject has a pair of points, one from Block 1 and one from Block 2.
Go to:
Discussion
Our results point to fundamental differences between the effects of reward and punishment on
behavior and to their quite distinct neural substrates. In addition, we extend the relevance of
reward-based learning in the basal ganglia to the learning of habits (Packard and Knowlton,
2002) and skills and sequences (Houk et al., 1995; Berns and Sejnowski, 1998; Suri and Schultz,
1998).
The first issue to consider is whether reward and punishment have similar effects on behavior.
We designed the opponent behaviors in our experiment as they might be constructed in every
day life. Our objective was to encourage fast responses to the visual stimuli; therefore, we
rewarded the desired behavior in one group and punished the undesired behavior in another.
Both manipulations had a measurable effect on behavior and deciding which is preferable
depends on whether one is interested in short term changes in performance without enhancement
of learning or longer term changes in learning itself. Only the reward group showed enhanced
implicit learning of the motor sequence; though the punished group also learned they did not do
any better than the control subjects. The lack of an effect of punishment on learning, though
there were clear effects on other aspects of behavior (see Fig.1), was surprising and would not
necessarily have been predicted on the basis of the reinforcement learning literature. B.F.
Skinner regarded punishment as a ‘questionable technique’, speculated as to whether it actually
worked, and stressed the fact that even when it did, its effects tended to be short lived (Skinner,
1953). Of course, we recognize that our results are only directly applicable to procedural learning
paradigms and may well not generalize to the very wide range of behaviors for which reward and
punishment are used as modulators. Nevertheless, procedural learning is an important aspect of
learning and the data we present might well be applicable to the rehabilitation of motor function
in patients with various forms of motor disability including those following a stroke.
The second question is whether our results tell us anything about the interaction among positive
and negative incentives (reward and punishment), motivation, and learning. Although this issue
has been dealt with extensively in the literature on Pavlovian conditioning (Dickinson and
Balleine, 2002) it has not been clear whether reward and punishment interact with separate
motivational systems that have different neurochemical and neuroanatomical substrates and
produce differential behavioral effects (Bindra, 1974; Dickinson and Dearing, 1979; Dayan and
Balleine, 2002) and the question has never been addressed in the context of implicit procedural
learning. The behavioral task we used enabled us to separate the effects of reward and
punishment on motor performance from those on motor learning. The fact that there were
qualitatively distinct effects on these behavioral measures suggests that reward and punishment
may actually engage qualitatively different motivational systems. However, it is not clear
whether there were quantitative differences in motivation between these groups. The similarity
of the RT-error trade-off in both groups (Fig. 1) might be interpreted as indicating similar
motivation, however, a more parsimonious explanation is that they used similar criteria for this
trade-off. The negativity bias (Taylor, 1991) in the decision-making literature would suggest that
the motivation might be stronger in the punished group, given the equal monetary value of
reward and punishment, making it all the more remarkable that the rewarded subjects clearly
learned more. If reward and punishment actually access separate motivation systems, then one
would expect that this should be evident in the underlying neural substrate. In the current study,
the activity in the dorsal and ventral striatum that related to the reward per se replicates previous
findings (McClure et al., 2003; O'Doherty et al., 2003; O'Doherty et al., 2004) and most likely
represents the neural correlate of the dopaminergic neurons coding a prediction error signal in
addition to being consistent with the two process account of reinforcement learning (Montague et
al., 1996; Sutton and Barto, 1998). By contrast, punishment led to activation predominantly in
the inferior frontal gyrus and in the insula, the latter being the most consistently activated area in
a variety of studies relating to punishment (Elliott et al., 2000; Sanfey et al., 2003; Daw et al.,
2006). It has been hypothesized that punishment, even when it is associated with striatal
activation (Seymour et al., 2004), does not operate through the dopaminergic system (Ungless et
al., 2004) but is more likely mediated through the serotonergic system originating in the median
raphe nucleus (Daw et al., 2002). The end result of activating such a motivational system in our
case was the change in performance we documented with which the insula was the only area to
be significantly correlated. The difference between the results of this and many other studies is
that we were able to correlate the neural substrates of reward and punishment with qualitatively
different behavior outcomes suggesting that these modulators might indeed operate through
different motivational systems.
The final issue to consider is how the results contribute to the our knowledge of the role of the
basal ganglia in procedural learning. There is a general consensus that the basal ganglia are an
important substrate for procedural learning (Grafton et al., 1995) (Rauch et al.,
1997; Willingham et al., 2002) particularly when learning becomes more established (Poldrack et
al., 2005; Seidler et al., 2005). The location of activation associated with learning in the current
study, the dorsal striatum (putamen), is similar to that identified by others (Poldrack et al.,
2005; Seidler et al., 2005). The dorsal striatum is involved in learning stimulus-action-reward
associations during instrumental learning (Haruno et al., 2004; O'Doherty et al., 2004). In our
experiment, the association between stimulus and action was deterministic, therefore the activity
in the putamen cannot be related to learning the stimulus-action association. Similarly, because
activity in the putamen was higher in the sequence compared to the random blocks irrespective
reward delivery, this activity is not solely related to action-reward association. Our data suggest
that reward facilitates procedural motor learning within the motor system by modulating the
activity of the putamen which has extensive connections to premotor areas. It is likely that the
effect is translated into an improvement in motor learning by a dopamine-induced potentiation of
corticostriatal synapses in the striatum, similar to that which occurs following direct stimulation
of the substantia nigra (Reynolds et al., 2001). In the same context, our results may also be
relevant to patients with Parkinson disease, who show some deficits in procedural learning tasks,
but who are disabled primarily because of an inability to produce coherent sequences of over-
learned movements. It is possible that the lack of dopamine in these patients results in an
impairment of an intrinsic reward system (Mazzoni et al., 2007), based on an internal
representation of motor performance, thus disrupting the type of corticostriatal facilitation we
demonstrated, and thereby affecting the performance of sequential movements.
Abstract
Reward and punishment motivate behavior, but it is unclear exactly how they impact
skill performance and whether the effect varies across skills. The present study
investigated the effect of reward and punishment in both a sequencing skill and a
motor skill context. Participants trained on either a sequencing skill (serial reaction
time task) or a motor skill (force-tracking task). Skill knowledge was tested
immediately after training, and again 1 hour, 24–48 hours, and 30 days after training.
We found a dissociation of the effects of reward and punishment on the tasks,
primarily reflecting the impact of punishment. While punishment improved serial
reaction time task performance, it impaired force-tracking task performance. In
contrast to prior literature, neither reward nor punishment benefitted memory retention,
arguing against the common assumption that reward ubiquitously benefits skill
retention. Collectively, these results suggest that punishment impacts skilled behavior
more than reward in a complex, task dependent fashion.
Introduction
Reward and punishment, including biological reinforcers such as food, water, or pain,
are important motivators for both human and animal behavior. The majority of
neuroscience research has focused on studying the effects of reward and punishment
on decision-making1,2,3. However, in recent years interest in using reward and
punishment to augment motor skill learning has surged 4,5,6,7,8,9 raising the enticing
possibility that valenced feedback could be implemented in rehabilitation settings to
improve physical therapy outcomes 10,11,12,13. However, the variation in methodologies,
performance metrics, and retention timescales used across different studies make
establishing general principles challenging.
The present study examines the impact of reward and punishment on two different skill
learning tasks: the serial reaction time task (SRTT), a sequencing task wherein
participants press buttons in response to a stimulus appearing on a screen 14; and the
force tracking task (FTT) 15,16,17, a motor task wherein participants squeeze a force
transducer to follow a cursor on screen (Fig. 1A–D). The two tasks were implemented
in as similar a manner as possible as possible to facilitate comparison between them. In
an initial training session, participants trained on either the SRTT or FTT and received
valenced feedback (monetary reward, monetary punishment, or motivated control [see
methods]) based on their performance (calculated as [Mean Reaction Time/Accuracy
per block] for SRTT; mean distance from the target per block in the FTT). In both
tasks probe trials, during which stimuli were presented in either a fixed or a random
order, were presented before and after training. General skill learning was assessed by
comparing initial performance to performance after training, regardless of the probe
block type. Sequence-specific skill learning was distinguished from general skill
learning by comparing performance on fixed versus random probe blocks. Skill
retention was then probed in the absence of feedback at 1 hour, 24 hours, and 30 days
after completion of the training.
Figure 1
(A) Experimental design. Seventy-two participants were divided between two
skill learning tasks: a task that demands integration of multiple memory
systems, the serial reaction time task (SRTT), and a task that is learned
primarily by the motor network, the force-tracking task (FTT). Within each task,
participants were randomly assigned to three different feedback groups
(reward, punishment, control). (B) Experimental timeline. For each task, trials
were grouped into blocks of trials. Unbeknownst to the participants, during
some blocks (“fixed sequence blocks”) the stimulus would appear according to
a repeating pattern. During other periods the appearance of the stimulus was
randomly determined (“random sequence blocks”). Following familiarization
blocks, participants were trained on the task with valenced feedback. To
assess sequence knowledge, training was bookended by early and late probes
in which participants performed three blocks arranged random - sequence -
random. Participants were then tested for sequence knowledge without
feedback 1-hour, 24-hours, and 30-days after learning. (C) Serial reaction time
task. Participants were presented with four locations on a screen denoted by
“O’s”. A trial began when one “O” changed to an “X”. Participants were
instructed to press the corresponding button on a controller as fast and
accurately as possible. After 800 ms, the X changed back to an O, and
participants were given valenced feedback for their performance on that trial.
Performance in the SRTT was based on reaction time and accuracy of the
button press. If participants were accurate and faster than they performed on
their previous 96 trials, a participant would receive positive feedback (reward,
or absence of punishment) on that trial. If they were slower or inaccurate, they
would receive the negative outcome (either punishment or absence of reward).
(D) Force-tracking task. Participants held a force transducer in their right hand
and saw a black circle (start position), a blue circle (target), and a white circle
(cursor). Participants were instructed to squeeze the force transducer to keep
the cursor as close to the center of the target as possible. The target moved
continuously during the trial (12 seconds), followed by a 2 second break
between trials. The distance of the cursor from the target was the measure of
performance. If the participant was closer to the center of the target than he
were on their previous 8 trials, they would receive positive feedback. During
sequence blocks the target followed one of six trajectories, (D, left) whereas
during random blocks the target would follow a random trajectory.
Full size image
Participants were able to learn both tasks successfully and the skill learned was almost
entirely retained at 30 days. Overall, we saw little effect of reward on either learning or
retention. Punishment had no effect on skill retention, but had significant, task-
dependent effects on learning. In the SRTT punishment improved speed with minimal
impact on accuracy. In contrast, punishment impaired performance on the FTT. These
results suggest that the effect of feedback varies depending on the skill being learned,
and while feedback impacts online performance, the benefit of reward reported to
retention may be less robust than previously demonstrated.
Results
Punishment improves online performance of the serial reaction time task
We investigated the impact of reward and punishment on SRTT sequence learning in
three different ways. First, we compared sequence knowledge during sequence
knowledge probes either early in learning (immediately following familiarization when
valenced feedback was first introduced) or late in learning (at the end of the training
session) (see Fig. 1). During these probes, we estimated sequence knowledge by
calculating the reaction time (RT) difference between fixed and random blocks (Fig.
2). A repeated measures ANOVA, with Group (reward, punishment, control),
Sequence (fixed, random) and Time-point (early, late) as factors revealed a significant
three-way interaction between Group, Sequence, and Time-point (F (2,33) = 5.370, p <
0.01). Follow-up analyses indicated that both punishment and reward groups acquired
more skill knowledge during the early sequence knowledge probe than control (F (2,33) =
5.213, p < 0.05; punishment v control: t (22) = 3.455, p < 0.005, reward v control: t (22) =
2.545, p < 0.02), but did not differ from each other (reward v punishment: t (22) = 0.707,
p = 0.487). Further, the control group evidenced a greater gain in sequence knowledge
from the early- to late sequence knowledge probe compared to reward (t (22) = 2.884, p <
0.01), although this comparison was not significant for control versus punishment
when correcting for multiple comparisons (t (22) = 2.075, p = 0.05), in part reflecting the
benefit of feedback to early learning. These results suggest that feedback facilitates
rapid sequence learning on the SRTT.
2.884, p < 0.012). There was no difference between reward and control (t = (22)
skill knowledge during the early sequence knowledge probe than control (F = (2,33)
t = 2.545, p < 0.02), but did not differ from each other (reward v punishment:
(22)
possibly due to the benefit of reward during the early learning period. The
punishment group did not differ from control when correcting for multiple
comparisons [t = 2.075, p = 0.05 (α: 0.05/3 = 0.0167)]. Feedback did not affect
(22)
retention at any time point (lower right panel). Main panel shows mean ± SEM.
Box plots show median, crosses show within group outliers. Asterisks denote
periods with significant effects of feedback (p < 0.05).
Full size image
Second, to examine the effect of valenced feedback on learning rate, we compared the
median reaction time across the six consecutive sequence training blocks immediately
following the early sequence knowledge probe using a repeated measures ANOVA
with Block (1–6) and Group as factors. Participants showed improvement over the
course of training (Main effect of Block: F (5,33) = 11.224, p < 0.001). We also found a
main effect of Group (F (2,33) = 3.286, p < 0.05) and follow up tests indicated that the
punishment group was significantly faster than control overall (punishment versus
control: t(22) = 2.884, p < 0.012), but there was no difference between reward and control
(t(22) = 0.480, p = 0.636), or reward and punishment (t (22) = 1.757, p = 0.093, two-tailed)
during the training period. The lack of a significant Group by Sequence interaction in
the post- probe highlights that this is a general, rather than sequence-specific,
improvement.
Collectively, these results show that both reward and punishment increased early
learning of the sequence with punishment additionally having a marked effect on
performance during training.
Second, we examined performance during the six consecutive sequence training blocks
using repeated measures ANOVA, with Block and Group as factors. All feedback
groups showed improvement across the training period (Main effect of Block: F (5,165) =
8.478, p < 0.001; S2 versus S7: t (35) = 2.836, p < 0.01). Although reward tended to
outperform punishment during training, there was no effect of Group on learning rate
in the FTT (Group x Block: F (10,165) = 1.186, p = 0.156).
Finally, we examined the effect of valenced feedback on retention in the FTT. Five
participants did not complete the retention probes due to timetabling. This left us with
10 control, 9 reward, and 11 punishment participants for retention analyses. All groups
demonstrated retention of sequence knowledge at all time-points (Main effect of
Sequence, F (1, 27) = 86.387, p < 0.001; t (35) = 9.030, p < 0.001). There was no main
effect or interaction with feedback Group on retention.
Collectively, these results show that the primary effect of feedback in FTT was for
punishment to impair learning from the pre- to post-training probe time points.
Discussion
This study sought to determine whether the impact of reward and punishment
generalizes across different types of motor skill learning, as implemented using a
Serial Reaction Time Task (SRTT) and a Force Tracking Task (FTT). We found that
punishment had opposing effects on performance of the two skills. During performance
of the SRTT, training with punishment led to improved reaction times overall with
minimal detriment to accuracy. In contrast, punishment impaired performance of the
FTT. These effects were only present whilst feedback was being given; there was no
effect of training with feedback on general or sequence-specific retention measured at
1 hour, 24 hours, and 30 days in either task. Our results refute any simple model of the
interaction between feedback and performance. Instead, we show that the impact of
feedback depends on the training environment and the skill being learned.
There may be a number of reasons for this task-specific effect of feedback. While both
tasks rely on sequence learning, they differ with respect to the mechanism that
facilitates improvement. The motivational salience of punishment (i.e. loss aversion)
may explain the performance benefit seen on the SRTT, where the added attention
facilitated by punishment has been hypothesized to recruit additional neural resources
to aid SRTT performance 8,18. However, a purely motivational account cannot explain
the deleterious effect of punishment to performance on the FTT. Therefore, we need to
consider alternative explanations that may account for the differential effects of reward
and punishment to performance these two tasks.
The two tasks also differ with respect to their motor demands. Specifically, in our
implementation, performance on the FTT relies on more precise motor control than the
SRTT. Within the motor system, others have reported that reward-related
dopaminergic activity reduces motor noise 19, while dopaminergic activity associated
with punishment leads to an increase in motor variability, i.e. noise 20. We found that
punishment impaired general (i.e. non sequence-specific) performance on the FTT.
After one-hour, during the retention test without feedback, the punishment group
performed as well as the reward and control groups. We think that our findings are
consistent with the hypothesis that punishment may increase motor noise, which may
have led to impaired performance by the punishment group during training. Because
increased motor variability was not directly measured in our implementation of the
SRTT, participants would not be penalized for any variation in movement that did not
impact reaction time directly. If an assessment of motor variability was considered in
the evaluation of SRTT performance, one might find that punishment impairs this
dimension of performance. Our implementation of the SRTT and the FTT do not have
a direct measure of motor variability and we cannot explicitly address this issue in the
present study. Future work should examine this question.
The implementations of the tasks used here also differed with respect to the
information content of a given instance of feedback. Ordinarily, learning on the SRTT
relies on the positive prediction error encoded in the striatum that occurs on fixed-
sequence trials8,21. The reward or punishment in the SRTT may augment this positive
prediction error and facilitate performance and learning. In contrast, the moment-to-
moment feedback given on the FTT is not associated with an instantaneous positive
prediction error signal. Rather, our implementation of the FTT is similar to
discontinuous motor tasks that rely on the cerebellum and may therefore not benefit
from moment-to-moment feedback 22 (but also see Galea, et al.4 for an additional
account of cerebellar learning with feedback). Finally, although information content
was not intentionally manipulated, this difference may also alter effect the reward and
punishment on these tasks.
Unlike prior studies, we saw no benefit of reward to retention 4,7,8,10. Most studies that
have looked at reward and punishment in skill learning have only examined immediate
recall4,8,10, and only one study has shown a benefit of reward to long-term retention of
a motor skill7. In their study, Abe, et al.7 observed that the control and punishment
groups evidenced diminished performance after 30-days compared to their post-
training time-point. Importantly, Abe, et al.7 also found that the reward group showed
offline gains from the immediate time point to 24-hours after training, and this effect
persisted through 30-days. So, while in our study the punishment and control group did
not evidence forgetting from 24-hours to 30-days, potentially limiting our sensitivity to
the effect of reward, the reward group in our study also did not show any offline-gains.
As such, we are confident in our finding that reward did not impact retention.
While not discussed at length by Abe and colleagues, their punishment group
performed significantly worse during training, suggesting that the skill was not learned
as effectively by participants in that group. Therefore, it is unclear whether the
difference in memory observed in their study can be attributed to a benefit of reward to
consolidation or to ineffective acquisition when training with punishment. Our study
design differed from the implementation used by Abe and colleagues 7with respect to
the input device (whole-hand grip force in our study, precision pinch force by Abe and
colleagues), feedback timing, and trial duration. However, our result questions
robustness of the finding that reward benefits skill retention. We maximized our design
to be sensitive to differences in online-learning rather than retention, and future studies
should examine other factors that influence the effect of feedback on retention of skill
memories.
With respect to the SRTT, it is worth considering that our participants evidenced less
sequence-specific learning than some others have found in unrewarded versions of this
task, where the difference between sequence and random trials can be up to 80
ms23,24,25. However, there is considerable variability in the difference between
sequence and random trials on the SRTT reported in the literature, and some groups
have reported sequence-specific learning effects on the SRTT to be between 10 and 30
ms26,27. The difference reported after learning by the Control, Reward, and Punishment
groups in our study is approximately equal to the difference for the rewarded group
reported by Wachter, et al.8 (~30 ms) and more than observed in their control and
punishment groups. This is evidence of substantially less sequence-specific knowledge
than we observed in our study, and we are therefore confident that participants were
able to learn and express sequence-specific knowledge in all three feedback conditions.
Finally, we recognize that there are difficulties in comparing performance across tasks.
Because the tasks used here vary in performance outcome (response time in the SRTT,
tracking error in the FTT), comparing them in a quantitative way is not possible.
However, the dissociation in the effect of punishment in these contexts provides
compelling evidence that the effect does depend on task. Moreover, our study brings
together the previously disparate literature examining the effects of reward and
punishment on skill learning. This result shines light on the challenge of extrapolating
from a single experiment in a specific context to a more general account of skill
learning.
The study design was the same for both tasks (Fig. 1A). Participants trained on either
the serial reaction time task (SRTT), or the force-tracking task (FTT). For both tasks,
trials were presented over 15 blocks. A 30-second break (minimum) separated each
block of trials. Unbeknownst to the participants, during some blocks (“fixed sequence
blocks”) the stimulus would appear according to a repeating pattern (described below
for each task). During other periods the appearance of the stimulus was randomly
determined (“random sequence blocks”).
Familiarization and training blocks were conducted in the bore of an MRI scanner. To
acclimatize participants to the task, and establish their baseline level of performance,
the task began with three random-sequence blocks without feedback (“familiarization
blocks”). Participants were unaware of the forthcoming feedback manipulation during
these familiarization blocks. Then the feedback period began, starting with a pre-
training probe (three blocks, random – fixed – random), then the training blocks (six
consecutive fixed-sequence blocks), and, finally, a post-training probe (three blocks,
random – fixed – random). The difference in performance between the average of the
two random blocks, versus the fixed sequence block, during the probes was used to
index sequence knowledge 28.
To test the impact of reward and punishment on skill learning, participants were
randomised into one of 3 feedback groups: reward, punishment, or uninformative
(control). During the feedback period, reward, punishment, or control feedback was
provided based on the participant’s ongoing performance. The feedback paradigm for
each task is outlined separately below.
Participants were given retention probes at one-hour, 24–48 hours, and 30 days after
training. No feedback was delivered during the retention probes. The second probe
always occurred after at least one night’s sleep.
The initial visit (Familiarization, Early probe, Learning, and Late Probe) took place
while participants underwent MRI scanning.
Participants
78 participants (47 female, mean age = 25 years ± 4.25 std.) participated in this
experiment. All participants were right-handed, free from neurological disorders, and
had normal or corrected-to-normal vision. All participants gave informed consent and
the study was performed with National Institutes of Health Institutional Review Board
approval in accordance with the Declaration of Helsinki (93-M-0170, NCT00001360).
Data from six individuals were removed from the study due to inattention (defined as
non-responsive or inaccurate on greater than 50% of trials during training) or inability
to complete the training session.
Between each block, participants saw the phrase “Nice job, take a breather”. After five
seconds, a black fixation-cross appeared on the screen for 25 seconds. Five seconds
before the next block began, the cross turned blue to cue the participants that the block
was about to start.
During the retention probes, participants performed three blocks (random – fixed –
random on a 15-inch Macbook Pro using a button box identical to the one used during
training. During these retention probes, the next trial began 200 ms after the participant
initiated their response. No feedback was given during the retention blocks. The first
button press made after stimulus presentation was considered the participant’s
response. All responses were included in the analysis. Any missed trial was counted as
an error, and only correct trials were considered for analysis of RTs.
Force-tracking task
In the force-tracking task (FTT), participants continuously modulated their grip force
to match a target force output 16,17. In the traditional implementation, participants are
exposed to a single pattern of force modulation repeated each trial. This design does
not allow discrimination between general improvement (i.e. familiarization with the
task and/or the force transducer) and improvement specific to the trained sequence of
force modulation. Therefore, we decided to adapt the traditional FTT method to align it
with the experimental design that is traditional for the SRTT, i.e. by including random
sequence blocks.
For data analysis, the squared distance from the cursor to the target was calculated at
each frame refresh (60 Hz). The first 10 frames were removed from each trial. The
mean of the remaining time points was calculated to determine performance, and trials
were averaged across blocks.
Feedback
All participants were paid a base remuneration of $80 for participating in the study. At
the start of the feedback period, participants were informed they could earn more
money based on their performance.
During the feedback period, participants were given either reward, punishment, or
control feedback. The presence of reward or the absence of punishment was based on
participant’s performance. In both the SRTT and the FTT, an initial criterion was
defined, based on the participant’s median performance during the final familiarization
block. As participants progressed through training, this criterion was re-evaluated after
each block, to encourage continuous improvement. In the reward group, the feedback
indicated that the participant’s performance was getting better at the task. In the
punishment group, the feedback indicated they were getting worse. Because the
frequency of feedback events differed between the reward and punishment groups
(reward from high-to-low as training progressed, punishment from low-to-high), the
control group was split into two different sub-groups (control-reward and control-
punishment). The control groups received feedback at a frequency that matched the
corresponding feedback group but was not related to their performance. Participants in
the control group were made aware that the feedback was not meaningful. We
considered the reward and punishment control groups together in the analyses, as is
typical in these studies 7,8.
In the SRTT, performance was defined as the accuracy (correct or incorrect) and
reaction time (RT) of a given trial. Feedback was given on a trial-by-trial basis (Fig.
1C). This was indicated to the participant when the white frame around the stimulus
changed to green (reward) or red (punishment). In the reward group, the participants
were given feedback if their response was accurate and their RT was faster than their
criterion RT, which indicated that they earned money ($0.05 from a starting point of
$0) on that trial. In the punishment group, participants were given feedback if they
were incorrect, or their RT was slower than their criterion, which indicated that they
lost money ($0.05 deducted from a starting point of $55) on that trial. Participants in
the control-reward and control-punishment groups saw red or green colour changes,
respectively, at a frequency matched to punishment and reward, respectively. Control
participants were told that they would be paid based on their speed and accuracy.
Importantly, to control for the motivational differences between gain and loss,
participants were not told the precise value of a given trial. This allowed us to assess
the hedonic value of the feedback, rather than the level on a perceived-value function.
Between blocks, for the reward and punishment groups, the current earning total was
displayed (e.g. “You have earned $5.00”). Control participants saw the phrase, “You
have earned money”. The criterion RT was calculated as median performance in the
first familiarization block. After each block, the median + standard deviation of
performance was calculated, and compared with the criterion. If this test criterion was
faster (SRTT) or more accurate (FTT) than the previous criterion, the criterion was
updated. During the SRTT, only the correct responses were considered when
establishing the criterion reaction time.
Feedback in the FTT was based on the distance of the cursor from the target (Fig. 1C).
For the reward group, participants began with $0. As participants performed the task,
their cursor turned from white to green when the distance from the target was less than
their criterion. This indicated that they were gaining money at that time. In the
punishment group, participants began with $45, and the cursor turned red if it was
outside their criterion distance. This indicated that they were losing money. For
reward-control and punishment control, the cursor changed to green or red,
respectively, but was unrelated to their performance. For control, the duration of each
feedback instance, as well as cumulative feedback given on each trial, was matched to
the appropriate group. Between each block, participants were shown their cumulative
earnings. Control participants saw the phrase “You have money”.
Statistical analyses
In both tasks, the six training blocks were compared using a repeated-measures
ANOVA to establish differences in learning rate (Block x Group). Learning was
indexed by comparing the performance (RT and accuracy separately for SRTT;
squared distance from the target [squared error] for FTT) on the sequence blocks to the
average of the two random blocks at the pre and post training time points using a
repeated-measures ANOVA (Time point x Sequence x Group). Memory for the
sequence was evaluated by comparing the fixed block, to the mean of the two random
blocks, at each retention time point using a repeated-measures ANOVA (Time point x
Sequence x Group). A Bonferroni correction was applied for post-hoc analyses to
correct for multiple comparisons. If sphericity was violated, the Hyunh-Feldt
correction was applied.
Additional Information
How to cite this article: Steel, A. et al. The impact of reward and punishment on skill
learning depends on task demands. Sci. Rep. 6, 36056; doi: 10.1038/srep36056 (2016).
Maybe it is the paint covering their clothes or their pen and paper always in hand, these creative
people are always up to something. Whatever the traits and giveaways are, being a creative
person has health benefits including having lower levels of stress and anxiety, has healing-like
effects, assists in self-expression, enhances self-worth, lowers feelings of depression, as well as
other great benefits.
Characteristics and Qualities of Creative People
Although each and every creative person is different from the next, many of them have similar
personality traits. Here are some of the most distinct characteristics and qualities that some
highly creative people have and embody.
Independent
Usually, creative people have a strong sense of independence. They enjoy being alone with their
ideas and making art out of their thoughts. They are comfortable with being on their own when
exploring new cities or even eating at a restaurant by themselves. In a way, they would prefer not
to work on projects with other people since it is easier for them to dive in and take charge all by
themselves.
Curious
Creative people are always curious about how certain things work and what makes people tick.
They are the people who constantly ask questions and want to become knowledgeable about
random things. Their curiosity comes across in their work and art since they are brave enough to
question and then answer things others will not.
Deep thinkers
It should come as no surprise that creative people are deep thinkers. They dive deep into
philosophical questions and want to get to the root of everything. They internalize their thoughts
and their mind is always running. They are intelligent and become knowledgeable about religion,
politics, niche subjects, and the meaning of life in general. They tend to use their deep thoughts
and the inner-most workings of their mind for inspiration for their work.
Open-minded
Creative people tend to be very open to new ideas and new ways of thinking since they are
constantly thinking about life and things from different perspectives. For example, if they are a
writer, they are putting themselves in each of their characters and their mindsets when drafting
their novel. So when they are living their life, putting themselves in other people’s shoes and
thinking in a different mindset comes easily to them.
Interesting
Creative people are truly interesting individuals. They have such a vibrant personality that can
entertain and keep you occupied for hours. They have a wide range of interests which makes
talking to them about nearly anything always a fun time. They know how to conversate with a
diverse group of people and always have something interesting to say.
Fun
Creative people are a good time. They are fun to be around since they have such a thirst for life.
They are usually excited about new people, places, and things. They want to soak in everything
that life has to offer and then put their experiences into their work.
Ambitious
Creative people understand that no task is too small for them to tackle. They are ready to take on
a lot of work all at once and they know they can produce awesome results. Most of the time, they
do so under an intense deadline and plenty of stress. Whether it is a first draft of a novel due
within a month or a collection of sculptures they have to create for an art show quickly
approaching, they know they can do it.
Sensitive
Creative people are sensitive since they are very in-tune with their emotions and feelings. They
capture this sensitivity and use it as inspiration for their work. They have a certain sensitivity for
people, feedback, beauty, and other aspects of their life. They have a heightened sense of
sensitivity at all times.
Active
Creative people are active people. They have no tolerance for boredom since they always want to
be creating and improving their craft. They are down for anything that comes their way and are
ready for adventure at a moment’s notice. They thrive on new and exciting experiences since it
provides inspiration for their novel, poetry, music, dance, or artwork.
Spreads Happiness
Creative people share their gift and passion with the world and this, in turn, spreads
happiness. Spreading happiness through creativity happens when this passionate, creative person
shares their work and art with others. Their art evokes certain emotions and responses from
people, it also may inspire other people to become creative themselves. Creative people may
enjoy sharing their work with others because it also makes them happy to receive great reviews
of their work or see the expressions other’s have when they view their masterpieces. Creative
people tend to be happy, positive, and upbeat.
Final Thoughts
Creative people embody incredible characteristics and qualities that truly make them standout as
individuals. If you have a creative friend, family member, or coworker, it is always a joy to be
surrounded by them and their unique personality. If you are a highly creative person yourself, do
some self-reflection to see if you are embodying these characteristics and what other qualities
you have that correlate with your creative passion.
You can also try to foster creativity in your children or boost your own creativity as an adultif
this is something you are passionate about. Since creativity has many physical and mental health
benefits, it is something you should be exploring and implementing into your life. Becoming a
more creative person will enhance your life.
What other qualities or characteristics of creative people have you noticed? Share your thoughts
with us in the comment section below!
Looking for more creativity? Need inspiration? Head to Coloring.Club for all your creative and
coloring wants and needs.
74 comments
Even the brightest students can sometimes find themselves academically
underperforming, often through no fault of their own. When students find
themselves in this situation, it’s often because they’re stuck in a rut and are
not sure what to do to improve. If this sounds like you, the first step is to work
out the reasons why you may be underperforming, and the next step is to
work out how to tackle the problem. If you’re not sure how to go about it, this
article shows you what you can do to form an improvement plan to help you
achieve the grades you know you’re capable of achieving.
Comments (74)
1.
Anonymous
November 2, 2018 at 10:07 am
2.
???
March 15, 2018 at 7:25 am
Im so frusterated because in the beginning of the year, i had all A’s and
B’s and now i have one A+ in a class that i just have to show up for (im
a teacher’s assistant) a B+ in PE (used to have an A) a B+ in
history(used to be an A+), a C in science (used to be a B), a C in
english(has been going in between a c, a d, and a b-), and a D in
math(which used to be a B). Im really upset because i have always
struggled with math and science but i want to be a doctor but i have
never struggled in english and this year, english is my worst subject.
Recently, i had a c- in english and i took a 50 point test and got 46.5
(93%) but it only brought my grade up a little bit. I had a tutor in math
last year and it helped a ton but i don’t want to have to waste my time
driving there and back home again for three different subjects. I recently
got a 19/34 on a chemistry test and it made me really mad. I did an
extra essay to help it out and i have been writing a lot of essays lately to
apply for different language programs so i have gotton a lot of extra
practice with essay writing so i think it will help my grade. I am also
writing an essay for english which im doing pretty good on. I just want to
have at most one C by the end of the year.
Reply
3.
frank