Reinforcement
REINFORCEMENT
In its earliest technical usages, the term reinforcement implied strengthening, echoing its colloquial usage. It has been applied to a broad range of phenomena in learning, including the operant or instrumental behavior studied by B. F. Skinner and the respondent or classical conditioning procedures of Ivan Pavlov. The term, which also applies to certain types of machine learning procedures studied by computer scientists and engineers, now refers mostly to cases in which behavior has some consequences and, by virtue of these consequences, comes to occur more often. The term reward, sometimes used as a nontechnical synonym, is not equivalent. For example, one can speak of delivering a reward even without evidence that the reward has an effect on behavior.
As a classical example of reinforcement, imagine a rat in a chamber with a lever and a cup into which food pellets can be delivered. If pressing the lever does nothing, the rat presses only occasionally. If each press produces a food pellet, however, the rat presses the lever more often. The food is called a reinforcer, and the rat's lever press is said to have been reinforced. The response that increases must be the one that produced the consequence. For example, if lever presses produce shock and only the rat's jumping increases, it would be inappropriate to speak of either lever pressing or jumping as reinforced. An ambiguity of usage is that reinforcement sometimes refers to delivery of the reinforcer ("the lever press was reinforced" means that lever presses produced food) and sometimes to the resulting behavior change ("the lever press was reinforced" means that lever presses occurred more often because they produced food). The sense is usually clear from the context.
The Operant Nature of Reinforcement
Edward Thorndike's experiments with animals in problem boxes were early examples of studies of reinforcement. Typically, a food-deprived animal was placed inside a box with food available outside. Sooner or later, typically by chance, the animal operated a device that released the door and freed it from the box. With repetition it operated the device more and more rapidly upon being placed in the box. To describe his findings, Thorndike formulated his law of effect. The law went through many revisions, but its essence was that responses can be made more probable by some consequences and less probable by others; in language closer to Thorndike's, responses with satisfying effects are stamped in, whereas those with annoying effects are stamped out.
Behavior that is modified by its consequences is called operant behavior or, in older and less common usages, instrumental behavior, which is behavior that operates on its environment. It is said to be emitted because it is primarily determined by its consequences; it does not require an eliciting stimulus, as in reflex relations (e.g., as when a puff of air to the eye produces a blink). Operants are classes of responses defined by their environmental effects rather than by their topography (their form or the particular muscle groups involved). For example, a rat might press a lever with its left paw, its right paw, or both paws; it might even depress the lever by biting or sitting on it. But similar movements at the other end of the chamber, far from the lever, would not be lever presses no matter how closely they resembled the responses that operated the lever.
Skinner significantly extended the analysis of reinforcement. One of his critical contributions was to clarify the difference between operant learning and the classical or respondent learning that had been studied by Pavlov (in Pavlov's experiments, the organism's responses had no effect on the stimuli that were presented).
Consider the relation between a red traffic light and a driver's stepping on the brakes. The red light sets the occasion for stepping on the brakes; this response occurs at a red light and not at a green light because it has different consequences in the presence of each stimulus. A stimulus that signals or sets the occasion for different consequences of responding is discriminative. Three terms are involved: the discriminative stimulus, the response, and the consequences of the response in the presence of that stimulus.
For example, imagine a pigeon in a chamber with a feeder; above the feeder is a small disk or key that can be lit from behind with lights of different colors. Suppose that reinforcement of the pigeon's key peck depends on the presence of a green or a red light. If the peck produces food only when the key is green, the pigeon will come to peck in the presence of green but not red. Its pecking in the presence of green is a discriminated operant. Discriminative stimuli correspond to the stimuli colloquially called signals or cues. They do not elicit responses; instead, they set the occasions on which responses have consequences.
Positive and Negative Reinforcement
A variety of consequences can serve as reinforcers, ranging from those of obvious biological significance, such as food or water or a sexual partner, to minor changes in things seen or heard or touched. Consequences are not restricted merely to the production of stimuli. Responses can remove stimuli, as when operating a switch turns off a light; responses can prevent stimuli from occurring, as when unplugging a lamp before repairing it eliminates the possibility of a shock; and responses can even change the consequences of other responses, as when replacing a burned-out light bulb makes the previously ineffective response of operating the light switch an effective response again. Each type of consequence may affect later behavior.
A stimulus is a positive reinforcer if its presentation increases the likelihood of responses that produce it and a negative reinforcer if its removal increases the likelihood of responses that terminate or prevent or postpone it. Negative reinforcers are sometimes called aversive stimuli. Responding that turns off an aversive stimulus (e.g., when a rat turns off a loud noise by pressing a lever) is escape responding. Responding that prevents or postpones an aversive stimulus (e.g., when a rat's lever press prevents or postpones the delivery of an electric shock) is avoidance responding.
Reinforcement and Punishment
Consequences can reduce responding instead of increasing it. For example, if shock is delivered whenever a rat grooms its tail, the rat may groom its tail less often. Responding reduced by its consequences is therefore punished. Punishment, simply the inverse of reinforcement, has had a more controversial history. Thorndike's early versions of the law of effect argued that behavior could be stamped out by annoyers as well as stamped in by satisfiers, but he later withdrew the part that dealt with the stamping out of behavior.
A rat's reinforced lever pressing can be reduced when presses are punished by shock, but the responding recovers to earlier levels once punishment is discontinued. Because punishment suppresses responding only temporarily, some have argued that it is ineffective. Yet reinforcement would also have to be judged ineffective by this criterion. In early studies standards for the effectiveness of punishment were different from those for the effectiveness of reinforcement. Investigations tended to concentrate on the recovery of responding after punishment was discontinued rather than on the reduction of responding during punishment.
Punishment is effective, but that does not recommend it. It is usually an undesirable method for eliminating behavior. Thorndike and his successors were probably right for the wrong reasons. For example, one major problem is that aversive stimuli used as punishers have side effects, such as eliciting aggressive behavior.
Shaping and the Selection of Behavior
Reinforcement does not establish connections or associations between responses and stimuli. Rather, it changes the probability of classes of responses. Some responses become more likely and others less likely. In other words, some responses survive in the organism's behavior, whereas others do not. In this respect, reinforcement works by selection. It operates on populations of responses within the lifetime of the individual organism much as evolutionary selection operates on populations of organisms over successive generations.
The selective nature of reinforcement is best illustrated by shaping, a procedure that creates new responses through the successive reinforcement of other responses that more and more closely approximate it. For example, a rat may rarely if ever press a lever with a force that approximates its own body weight, but it can be induced to do so by selective reinforcement of stronger and stronger presses. Shaping begins with whatever behavior is available. At the start, most presses will probably be weak; but with reinforcement of the strongest ones, the force of pressing will increase a little, and the criterion for reinforcement then moves up to the strongest of the new population of responses. This step in turn produces another increase in force and another change in criterion, and so on, until the rat is pressing with a force that might never have occurred in the absence of the shaping procedure. Reinforcement has had many useful applications, and shaping has figured in a significant way in many of them, including the training of animals and the production of speech in autistic children.
Computational Reinforcement
Learning through reinforcement is studied in computer science and engineering, where the term reinforcement learning is applied to learning algorithms that include some of the principles of reinforcement observed in animal behavior. However, a different type of learning is most commonly studied in these fields. This is called supervised learning (or sometimes "learning by example" or "learning with a teacher") in which a system learns from a collection of examples, where each example consists of an input, usually is a list of numbers, together with a desired, or target, output specifying the response that the system should produce when given that input. The object of the task is for the system to learn a response rule that produces the desired responses for the example inputs and also for novel inputs it may receive in the future.
A reinforcement learning system, in contrast, learns from signals that evaluate the quality of the learning system's behavior without explicitly telling the system what its behavior should be. Evaluative feedback tells the learning system whether or not, and possibly by how much, its behavior has improved or deteriorated; or it provides a measure of the "goodness" of the behavior; or it just provides an indication of success or failure. Evaluative feedback does not directly tell the learner what it should have done or how it should change its behavior. Instead of trying to match a standard of correctness (i.e., match desired outputs), a reinforcement learning system tries to maximize the goodness of behavior as indicated by evaluative feedback. To do this, it has to actively try alternatives, compare the resulting evaluations, and use some kind of selection mechanism to guide behavior toward the better alternatives. Whereas supervised learning is instructional, reinforcement learning is selectional, like reinforcement in animal learning.
Reinforcement learning is useful from a computational perspective because the evaluation component, sometimes called a "critic," requires much less information and knowledge than the "teacher" in a supervised learning system. It is possible to evaluate a system's behavior without knowing what the correct behavior would be. In fact, a critic does not even need access to the learning system's actions; it can evaluate their consequences on some complex process. It does not need to know anything about the mechanism by which the actions produce these consequences. A complication is that actions can have delayed as well as immediate consequences; evaluative feedback generally evaluates the consequences of all of the system's past behavior. Researchers have developed a number of methods enabling a system to learn efficiently under these conditions.
Several technologies have benefited from applications of reinforcement learning. Although inspired by reinforcement phenomena in animal behavior, research in computational reinforcement learning seeks mainly to create learning methods that can make artificial systems more autonomous and adaptable; it does attempt to create accurate models of animal learning.
Conclusion
The principle of reinforcement emerged in experimental psychology to account for increases in an animal's responding that are due to its consequences. Computer scientists and engineers have adapted this principle to create systems that improve their behavior over time by learning from their own experiences.
See also:ALGORITHMS, LEARNING; OPERANT BEHAVIOR
Bibliography
Estes, W. K. (1944). An experimental study of punishment. Psychological Monographs 57, 263.
Pavlov, I. P. (1927). Conditioned reflexes, trans. G. V. Anrep. London: Oxford University Press.
Skinner, B. F. (1938). The behavior of organisms. New York: Appleton-Century-Crofts.
—— (1969). Contingencies of reinforcement. New York: Appleton- Century-Crofts.
—— (1981). Selection by consequences. Science 213, 501-504.
Sutton, R. S., and Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press.
Thorndike, E. L. (1898). Animal intelligence: An experimental study of the associative processes in animals. Psychological Review Monograph Supplements 2, 109.
A. CharlesCatania
Revised byAndrew G.Barto
Reinforcement
REINFORCEMENT
Although the term reinforcement has many common uses and associated meanings, its meaning is precise when used by behavior analysts and behavior therapists. The act or process of making a reinforcer contingent on behavior is termed positive reinforcement, and a reinforcer is any object or event that, when delivered following some behavior, increases the probability that the behavior will occur again. A typical example might evolve from a laboratory experiment with rats. A rat is placed in a small plastic chamber. The rat can press a lever located on one wall of the chamber. When the rat presses the lever, a small food pellet drops into a dish. If the rat returns to the lever and continues to press it would be said that the food pellet functions as a reinforcer that the behavior is maintained by positive reinforcement.
There is often confusion between positive reinforcement and negative reinforcement. Negative reinforcement occurs when a behavior results in terminating an aversive stimulus. In the case of the rat, the negative stimulus might be a loud noise. A lever press turns off the stimulus. If the rat continues to press the lever, it would be said that loud noise functions as a negative reinforcer and the behavior is maintained by negative reinforcement. Thus, both positive and negative reinforcement refer to increases in behavior, but differ in whether a pleasant stimulus is presented as the result of some behavior (positive reinforcement). Negative reinforcement is also referred to as escape (if the response turns off the stimulus each time it appears) or avoidance (if the response can postpone presentation of the stimulus).
It is important to note that reinforcement is a concept that refers to the relationship between behavior and its consequences. Stimuli or events are not assumed to have inherent reinforcing effects. For example, although most people like money and will continue to exhibit behavior that results in obtaining money, it cannot be assumed that money functions as a reinforcer for everyone. For example, money might not serve as a reinforcer for a monk devoted to an ascetic lifestyle. The defining characteristic of reinforcement depends on how a behavior is changed and not on the types of things that serve as reinforcing events (Morse & Kelleher, 1977). Factors that help determine whether a given object or event is reinforcing or punishing for a given individual include that individual's previous experiences and other features of the environment that coexist and are associated with the object or event. The upshot is that different things may function as reinforcers for different people.
Drugs can serve as reinforcers that maintain drug-seeking and drug-taking behaviors. This can be observed in the prevalence of drug use among humans and has also been shown in laboratory research with animals. In a typical laboratory experiment, the animal such as a rat or monkey has a catheter placed in a vein and connected to a pump-driven syringe. The animal can press a lever to activate the pump, and this results in a dose of a drug such as Cocaine, Heroin, Nicotine, or Alcohol being infused into the vein. If the animal continues to press the lever to obtain the drug, then the drug is said to serve as a reinforcer. Interestingly, those drugs which lead to Addiction in humans also serve as reinforcers in animals. The only exception is Marijuana (THC), which is used fairly extensively by humans but does not function as a reinforcer in animals. It should be noted that drugs that serve as reinforcers under one condition may not serve as reinforcers under other conditions. For example, nicotine serves as a reinforcer only at low doses and when doses are properly spaced. Nevertheless, the observation that drugs of abuse generally function as reinforcers in experimental animals has brought the study of drug-seeking behavior and drug abuse into a framework that allows carefully controlled behavioral analyses and the application of well-established and objective behavioral principles (Schuster & Johanson, 1981).
The acquisition of drug use in humans predominantly involves positive reinforcement, whereas the maintenance of drug use can involve both positive and negative reinforcement. The ability of a drug to serve as a positive reinforcer is usually associated with its pleasurable subjective effects (e.g. a "rush", a "high", or other feelings of intoxication). But again, given the definition of reinforcement, it is not necessary for a drug to be subjectively reinforcing or pleasurable in order for it to maintain behavior. Many drugs are also associated with symptoms of Withdrawal when abstinence is initiated following a period of regular use. In this case, taking the drug again may terminate the aversive state of withdrawal; in this way, drug use is maintained by negative reinforcement. Drug use can also be influenced by sources of reinforcement other than the direct effects of the drug. For example, social encouragement and praise from a peer group can play an important role in the development of drug use by teenagers. Biological factors may also come into play. For example, some individuals may be more or less susceptible than others to feeling and recognizing the pleasurable effects of drugs. When drug use is viewed as a behavior maintained by the reinforcing effects of drugs, it suggests that this behavior is not amoral or uncontrolled but rather that it is the result of normal behavioral processes.
(See also: Addiction: Concepts and Definitions: Causes of Substance Abuse: Learning ; Research, Animal Model: Intercranial Self-Stimulation ; Wikler's Pharmacologic Theory of Drug Addiction )
BIBLIOGRAPHY
Morse, W. H., & Kelleher, R. T. (1977). Determinants of reinforcement and punishment. In W. K. Honig & J. E. R. Staddon (Eds.), Handbook of operant behavior. Englewood Cliffs, NJ: Prentice Hall.
Schuster, S. R., & Johanson, C. E. (1981). An analysis of drug-seeking behavior in animals. Neuroscience and Biobehavioral Reviews, 5, 315-323.
Maxine Stitzer
Reinforcement
Reinforcement
Definition
Reinforcement is an act, process, circumstance, or condition that increases the probability of a person repeating a response. A reinforcer is a stimulus that follows some behavior and increases the probability that the behavior will occur. For example, when teaching a child to clean his room, a parent might reinforce him by giving him a reward for doing so without being reminded. The reward reinforces the desired behavior and increases the probability that the child will clean his room again in the future.
Description
In psychology, reinforcement is associated with two types of conditioning: classical (or Pavlovian) conditioning and operant (or Skinnerian) conditioning. Operant conditioning is frequently used to increase the frequency of a desired response by developing positive associations between it and a stimulus or rein-forcer. Behavior modification uses the principles of operant conditioning or other learning techniques to improve symptoms or to increase desired behavior.
In operant conditioning (as developed by B. F. Skinner), positive reinforcers are rewards that strengthen a conditioned response after it has occurred, such as feeding a hungry pigeon after it has pecked a key. Negative reinforcers are stimuli that are removed when the desired response has been obtained. For example, when a rat is receiving an electric shock and presses a bar that stops the shock, the shock is a negative reinforcer—it is an aversive stimulus that reinforces the bar-pressing behavior. The application of negative reinforcement may be divided into two types: escape and avoidance conditioning. In escape conditioning, the subject learns to escape an unpleasant or aversive stimulus (a dog jumps over a barrier to escape electric shock). In avoidance conditioning, the subject is presented with a warning stimulus, such as a buzzer, just before the aversive stimulus occurs and learns to act on it in order to avoid the stimulus altogether.
Punishment can be used to decrease unwanted behaviors. Punishment is the application of an aversive stimulus in reaction to a particular behavior. For children, a punishment could be the removal of television privileges when they disobey their parents or teachers. The removal of the privileges follows the undesired behavior and decreases its likelihood of occurring again.
Reinforcement may be administered according to various schedules. A particular behavior may be reinforced every time it occurs, which is referred to as continuous reinforcement. In many cases, however, behaviors are reinforced only some of the time, which is termed partial or intermittent reinforcement. Reinforcement may also be based on the number of responses, or scheduled at particular time intervals. In addition, it may be delivered in regularly or irregularly. These variables combine to produce four basic types of partial reinforcement. In fixed-ratio (FR) schedules, reinforcement is provided following a set number of responses (a factory worker is paid for every garment he assembles). With variable-ratio (VR) schedules, reinforcement is provided after a variable number of responses (a slot machine pays off after varying numbers of attempts). Fixed-interval (FI) schedules provide for reinforcement of the first response made within a given interval since the previous one (contest entrants are not eligible for a prize if they have won one within the past 30 days). Finally, with variable-interval (VI) schedules, first responses are rewarded at varying intervals from the previous one.
See alsoBehavior modification.
Resources
BOOKS
Flora, Stephen Ray. The Power of Reinforcement. Albany, NY: Statue University of New York Press, 2004.
VandenBos, Gary R., ed. APA Dictionary of Psychology. Washington D.C.: American Psychological Association, 2007.
PERIODICALS
Bogdan, Ryan, and Diego A. Pizzagalli. “Acute Stress Reduces Reward Responsiveness: Implications for Depression.” Biological Psychiatry 60.10 (Nov. 2006): 1147-54.
Dallery, Jesse, Irene M. Glenn, and Bethany R. Raiff. “An Internet-Based Abstinence Reinforcement Treatment for Cigarette Smoking.” Drug and Alcohol Dependence 86.2-3 (Jan. 2007): 230-38.
Doran, Neal, Dennis McChargue, and Lee Cohen. “Impulsivity and the Reinforcing Value of Cigarette Smoking.” Addictive Behaviors 32.1 (Jan. 2007): 90-98.
Fernandez, Melanie A., et al. “The Principles of Extinction and Differential Reinforcement of Other Behaviors in the Intensive Cognitive-Behavioral Treatment of Primarily Obsessional Pediatric OCD.” Clinical Case Studies 5.6 (Dec. 2006): 511-21.
Hand, Dennis J., Andrew T. Fox, and Mark P. Reilly. “Response Acquisition With Delayed Reinforcement in a Rodent Model of Attention-Deficit/Hyperactivity Disorder (ADHD).” Behavioural Brain Research 75.2 (Dec. 2006): 337-42.
Rieskamp, Jörg. “Positive and Negative Recency Effects in Retirement Savings Decisions.” Journal of Experimental Psychology: Applied 12.4 (Dec. 2006): 233-50.
Ruth A. Wienclaw, PhD
Reinforcement
Reinforcement
In either classical or operant conditioning, a stimulus that increases the probability that a particular behavior will occur.
In classical (Pavlovian) conditioning , where the response has no effect on whether the stimulus will occur, reinforcement produces an immediate response without any training or conditioning. When meat is offered to a hungry dog, it does not learn to salivate, the behavior occurs spontaneously. Similarly, a negative reinforcer, such as an electric shock, produces an immediate, unconditioned escape response. To produce a classically-conditioned response, the positive or negative reinforcer is paired with a neutral stimulus until the two become associated with each other. Thus, if the sound of a bell accompanies a negative stimulus such as an electric shock, the experimental subject will eventually be conditioned to produce an escape or avoidance response to the sound of the bell alone. Once conditioning has created an association between a certain behavior and a neutral stimulus, such as the bell, this stimulus itself may serve as a reinforcer to condition future behavior. When this happens, the formerly neutral stimulus is called a conditioned reinforcer, as opposed to a naturally positive or negative reinforcer, such as food or an electric shock.
In operant conditioning (as developed by B. F. Skinner ), positive reinforcers are rewards that strengthen a conditioned response after it has occurred, such as feeding a hungry pigeon after it has pecked a key. Negative reinforcers are unpleasant stimuli that are removed when the desired response has been obtained. The application of negative reinforcement may be divided into two types: escape and avoidance conditioning. In escape conditioning, the subject learns to escape an unpleasant or aversive stimulus (a dog jumps over a barrier to escape electric shock). In avoidance conditioning, the subject is presented with a warning stimulus, such as a buzzer, just before the aversive stimulus occurs and learns to act on it in order to avoid the unpleasant stimulus altogether.
Reinforcement may be administered according to various schedules. A particular behavior may be reinforced every time it occurs, which is referred to as continuous reinforcement. In many cases, however, behaviors are reinforced only some of the time, which is termed partial or intermittent reinforcement. Reinforcement may also be based on the number of responses or scheduled at particular time intervals. In addition, it may be delivered in regularly or irregularly. These variables combine to produce four basic types of partial reinforcement. In fixed-ratio (FR) schedules, reinforcement is provided following a set number of responses (a factory worker is paid for every garment he assembles). With variable-ratio (VR) schedules, reinforcement is provided after a variable number of responses (a slot machine pays off after varying numbers of attempts). Fixed-interval (FI) schedules provide for reinforcement of the first response made within a given interval since the previous one (contest entrants are not eligible for a prize if they have won one within the past 30 days). Finally, with variable-interval (VI) schedules, first responses are rewarded at varying intervals from the previous one.
See also Avoidance learning; Behavior modification; Classical conditioning; Pavlov, Ivan
Further Reading
Craighead, W. Edward. Behavior Modification: Principles, Issues, and Applications. Boston: Houghton Mifflin, 1976.
Skinner, B.F. About Behaviorism. New York: Knopf, 1974.
Reinforcement
REINFORCEMENT
Reinforcement generally refers to the increase or strengthening of a particular response following the delivery or removal of a stimulus or event. Given that this is perhaps the most fundamental process in operant learning theory, it is critical to understand the difference between positive and negative reinforcement. Simply stated, positive reinforcement involves the presentation or delivery of something "positive," and negative reinforcement involves the removal, reduction, or termination of something "negative." It is important to note that both processes have the same effect; that is, they both strengthen or reinforce particular behaviors. This is a critical point, given that negative reinforcement is often mistakenly equated with punishment.
More specifically, positive reinforcement is a process in which a stimulus is presented following a particular behavior, thereby strengthening that behavior. The stimulus is referred to as a "reinforcer" and is roughly synonymous with the word "reward." The following is a simple example of positive reinforcement: Clarice's teacher provided lavish praise (a positive reinforcer) after Clarice used the word "please." Clarice's use of this word was positively reinforced; thus, she will be more likely to say "please" in the future.
Negative reinforcement is a process that involves the removal or reduction of a negative or unwanted stimulus after a behavior occurs, thereby strengthening that behavior. Negative reinforcement involves responding to "escape" from an annoying or aversive stimulus (i.e., a negative reinforcer or "punisher").
The following example demonstrates how Adam's gift-giving behavior was negatively reinforced: Adam's angry girlfriend, Holly, was not speaking to him. To escape this aversive and unpleasant situation, Adam gave Holly a bouquet of roses. The act of giving roses led to the removal of an unwanted and aversive situation (i.e., Holly began speaking to him again); thus, Adam's behavior was negatively reinforced. As a result, Adam will be more likely to give roses to his angry girlfriend in the future.
See also:LEARNING; SKINNER, B. F.
Bibliography
Chiesa, M. Radical Behaviorism: The Philosophy and the Science. Boston, MA: Authors Cooperative, 1994.
Iwata, Brian. "Negative Reinforcement in Applied Behavior Analysis: An Emerging Technology." Journal of Applied Behavior Analysis 20 (1987):361-378.
Martin, Gary, and Joseph Pear. Behavior Modification: What It Is and How to Do It, 6th edition. Upper Saddle River, NJ: Prentice Hall, 1996.
Skinner, B. F. Science and Human Behavior. New York: Macmillan, 1953.
Skinner, B. F. Contingencies of Reinforcement: A Theoretical Analysis. New York: Appleton-Century-Crofts, 1969.
Laurie A.Greco
Reinforcement
Reinforcement
Definition
A reinforcer is a stimulus that follows some behavior and increases the probability that the behavior will occur. For example, when a dog's owner is trying to teach the dog to sit on command, the owner may give the dog a treat every time the dog sits when commanded to do so. The treat reinforces the desired behavior.
Description
In operant conditioning (as developed by B. F. Skinner), positive reinforcers are rewards that strengthen a conditioned response after it has occurred, such as feeding a hungry pigeon after it has pecked a key. Negative reinforcers are stimuli that are removed when the desired response has been obtained. For example, when a rat is receiving an electric shock and presses a bar that stops the shock, the shock is a negative reinforcer— it is an aversive stimulus that reinforces the bar-pressing behavior. The application of negative reinforcement may be divided into two types: escape and avoidance conditioning. In escape conditioning, the subject learns to escape an unpleasant or aversive stimulus (a dog jumps over a barrier to escape electric shock). In avoidance conditioning, the subject is presented with a warning stimulus, such as a buzzer, just before the aversive stimulus occurs and learns to act on it in order to avoid the stimulus altogether.
Punishment can be used to decrease unwanted behaviors. Punishment is the application of an aversive stimulus in reaction to a particular behavior. For children, a punishment could be the removal of television privileges when they disobey their parents or teacher. The removal of the privileges follows the undesired behavior and decreases its likelihood of occurring again.
Reinforcement may be administered according to various schedules. A particular behavior may be reinforced every time it occurs, which is referred to as continuous reinforcement. In many cases, however, behaviors are reinforced only some of the time, which is termed partial or intermittent reinforcement. Reinforcement may also be based on the number of responses or scheduled at particular time intervals. In addition, it may be delivered in regularly or irregularly. These variables combine to produce four basic types of partial reinforcement. In fixed-ratio (FR) schedules, reinforcement is provided following a set number of responses (a factory worker is paid for every garment he assembles). With variable-ratio (VR) schedules, reinforcement is provided after a variable number of responses (a slot machine pays off after varying numbers of attempts). Fixed-interval (FI) schedules provide for reinforcement of the first response made within a given interval since the previous one (contest entrants are not eligible for a prize if they have won one within the past 30 days). Finally, with variable-interval (VI) schedules, first responses are rewarded at varying intervals from the previous one.
See also Behavior modification
reinforcement
re·in·force·ment / ˌrē-inˈfôrsmənt/ • n. the action or process of reinforcing or strengthening. ∎ the process of encouraging or establishing a belief or pattern of behavior, esp. by encouragement or reward. ∎ (reinforcements) extra personnel sent to increase the strength of an army or similar force: a small force would hold the position until reinforcements could be sent. ∎ the strengthening structure or material employed in reinforced concrete or plastic.