INTRODUCTION TO LEARNING

______________________________

Reinforcement

 Psychologists have focused more attention on the power of consequences--rewards, punishment, and removing something unpleasant--to change behavior than any other method. Some behavior modifiers use only this method; others don't use it at all. However, it is not known exactly how reinforcement works: (a) do rewards strengthen the habit (response tendencies in a specific situation) or (b) do rewards merely give us information, letting us know which responses result in the pay offs we want? Or, (c) do rewards act primarily as pay offs for performing a certain action, thus, motivating us? This has been a controversy for decades. We still don't know. Perhaps all three processes are involved; that's my guess. Let's look at some of the complexity.

 Behaviorists have a specific definition for a reinforcer: a reinforcer is anything (like food) that is produced by an operant behavior (like pressing a bar) which increases the likelihood that the behavior will occur again in the future. Ordinarily, this is called a payoff or a reward (I often use reinforcer, payoff, and reward interchangeably), but you should realize that a reinforcer, on rare occasions, acts differently from a reward. For example, if your Dad makes a dessert every night but on one particular night announces that you get dessert that night because you studied before supper, this "reward" will probably have no effect on your studying (and, thus, isn't a reinforcer) because it really isn't meaningfully connected to or contingent on your studying. You get dessert anyway. Another example: if a teacher criticizes your hand writing, encouraging you to be more careful, and it results in your writing more neatly, then these reprimands function like reinforcers for better writing (or were they punishment for sloppy writing?). Certainly, rewards don't always work and produce the desired behavior, but, by definition, reinforcement always increases the strength of the preceding behavior.

 There are some other problems with the above definition of a reinforcer. It implies that reinforcers only influence behaviors. But there is reason to suppose that emotional reactions, thoughts, attitudes, and physiological processes are also affected by reinforcers. Also, the above definition may imply that only extrinsic material rewards (in the environment) are reinforcers, but, as we will see, simply our belief that others are impressed with us may be rewarding and feeling proud or excited may be a reinforcement. Certainly love, hate, and addictions "increase the likelihood of certain behaviors" but are they "produced by operant behaviors?" These emotions and needs precede the behavior and seem to motivate certain behaviors which will lead to desired pay offs (including feeling better which is negative reinforcement). Perhaps a need (like hunger) exists before there can be a reinforcer (food), but the drive or need is not ordinarily considered part of the reward. Again, the point is that needs, reinforcements, and rewards are related but somewhat different concepts.

 It may also surprise you but rewards will, strangely enough, sometimes reduce the frequency of the preceding behavior, i.e. have the effects of punishment. Extrinsic rewards are, in some circumstances, harmful, e.g. rewards (like "pay") may turn fun into "work," lower our motivation to do the "work," and reduce the amount of innovativeness or thinking we do about the "work" at hand, thus, making our behavior more automated and stereotyped. Warnings about when not to use material rewards are given later in the section on intrinsic motivation. Other examples of harmful rewards: giving concrete rewards (money, car use) for good grades results in lower grades! Threatening and pressuring students to do better is harmful but giving praise, offering to help, and giving encouragement is helpful (Brown, 1990). Repeatedly rewarding the student for completing easy tasks results in the student feeling less able and being less motivated. Even rewarding excellence with honor rolls and status may be detrimental if students restrict their interests or avoid hard courses to keep their GPA high. There are no simple rules that all wise people know. It is important to know some of the complexities (see Kohn, 1993, for an excellent practical summary).

 To further complicate matters, the effectiveness of a reinforcer (reward), of course, depends on the individual. Listening to loud music is a great reward for some people; it's punishment for others. Accumulating a lot of money is critical for some and rather meaningless for others. Likewise, failure affects us differently. If you are success-oriented, a failure experience seems to increase your drive to succeed and you will try again to accomplish the task. If personality-wise you focus primarily on avoiding failure, a failure is too punishing and you lose interest in the task; you won't try it again. You have to find your own reinforcers (see method #16 in chapter 11).

Losers visualize the penalties of failure. Winners visualize the rewards of success.
-Rob Gilbert

If at first you don't succeed, try, try again. This is easy for the success-oriented, hard for the person trying to avoid failing.

 Also, while it seems logical, experimentalists didn't point out until recently that the effects of a reinforcer depends on the context, i.e. a reward has much more impact on behavior if it is powerful relative to the other rewards available in the environment. Likewise, a reinforcer received in an environment rich with many other wonderful, freely available rewards, is not going to have much impact on behavior (remember John?). Thus, the payoff for argumentative-rebellious behavior could be reduced by increasing the rewards obtained from completely different behaviors, such as studying, doing the dishes, getting a job, etc. Perhaps just being in a supportive, reassuring group would reduce the reinforcement gotten from arguing or fighting. Likewise, a weak reward in a rich environment can be strengthened by reducing the free reinforcement available or by making some of the other reinforcers also contingent on the desired behavior (McDowell, 1982). Example: The satisfaction of cleaning your room may be overwhelmed by the other pleasures in the room--TV, electronic games, clothes, friends on the phone, food, etc. Self-helpers need to consider the context of their self-reinforcement.

 Considering all this complexity, some psychologists (Klein and Mowrer, 1989) advocate giving up the word reinforcer because it is so unclear. For instance, if presenting food to a very full cat doesn't alter the cat's behavior, then food isn't a reinforcer in this instance, is it? As Bandura suggests, maybe a reinforcer is merely an incentive--a motivator--when the animal is needy. For instance, it is clear that some solutions to problems can be learned but not used (we may find the bathroom long before we need it), suggesting that immediate reinforcement (although, what about the relief of knowing there is one available?) is not necessary for learning to occur. It has also been shown that thin people eat when they are hungry; overweight people eat when food is available and attractive ("The cookies will get stale if they aren't eaten"). The eating-without-being-hungry reaction at first looks like an automatic, almost uncontrollable habit response, not a matter of reinforcement by reducing hunger (but maybe some other need is reduced).

 An example of the motivational aspect of reinforcers is your weekly pay check. Especially after 20 years, the money isn't a necessary reinforcement for learning how to do your job. The pay and the threat of loosing your job are simply motivations; you work, in part, for the money. On the other hand, while it is common for self-helpers to reinforce studying by taking restful breaks, calling a friend, having a coke, taking a walk, etc., it seems unlikely that a person would study four hours every night just for those minor immediate rewards. Also, the grade arrives weeks or months after the studying! Hardly an immediate reinforcer. So, what explains studying? or working for a promotion? Frankly, psychology doesn't explain this very well. I think we study, in part, because we repeatedly remind ourselves of the long-range + and - consequences of studying, and it feels good to be making progress towards a valued future. The little rewards the self-helper gives him/herself (the 10 minute break) may make the "work" a little more pleasant and probably remind us of our long-range goals, but those goals are usually the powerful motivators.

 Early learning theorists thought that being paired very close together (contiguity) was the key to connecting the CS with the UCS (in classical) and the response with the reinforcement (in operant). Recent research has shown that close pairing does not necessarily result in learning, but rather the CS must predict the UCS and the operant behavior must truly produce the reinforcement (not just be followed by a reward). The reinforcement must be contingent on the operant behavior. Contingency--knowing some behavior leads to certain pay offs--is the basis for conditioning. The motivated student must believe that studying leads to better grades and better grades lead to more success and success leads to more satisfaction and so on.

 Naturally with all this controversy about reinforcement today, it is also questioned whether self-reinforcement will work. Many say it is the most effective self-help method we have; others totally ignore the method (Brigham, 1989). Isn't it amazing that we don't know how much of the effects of a reinforcer is due to receiving the reward itself, the personal reaction of the person to the rewarder (you or someone else), the reaction to being in control or controlled, and/or to the personal satisfaction of being successful and earning a reward? It's all intermixed. Maybe the confusion explains why people aren't more self-rewarding in order to produce more desired behavior. We apparently don't strongly believe in self-reinforcement or we'd be doing it all the time. Maybe, as Skinner thought, it is punishing to withhold a reward from ourselves, e.g. if you deprived yourself of an available fantastic reward--say a Porsche 944--until after completing the desired "target" behavior (say getting all A's this semester), would the strain of waiting for the Porsche be so unpleasant that the Porsche wouldn't actually reinforce studying? It isn't easy to say, is it? And, there is another question: would most people just cheat (if they could) and immediately take the car, forgetting about achieving the "target" GPA? I think most people could rationalize taking that beautiful little car out of storage for a special occasion or a little vacation. (In which case, you are reinforcing cheating and rationalizing.) Learning to live by the rules is a real problem, as we will see next.

 Another problem is that researchers studying self-reinforcement in children have confounded "self-control" (e.g. getting a prize after doing your school work) with external control (where the teacher sets up the reward system, including evaluating the work, deciding when and what prizes are given, etc.). Someone has to plan, execute, and monitor the system--either the teacher or the student. In most of these studies of "self-reinforcement," the little kids aren't taught to be skillful modifiers of their own behavior. So, when the teacher or a psychologist is running the project, it really isn't a self-directed project (although the student may physically give him/herself a toy as a reinforcement). If the children in these studies are not monitored by the teacher and if they grade themselves and have free access to the prizes, they tend to lie and cheat, taking the prizes rather freely (Gross and Wojnilower, 1984). That is no surprise and not a compelling argument against all self-reinforcement. It does raise questions but it is still possible that we--as adults and even as children--can learn to forego goodies and fun for a little while, so we can make these reinforcers contingent on doing the things that will improve our lives in the long run. To assume otherwise, i.e. that humans can't delay gratification and would always cheat to get what they want now, is a very negative view of the species. And it doesn't square with the bulk of the data (Mischel, 1981). Many people are testing the notion that useful knowledge (with or without reinforcers) enables a person to become self-directed (including you as you read this book).

 One more complication is that there are two aspects of self-reinforcement all mixed together. This is an example: (a) the satisfaction of sinking long shots while practicing basketball and (b) giving yourself a coke as a "reward" after doing well in basketball practice. Do both (a) and (b) actually reinforce accurate shooting? Or does (b) only reinforce practicing, not accuracy? How do we know? Secord (1977) says self-rewards and self-praise don't add much reinforcement beyond the satisfaction of doing well. On the other hand, the intrinsic satisfaction of making long shots isn't exactly self-reinforcement (you aren't in total control--you don't make every shot and you didn't create the thrill). Secord focuses on helping people set up the conditions (not reinforcement) that increases their chances of doing what they want to do but haven't been able to do, namely in my example, make more long shots (see change of environment in chapter 11).

 Age also partly determines which approaches you need to use with children or teenagers. With young children, you can teach parents and teachers how to modify the child's behavior by rewarding or punishing it. With teenagers, this manipulation of rewards frequently will not work because parents can't control much of the teenager's environment. Besides, teenagers are into self-control, i.e. doing their own thing, and skillful at resisting control. Therefore, the usual approach with teenagers is to teach them self-management training--ways of changing their own environment--so that they and their parents or teachers are both happy. Often the major task the teenager needs to learn is which of his/her behaviors will irritate others and which will eventually be reinforced by others.

 Many behaviors produce a variety of consequences. Brigham (1989) points out that almost all problem behaviors occur when the complex consequences of an action are both immediate and delayed, e.g.:

  1. taking immediate pleasures but running into trouble in the long run (smoking, over-eating, building love relationships with two people at same time, being so let's-have-a-good-time-oriented at work that you are fired),

  2. taking immediate small pleasures but loosing out on major satisfactions later on (spending money impulsively as soon as you get it rather than saving your money for major, important purchases later, having a brief affair resulting in loosing a good long-term relationship, teasing a person to the point that it becomes a big fight),

  3. avoiding a minor immediate unpleasant situation but risking a major problem (not going to the doctor to have a irregular, dark mole checked, avoiding treatment for an emotional or addiction problem, neglecting to buy condoms or to take the pill), and

  4. avoiding a minor immediate unpleasant situation and, thereby, missing out on an important future event (not studying hard enough to get into medical or law or graduate school, avoiding meeting people and not developing social skills that would lead to an enjoyable social life and wonderful relationships).

 Research has shown that animals and humans tend to take the smaller immediate reward, rather than waiting for a larger delayed pay off. Consider this example: suppose someone offered you $8 immediately for an hour of work or $10 for the work if you would wait three days to be paid, which would your take? Most would take the $8 now. But suppose someone offered you $8 for the work in 30 days or $10 in 33 days, i.e. the same 20% profit in 3 days, which would you take? The 33 day offer, of course. Maybe immediate, no-wait pay offs are just more satisfying. Maybe "a bird in the hand is worth two in the bush." Maybe life teaches us that promises may be broken. In any case, being aware of the appeal and excessive focus on the immediate pay offs, can help us cope with these situations. Where the immediate pleasures need to be decreased (#1 and #2), one should avoid the situations and develop other incompatible responses, like assuming more of a responsible leadership role at work instead of playing around. One needs to keep his/her eyes on the big long-range consequences (see motivation in chapter 14). Where one needs to tackle unpleasant immediate tasks (#3 and #4), one should change the environment or oneself so that the necessary immediate behavior is well rewarded while at the same time focusing on learning to enjoy dancing and studying. Again, keep the future in mind so you can avoid major problems and achieve major goals. When we are fully aware of all the consequences of our actions, we can have more self-control and more payoffs in the long run.

 Regardless of the outcome of these many debates and questions about the technical term reinforcement, you can rest assured that the outcome or consequences of a specific behavior will in some way influence the occurrence of that behavior in the future. Providing a material reward isn't always the best thing to do. But, assuring that genuine satisfaction follows the desired behavior will enhance your learning and/or your motivation.

 As we conclude our discussion of learning, it must be made clear that (1) learning processes are quite complicated, but there is a great deal of useful knowledge available to us in this area, (2) theories often fail to explain or predict real life behavior, and the early theorists neglected many crucial causes of our behavior, and (3) learning theories and experimental researchers have seldom developed helpful treatment or self-help methods. Hundreds of therapy and self-help procedures already exist; they were mostly invented by suffering people and creative practitioners. However, research and theories are important for knowing with greater certainty which methods work, how well they work, and why. That's why researchers should help much more in the process of "giving psychology away."


back forward

[ << ][ << ]