“The Negative Effects of Positive Reinforcement” by Michael Perone: Another Misrepresented Article


Note: I have been working on this paper for 18 months. Today when I published it, I was unaware that Dr. Perone was the head of a recent task force that concluded that contingent electric skin shock of of a population that could include people with developmental disabilities,  emotional disorders, and autistic-like behaviors could be part of an “ethically sound treatment program.”  It casts his paper in a different light. I’m leaving my writeup published for now because I think we need these answers to what is an often quoted paper. Please don’t consider it in support of Dr. Perone in any way.

“The Negative Effects of Positive Reinforcement” by Dr. Michael Perone is a scholarly article some trainers like to use to muddy the waters about positive reinforcement training. They throw out Dr. Perone’s article title like a bogeyman and use it to defend aversive methods in dog training. That usually indicates they haven’t read it. It’s a thoughtful article and has some interesting things to consider, but it doesn’t say what they seem to think it does. Not even close.

I’m going to list here and summarize the effects of positive reinforcement mentioned in the article. I’ll summarize why they have almost nothing to do with well-executed dog training. They give us something to think about in our human lives. But they apply almost exclusively to humans and our lifestyles, and the ones that can apply to animals are easily avoided.

Positive Reinforcement Can Have Delayed Aversive Consequences

Perone attributes the first mention of these aversive consequences to Skinner and quotes him several times (1971, 1983).

Here’s what they are talking about. Let’s say I spend my whole weekend water-skiing. I may come home with a sunburn (but the sun felt so good!), sore or strained muscles (but every run was great!), and maybe even a hangover (gosh that socializing was the best!). Don’t drink and boat, folks, this is just an example. I may be so wrung out after my fun weekend that I won’t have enough energy to finish the report I was supposed to have completed by Monday. All the things I did were fun and reinforcing at the time and I kept doing them, to the detriment of my body.

These potential longer-term aversive effects are one category of “negative effects” Perone is talking about.

How much do they apply to positive reinforcement-based animal training? Hardly at all! We don’t choose training methods and activities with delayed aversive consequences. As animal guardians, we aim to protect our animals from such consequences in both training and the rest of their lives. For example, we don’t let dogs overdo playing in the water hose—we don’t want to risk obsession or water intoxication. We don’t let a dog with an injury play endless games of fetch, even if they beg us. We interrupt dogs playing with each other when they begin to ramp up into over-arousal. The equivalent of my water-skiing weekend shouldn’t happen.

Perone quotes Skinner about activities that are so reinforcing they exhaust him. Skinner wrote, “Fatigue is a ridiculous hangover from too much reinforcement” (1983). He was concerned that the attraction of highly reinforcing activities would prevent him from more important activities with less immediate reinforcement. This is a crucial concern for any human with control over their activity choices, and one many of us wrestle with for most of our lives. Should I do the immediate fun thing or the less fun thing that has good results over time?

But this is unlikely to be a concern for positive reinforcement-based animal trainers. On the contrary, well-executed positive reinforcement training is a highly reinforcing activity for both the human and animal. It also has delayed positive consequences for both parties.

Do I even need to point out that aversive methods often have long-term aversive consequences, even deadly consequences? There is just no comparison.

Positive Reinforcement Can Make People Vulnerable to Exploitation by Government and Business.

This is true. Exploiters can use positive reinforcement (praise, social acceptance, money, tangible items) to draw people into dangerous or unfair situations from which they can’t escape. This happens on the large scale but also on the small, interpersonal scale. This danger, again, has very little application to training animals or to our lives with animals. We already have a ton of control over their lives, even those of us who do our best to give our animals freedom. We work hard to make even the onerous experiences of life fun for our animals. Things such as some husbandry activities, taking meds, and physical therapy. And we use positive reinforcement to give the animal more choices, more opportunities, a wider world. Plus remember: it’s fun.

Some Reinforcing Activities Naturally Have Delayed Aversive Consequences

This is a reiteration of the first point, but Perone includes a list of “more mundane” activities for short-term pleasure here.

Positive reinforcement is implicated in eating junk food instead of a balanced meal, watching television instead of exercising, buying instead of saving, playing instead of working, or working instead of spending time with one’s family. Positive reinforcement underlies our propensity toward heart disease, cancer, and other diseases that are related more to maladaptive lifestyles than to purely physiological or anatomical weaknesses.

Perone, 2003, referencing Skinner, 1971

Of course!

Here is my own example: Let’s say I eat a whole bag of Cheetos because they are engineered to taste good and cause me to want more and more. The behaviors of reaching into the bag or the bowl and putting a piece in my mouth and all other behaviors that get those Cheetos ingested are immediately and powerfully reinforced. Delayed aversive consequences can include stomachache, bloating, poor nutrition, and that “ick” feeling. Oh yeah, and getting the orange stuff all over my fingers. (See big important note at the bottom of the post. I am not food- or body-shaming here.)

Again, this doesn’t apply to animal training or living with our pets. For instance, with both horses and dogs, we educate ourselves about bloat and do our best to prevent the circumstances that can cause it. And I’m pretty sure I don’t have a single positive reinforcement dog training friend who would let their dog eat a whole bag of Cheetos.

But once during an agility trial, I gave Zani too many rich treats over the course of the day. On our last run, she had diarrhea in the ring. Was my conclusion, “Welp, better stop using positive reinforcement”? Of course not. My conclusion was, “You asshole, you made your dog sick with that Braunschweiger. It could have even been worse; dogs can suffer or even die of pancreatitis from too much fatty food. Don’t do that again.”

Aspects of Positive Reinforcement Schedules Can Be Aversive

Top-down view of a pigeon pecking a yellow button in a Skinner box

Perone describes two studies identifying aspects of positive reinforcement schedules that can be aversive. Yes, in a controlled laboratory environment, we can test to see whether an animal will work to avoid a certain positive reinforcement schedule.

In the first study, the researchers studied the effects on pigeons of a change from a rich reinforcement schedule (Variable Interval 30 seconds) to a leaner one (VI 120 seconds). With some clever indicators to the pigeons of which schedule was in effect, they showed the leaner schedule was an aversive condition compared to the richer schedule and that indicators of the leaner schedule could act as conditioned punishers (Jwaideh & Mulvaney, 1976).

In the second study, pigeons were taught to recognize predictors of changes in reinforcement schedules and reinforcer magnitude. They were given the option to “escape,” to peck a key that would stop the trial until they pecked it again. When the trial was stopped, the indicator lights changed, the “house-light” color and intensity changed, and no pecks on any keys were reinforced. It turned out that within a schedule, the pigeons were most likely to take a time-out just after being reinforced. During schedule transitions, the pigeons were most likely to take a time-out when the indicators told them they were switching from high magnitude reinforcers to lower magnitude reinforcers (Everly et al., 2014). These situations meet the criteria for aversiveness because the birds were opting to escape, to “quit the game” for a time.

These are valuable lessons. It’s important to note that these were “free operant” experiments, rather than the discrete trials we generally use in training. This post discusses the difference. In life, we should have very few situations in which we make large step-downs in reinforcer magnitude or frequency for the same behavior. But it can happen by accident or out of ignorance. If there is likely to be a step-down of this sort, we need to take action about it.

Sable dog trotting toward camera with her mouth open and tail up (looking happy)
Summer in a competitive rally run

The example that comes to mind is competitive obedience. I used to compete in rally obedience with my dog Summer. While learning and practicing, I generally reinforced (and reinforced well, with meat or cheese) every behavior. Then I carefully stepped down to every second or third behavior. This was OK with her, and she maintained her enthusiasm. But what would have happened if, at that point, I had suddenly taken her into an obedience ring and performed a minute-and-a-half-long run of 25 behaviors with no reinforcement until the end? Well, maybe nothing bad performance-wise the first time. Her behaviors were strong and resistant to extinction. But it wouldn’t have been kind, and over time (it doesn’t take much time at all!) she would have learned the trial environment predicted no goodies while in the ring. This happened to a lot of dogs before skilled positive reinforcement trainers entered the obedience world.

Thanks to modern dog training methods, we now know lots of ways to make the ring experience happier for the dog and not have that huge step-down in fun. These include using conditioned reinforcers and putting some thought into our reinforcement schedules. Luckily, I had good teachers. What I did was gradually wean Summer from intermittent treats during the run during practice while teaching her she would get a mega-treat (a whole jar of chicken baby food) at the end of the run. We even practiced a fun “hurry from the ring to our crating area to get the treat” sequence as part of the routine when preparing. Believe me, this switch did not diminish her interest and happiness with rally at all! And I was able to do the same during trials, so trials didn’t predict a leaner schedule to her.

Conclusion

Please note what I have not said here. I have not said that training with positive reinforcement has no possible negative consequences. It can. When we humans hold access to all the good stuff, it takes a mindful approach to avoid coercion. But if we are positive reinforcement-based trainers, avoiding coercion is already a top goal. Schedule effects such as Perone describes are a very good thing for us to learn about to provide the best, happiest experience for our animals. Punitive schedule changes can be avoided.

In the meantime, keep in mind that the negative side effects of positive reinforcement training listed in this article by Perone are minimal in animal training. These effects are not at all comparable to the potential fallout from force-based training, which can ruin the lives of dogs and destroy relationships.

The title of the article causes some trainers who use highly aversive methods to hope it can work as a “gotcha” to support their stance. “Look, positive reinforcement is just as bad!” Except it doesn’t show that at all, and they would know if they had read it. Or they do know, and expect you not to read it. Next time you see it referenced, feel free to link to this post.

Training with positive reinforcement, even moderately well, is unlikely to have delayed aversive effects. It’s more likely to have both current and delayed beneficial effects.

A Note about Cheetos

I eat Cheetos and other snack foods. I’m aware they are engineered to be extremely tasty but not satisfying, so we eat more. I eat them anyway. I don’t food shame anybody. I don’t idealize thin body types. I hope everyone reading has the resources to treat themselves to plenty of their preferred pleasures in life, both short-term and long-term.

Further Reading

I find this article by Balsam and Bondy, The Negative Side Effects of Reward, a far better discussion of challenges we might encounter when doing positive reinforcement training. Before you get worried: this article is not at all damning of positive reinforcement-based animal training either. It gives some very practical information about challenges we already recognize. For instance, if you use a powerful food reinforcer, you may get more “food approaching” behavior than the behavior you are trying to capture and reinforce. (“My dog is distracted by the food!”) This is a fairly minor training challenge. The other points in the article are similar. Again, the negative side effects” are not at all comparable to the fallout associated with force-based training.

Also, for advanced reading and more information about how to make positive reinforcement training the best it can possibly be, take a look at Nonlinear Contingency Analysis by Layng, Andronis, Codd, and Abdel-Jalil (2021).

Thank you to my well-qualified friend who looked over my post. All mistakes, of course, are my own.

Related Post

References

Balsam, P. D., & Bondy, A. S. (1983). The negative side effects of reward. Journal of Applied Behavior Analysis16(3), 283-296.

Everly, J. B., Holtyn, A. F., & Perone, M. (2014). Behavioral functions of stimuli signaling transitions across rich and lean schedules of reinforcement. Journal of the Experimental Analysis of Behavior101(2), 201-214.

Jwaideh, A. R., & Mulvaney, D. E. (1976). Punishment of observing by a stimulus associated with the lower of two reinforcement frequencies. Learning and Motivation, 7, 211- 222.

Layng, T. J., Andronis, P. T., Codd, R. T., & Abdel-Jalil, A. (2021). Nonlinear contingency analysis: Going beyond cognition and behavior in clinical practice. Routledge.

Perone, M. (2003). Negative effects of positive reinforcement. The Behavior Analyst26, 1-14.

Skinner, B. F (1971). Beyond freedom and dignity. New York: Knopf.

Skinner, B. F. (1983). A matter of consequences. New York: Knopf.



Source link