Steve Martin Natural Encounters Inc.
Susan G. Friedman Utah State University
Clear, two-way communication is the cornerstone of successful animal training. Through clear communication expert trainers fluidly shape an animal’s responses from one approximation to the next, resulting in a new, complex behavior in minutes instead of weeks. One of the most important communication tools is the conditioned reinforcer, also known as a secondary reinforcer, event marker, marker, bridging stimulus and bridge. Conditioned reinforcers improve two-way communication because they can be delivered the instant the right behavior occurs. This close temporal association between the behavior and the reinforcer is an essential characteristic of effective reinforcement known as contiguity.
Marian Kruse and Keller Breland were among the first animal trainers to use clickers to improve training outcomes more than 70 years ago. Along with Bob Bailey and others, they explored a wide variety of other conditioned reinforcers too, such as whistles, lights, touch and words. These conditioned reinforcers are now commonplace in our training toolkits.
Interestingly, the precise function of conditioned reinforcers is still being investigated (for a good discussion of the different accounts, see Pierce and Cheney, 2008). However, on a practical level we have the information we most need to know about conditioned reinforcers:
1) How to make them – pair a neutral stimulus closely and repeatedly with a well-
established reinforcer, i.e., respondent (classical) conditioning.
2) How to break them – stop pairing the conditioned reinforcer with a backup reinforcer, i.e., respondent extinction.
As we consult in zoos around the world, we often see trainers inadvertently breaking, or weakening, their conditioned reinforcers by not backing them up with a well-established reinforcer. We call this training approach Blazing Clickers. Blazing clickers is defined as the unsystematic, rapid-fire clicking of each correct response in a series of correct responses, without following every click with a well-established, backup reinforcer, i.e., click, no treat.
Over the many discussions we’ve had with trainers who blaze their clickers, we’ve come to believe that this approach results from several misconceptions about basic behavioral processes related to conditioned reinforcers. The purpose of this paper is to improve your training effectiveness by addressing five common misconceptions associated with blazing clickers, and to add our voices to those trainers who recommend pairing every, or nearly every, click with a well- established backup reinforcer (see for example, Fernandez, 2001; Ramirez, 1999; B. Bailey, personal communication, April 17, 2011, K. Pryor, personal communication, April 16, 2011).
For ease of communication throughout this paper we use the following short hand:
1. The word click refers to any conditioned reinforcer used in training to reinforce a behavior with super contiguity. It is used synonymously with conditioned or secondary reinforcer, bridging stimulus, bridge, event marker and marker.
2. The word treat refers to any well-established reinforcer, conditioned or unconditioned, used to condition and maintain the reinforcing strength of the click. Treat is used synonymously with backup reinforcer (most often in animal training the backup reinforcer is food).
3. The term blazing clickers refers to the practice of repeatedly clicking without systematically delivering the backup reinforcer, also referred to a solo clicks.
Misconception #1 – Blazing clickers is a good approach because the clicker is a reinforcer (a secondary reinforcer), so the animal doesn’t need another one (the treat).
Some trainers say they don’t need to follow the click with a treat because the clicker is not
only a marker or a bridge, but a bona fide secondary reinforcer too. Why deliver two reinforcers when one will do? It’s true that a well-conditioned secondary reinforcer can be as strong, or even stronger, than a primary reinforcer, given a long, strong conditioning history. However, a critical difference between primary and secondary reinforcers is that primary reinforcers are
automatically reinforcing – pre-wired so to speak; secondary reinforcers depend on experience, specifically close repeated pairing with other well-established reinforcers to acquire and maintain their reinforcing strength. In fact, the procedure for returning a secondary reinforcer to its neutral state is un-pairing, i.e., delivering the secondary reinforcer repeatedly without a backup reinforcer, known as respondent extinction (a conditioned stimulus, CS, is presented without the subsequent unconditioned stimulus, US).
Somewhere between consistent pairing and no pairing is the progressive weakening of the secondary reinforcer. While secondary reinforcers do have a “shelf life,” the span of that shelf life is not knowable and those secondary reinforcers that do have a long shelf life are the result of a consistent click-treat history made up of dozens, if not hundreds, of pairings (Pierce & Cheney, 2008). Each time a click occurs without a backup reinforcer, it is quite literally a respondent extinction trial and secondary reinforcers can lose their strength to reinforce very quickly, a problem we have observed many times. As the click fails to reliably predict a treat, animals scan the environment for other clues that the backup reinforcer (food) is on the way, such as the subtle movement of the trainer’s hand toward the treat bag or bucket. In fact, in the case where trainers work up close with their animals, we often see animals respond to their trainer’s body language before or after the click is sounded (independent of the click). It may be that given close proximity, many animals respond faster to what they see than what they hear.
Misconception #2 – Blazing clickers makes training more interesting for the animal. If you treat every time you click, the session is too predicable and animals get bored with training.
Some trainers have explained to us that blazing clickers is a good way to keep animals interested in training, to ward off the boredom produced by the humdrum repetition of consistent click-treat. It is true that variety is the spice of life but we think the spice should come from the variety and quantity of reinforcers you provide, the behaviors you train, and the pace with which you train them, rather than blazing clickers.
Imagine finding your refrigerator locked 3 or 4 times a week just to keep things interesting for you. A different hypothesis for why an animal may become inattentive in a training session is worth considering, which we call blazing behaviors. That’s the rapid-fire cueing of mundane responses, responses that lead to no important skill or improved quality of life for the animal. For
example, we routinely see training sessions comprised of multiple, rapidly delivered cues to target different body parts, each touch lasting only a fraction of a second. It sounds like this:
“Gracie, arm-click, finger-click, shoulder-click, ear-click, foot-click, knee-click, back- click, goooooooooooood, treat-treat-treat.”
Talk about humdrum! When observing this type of training session, we find ourselves wondering exactly what is the purpose of teaching an animal to touch so many body parts in rapid succession in less than 20 seconds. Targeting is most useful when the touch is held for some duration of time. As a longer duration behavior, targeting can easily be leveraged into important medical and husbandry behaviors.
Misconception #3 – Blazing clickers builds stronger behaviors than consistently pairing click- treat because inconsistent pairing is a variable schedule of reinforcement like a slot machine.
Some trainers think that blazing clickers is a variable schedule of reinforcement that should lead to stronger behavior since the back up reinforcer is intermittently withheld. A variable schedule is one of several intermittent schedules of reinforcement where the number of responses (or time interval, duration, etc.) required for reinforcement changes around a set average. It is correct that intermittent schedules build persistence into fluent behavior, i.e., the behavior is slower to extinguish. However there are two misconceptions rolled into this one rational for blazing clickers. First, if the clicker really is an effective conditioned reinforcer, withholding the treat doesn’t change the fact that you are still using a continuous reinforcement schedule of clicks. If the click is not an effective conditioned reinforcer, we’re faced with the very real possibility that the click is meaningless noise the animal has to sort through to find the behavior- consequence contingency.
It is also worth considering that there is no absolute or inherent value in building behavioral
persistence when it isn’t needed. Cued behaviors are a case in point: You have to show up to cue
them so why miss an opportunity to increase an animal’s daily amount of reinforcement by using
a variable schedule? When persistence is required, the best approach is to first teach the new
behavior with continuous reinforcement (click-treat) for the clearest communication of the
behavior-consequence contingency. Next, gradually thin the reinforcers over time (known as
stretching the reinforcement ratio) to the desired variable schedule changing the amount of
behavior unpredictably while increasing the amount behavior required for reinforcement overall.
For example, if a trainer wants a lion to make several trips to a public viewing window each day,
a variable ratio schedule of reinforcement (i.e., the click-treat together, no solo clicks!) would be the right tool. Starting with a continuous schedule to get a high rate of window passes first, the trainer can then gradually thin the click-treat reinforcer by requiring an increasing, but variable, number of passes to get reinforcement. Implementing this training strategy takes time and careful planning to keep the reinforcement rate high enough for the lion to remain engaged in the training. A variable duration schedule can be used to increase the length of time the lion stays at the window.
Misconception #4 – Blazing clickers reduces frustration aggression because the animal learns not to expect a treat every time. Otherwise, all hell can break loose if you run out of food before the session is over.
Some trainers have expressed the concern that animals trained with a consistent click-treat history will become aggressive when a treat is not presented. One way to solve this problem is to ensure that the treat is always presented after a click. This requires planning the right amount of backup reinforcers and parsing them out carefully during each training session, or ending a training session early because you have run out of food (something that should only happen once). You can also plan for a shorter training session with larger quantities of reinforcers and fewer
behaviors or repetitions. This may improve the motivation of your learner and help you avoid the mundane repetition of blazing behaviors described in #2, above.
It is possible to teach an animal to offer a lot of behavior for mainly secondary reinforcers, see for example, Alferink, Crossman, & Cheney, 1973, which describes the process by which a conditioned reinforcer, a hopper light alone, came to maintain pigeons’ disk pecking 300 times in the presence of free food. However, conditioning such a strong secondary reinforcer requires a the systematic implementation of a plan that includes hundreds of click-treat pairings, a strong backup reinforcer, and eventually variable schedules carefully delivered to avoid stretching the ratio of reinforcement too abruptly or too far (known as ratio strain). Such a structured approach is very different from haphazardly choosing to not follow each click with a valued backup reinforcer based on a hypothesized relationship between consistent click-treat training and animal aggression. On the other hand, there is abundant data that an extinction schedule can elicit aggression (called extinction induced or frustration induced aggression). That is a concern when a click has lost its reinforcing strength from solo clicking and the treat is being intermittently withheld.
Misconception #5 – Blazing clickers is good for telling the animal that what it just did is right and the animal should keep doing it. They’re definitely smart enough to learn a click means different things.
Of course it is no problem to teach animals a keep going signal (KGS). It certainly is exciting to see a sea lion responding to the KGS by swimming another speed-lap, or a macaw flying a few more lazy loops around the theater, or an elephant keeping its leg in position for a foot trim. It is a problem however, when one signal, the click, is used to mean two entirely different communications. A red traffic light can’t effectively signal to drivers step on the brake and also step on the gas. Thank goodness we have red and green lights!
It is very unclear communication to have the same click mean two entirely different things such as food is coming and keep doing this behavior. We’ve seen animals walk away from the training station (now that’s some clear communication), when the same click was used in both these ways.
A well-conditioned click may well serve more than one function for a given behavior. It may mark the right behavior so the animal learns what to repeat to get food reinforcement again, it may bridge the behavior to the food reinforcer, and it may be a discriminative stimulus to end the behavior and prepare for food. A well-conditioned KGS doesn’t interrupt the flow of behavior, by definition. When using a KGS, a different signal is needed to communicate, “Food is coming now!”
Clickers, whistles and other conditioned reinforcers are valuable tools that help trainers communicate to animals the precise response they need to repeat to get a treat. When a conditioned reinforcer is reliably paired with a well-established backup reinforcer then communication is clear, motivation remains high and behaviors are learned quickly. However, when a click isn’t systematically paired with a backup reinforcer the communication becomes unclear, as evidenced by decreased motivation, increased aggression, and weak performance.
When the click begins to lose meaning because of repeated use without with a treat, animals begin to search for other stimuli to predict their outcomes. They often watch for body language clues that predict the treat is imminent thereby further strengthening the behavior consequence contingency and the click is just noise. While it’s true a secondary reinforcer doesn’t lose its ability to strengthen behavior the first time it’s used without a backup reinforcer, the number of solo clicks to extinction can’t be predicted, and it can happen very quickly. So, while we may be able to get away with the occasional solo click, blazing clickers is not best training practice. When the click doesn’t carry information an animal can depend on, the result is undependable behavior.
1 Martin, S. & Friedman, S.G. (2011, November). Blazing clickers. Paper present at Animal Behavior Management Alliance conference, Denver. Co.
Alferink, L.A., Crossman, E. K., & Cheney, C.D. (1973). Control of responding by a conditioned
reinforcer in the presence of free food. Animal Learning and Behavior, 1, 38-40. Fernandez, E.J., (2001). Click or Treat: A Trick or Two in the Zoo. American Animal Trainer
Magazine, 2, 41-44. Shedd Aquarium
Pierce, W. D., and Cheney, C.D. (2008). Behavior analysis and learning. (4th ed.). New York,
NY: Psychology Press, 221-240.
Ramirez, K. (1999). Animal training: Successful animal management through positive reinforcement. Chicago, IL: Shedd Aquarium: p.14.