The pdf was originally downloaded from
and is transcribed below for easier reading and use with text-to-speech.
Shaping behavior through reinforcement
Operant conditioning enables a trainer to correct unwanted behaviors by using a variable reward system
By Cheryl S. Smith
Article Transcription by Tameesha VanEtten of Valorzen Canine Training
This was the challenging concept at a recent Karen Pryor/Gary Wilkes Dog Training Seminar.
Karen Pryor is a former dolphin trainer and author of the highly popular book “Don’t Shoot the Dog.” Gary Wilkes is a veterinary behavior specialist who focuses on control of serious behavior problems in dogs. They have presented their seminars to such diverse groups as the National Association of Dog Obedience Instructors and the American Psychological Association.
What they teach is operant conditioning. The concept sounds a little scary, utilizing terms such as successive approximation, limited hold, shaping and stimulus control. Actually, operant conditioning is a very effective way to use reinforcement, and it works on birds, cats, dogs, dolphins, humans, otters and probably any other sentient being you might care to name.
We all use positive reinforcement when we praise or give a treat. But does a dog really understand exactly what you are praising, especially when a new behavior is being taught? If you are teaching a dog to sit, for example, has she already sat down or gotten up by the time the words “Good girl” come out of your mouth? What does the dog then think is the desired behavior?
This is where a conditioned reinforcer comes in. Words are too obtuse and lengthy to pinpoint a small action, but a whistle or a clicker can be sounded at the instant of the behavior you are after. If you follow the sound with the usual treat or praise, the dog will quickly learn that the chosen sound - be it a whistle, clicker, or something else - means, “You’re doing the right thing, your reward will be coming.” Because you can now delay the reward, you can work at a distance from the dog and still let it know when it is performing as desired.
With this concept as our only new information, we played the Training Game. One of the seminar participants volunteered to be the “dolphin” (training subject), and left the room so that we could decide on a behavior that our trainer would teach the dolphin. We chose to have the dolphin go to the center of the “pool” and spin in circles. Our volunteer came back in and we began.
She moved aimlessly about the room and got a click every time she headed toward the center. On several passes she got clicked as she reached the center. But on the next pass there was no click. She was visibly startled, and backed up. When she reached the center she got a click.
Now she knew to go to the center of the room, but had no idea what to do once she got there. So she went to the center and turned slightly to look at the trainer for a clue . . . and she got a click. Now she thought she really had it. She marched off in a new direction. This got a click at the turn several times, but then no click.
Our dolphin hurried back to the center. Quarter turn to the left, click. Turn back, nothing. Again, same results. Half-turn to the left, click. It was only another few seconds before she was spinning joyously in circles.
The exercise was a revelation. No verbal information at all was given, but our trainer had the volunteer performing the behavior in 10 minutes or less. Since we are much more adept at reading the facial expressions and body language of other people than of other species, we could easily see the emotions of the volunteer. She came in intrigued and curious. The first few clicks seemed to be agreeable, but when she wasn’t sure what to do once she got to the center she became confused and mildly upset. Once she thought she had it figured out she was pleased, and when she was proved wrong, she became frustrated and almost ready to quit.
Having witnessed this, we now watched a young Labrador Retriever with no obedience training learn to down and stay. The treat was shown to the dog and moved so that the dog sat. The dog got a click and the treat. This was repeated a few times, then the treat was moved so that the dog was sitting but with her head bent down. At no time did the trainer push or pull on the dog.
The really amazing part came after the dog had gotten maybe a dozen clicks for lying down. The trainer showed her the treat then moved it back against his chest. The dog tried to follow it and was calmly told “wrong.” The dog laid down and got a click and the treat. The clicks started again - the treat was shown to the dog and taken away. We could see the dog thinking, and she slowly collapsed into the down. For that she got a click and a handful of treats.
The whole action had to be reinforced when the trainer stood up (he had been sitting on the ground), but in no time the dog would down and stay until she received the click. You could then start putting commands or signals to the behavior, and have a dog that knows the “down-stay” in one or two lessons.
Watching a person go through the procedure first helped us to see the reactions of the dog, who exhibited a very similar range of emotions, from enjoyment to confusion to frustration to joy.
Next we were introduced to negative reinforcement. Again, this is something we all use, whether it is a scolding, a scruff shake or a yank on the chain. However, the rule in operant conditioning is to give a warning before you give any correction.
A 9-month-old St. Bernard with a pulling problem was used for this demonstration. A rolled-up towel, referred to as a “bonker,” was the negative reinforcement. Our trainer begin walking around with the dog, and when the dog started to move ahead of him, he said “No,” and threw the bonker at the dog’s head. Obviously, a little rolled-up towel is not hurting a big St. Bernard, but at the third “No,” the dog backed up until he was behind the trainer.
Our two trainers explained that the negative reinforcement is used to disrupt the unwanted behavior. Once that is achieved, you replace the unwanted behavior with some desired behavior. Now the St. Bernard was clicked and treated for being in heel position, and this gigantic puppy was walking sedately at a heel in no time.
While it is fine to give a treat every time while you are shaping a behavior, if you were to continue this way, the dog would soon learn to do the least possible work to earn the treat. Once a behavior is occurring reliably, you can switch to variable reinforcement; you reward only the best performances of the behavior.
Don’t raise your sights too high all at once, or the dog may get discouraged and quit; rather, gradually demand better and better performances. Improvement usually happens quickly, and both you and the dog will be delighted with the outcome.
There was so much more to this seminar than can be covered in one article - stimulus control, behavior chains and a funny warning about compound commands (the “sit sit” syndrome). For a more in-depth discussion of these concepts, read Pryor’s book, “Don’t Shoot the Dog.” Be sure to play the Training Game with people; you’ll learn much about how this system works. As Pryor warns, this is a way to think about training, not a set of rigid rules. Each trainer will shape a behavior slightly differently, clicking at a different time and demanding higher or lower standards from the beginning.
You may even find yourself using what you’ve learned on your boss or co-worker. Primarily, I hope you will use it to better understand your dog.
I knew I wanted to train my next dog without a choke chain and without punishing him before he knew what he was doing to earn the punishment. Operant conditioning has given me a whole new language to use.
Cheryl S. Smith lives in Campbell, Calif., with one human and four canine roommates, some of whom are trained better than others.