(Sorry folks this has nothing to do with tracking!)

We all spend hundreds of hours training our dog for trials: how can we be sure we are not taking the long road – wasting time unnecessarily?

There are a few facts of life that can help us to be more effective in our training. Here is a reminder a couple of those facts.

Learning quickly

When teaching your dog any new exercise, the sit for example, he will learn fastest if you reward him every time he does it. So if you are luring him with a titbit, or pushing his bum down to the floor, or clicking spontaneous sits, or whatever method you prefer, reward him every time – he will learn faster. Rewarding the dog every time in this way is known as a schedule of continuous reinforcement.

• Dogs (and other animals) learn faster on a continuous reinforcement schedule. - At the start of each new exercise reward every correct response.

A Continuous Reinforcement schedule can be abbreviated to CRF. That’s science for you – come up with a string of big words and then abbreviate it! Anyway for our purposes CRF means reward every correct response – titbit him every time he sits.

This may be the fastest way to teach a new exercise but beware it does have pitfalls. Many mums will remember praising their young children for using the toilet, how often do you hear a mother praising an eighteen year old for the same behaviour!

We must move on otherwise how can we teach more complex behaviours if the dog gets his reward just for sitting?

Although dogs learn fastest if rewarded every time, the learning is not very strong – it is more susceptible to extinction – to becoming ‘unlearnt’. This phenomenon is the reason why dogs that are fed from the table at every mealtime quickly learn not to beg if the feeding suddenly stops. However, dogs that are fed sometimes are more difficult to teach not to beg because sometimes they have been  rewarded and sometimes not. This makes them more patient – more persistent. ‘Okay I didn’t get fed this time but I probably will next time.’ They learn to be more persistent because they were not always rewarded.

Dogs are intrinsic gamblers.

Can you remember playing the ‘one armed bandits’ on the pier as a child? If you found a machine that paid out regularly you kept putting in more pennies. If that machine then stopped paying out for a while you would keep playing in the belief that it would soon pay out (after all it has proven it is a good machine). Another pay out would keep you playing, if you then received a ‘jackpot’ you would be really hooked. If after several more small payouts the rewards stopped again you would probably think that a ‘jackpot’ is about to come your way. Now you feed your pennies in recklessly! Your behavioural response is getting stronger. You are sure that another jackpot is on its way. Occasional rewards are all that is now necessary to keep you playing.

Compare this to walking onto the pier and trying a machine that did not pay out the first few times you put your money into it. You would quickly move on to find a ‘better’ machine. The early rewards (or lack of them) have a significant effect on your behaviour. First impressions are important, so when teaching a new exercise (or a new bit) reward lots in the early stages. Move on to skipping rewards occasionally and adding a jackpot after several unrewarded attempts (ensure that the one you reward is really good – that it is worthy of a jackpot).

• A variable schedule of reinforcement produces more reliable results. Progress to not rewarding every response.

A good way of progressing would be to move the exercise to a new location (your garden or the park, etc.) on a CRF [schedule of constant reinforcement] (same exercise new location equals new bit) so we reward every time he sits in the more distracting environment. At home in the lounge where he is now reliable on this exercise progress to a variable schedule of reinforcement (VR) the ratio between correct responses and rewards is variable - or unpredictable. So sometimes when he gets it right just say ‘Good dog,’ but no click or no treat or toy or whatever reward he is used to. Next time he does it reward him as before. After three or four more repetitions skip another reward. He should try harder the next time. If he does reward the faster one with a jackpot. Continue to progress in this way.

In the second environment (new location) move onto a Variable Ratio schedule (lets just say VR now we all know it just means that the rewards come on an unpredictable basis). Begin the exercise in a third location using a CRF (continuous reinforcement schedule). Build in this way until the sit is on a VR in all locations. Sit means sit wherever he is and what ever is going on around him. He may or may not be rewarded.

Use this method to ensure each exercise is fully learnt in differing environments and with varying distractions, etc.

Just to recap then;

• Dogs (and other animals) learn faster on a continuous reinforcement schedule. - At the start of each new exercise (or when moving the exercise to a new location) reward every correct response.

• A variable schedule of reinforcement produces more reliable results. Progress to not rewarding every response.

Warning

The fastest way for something to become ‘unlearnt’ is to move too quickly from rewarding every time to not rewarding enough. Remember the dog begging at the table? If he was fed at every meal and then the feeding suddenly stopped he quickly learned that begging was not productive any more so he stopped begging. This process is known as ‘extinction.’ So if you think your dog ‘jolly well knows what sit means now’ and you stop the rewards too abruptly then you actually put the sit onto an extinction schedule and he will soon stop sitting on command. At this stage you might be tempted to change from reward based training to making him do it.

This would be a shame because all the positive associations you have built with the exercises will now be lost as you suddenly become annoyed and force him to sit. However, if you are aware of this theory you will plan your training so that the transition from CRF to VR is gradual – just like the best one armed bandits on the pier.

So now let’s compare the dog that was begging at the table to teaching the dog to come when called. We start ideally with a puppy. We call it when we know it is likely to respond. That is when there is nothing else going on. The puppy comes and we feed it a small treat or portion of its daily ration. This is repeated many times. It is on a CRF. We make it easy for the pup to get it right and we reward him every time he does. We are teaching him what the word ‘come’ means, also at the same time we are teaching him that it is nice to come when called and that he can understand us (as long as we are consistent with our words).

As he becomes reliable in the lounge you can begin to skip some of the rewards (moving on to a VR) in the lounge. Remember don’t suddenly stop or the behaviour will extinguish just like the begging did when the feeding at the table stopped suddenly. Meanwhile out in the park you can begin to teach the come on a CRF. Try to make it easy for him to get it right by calling when there is not much else happening around him. Because he is in a more distracting environment (it’s a new bit) you must reward every correct response. Remember to allow him to go back to play after rewarding him for coming. The coming back for a reward will become a good game and then you can move this on to a VR. Remember he is a gambler! Get him hooked and then keep him guessing – but the rewards must be rewarding enough or else he will stop playing just as you would swap machines if your one armed bandit only paid out occasionally and then only with one or two pennies – not worth it!

Summary

The dog will learn faster on a CRF schedule (rewarding him every time he offers the correct response). Use this to teach new exercises or to teach an already learned exercise in a new environment (new place, added distractions – people, dogs, etc.).

Reward every time when teaching new exercises, when moving a learnt exercise to a new situation or when teaching a new component of an exercise.

If you suddenly stop giving the rewards extinction will occur - that is the dog will stop responding. For example he learnt to sit on command and was rewarded every time he did so, you then decide he knows it and no longer needs the titbit. The dog now learns that sitting is no longer rewarded so no longer sits on command. The response has broken down, become ‘unlearnt.’

Do not change too quickly from rewarding every time to seldom rewarding or the response will break down.

Once the new response is learnt gradually move that behaviour onto a variable schedule of reinforcement (unpredictable rewarding). Start in an ‘easy’ environment keeping the more challenging situations on a constant schedule until they too are thoroughly learnt. Gradually introduce the variable schedule of reinforcement to the more challenging situations. Never move quickly from rewarding every response to seldom rewarding.

A variable schedule of reinforcement produces more reliable response than constant ratio. It is less susceptible to extinction. That makes it most valuable when competing!

Anne Bussey

Go to Top
Template by JoomlaShine