Thursday, February 14, 2013

But horses aren't predators!

Originally Posted by PunksTank    
Continuing to love your posts JillyBean!! I was wondering if you could explain one more thing about food rewards (if you were already planning too sorry for jumping the gun :P). I hear so many people say "horses don't think like predators, their food is at their feet, so they don't know how to work for food" - While I disagree with this, seeing my horses dig in the snow for the little grass underneath, and seeing other horses who have learned to kick walls or whinny for food. But I was wondering if you could explain it for people who believe that?
Excellent question and one that I've also seen come up a lot.

Bottom line, every living thing needs energy and usually devotes the majority of its life working to obtain it by consuming food in some way/shape/form. Horses are no exception. It is a simple fact of life: horses need food to survive and so they work for it - whether they're in the wild and searching for grazing grounds or domesticated and chasing the rest of the herd off their flake of hay. Since horses work so hard to obtain their food simply to stay alive, it's an easy thing to exploit as a reinforcer/motivator in clicker training, especially if you use something like enjoy eating and don't get all the time.

Where predators are concerned, I'd actually be more worried about using food with them than I would non-predators. Most predators, like dogs, are fine being given treats. However, there are a few (snakes come to mind) that like their prey to be alive and lose interest in meals that aren't moving. If you were to clicker train an animal such as this, you'd have to find something that motivates them. If squirming meals were the only thing that motivated them to work, you'd have to use that as your reinforcer!

Herbivores, on the other hand, actually need more food than predators to function and that is why they are ALWAYS eating. Plants actually do not contain a whole lot of nutrition, so herbivores eat a lot, poop out most of it, and so must continue eating more. In contrast a predator, like a lion, can get all their nutrients from one meal and some can go weeks without eating.

Which brings us to the issue of being full. Any animal using treats in training does risk getting full and losing motivation to eat (like when you have a HUGE meal and don't even wan to look at dessert! Rare, I know, but it does happen lol). You don't need a starved animal, but right after feeding time probably isn't the best time to try clicker training, either. Generally, if you work with your horse any other time than after feeding time, they should be decently motivated to work for treats since they have such high energy (food) demands. Horses allowed free-choice hay and grass are usually ok since they're getting a slow and steady food intake (as opposed to stuffing themselves once or twice a day) and should still want food when you're working with them since they're working for food all day anyway. However, you'll need to pay attention to your horse and get to know them to find out when his optimal training time will be based on the desire for food (or whatever your reward is) and any other factors that affect motivation and attention.

All of this applies to all animals for the same reasons - here are a few examples of "unlikely" animals working with clicker training, none of which are predators:


Goldfish (I don't think they're predators, and even if they are, how incredible that CT is so simple and elegant it can be used with virtually any animal!):

And, not going to lie, this is my favorite one I found for so many reasons and I LOVE these camels! (And for us CT junkies, check out the targeting, the "stand" game aka "stand on your mat", and the camel/trainer reaction to when the camel asks for food!)

Clicker Emergencies

I'm going to call this "clicker emergencies" to distinguish between this and clicker training. Though these examples aren't intended actually teach the horse anything, having a clicker trained horse does come with a few side benefits that I've found very useful.

One example is for when you need your horse to do something new and there's no time to actually train the behavior. For instance, my colt needed somewhat urgent hoof care when I purchased him. He'd never been worked on by a farrier before and had a terrible flare and a few other issues I wanted to attend to right away, especially since it seemed like he was having strange bone development in order to balance himself on his hooves. By the time the farrier came out (about week into using clicker training), Flash knew what the clicker meant, but we didn't have time to work on picking up much less holding his feet for the farrier. My dad, a skeptic about my clicker training, came out to help me hold him for the Farrier. Flash was not happy and didn't participate, and I could tell my farrier was exercising a tremendous amount of patience. It wasn't long before I told him I could go get my clicker and that would probably help. My dad said the farrier probably didn't want me messing around and giving treats, but the farrier said to go ahead and do anything I thought might help. Out came my bag and the clicker! Normally, I would practice just picking up feet, then holding feet for a second, and then holding them longer and longer to actually train the behavior. However, there was no time for that. As soon as the farrier picked up Flash's foot, I started clicking and treating constantly. If he pulled his foot away or put it down, the clicking and treating stopped. It took him one try to figure out the game and then he was the easiest 18-month-old you've ever tried to work with! Again, this didn't teach him to hold his feet, but it got us through a nearly-impossible hoof trim. In addition, simply feeding him wouldn't have worked since it would have just created a mouthy and impatient horse trying to get more snacks. With the clicker, he knew he had to earn the treats and that they wouldn't just be given to him for no reason. (Since then, we've done a lot of work to train him to be good about his feet, going through the process I describe above of asking more and more from him in order to earn the click, and I can now work with all his hooves without any problems and without the aid of a clicker or treats)

My second example of where the clicker has helped in a tight spot is to get a horse's attention in a critical and urgent situation. For example, last fall I was leading Flash back from a ride and he got excited and took off loping and bucking home, pulling the lead rope out of my hands. However, the place I was boarding was off a main road with lots of 50-60mph traffic and there was a good chance he would run right out on the road if I couldn't get him stopped. I yelled "woah" and "Flash!", but he was headed for home! Then, almost by instinct, I started clicking my clicker furiously to get his attention - And he stopped immediately! Hey, he wasn't going to miss out on a treat! He stood still and waited for me to catch up to give it to him - At this point, I started clicking about every 5 seconds to tell him he was doing what I wanted (standing still) and keep him standing there while I caught up. Crisis averted!

Do I have to use a clicker and treats?

First the clicker:The purpose of the clicker is to provide a "bridge" between the behavior you're trying to reinforce and the actual reward. This enables you to "mark" specific behaviors by clicking simultaneously with them when it would be impossible to give them a reward for the behavior right then.

A clicker works very well as a bridge because it is a distinct and consistent sound that creates a strong, clear association between behavior -> marker (the clicker sound) ->reward. A sound works better than any other type of reinforcement because it will pretty much always be noticed and recognized.

However, any sound that is distinct and consistent will work for "clicker training". For instance, I know some people use the caps from Snapple bottles (they click when you push them in) and PunksTank uses a smooching sound and doesn't even have to carry a clicker device! The key to a good "marker" is making sure that it's always the same and always associated with your reward. For instance, if you make a smooch noise for a cue, then a smooch noise will not be an effective marker since it's not clear what you're indicating when you make the noise. Moreover "good boy" or "good girl" is usually a poor marker choice because you're likely to make the same words or even just the sounds in other contexts and confuse the horse, and even our best efforts to say this the same way every time will likely fail since things like emotion will affect how we say it. Personally, I don't trust myself to be consistent enough with any verbal cue, and so I have my clicker permanently attached to my wrist with a high-quality elastic wristband and it's just one of the pieces of tack I grab when I intend to work with my horse. If I can grab a halter and lead rope, I can grab my clicker :)

Now, the treats:
Once you understand what a reinforcer really is, you can decide what you'd like to use as your reinforcer. As long as it motivates the horse to work, it is a reinforcer! Preferably, you want a reinforcer that the horse will work for over a period of time as well. Does your horse work for a scratch behind his ears? If you'd like, you can use that instead of treats! However, treats are often the most convenient reinforcer for a number of reasons. First, most horses are food-motivated simply because it's a basic need, so we can exploit it. Not all food will work for all horses - for instance, one of my horses only likes a few bites of grain and then loses interest. Grain would not be a good reinforcer for him, while it probably would be for most horses. I like using "cookies" because I believe they're healthier and I don't have to worry about him getting too much. Plus, I can change flavors to keep him interested. I try to find the smallest ones I can so that I can give a small reward without feeding too much each time.

Backing Up, Day 2

OK, so now that I've finally gotten all that theory and technicality stuff down, I can finally update on our progress today!

I went out with the goal to just work on what we started yesterday (backing up with a verbal cue), adding speed and getting him to respond to the verbal cue. AND, per PunksTank's suggestion, I wanted to make sure I kept our training session short.

I was pleasantly surprised with how yesterday's lesson apparently "sunk in" overnight! I'm betting that the same thing would have happened even with just a short break yesterday like PunksTanks suggested. Unfortunately, I board my horses so it's a little difficult to spread out our sessions with breaks, but I'll have to get creative. For now, I'll just do little mini-lessons. I'm not sure how long I was out there today, but I made a point of stopping while we were ahead and keeping it shorter than yesterday.

I turned Flash out in the arena as soon as we got out there. He was eager to find out what game we were playing today, so he followed me wherever I went and stopped respectfully when I did (we've worked on where he's supposed to walk respectfully before and he got a reminder the other day when I reacted by shaking his halter without the clicker - he's been very respectful since). Then, I turned around and said "back up" - and he took a step backward! I immediately clicked and treated. He's backing up about 50% of the time on just the verbal cue now and will continue backing up if I keep saying it (backupbackupbackup...). He'll even do so at a decent speed, through I still want to get him faster. If I pick up my energy and walk toward him, shaking my finger at his chest like I did yesterday, then he picks up speed and moves pretty well.

I forgot to mention yesterday how he was swinging his hip some and not backing up straight, but I fixed that by swinging the lead rope at his hip and turning his head slightly, so he straightened back out. He seems to have worked the "straight" thing out now, especially since we're picking up speed and he has to move fairly straight in order to do so quickly.

After a few minutes of backing, he wandered off. I think he's feeding off some other cue he's not quite understanding and that I'm not trying to give, because he basically lunged himself on his own for a while. That alerted me to the fact that I needed to teach him a "come here" cue since he was so convinced he was supposed to be going around me (I try to do most of our training at liberty and didn't have the lead on to stop him). So, for the next few minutes, I focused on just asking him to come. Essentially, I called his name and extended the back of my hand to him and had him target it. Pretty soon, I could send him off by swinging the lead rope and then ask him to come in and touch my hand. Once I had his attention again, I asked him to back up a few steps, then come back forward when I called him and extended my hand. We only did this a few times, and then I decided it was a good place to stop while he was still interested and paying attention.

Tomorrow, I think I'll continue working on the "back up" and "come here cues" and focus on those until we have them really well :)

One last interesting note - we worked a LOT on leading last year out of necessity, including trotting when asked. He knows his cue very well, even after he had the winter off, and immediately trotted up to me when I asked him to catch up while leading him to the arena. However, he never passed me and slowed down as soon as his head was at my shoulder. It's so nice to have a cutie trotting after me and managing the slack in the lead rope appropriately!

Pats and verbal rewards: Are they reinforcement?

I have never been able to wrap my mind around WHY we seem to think that patting a horse or telling it "Good Boy" would be rewarding for a horse. Personally, I think we do it because we find it rewarding. Human language on its own is meaningless sto a horse, and I can't imagine that the horse (or any animal) really wants to be patted - with one of my horses, it would actually be counter-productive since one of my horses is really sensitive to things like that and shies away from them.

Realistically, the only way a pat or a verbal reward could be any sort of reinforcer would be if it was done consistently enough with other things to become associated with those things. For instance, if your horse gets a quick break or a change in activity when they get their pat or "good boy", they could become associated with one another. Essentially, you've done the same thing that clicker training does when it creates a "bridge" between an inherently meaningless reinforcer and gives it meaning through association. However, since you're likely not being consistent and intentionally pairing the real reward with your pat or verbal reward, it probably won't become very strongly associated with any sort of reward that the horse wants to work for.

However, just for kicks and giggles, let's assume that horses find pats and being told "good boy" or "good girl" is very rewarding for a horse.... It would still be a terrible reinforcer, much in the same way simply "treat training" is a terrible reinforcer and for the same reasons. The trouble with treat training is that you cannot give the horse immediate feedback on specific behaviors since it's impossible to give them a treat at that moment. Usually, if it's impossible to to feed a treat, it would probably be impossible to give them a pat. Thus, it's not really connected to the specific behavior you're working on but rather an overall "I did somethingright."

The ultimate test to find out whether your pats or words are real reinforcers would be to stop giving them and keep everything else you're doing exactly the same except. If you stopped patting or saying "good boy", would your horse still work for you at the same level/speed he does now? My bet would be yes - because he's not working for the pat or words. Rather, he's working for the release of pressure, the real reinforcer. Thus, since the pats aren't actually motivating the horse to perform the desired behaviors more often, it, by definition, is not a reinforcer at all.

(Disclaimer - I'm not saying you shouldn't pat/pet/rub your horse or tell them "good boy". In fact, though I don't pat because that just isn't something I do for whatever reason, I do give lots of rubs and verbal "good"-s because I do think it reinforces my relationship with my horse. I don't expect it to assist with my training beyond simply establishing a bond with my horse and being comfortable and happy around each other. In contrast, I expect the reinforcement with the clicker to actually produce results in our training.)

Reinforcement Schedules

Once you understand reinforcement and punishment and the different ways that they work, now it comes to WHEN you reinforce. This isn't quite as critical as knowing why training works the way it does in the first place, so I'll keep it short and sweet for those that are interested in the various types of reinforcement schedules and the results they produce.

The first main type of reinforcement schedule is continuous reinforcement. This is the type of schedule most commonly used in clicker training. Basically, this means that the behavior is reinforced each time it's given. In other words, if I'm teaching my horse to pick up his feet, I click and treat each time the horse picks his foot up. This is best used during the initial stages of learning as it create a strong association between the behavior and reinforcement. However, once the behavior is firmly associated with the reinforcement (and your horse knows what you expect from him), you can do two things - ask for more and/or switch to a partial reinforcement schedule. Personally, I do both. I'll explain the "asking for more" in the next post, but the basic idea is that the horse has to take the behavior one step further before getting a reinforcement (now he has to hold his foot up longer... and longer....) and you're actually asking for your horse to learn something new (i.e. Holding a foot rather than just picking it up). However, for this post, I'm going to explain switching to a partial reinforcement schedule in order to reinforce the SAME behavior that was already taught. This prevents what we call "extinction" - in other words, the behavior stopping since we're not reinforcing it anymore (for those of you who think that a clicker trained horse will ALWAYS need a clicker, listen up!).

Partial reinforcement: this means that the horse doesn't get a reinforcement every time it does what you're asking. Instead, it only gets a reinforcement part of the time. This way, you can ask for the behavior more often without a reinforcement (i.e. You can ask for behavior without a clicker) and the horse will still respond even though it doesn't get a treat every time.

I'm only going to worry about the things that are most important here - if you want to know more, Google "reinforcement schedules".

Here are the key terms you need to know to understand partial reinforcement:
Fixed = when you reinforce doesn't change.
Variable = it's unpredictable when you'll reinforce behavior
Ratio = when you reinforce depends on the number of times the behavior is performed
Interval = when you reinforce depends on the amount of time that has passed (I'm not going to discuss this one here, though).

There are four types of partial reinforcement, and different schedules lead to different results. I've included a graph below that illustrates these. I'm only going to explain fixed ratio and variable ratio here, though, because they directly apply to clicker training.

Fixed ratio means that you reinforce after a specific number of correct behaviors. Generally, this leads to a steady rate of responses in order to earn the reward with only a brief pause after getting the reward. For example, every time a kid completes three math problems, he gets a piece of candy, so he does three math problems, receives his candy, eats it and pauses, then decides he wants another one so gets back to work again. The weakness here is that, if the horse doesn't receive a treat after the expected time, the behavior can break down and the horse stops responding.

Variable ratio solves this problem. With variable ratio reinforcement, the horse never knows when it's going to get a reward - it can perform the desired behavior any number of times and may or may not receive the reinforcer. This is the most powerful reinforcement schedule as it produces a high and steady rate of the desired behavior. Don't believe me? This is how gambling addiction works: You never now when you're going to win, even without any sort of reward (and even lose money!), people keep on gambling and gambling because every now and then they win $5 back, $2 back, $10 back, etc., and they think they just might hit the jackpot with the next round.

This applies to clicker training when teaching the horse to respond the way you want it to without the clicker. I use this to reinforce behaviors that my horse knows and that I expect, but want to reward every now and then. When I'm "phasing out" the clicker, I'll ask the horse to do what I want and only click and treat every now and then. Thus, he learns that he can respond even without the clicker. Eventually, I won't use the clicker at all when asking for this behavior - this behavior is expected and the horse knows what he's supposed to be doing (thus I avoid the horse trying something else or getting confused because he didn't get a reinforcing click and treat). For things my horse knows REALLY well, I do click and treat every now and then just to say "good boy" in a way that's meaningful for him. I could probably be just fine without it, but I like to reinforce these behaviors every once in a while (i.e. Maybe once in a week or even a month) just because. Since he never knows when what he's doing might earn him a treat, he's always listening even when he doesn't get one!

Operant Conditioning: Applying it to horses

Ok, so in my last post, I explained the different aspects of operant conditioning. (Operant conditioning is simply a fancy psychology term for saying we can train behavior through motivation and is applicable to pretty much every voluntary behavior known to living organisms).

Now, let's apply it to horses:

Positive reinforcement: This is where clicker training finds a home. With clicker training, a reinforcer (usually a treat) is introduced when the desired behavior is performed. Technically, treat training is also positive reinforcement as far as it is able to reinforce the behavior you're wanting. The addition of a clicker as a "bridge" between the the behavior and actually receiving a treat simply allows us to be more intentional, accurate, and flexible with the behaviors we are trying to reinforce, which I already discussed in a previous post.

Negative reinforcement: This is where training off of pressure comes into play. In "traditional" training, pressure is applied to guide/ask the horse to do something, and then the pressure is released when the horse responds correctly. The horse learns to work for the release of pressure.

Positive punishment: This is used each time you smack your horse for getting in your space. For instance, if he is mugging you for treats and you give him a firm thwack on the nose, he learns not to mug you for treats or else!

Negative punishment: I had a hard time coming up with one for this, but it just dawned on me yesterday - this is often used in *proper* clicker training if you are going to give the horse a treat and the horse reaches out for it a little too eagerly. The correct thing to do in this situation would be to wrap your fingers around the treat and take it back. Withholding the treat discourages the horse from reaching for it, and since you are eliminating that behavior by taking something away that he would have gotten otherwise, it is negative punishment.

Oftentimes, these are simultaneously used to govern behavior. I already gave an example above about how each of these applied to why I graded those papers, but now here's an example of combining these within a training session: If I wanted to teach my horse to back up (as I did yesterday), I first gave the cue that I want him to learn ("back up"). However, since that didn't mean anything to him yet, I stepped forward and put pressure on his front shoulder. He took a step back, and I released the pressure (negative reinforcement), clicked, and treated (positive reinforcement). However, if he ever reached for a treat, I would have bopped him on the nose (positive punishment) and withheld the treat (negative punishment). During any given training session, you'll usually find me using a combination of positive and negative reinforcement (clicking and treating as well as using pressure). I don't use punishment unless he does something I don't want, obviously. Usually, that doesn't happen, though, since he's so keen on trying to figure out what I want :)

There are many other examples of how these are used with horses and in our everyday lives. Hopefully, you're beginning to get an idea of how important these principles are and how they apply to just about everything you do. For example, I am currently writing this because I want the positive reinforcement of hearing other people's comments and knowing I helped them learn something as well as the hoped-for negative reinforcement of fewer people writing off clicker training simply because they don't understand it.

Which, by the way, brings us to reinforcement schedules and how some schedules are more powerful than others in sustaining behavior - but that's for another post ;)