Can horses really plan and strategise?
Reflections on a fascinating new study from Nottingham Trent University
A few weeks ago, researchers at Nottingham Trent University published the first study to find indications of higher cognitive abilities in horses. The study, “Whoa, No-Go: Evidence consistent with model-based strategy use in horses during an inhibitory task” (Evans et al. 2024), explored whether horses can predict a sequence of outcomes and decide how to act based on these predictions. This type of learning is called model-based learning, because it requires the creation of mental “models” of the world that can be accessed when decisions need to be made. It requires more advanced cognitive processes than model-free learning, which is simple action-consequence learning where an animal learns what works and what doesn’t through repeated trial and error.
Happily, the study got a lot of media attention. Less happily, the media headlines stated that “horses can plan and strategise”, which is not an accurate representation of the results. Predictably, this headline has traveled around the world and has already been weaponised by those eager to find excuses to believe the old myths of scheming horses who spend their days plotting how to make our lives difficult. This has in turn led to backlash on social media among well-intentioned people who are concerned about the consequences of this misrepresentation.
So what is true? Can horses plan and strategise? What does the study actually say? Let’s unpack all this, starting with the study itself.
The experiment and the results
Evans et als. study is ambitious, well-designed, and is an important puzzle piece in our understanding of equine cognition. The authors recruited 20 adult horses for the study, 7 mares and 13 geldings. That’s a decent sample size for behavioural research with larger animals (although having twice as many males than females may skew the results if there is a sex-based difference in performance). All horses were kept at Nottingham Trent University, and the study was conducted in their home environment with minimal disruption to their daily management schedule.
The experiment consisted of three stages. I call them games here, because I find that thinking about behavioural experiments as games with rules is a more intuitive and fun way of engaging with the science. In the first game, the rule was simple: the experimenter held up a poster in front of the horse, and if they touched it with their muzzle, they were given a food reward.
In the second game, the rules became slightly more complicated: the horse was still supposed to touch the poster for a reward, except when a light was on. If they touched the poster when the light was on, they got no reward. Scientists call this a stop contingency. The illustration below shows the two rules of the second game.
In the third game, the rules were the same as in the second game, but there were now consequences for breaking them. If a horse touched the poster when the light was on the experimenter initiated a “time out” by moving the poster out of reach for 10 seconds and “pausing” the game. After the 10 second time out, the experimenter approached the horse again to continue where they had left off. The premise is that the horses enjoy participating in the game because it gives them an opportunity to earn a food reward. So when they lose the opportunity to play for a while during the time outs, they become more motivated to follow the rules because breaking them now comes at a cost.
How does this relate to model-based vs. model-free learning?
The researchers hypothesised that if the horses were learning the rules of the game through model-free learning (i.e. through trial-and-error), there would be a gradual decline in the third game in how often they touched the poster when the light was on, as it would take them several repetitions to make the connection between their behaviour and the negative consequences.
On the other hand, if they were learning the rules through model-based learning, they would have already understood the rules in the second game but decided not to follow them as there was no cost involved. If this were the case, the moment a cost was introduced, there would be an immediate decrease in how often they touched the poster when the light was on, because they would be able to infer that the consequences of the rule breaking had changed.
And the researchers did indeed see just such a drastic decrease in how often the horses touched the poster once the time out was introduced in the third game. Below is a box plot that shows the results and the statistical significance. Sessions 1-3 (the blue boxes) show the results of the second game, while sessions 4-6 (the green boxes on the right) show the results of the third game. The y-axis shows the number of times in a session that the horses touched the poster when the light was on, i.e. when they were not supposed to. There is a clear and statistically significant (p = 0.02) decrease between the second and third game, indicating that there is only a 2 % chance of getting such a pronounced difference if the horses used “just” model-free learning.
The results of the study are quite compelling - it seems pretty clear that something drastic changes between the second and third game. The researchers conclude that this dramatic change is “consistent with model-based learning”, because it suggests that the horses may have understood the rules in the second game but only decided to follow them once a cost was introduced in the third game, i.e. they were able to predict the outcomes of different actions and choose the action that gave the outcome they desired (avoiding a time-out/continuing to play the game and earn rewards). If the horses had used model-free learning, on the other hand, they would have had to learn through trial-and-error in the third game which would have taken longer, and the decline in “errors” would have been slower and more gradual.
This is certainly one possible explanation, but the authors also provide three alternative explanations. They acknowledge that the results could be due to seasonal differences in motivation (as the second and third game were played three weeks apart), or due to the horses responding to other cues in the environment that the researchers were not able to control for, or even due to other learning mechanisms. To this I would add the possibility that the cost of breaking the rules was perceived as so great that learning was speeded up. Learning is complex, and we know horses, like other animals, are capable of learning in just one trial if the outcome has a strong emotional component (such as a very frightening experience). If they were highly motivated to participate in the game, it’s possible that losing the opportunity to do so through a time-out was a sufficiently negative experience for them to make a quick connection between their actions and the outcome.
This possibility seems even more likely in view of the large individual variation in the results. In the discussion, the authors state that “it is worth noting that considerable individual variation in errors was observed in all sessions indicating strategy was not consistently demonstrated by all subjects”. They do not explore this further, and I wish they had because for me this is the most interesting part of the study!
Individual variation and “the chaos factor”
I saw a similar pattern in a cognitive bias experiment I did a few years ago as part of my master’s research. The horses in my study learned to distinguish between buckets placed in two different locations. In one location there would always be carrots in the bucket, while in the other location, the bucket would always be empty. What I found was that some horses would keep approaching the empty bucket and it would take them many trials to “learn” that it was empty, while others would stop approaching the empty bucket after just one or two trials.
At the time I wondered if this was due to differences in individual motivations - some horses may have been motivated to keep approaching “just in case” there might be a carrot in the bucket, while others were motivated to conserve energy, or do something else instead. I recall Peter Godfrey-Smith bringing up a similar example from octopus research in his book “Other Minds”: in 1959, researchers set out to test a simple operant conditioning protocol on three octopuses named Albert, Bertram and Charles. The task was to pull a lever, and if they did, they were rewarded with a sardine. While Albert and Bertram performed with focus and diligence, Charles ran riot in his tank and did pretty much anything except pull the lever, including squirting water at the researchers.
This large individual variation in behaviour under similar conditions can be seen in pretty much all behavioural research, in species as diverse as horses and octopuses - I call it “the chaos factor”. Traditionally, scientists would regard this individual variation as statistical noise and clean any outliers out of the data set. Today, however, many of us have realised that this individuality is a feature and not a bug, because animals are not a monolith. They have different personalities, different experiences, different motivations, and different abilities. All these differences inevitably impact the choices they make, in behavioural tests and in real life.
While combining and averaging individual results might be necessary for statistical analysis, it deprives us of some of the most interesting aspects of animal behaviour - their individuality. Finding new experimental and analytical models that take this individuality into account will be the new frontier for behavioural research.
How smart are horses, really?
Where does this all leave us? Do horses strategise and plan, as BBC and The Guardian and other publications would have us believe?
As usual, the answer is “it,s complicated”.
Let us first briefly recap model-free vs model-based learning. Model-free learning is learning through trial and error, where the animal makes a connection between its actions and the consequences through repetition. It is a simple form of learning with minimal cognitive involvement. Model-based learning, on the other hand, is when an animal creates a mental “model” of the different potential outcomes of its actions. This type of learning requires more advanced cognitive processes than model-free learning, because the animal understands that there are different possible actions that can lead to different outcomes, and decides which action to take based on some type of evaluation of the potential outcome.
I say “some type of” evaluation, because we have little reason to think model-based learning is all-or-nothing, meaning every species either evaluates future outcomes in the exact same way we humans do or not at all. Scientists increasingly accept that mental traits exist on an evolutionary continuum, just like physical traits. While it is possible that model-based learning emerged suddenly in H. sapiens some 200 000 years ago through an extreme mutation, it is incredibly improbable. It is far more likely that model-based learning, and other cognitive abilities, evolved from earlier stages just like limbs or the neocortex. And just like limbs and the neocortex, there can be functional variations between different species depending on the particular needs of their ecological niches.
So the first thing we need to do is ask: what purpose would the evolution of model-based learning have served for the horse as a species? If we approach cognition from this evolutionary perspective - as I think we should - then we can build a theoretical argument for the existence of (some type of) model-based learning in horses because of the complexity of the world they have evolved to inhabit. Horses live on large and varying home ranges, and they navigate complex and dynamic social relationships. It is reasonable to think that this requires at least some ability to make tactical decisions based on predicted outcomes.
There is also plenty of anecdotal evidence to support this theoretical argument. For example, home ranges will have different areas with water, shelter, grazing and so on. At any given moment, horses will need to decide whether to stay where they are or go somewhere else. Water sources can be far apart and take time to get to, so it would make sense for them to have evolved some ability to plan ahead in terms of when to leave and which route to take.
Another example closer to home is the fact that some horses will create designated toilet areas in their fields. This likely requires a number of cognitive processes, including introception (realising they need to poop), impulse control (not pooping immediately), memory retrieval (remembering that they have a toilet area), and a mental map of the field (knowing how to get to the toilet area). If there are several toilet areas in the field, they may also need to utilise some type of model-based prediction to decide which one to go to.
A common argument against the existence of any “higher” cognitive processes in horses is the fact that horse brains have a smaller frontal cortical area than we do. This is at best a very weak argument. Horses and humans share a general mammalian brain structure, and there are certainly differences in both absolute and proportional size between our brains. But these differences are not in and of themselves predictive of function. Neuroanatomy is a complex research field, and similar cognitive processes can originate in different areas of the brain in different species.
An extreme example is the octopus, which doesn’t even have a centralised brain (or any prefrontal cortex in the mammalian sense) but still shows evidence of learning and rather complex cognition. Human and octopus brains have evolved independently for c. 750 million years and look very different, yet still ended up with some analogous cognitive abilities, like the ability to learn from consequences. So neither brain size nor shape is in itself predictive of brain function. We can’t know, just from looking at an animal’s brain, what cognitive abilities that animal will have or not have. Instead, we have to look at the behaviour, which is what Evans et al. have done.
Does their study prove that horses possess model-based learning? I would say no, because in science, a single study ranks low on the hierarchy of evidence. Evans et al. provide the first experimental indication that there might be something going on with equine cognition that we have previously not considered, but in order to prove the existence of model-based learning in horses, we need to repeat the study many more times with other populations. Once we have 5-10 studies with decent sample sizes that find the same thing, then we can consider this to be evidence of model-based learning in horses.
Can horses plan and strategise? Probably, to the extent that this has been necessary for their survival over the last 50 million years. This does not mean they can set up retirement funds or scheme to make our next ride miserable. Whatever planning and strategising they may be able to do will be focused on things that are relevant to them, like navigating social herd dynamics, finding different food sources, deciding where to poop, when to seek shelter from flies etc.
Where next for equine cognition research? Well, someone needs to start funding it. And I don’t mean novel experiments, I mean the boring legwork. Every cognition study needs to be repeated 10-20 times so that we can build a credible evidence base. So if anyone reading this has some spare money lying around that they don’t need, please give it to Evans and her team. They’re doing great work, and we need to make sure they can do more of it!
All opinions expressed here are my own.
This post is free for anyone to read. It takes time and effort for me to write these posts, so if you liked it or found it interesting, please consider commenting, sharing it with a friend, or sharing it on social media as a thank you! And if you want to get notified when I post next, please consider subscribing. That’s also free!
As a former special education teacher with a lot of experience in cognitive assessment, plus 25+ years of adult horse experience, my answer to the question is--a guarded yes. Add to that the caveat that "not all horses" are capable of such thinking. There is a range of intelligence in horses just like there is in humans.
Much of this planning and strategizing is going to be based on things important *to the horse*, not necessarily to the human. A horse that has specific toileting areas will do so for reasons important to the horse--either limiting pasture areas fouled by poop in order to have clean eating spaces, or else (in the case of stallions and some geldings) territorial marking. Motivation for these purposes will differ from horse to horse--a mare is not necessarily going to be interested in territorial marking, and we all know male horses who don't care about territorial marking or keeping things clean.
Acquisition of a desired treat is going to be much more of a common motivation. I have two rather intelligent horses who are both extremely food-motivated, the gelding more than the mare (he has experienced food deprivation in the past where she has not). The mare is a retired reining horse and has demonstrated the ability to retain complex patterns in her memory, including anticipating flying change requests in one exercise where the change is requested after a certain number of small squares in one direction (reliably up to seven squares, inconsistent after that). As part of a bonding exercise between her and the gelding as well as easing the transition from work to retirement, I started feeding her treats from the gelding's back while riding in the field.
It was amusing to watch her begin to anticipate and attempt to create opportunities where she controlled the feeding of treats. I normally saved this activity for when the gelding and I finished working. She would anticipate our path (this in a 40 acre field) and intercept us earlier than planned. After she learned that this wouldn't work, she began to place herself near the most likely location where we would finish work and feed a treat to her. This happened in a space where she had been worked in the past, before retirement. The only time she would not do this was when hay was fed in the field (this happens in winter pasture). The key is that she would watch us even while eating fresh hay, and move in our direction once she anticipated the ending of our work session.
This was not a training activity. I would otherwise ride up to the mare and call her over. She instigated the interception of horse and rider on her own, after carefully and consistently watching us work and learning our behavioral patterns, which were very similar to ones we did when she was being ridden. I have had over 19 years of experience with this particular horse and seen her demonstrate similar behaviors where she clearly thought through a situation to achieve a desired outcome--and rarely was it associated with human training goals, but rather at liberty, often in a field with other horses.
Thank you, this was a super well written comment. One thing that - as a practical only horse trainer - I didn't find when reading the paper the first time at least, was how many repetitions was made in each training session. My experience is that when teaching a horse something simple, like touching an object or not touching it, the number of rewards/no rewards needed is low. Usually three or four repetitions is needed to teach a naive horse to touch an object. I haven't tried teaching a cue for not touching, preferring to use another behaviour that can also be rewarded, but I think there would be a low number of repetitions needed there, too. Unless the horse in question would be really optimistic and super resistant to negative punishment.