Math Tests, Birthdays, O.J. Simpson, and Texas Sharpshooters

Corey Hoffstein

10 years ago

This blog post is available as a PDF here.

Summary

Humans are not terribly good at accurately assessing probability and dealing with randomness. This may stem from these skills having little evolutionary value.
We exhibit hindsight and confirmation bias. We weave narratives around completely random events. We assess probabilities using the wrong context.
These biases make performance evaluation difficult since security, asset class, and strategy returns have a random component.
Appreciating the role of randomness in portfolio results may quell some of the constant disappointment with diversification.

Humans are generally really, really bad at dealing with randomness and probability. In a 2008 article, Michael Shermer labeled this difficulty “folk numeracy.” He described “folk numeracy” as follows:

“Folk numeracy is our natural tendency to misperceive and miscalculate probabilities, to think anecdotally instead of statistically, and to focus on and remember short-term trends and small-number runs. We notice a short stretch of cool days and ignore the long-term global-warming trend. We note the consternation with the recent downturn in the housing and stock markets, forgetting the half-century upward-pointing trend line. Sawtooth data trend lines, in fact, are exemplary of folk numeracy: our senses are geared to focus on each tooth’s up or down angle, whereas the overall direction of the blade is nearly invisible to us.”

From an evolutionary perspective, none of this should be too surprising. Consider a person living in a jungle full of dangerous predators. He hears a rustle behind him. Is there any value in calculating the probability that the rustle was caused by a dangerous predator? Of course there isn’t. The cost of a false positive – fleeing a harmless situation – is tiny compared to the cost of a false negative – staying put since the odds are the situation is benign, only to be killed by a tiger.

As Shermer’s article points out, our problems with probability are multi-faceted. Below, we’ll discuss a few of the main issues.

Narratives

Humans tend to weave narratives around the events they experience, even if those events are the result of pure randomness.

Consider a high school math student. After studying diligently for a ten-question test, she has a firm grasp on 90% of the material. On average, this means that she should be an A student. However, luck will play a pretty big role in her success or failure. She will do better when the questions selected by the teacher overlap closely with her knowledge and worse when they do not.

The probability of a C or lower is 7.0%. While this outcome would be unlucky for someone that knows 90% of the material, we see that it certainly is not out of the question.

Say the student does indeed get a C. Everyone involved is likely to start to weave a narrative to explain the disappointing grade. The student might start to lose confidence, worrying that she was not as prepared as she thought. The parents might think the student is slacking in her studies or distracted by friends or extracurricular activities. She might get scolded, grounded, or forced to go to a tutor. All of this storytelling occurs despite the fact that the score was nothing more than a bit a bad luck.

Context

In other instances, humans get half of the way there. They stop to consider probabilities and the role of randomness, but because they use the wrong context they still end up far from the mark.

The Birthday Paradox is a prime example of this. Say that you are in a room with 22 other individuals and are asked to estimate the probability that at least two people in the room share the same birthday.

Posed with this question, most people guess that the probability is quite low. In fact, the true probability of at least two people sharing a birthday is 50.7%. Why? Incorrect context.

The actual question posed was: What is the probability that at least two people in the room share the same birthday?

However, most people answer a slightly different question: What is the probability that someone in the room shares a birthday with me?

The answer to this second question is quite low at 5.9%. Most people get the right answer to the wrong question. Their context is off.

Another interesting example of incorrect context occurred during the O.J. Simpson trial, which has been in the spotlight recently thanks to two recent television series.

A key aspect of the trial involved past domestic violence perpetrated by Simpson against Nicole Brown. One of Simpson’s defense attorneys, Alan Dershowitz, argued against the relevance of this evidence by saying that fewer than 1 in 2,500 abusers go on to murder their spouses.

Dershowitz wasn’t necessarily incorrect, but he provided the jury with the wrong context (which, perhaps, was his intention, given that he was defending Simpson). In probability terms, he claims that the probability that a man murders his wife given that he abused her in the past is less than 1 in 2,500.

But this ignores the key fact that someone did in fact murder Brown.

The jury should be interested in a slightly different probability: the probability that a man murders his wife given that he abused her and that she has been murdered. Using some additional data and a powerful tool called Bayes’ theorem, we find that this probability is actually around 89%. 89% may not be high enough to convict by itself, but would be powerful when combined with other evidence of guilt.

Hindsight Bias and Confirmation Bias

From Wikipedia:

“Hindsight bias is the inclination, after an event has occurred, to see the event as having been predictable, despite their having been little or no objective basis for predicting it.”

“Confirmation bias is the tendency to search for, interpret, favor, and recall information in a way that confirms one’s preexisting beliefs or hypotheses, while giving disproportionally less consideration to alternative possibilities.

The Texas Sharpshooter Fallacy illustrates both of these biases.

Consider a cowboy randomly shooting at the side of a barn. This cowboy is a terrible shot and so bullet holes are randomly dispersed on the barn’s surface. Inevitably, some clusters start to develop as the cowboy takes more and more shots. The cowboy locates the densest clusters and draws bull’s-eyes around them. In the future, the cowboy uses the barn as evidence of his sharpshooting prowess.

The Texas Sharpshooter Fallacy is more than an entertaining story. It’s pervasive in many walks of life. Journalist David McRaney pointed out a number of examples in an article on the topic. For example, he wrote:

“In World War II, Londoners took notice when bombing raids consistently missed certain neighborhoods. People began to believe German spies lived in the spared buildings. They didn’t. Analysis afterwards by psychologists Daniel Kahneman and Amos Tversky showed the bombing strike patterns were random.”

Diversification Disappoints

So what in the world does any of this have to do with investing?

A few weeks back we explained why diversification, often called the only free lunch on Wall Street, will always disappoint.

In our opinion, much of this frustration stems from the probability-related issues discussed above. Diversification is valuable precisely because the future is uncertain, which means it should be evaluated with probability and randomness in mind.

As an example, consider an investor that spreads her capital equally among ten different mutual funds (labeled A through J). Her performance, both on a position-by-position and aggregate basis, is summarized in the following periodic table of returns.

These are random, simulated returns generated assuming that each Fund’s returns are independently and identically distributed normal random variables.

One of the first things the investor will probably notice is the performance of Fund J. Fund J not only is the worst performing over the full period, but also has the dubious distinction of underperforming the overall portfolio in all five years. As a result, Fund J detracted from portfolio performance each and every year.

The investor draws a bull’s-eye right on Fund J, just like the Texas Sharpshooter.

These are random, simulated returns generated assuming that each Fund’s returns are independently and identically distributed normal random variables.

Then the investor thinks, “Wow, what are the odds that a Fund underperforms for five consecutive years? Probably pretty low.”

And she isn’t wrong. The probability of Fund J underperforming for five years in a row is only 3.125%. However, her context is wrong, just like in the Birthday Paradox and O.J. Simpson examples. The more relevant question is: “What is the probability that at least one of my ten funds underperforms for five straight years?” Using some simplifying assumptions, the answer is around 28%, far from a rare event[1].

As a quick mathematical aside, it’s worth nothing this probability will actually increase as more funds are added to the portfolio. More diversified portfolios have greater odds of having long-term underperformers. With twenty funds, the probability increases to nearly 50% – equivalent to a coin flip. With thirty funds, the probability is above 60% and so it’s actually more likely than not that we have at least one fund underperform for 5 years in a row.

With this incorrect probability assessment in hand, the investor weaves a narrative as to why Fund J underperformed. Maybe there was a change in portfolio manager or the Fund’s process is fundamentally broken. Maybe the Fund is too big or too small. This is no different than searching for explanations as to why your A-student got a C on a math test.

To be clear, we are not suggesting that investors put their collective heads in the sand and chalk up all performance variation to randomness. A diligent parent won’t ignore their children’s grades and a diligent investor won’t stop trying to understand the sources of outperformance and underperformance just because randomness and luck happen to play a significant role in results, especially when sample sizes are small.

However, we do believe that the best investors appreciate the starring role that randomness plays in investment results. Gaining this appreciation may soothe some of the constant disappointment with diversification.

[1] We assume that each fund’s annual return is independently and identically normal random variables and that there is no serial correlation from year to year. We use these assumptions for all of the hypothetical portfolio analysis.

Corey Hoffstein

Corey is co-founder and Chief Investment Officer of Newfound Research. Corey holds a Master of Science in Computational Finance from Carnegie Mellon University and a Bachelor of Science in Computer Science, cum laude, from Cornell University. You can connect with Corey on LinkedIn or Twitter.