This post is available as a PDF download here.
Summary
- Research suggests that simple heuristics are often far more robust than more complicated, theoretically optimal solutions.
- Taken too far, we believe simplicity can actually introduce significant fragility into an investment process.
- Using trend equity as an example, we demonstrate how using only a single signal to drive portfolio allocations can make a portfolio highly sensitive to the impact of randomness, clouding our ability to determine the difference between skill and luck.
- We demonstrate that a slightly more complicated process that combines signals significantly reduces the portfolio’s sensitivity to randomness.
- We believe that the optimal level of simplicity is found at the balance of diversification benefit and introduced estimation risk. When a more complicated process can introduce meaningful diversification gain into a strategy or portfolio with little estimation risk, it should be considered.
Introduction
In the world of finance, simple can be surprisingly robust. DeMiguel, Garlappi, and Uppal (2005)1, for example, demonstrate that a naïve, equal-weight portfolio frequently delivers higher Sharpe ratios, higher certainty-equivalent returns, and lower turnover out-of-sample than competitive “optimal” allocation policies. In one of our favorite papers, Haldane (2012)2demonstrates that simplified heuristics often outperform more complicated algorithms in a variety of fields.
Yet taken to an extreme, we believe that simplicity can have the opposite effect, introducing extreme fragility into an investment strategy.
As an absurd example, consider a highly simplified portfolio that is 100% allocated to U.S. equities. Introducing bonds into the portfolio may not seem like a large mental leap but consider that this small change introduces an axis of decision making that brings with it a number of considerations. The proportion we allocate between stocks and bonds requires, at the very least, estimates of an investor’s objectives, risk tolerances, market outlook, and confidence levels in these considerations.3
Despite this added complexity, few investors would consider an all-equity portfolio to be more “robust” by almost any reasonable definition of robustness.
Yet this is precisely the type of behavior we see all too often in tactical portfolios – and particularly in trend equity strategies – where investors follow a single signal to make dramatic allocation decisions.
So Close and Yet So Far
To demonstrate the potential fragility of simplicity, we will examine several trend-following signals applied to broad U.S. equities:
- Price minus the 10-month moving average
- 12-1 month total return
- 13-minus-34-week exponential moving average cross-over
Below we plot over time the distance each of these signals is from turning off. Whenever the line crosses over the 0% threshold, it means the signal has flipped direction, with negative values indicating a sell and positive values indicating a buy.
In orange we highlight those periods where the signal is within 1% of changing direction. We can see that for each signal there are numerous occasions where the signal was within this threshold but avoided flipping direction. Similarly, we can see a number of scenarios where the signal just breaks the 0% threshold only to revert back shortly thereafter. In the former case, the signal has often just managed to avoid whipsaw, while in the latter it has usually become unfortunately subject to it.
Source: Kenneth French Data Library. Calculations by Newfound Research.
Is the avoidance of whipsaw representative of the “skill” of the signals while the realization of whipsaw is just bad luck? Or might it be that the avoidance of whipsaw is often just as much luck as the realization of whipsaw is poor skill? How can we determine what is skill and what is luck when there are so many “close calls” and “just hits”?
What is potentially confusing for investors new to this space is that academic literature and practitioner evidence finds that these highly simplified approaches are surprisingly robust across a variety of investment vehicles, geographies, and time periods. What we must stress, however, is that evidence of general robustness is not evidence of specific robustness; i.e. there is little evidence suggesting that a single approach applied to a single instrument over a specific time horizon will be particularly robust.
What Randomness Tells Us About Fragility
To emphasize the potential fragility on utilizing a single in-or-out signal to drive our allocation decisions, we run a simple test:
- Begin with daily market returns
- Add a small amount of white noise (mean 0%; standard deviation 0.025%) to daily market returns
- Calculate a long/flat trend equity strategy using 12-1 month momentum signals4
- Calculate the rolling 12-month return of the strategy minus the alternate market history return.
- Repeat 1,000 times to generate 1,000 slightly alternate histories.
The design of this test aims to deduce how fragile a strategy is via the introduction of randomness. By measuring 12-month rolling relative returns versus the modified benchmarks, we can compare the 1,000 slightly alternate histories to one another in an effort to determine the overall stability of the strategy itself.
Now bear with us, because while the next graph is a bit difficult to read, it succinctly captures the thrust of our entire thesis. At each point in time, we first calculate the average 12-month relative return of all 1,000 strategies. This average provides a baseline of expected relative strategy performance.
Next, we calculate the maximum and minimum relative 12-month relative performance and subtract the average. This spread – which is plotted in the graph below – aims to capture the potential return differential around the expected strategy performance due to randomness. Or, put another way, the spread captures the potential impact of luck in strategy results due only to slight changes in market returns.
Source: Kenneth French Data Library. Calculations by Newfound Research.
We can see that the spread frequently exceeds 5% and sometimes even exceeds 10. Thus, a tiny bit of injected randomness has a massive effect upon our realized results. Using a single signal to drive our allocation appears particularly fragile and success or failure over the short run can largely be dictated by the direction the random winds blow.
A backtest based upon a single signal may look particularly good, but this evidence suggests we should dampen our confidence as the strategy may actually have just been the accidental beneficiary of good fortune. In this situation, it is nearly impossible to identify skill from luck when in a slightly alternate universe we may have had substantially different results. After all, good luck in the past can easily turn into misfortune in the future.
Now let us perform the same exercise again using the same random sequences we generated. But rather than using a single signal to drive our allocation we will blend the three trend-following approaches above to determine the proportional amount of equities the portfolio should hold.5 We plot the results below using the same scale in the y-axis as the prior plot.
Source: Kenneth French Data Library. Calculations by Newfound Research.
We can see that our more complicated approach actually exhibits a significant reduction in the effects of randomness, with outlier events significantly decreased and far more symmetry in both positive and negative impacts.
Below we plot the actual spreads themselves. We can see that the spread from the combined signal approach is lower than the single signal approach on a fairly consistent basis. In the cases where the spread is larger, it is usually because the sensitivity is arising from either the 10-month SMA or 13-minus-34-week EWMA signals. Were spreads for single signal strategies based upon those approaches plotted, they would likely be larger during those time periods.
Source: Kenneth French Data Library. Calculations by Newfound Research.
Conclusion
So, where is the balance? How can we tell when simplicity creates robustness and simplicity introduces fragility? As we discussed in our article A Case Against Overweighting International Equity, we believe the answer is diversificationversus estimation risk.
In our case above, each trend signal is just a model: an estimate of what the underlying trend is. As with all models, it is imprecise and our confidence level in any individual signal at any point in time being correct may actually be fairly low. We can wrap this all together by simply saying that each signal is actually shrouded in a distribution of estimation risk. But by combining multiple trend signals, we exploit the benefits of diversification in an effort to reduce our overall estimation risk.
Thus, while we may consider a multi-model approach less transparent and more complicated, that added layer of complication serves to increase internal diversification and reduce estimation risk.
It should not go overlooked that the manner in which the signals were blended represents a model with its own estimation risk. Our choice to simply equally-weight the signals indicates a zero-confidence position in views about relative model accuracy and relative marginal diversification benefits among the models. Had we chosen a more complicated method of combining signals, it is entirely possible that the realized estimation risk could overwhelm the diversification gain we aimed to benefit from in the first place. Or, conversely, that very same added estimation risk could be entirely justified if we could continue to meaningfully improve diversification benefits.
If we return back to our original example of a 100% equity portfolio versus a blended stock-bond mix, the diversification versus estimation risk trade-off becomes obvious. Introducing bonds into our portfolio creates such a significant diversification gain that the estimation risk is often an insignificant consideration. The same might not be true, however, in a tactical equity portfolio.
Research and empirical evidence suggest that simplicity is surprisingly robust. But we should be skeptical of simplicity for the sake of simplicity when it foregoes low-hanging diversification opportunities, lest we make our portfolios and strategies unintentionally fragile.
What do portfolios and teacups have in common?
By Corey Hoffstein
On December 17, 2018
In Portfolio Construction, Risk Management, Weekly Commentary
This post is available as a PDF download here.
Summary
Introduction
At Newfound, we spend a lot less time trying to figure out how to be more right than we spend trying to figure out how to be less wrong. One area of particular interest for us is the idea of unintended bets: the exposures in a portfolio we may not even be aware of. And if we knew we had the exposure, we might not even want it.
For example, consider a portfolio that invests in either broad U.S., broad international, or broad emerging market equities based upon valuations. A significant tilt towards non-U.S. assets may be a valuation-driven decision, but for U.S. investors it creates significant exposure to fluctuations in the U.S. dollar versus foreign currencies.
Of course, exposures are not limited only to assets. Exposures may be broader macro-economic, stylistic, thematic, geographic, or even political factors.
These unintended bets can go far beyond explicit and implicit exposures. In our example, the choice of how to measure value may lead to meaningfully different portfolios, despite the same overarching thesis. For example, a naïve CAPE ratio versus adjusting for differences in relative sector composition dramatically alters the view of whether international equities are significantly cheaper than U.S. equities. These potential differences capture what we like to call “model specification risk.”
Finally, we can be subject to unintended bets based upon when the portfolio is re-evaluated and reconstituted. Evaluating valuations in January, for example, may lead to a different decision versus evaluating them in July.
How can we avoid these unintended bets? At Newfound, we believe that the answer falls back to diversification: not only in the traditional sense of what we invest in, but also across how we make decisions and when we make them.
When left uncontrolled, unintended bets can make a strategy incredibly fragile.
What, precisely, does it mean for a strategy to be fragile? A strategy is fragile when small variations of strategy inputs – be it asset returns or other measures – lead to meaningful dispersion in realized results.
Now we want to distinguish between volatility and fragility. Volatility is the dispersion of strategy returns across time, while fragility is the dispersion in end-of-period wealth across variations of the strategy.
As an example, a portfolio that invests only in the S&P 500 is very volatile but not particularly fragile. Given the last ten years of returns for the S&P 500, slight variations in annual returns would not lead to significant dispersion in end-of-period wealth. On the other hand, a strategy that flips a coin every December and invests for the next year in the S&P 500 when it lands on heads or short-term U.S. Treasuries when it lands on tails would have lower expected volatility than the S&P 500 but would be much more fragile. We need simply consider a few scenarios (e.g. all heads or all tails) to understand the potential dispersion such a strategy is subject to.
In the remainder of this commentary, we will demonstrate how diversification across the what, how, and when axes can reduce strategy fragility.
The Experiment Setup
Since a large degree of our focus at Newfound is on managing trend equity mandates, we will explore fragility through the lens of the style of measuring trends. For those unfamiliar with the approach, trend equity strategies aim to capture a significant portion of equity market growth while avoiding substantial and prolonged drawdowns through the application of trend following. A naïve implementation of such an idea would be to invest in the S&P 500 when its prior 12-month return has been positive and invest in short-term U.S. Treasuries otherwise.
To learn something about the fragility of a strategy, we are going to have to inject some randomness. After all, no amount of history will tell us about the fragility of a teacup that has spent its entire life sitting on a shelf; we will need to see it fall on the floor to actually learn something.
As with our recent commentary When Simplicity Met Fragility, we will inject randomness by adding white noise to asset returns. Specifically, we will add to daily returns a draw from a random normal distribution with mean 0% and standard deviation 0.025%. Using this slightly altered history, we will then run our investment strategy.
By performing this process a large number of times (10,000 in this commentary), we can explore how the outcome of the strategy is impacted by these slight variations in return history. The greater the dispersion in results, the more fragile the strategy is.
To demonstrate how diversification across the three different axes can affect fragility, we will start with a naïve trend equity strategy – investing in broad U.S. equities using a single trend model that is rebalanced on a monthly basis – and vary the three components in isolation.
The What
The “what” axis simply asks, “what are we invested in?”
How can our choice of “what” affect fragility? Consider a slight variation to our coin-flip strategy from before. Instead of flipping a single coin, we will now flip two coins. The first coin determines whether we invest 50% of the portfolio in either the S&P 500 or short-term U.S. Treasuries, while the second coin determines whether we invest the other 50% of the portfolio in either the Russell 1000 or short-term U.S. Treasuries.
In our single coin example, each year we expected to invest in the S&P 500 50% of the time and in short-term U.S. Treasuries 50% of the time. With two coins, we now expect to be fully invested 25% of the time, partially invested 50% of the time, and divested 25% of the time.
Let’s take this notion to further limits. Consider now flipping 100 coins where each determines the allocation decision for 1% of our portfolio, where heads leads to an investment in a large-cap U.S. equity portfolio and tails means invest in short-term U.S. Treasuries. Now being fully invested or divested is an infinitesimally small probability event; in fact, for a given year there is a 95% chance that your allocation to equities falls between 40-60%.1
Even though we’ve applied the exact same process to each investment, diversifying across more investments has dramatically reduced the fragility of our coin-flipping strategy.
Now let’s translate this from the theoretical to the practical. We will begin with a simple trend following strategy that invests in the underlying asset when prior 12-1 month returns have been positive or invests in the risk-free rate, re-evaluating the trend at the end of each month.
To explore the impact of diversifying our what, we will implement this strategy five different ways:
The graph below plots the distribution of log difference in terminal wealth against the median outcome for each of these five approaches. Lines within each “violin” show the 25th, 50th, and 75thpercentiles.
The graph clearly demonstrates that by increasing our exposure across the “what” axis, the dispersion in terminal wealth is dramatically reduced.
Source: Kenneth French Data Library. Calculations by Newfound Research.
But why is reduced dispersion in terminal wealth necessarily better?
It implies a greater consistency in outcome, which is not only important for setting forward expectations, but is also important for evaluating past performance (whether backtested or live). This evidence tells us that if we are evaluating a trend equity strategy that employs a single model to make in-or-out decisions on broad U.S. equities on a monthly basis, it will be nearly impossible to tell whether the realized results are in line with reasonable expectations or overly optimistic (we can probably guess that they aren’t overly pessimistic, as those sorts of returns typically aren’t marketed).
To justify a concentration in the “what” axis, we would have to demonstrate that the worst-case scenarios would still represent a meaningful improvement in expected terminal wealth versus a more diversified approach.
It should be noted that our experiment design prohibits dispersion from every being fully reduced, as we are injecting randomness into past returns. Even if no strategy is applied, there will be some inherent dispersion in final wealth. For example, below we plot the dispersion that occurs simply from adding randomness to past returns with a buy-and-hold approach.
Increasing the number of assets in the portfolio inherently reduces dispersion for buy-and-hold because diversification helps drive the expected impact of the injected randomness towards its mean: zero. With only one asset, on the other hand, outlier events are free to wreak havoc on results.
Source: Kenneth French Data Library. Calculations by Newfound Research.
Note that adding a strategy on top of buy-and-hold can exacerbate the fragility issue, making diversification that much more important.
The How
The “how” axis asks, “how are we making investment decisions.”
Many investors are already somewhat familiar with diversification along the “how” axis, often diversifying their active exposures across multiple managers who might have similar investment mandates but slightly different processes.
We like to call this “process diversification” and think of it as akin to the parable of the blind men and the elephant. Each blind man touches a different part of the elephant and pronounces his belief in what he is touching based upon his isolated view. The blind man touching the leg, for example, might think he is touching a sturdy tree while the blind man touching the tail might believe he is grabbing a rope.
None is correct in isolation but taken together we may gain a more well-rounded picture.
Similarly, two managers may claim to invest based upon valuations, but the manner in which they do so gives them a very different picture of where value can be found.
The idea of process diversification was explored in the 1999 paper “Do You Need More than One Manager for a Given Equity Style?” by Franklin Fant and Edward O’Neal. Fant and O’Neal found that while a multi-manager approach does very little for return variability across time (i.e. portfolio volatility), it does a lot for end-of-period wealth variability. They find this to be true across almost all equity style box categories. In other words: taking a multi-manager approach can reduce fragility.
Let us return to our prior coin flip example. Instead of making a choice to invest in the S&P 500 based upon a coin-flip, however, we will combine a number of different signals. For example, we might flip a coin, roll a die, measure the weather, and look at the second hand of a clock. Each signal gives us some sort of in-or-out decision, and we average these decisions together to get our allocation. As with before, as we incorporate more signals, we decrease the probability that we end up with extreme allocations, leading to a more consistent terminal wealth distribution.
Again, we should stress here that the objective is not just outright elimination of dispersion in terminal wealth. After all, if that were our sole pursuit, we could simply stuff our money under our mattress. Rather, assuming we will be implementing some active investment strategy that we hope has a positive long-term expected return, our aim should be to reduce the dispersion in terminal wealth for that strategy.
Of course, in investing we would not expect the processes to be entirely independent. With trend following, for example, most popular models are actually mathematically linked to one another, and therefore generate signals that are highly correlated. Nevertheless, even modest diversification can have meaningful benefits with respect to strategy fragility.
To explore the impact of diversification along the how axis, we implement our trend following strategy six different ways. Each invests in broad U.S. equities and rebalances monthly but differs in the number of trend-following models employed.2
The results are plotted below.
Source: Kenneth French Data Library. Calculations by Newfound Research.
Again, we can see that increased diversification across the how axis dramatically reduces dispersion in terminal wealth. Our takeaway is largely the same: without an ex-ante view as to which particular model (or group of models) is best (i.e. a view of how to be more right), diversification can lead to greater consistency in results. We will be less wrong.
A subtler conclusion of this analysis is that it should be very, very difficult to necessarily conclude that one model is better than another. We can see that if we risk selecting just one model to govern our process, seemingly minor variations in historical returns leads can lead to dramatically different terminal wealth results, as evidenced by the bulging distribution. Inverting this line of thinking, we should also be suspect of any backtest that seeks to demonstrate the superiority of a given model using a single backtest. For example, just because a 12-1 month total return model performs better than a 10-month moving average model on historical S&P 500 returns, we should be highly skeptical as to the robustness of the conclusion that the 12-1 model is best.
The When
Then “when” axis asks, “when are we making our investment decision?”
This is an oft overlooked question in public markets, but it is commonly addressed in the world of private equity and venture capital. Due to the illiquid nature of those markets, investors will often attempt to diversify their business cycle risk by establishing positions in multiple funds over time, giving them exposure to different “vintages.” The idea here is simple: the opportunity set available at different points in time can vary and if we allocate all of our earmarked capital to a particular year, we may miss out on later opportunities.
Consider our original coin-flipping example where we flipped a single coin every December to determine whether we would buy the S&P 500 or hold our capital in short-term Treasuries. But why was it necessary that we make the decision in December? Why not July? Or January? Or September?
While we would not expect there to be point-in-time risk for coin flipping, we can still consider the net effect of a vintage-based allocation methodology. Here we will assume that we flip a coin each month and rebalance 1/12thof our capital based upon the result.
Again, the probability of allocating to the extremes (100% invested or 100% divested) is dramatically reduced (each has approximately a 0.02% chance of occurring) and we reduce strategy fragility to any specific coin flip.
But just how impactful is this notion? Below we plot the rolling 1-year total return difference between two 60% S&P 500 / 40% 5-year U.S. Treasury fixed-mix portfolios, with one being rebalanced in February and one in August. Even for this highly simplified example, we can see that the total return spread between the two portfolios blows out to over 700 basis points in March 2010 due to the fact that the February portfolio rebalanced back into equities at nearly the exact bottom of the crisis.
Source: Global Financial Data. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.
To increase diversification across the “when” axis, we want to increase the number of vintages we deploy. For our trend following example, we will assume that the portfolio allocates between broad U.S. equities and the risk-free rate based upon a single model, but with an increasing number of evenly-spaced vintages. Again, we will run 10,000 simulations that each slightly perturb historical U.S. equity market returns and compare the terminal wealth variation for approaches that employ a different number of vintages.
We can see in the graph below that, as with the other axes of diversification, as we increase the number of vintages employed, the variance decreases. While the 25thand 75thpercentiles do not decrease as dramatically as for the other axes, we can see that the extreme variations are reined in substantially when we move from 1 monthly tranche to 4 weekly tranches.
Source: Kenneth French Data Library. Calculations by Newfound Research.
Conclusion
We see two critical conclusions from this analysis:
To conclude our analysis, below we present a graph that combines diversification across all three axes. We again run 10,000 samples, randomly perturbing returns. For each sample, we then run four variations:
It should come as no surprise that as we increase the amount of diversification across all three axes, the dispersion in terminal wealth is dramatically reduced.3
Source: Kenneth French Data Library. Calculations by Newfound Research.
It is also important to note that while our analysis focused on trend following strategies, this same line of thinking applies across all investment approaches. As an example, consider a quantitative value manager who buys the top five cheapest stocks, as measured by price-to-book, in the S&P 500 each December and then holds them for the next year. Questions worth pondering are:
While low levels of diversification across the what, how, and when axes are not necessarily an indicator that a model is inherently fragile, it should be a red flag that more effort is required to disprove that it is not fragile.