This post is available as a PDF download here.
Summary
- Benchmarking a trend-following strategy can be a difficult exercise in managing behavioral biases.
- While the natural tendency is often to benchmark equity trend-following to all-equities (e.g. the S&P 500), this does not accurately give the strategy credit for choosing to be invested when the market is going up.
- A 50/50 portfolio of equities and cash is generally an appropriate benchmark for long/flat trend-following strategies, both for setting expectations and for gauging current relative performance.
- If we acknowledge that for a strategy to outperform over the long-run, it must undergo shorter periods of underperformance, using this symmetric benchmark can isolate market environments that underperformance should be expected.
- Diversifying risk-management approaches (e.g. pairing strategic allocation with tactical trend-following) can manage events that are unfavorable to one strategy, and benchmarking is a tool to set expectations around the level of risk management necessary in different market environments.
Any strategy that deviates from the most basic is compared to a benchmark. But how do you choose an appropriate benchmark?
The complicated nature of benchmarking can be easily seen by considering something as simple as a value stock strategy.
You may pit your concentrated value manager you currently use up against the more diversified value manager you used previously. At that time, you may have compared that value manager to a systematic smart-beta ETF like the iShares S&P 500 Value ETF (ticker: IVE). And if you were invested in that ETF, you might compare its performance to the S&P 500.
What prevents you from benchmarking them all to the S&P 500? Or from benchmarking the concentrated value strategy to all of the other three?
Benchmark choices are not unique and are highly dependent on what aspect of performance you wish to measure.
Benchmarking is one of the most frequently abused facets of investing. It can be extremely useful when applied in the correct manner, but most of the time, it is simply a hurdle to sticking with an investment plan.
In an ideal world, the only benchmark for an investor would be whether or not they are on track for hitting their financial goals. However, in an industry obsessed with relative performance, choosing a benchmark is a necessary exercise.
This commentary will explore some of the important considerations when choosing a benchmark for trend-following strategies.
The Purpose of a Trend-Following Benchmark
As an investment manager, our goal with benchmarking is to check that a strategy’s performance is in line with our expectations. Performance versus a benchmark can answer questions such as:
- Is the out- or underperformance appropriate for the given market environment?
- Is the magnitude of out- or underperformance typical?
- How is the strategy behaving in the context of other ways of managing risk?
With long/flat trend-following strategies, the appropriate benchmark should gauge when the manager is making correct or incorrect calls in either direction.
Unfortunately, we frequently see long/flat equity trend-following strategies benchmarked to an all-equity index like the S&P 500. This is similar to the coinflip game we outlined in our previous commentary about protecting and participating with trend-following.[1]
The behavioral implications of this kind of benchmarking are summarized in the table below.
The two cases with wrong calls – to move to cash when the market goes up or remain invested when the market goes down – are appropriately labeled, as is the correct call to move to cash when the market is going down. However, when the market is going up and the strategy is invested, it is merely keeping up with its benchmark even though it is behaving just as one would want it to.
To reward the strategy in either correct call case, the benchmark should consist of allocations to both equity and cash.
A benchmark like this can provide objective answers to the questions outlined above.
Deriving a Trend-Following Benchmark
Sticking with the trend-following strategy example we outlined in our previous commentary[2], we can look at some of the consequences of choosing different benchmarks in terms of how much the trend-following strategy deviates from them over time.
The chart below shows the annualized tracking error of the strategy to the range of strategic proportions of equity and cash.
Source: Kenneth French Data Library. Data from July 1926 – February 2018. Calculations by Newfound Research. Returns are gross of all fees, including transaction fees, taxes, and any management fees. Returns assume the reinvestment of all distributions. This document does not reflect the actual performance results of any Newfound investment strategy or index. All returns are backtested and hypothetical. Past performance is not a guarantee of future results.
The benchmark that minimizes the tracking error is a 47% allocation to equities and 53% to cash. This 0.47 is also the beta of the trend-following strategy, so we can think of this benchmark as accounting for the risk profile of the strategy over the entire 92-year period.
But what if we took a narrower view by constraining this analysis to recent performance?
The chart below shows the equity allocation of the benchmark that minimizes the tracking error to the trend-following strategy over rolling 1-year periods.
Source: Kenneth French Data Library. Data from July 1926 – February 2018. Calculations by Newfound Research. Returns are gross of all fees, including transaction fees, taxes, and any management fees. Returns assume the reinvestment of all distributions. This document does not reflect the actual performance results of any Newfound investment strategy or index. All returns are backtested and hypothetical. Past performance is not a guarantee of future results.
A couple of features stand out here.
First, if we constrain our lookback period to one year, a time-period over which many investors exhibit anchoring bias, then the “benchmark” that we may think we will closely track – the one we are mentally tied to – might be the one that we deviate the most from over the next year.
And secondly, the approximately 50/50 benchmark calculated using the entire history of the strategy is rarely the one that minimizes tracking error over the short term.
The median equity allocation in these benchmarks is 80%, the average is 67%, and the data is highly clustered at the extremes of 100% equity and 100% cash.
Source: Kenneth French Data Library. Data from July 1926 – February 2018. Calculations by Newfound Research. Returns are gross of all fees, including transaction fees, taxes, and any management fees. Returns assume the reinvestment of all distributions. This document does not reflect the actual performance results of any Newfound investment strategy or index. All returns are backtested and hypothetical. Past performance is not a guarantee of future results.
The Intuitive Trend-Following Benchmark
Is there a problem in determining a benchmark using the tracking error over the entire period?
One issue is that it is being calculated with the benefit of hindsight. If you had started a trend-following strategy back in the 1930s, you would have arrived at a different equity allocation for the benchmark based on this analysis given the available data (e.g. using data up until the end of 1935 yields an equity allocation of 37%).
To remove this reliance on having a sufficiently long backtest, our preference is to rely more on the strategy’s rules and how we would use it in a portfolio to determine our trend-following benchmarks.
For a trend following strategy that pivots between stocks and cash, a 50/50 benchmark is a natural choice.
It is broad enough to include the assets in the trend-following strategy’s investment universe while being neutral to the calls to be long or flat.
Seeing the 50/50 portfolio be the answer to the tracking error minimization problem over the entire data simply provides empirical evidence for its use.
One argument against using a 50/50 blend could focus on the fact that the market is generally up more frequently than it is down, at least historically. While this is true, the magnitude of down moves has often been larger than the magnitude of up moves. Since this strategy is explicitly meant as a risk management tool, accounting for both the magnitude and the frequency is prudent.
Another argument against its use could be the belief that we are entering a different market environment where history will not be an accurate guide going forward. However, given the random nature of market moves coupled with the behavioral tendencies of investors to overreact, herd, and anchor, a benchmark close to a 50/50 is likely still a fitting choice.
Setting Expectations with a Trend-Following Benchmark
Now that we have a benchmark to use, how do we use it to set our expectations?
Neglecting the historical data for the moment, from the ex-ante perspective, it is helpful to decompose a typical market cycle into four different segments and assess how we expect trend-following to behave:
- Initial decline – Equity markets begin to sell off, and the fully invested trend-following strategy underperforms the 50/50 benchmark.
- Prolonged drawdown – The trend-following strategy adapts to the decline and moves to cash. The trend-following strategy outperforms.
- Initial recovery – The trend-following strategy is still in cash and lags the benchmark as prices rebound off the bottom.
- Sustained recovery – The trend-following strategy reinvests and captures more of the upside than the benchmark.
Of course, this is a somewhat ideal scenario that rarely plays out perfectly. Whipsaw events occur as prices recover (decline) before declining (recovering) again.
But it is important to note how the level of risk relative to this 50/50 benchmark varies over time.
Contrast this with something like an all equity strategy benchmarked to the S&P 500 where the risk is likely to be similar during most market environments.
Now, if we look at the historical data, we can see this borne out in the graph of the drawdowns for trend-following and the 50/50 benchmark.
Source: Kenneth French Data Library. Data from July 1926 – February 2018. Calculations by Newfound Research. Returns are gross of all fees, including transaction fees, taxes, and any management fees. Returns assume the reinvestment of all distributions. This document does not reflect the actual performance results of any Newfound investment strategy or index. All returns are backtested and hypothetical. Past performance is not a guarantee of future results.
In most prolonged and major (>20%) drawdowns, trend-following first underperforms the benchmark, then outperforms, then lags as equities improve, and then outperform again.
Using the most recent example of the Financial Crisis, we can see the capture ratios verses the benchmark in each regime.
Source: Kenneth French Data Library. Data from October 2007 – February 2018. Calculations by Newfound Research. Returns are gross of all fees, including transaction fees, taxes, and any management fees. Returns assume the reinvestment of all distributions. This document does not reflect the actual performance results of any Newfound investment strategy or index. All returns are backtested and hypothetical. Past performance is not a guarantee of future results.
The underperformance of the trend-following strategy verses the benchmark is in line with expectations based on how the strategy is desired to work.
Another way to use the benchmark to set expectations is to look at rolling returns historically. This gives context for the current out- or underperformance relative to the benchmark.
From this we can see which percentile the current return falls into or check to see how many standard deviations it is away from the average level of relative performance.
Source: Kenneth French Data Library. Data from July 1926 – February 2018. Calculations by Newfound Research. Returns are gross of all fees, including transaction fees, taxes, and any management fees. Returns assume the reinvestment of all distributions. This document does not reflect the actual performance results of any Newfound investment strategy or index. All returns are backtested and hypothetical. Past performance is not a guarantee of future results.
In all this, there are a few important points to keep in mind:
- Price moves that occur faster than the scope of the trend-following measurement can be one source of the largest underperformance events.
- Along a similar vein, whipsaw is a key risk of trend-following. Highly oscillatory markets will not be favorable to trend-following. In these scenarios, trend following can underperform even fully invested equities.
- With percentile analysis, there is always a first time for anything. Having a rich data history covering a variety of market scenarios mitigates this, but setting new percentiles, either on the low end or high end, is always possible.
- Sometimes a strategy is expected to lag its benchmark in a given market environment. A primary goal with benchmarking is it accurately set these expectations for the potential magnitude of relative performance and design the portfolio accordingly.
Conclusion
Benchmarking a trend-following strategy can be a difficult exercise in managing behavioral biases. With the tendency to benchmark all equity-based strategies to an all-equity index, investors often set themselves up for a let-down in a bull market with trend-following.
With benchmarking, the focus is often on lagging the benchmark by “too much.” This is what an all-equity benchmark can do to trend-following. However, the issue is symmetric: beating the benchmark by “too much” can also signal either an issue with the strategy or with the benchmark choice. This is why we would not benchmark a long/flat trend-following strategy to cash.
A 50/50 portfolio of equities and cash is generally an appropriate benchmark for long/flat trend-following strategies. This benchmark allows us to measure the strategy’s ability to correctly allocate when equities are both increasing or decreasing.
Too often, investors use benchmarking solely to see which strategy is beating the benchmark by the most. While this can be a use for very similar strategies (e.g. a set of different value managers), we must always be careful not to compare apples to oranges.
A benchmark should not conjure up an image of a dog race where the set of investment strategies are the dogs and the benchmark is the bunny out ahead, always leading the way.
We must always acknowledge that for a strategy to outperform over the long-run, it must undergo shorter periods of underperformance. Diversifying approaches can manage events that are unfavorable to one strategy, and benchmarking is a tool to set expectations around the level of risk management necessary in different market environments.
[1] https://blog.thinknewfound.com/2018/05/leverage-and-trend-following/
[2] https://blog.thinknewfound.com/2018/03/protect-participate-managing-drawdowns-with-trend-following/
The New Glide Path
By Corey Hoffstein
On July 2, 2018
In Portfolio Construction, Risk Management, Sequence Risk, Weekly Commentary
This post is available as a PDF download here.
Summary
In past commentaries, we have written at length about investor sequence risk. Summarized simply, sequence risk is the sensitivity of investor goals to the sequence of market returns. In finance, we traditionally assume the sequence of returns does not matter. However, for investors and institutions that are constantly making contributions and withdrawals, the sequence can be incredibly important.
Consider for example, an investor who retires with $1,000,000 and uses the traditional 4% spending rule to allocate a $40,000 annual withdrawal to themselves. Suddenly, in the first year, their portfolio craters to $500,000. That $40,000 no longer represents just 4%, but now it represents 8%.
Significant drawdowns and fixed withdrawals mix like oil and water.
Sequence risk is the exact reason why traditional glide paths have investors de-risk their portfolios over time from growth-focused, higher volatility assets like equities to traditionally less volatile assets, like short-duration investment grade fixed income.
Bonds, however, are not the only way investors can manage risk. There are a variety of other methods, and frequent readers will know that we are strong advocates for the incorporation of trend-following techniques.
But how much trend-following should investors use? And when?
That is exactly what this commentary aims to explore.
Building a New Glidepath
In many ways, this is a very open-ended question. As a starting point, we will create some constraints that simplify our approach:
Source: St. Louis Federal Reserve and Kenneth French Database. Past performance is hypothetical and backtested. Trend Strategy is a simple 200-day moving average cross-over strategy that invests in U.S. equities when the price of U.S. equities is above its 200-day moving average and in U.S. T-Bills otherwise. Returns are gross of all fees and assume the reinvestment of all dividends. None of the equity curves presented here represent a strategy managed by Newfound Research.
To generate our glide path, we will use a process of backwards induction similar to that proposed by Gordon Irlam in his article Portfolio Size Matters (Journal of Personal Finance, Vol 13 Issue 2). The process works thusly:
As a technical side-note, we should mention that exploring all possible portfolio configurations is a computationally taxing exercise, as would be an optimization-based approach. To circumvent this, we employ a quasi-random low-discrepancy sequence generator known as a Sobol sequence. This process allows us to generate 100 samples that efficiently span the space of a 4-dimensional unit hypercube. We can then normalize these samples and use them as our sample allocations.
If that all sounded like gibberish, the main thrust is this: we’re not really checking every single portfolio configuration, but trying to use a large enough sample to capture most of them.
By working backwards, we can tackle what would be an otherwise computationally intractable problem. In effect, we are saying, “if we know the optimal decision at time T+1, we can use that knowledge to guide our decision at time T.”
This methodology also allows us to recognize that the relative wealth level to spending level is important. For example, having $2,000,000 at age 70 with a $40,000 real spending rate is very different than having $500,000, and we would expect that the optimal allocation would different.
Consider the two extremes. The first extreme is we have an excess of wealth. In this case, since we are optimizing to maximize the probability of success, the result will be to take no risk and hold a significant amount of T-Bills. If, however, we had optimized to acknowledge a desire to bequeath wealth to the next generation, you would likely see the opposite extreme: with little risk of failure, you can load up on stocks and to try to maximize growth.
The second extreme is having a significant dearth of wealth. In this case, we would expect to see the optimizer recommend a significant amount of stocks, since the safer assets will likely guarantee failure while the risky assets provide a lottery’s chance of success.
The Results
To plot the results both over time as well as over the different wealth levels, we have to plot each asset individually, which we do below. As an example of how to read these graphs, below we can see that in the table for U.S. equities, at age 74 and a $1,600,000 wealth level, the glide path would recommend an 11% allocation to U.S. equities.
A few features we can identify:
Ignoring the data artifacts, we can broadly see that trend following seems to receive a fairly healthy weight in the earlier years of retirement and at wealth levels where capital preservation is critical, but growth cannot be entirely sacrificed. For example, we can see that an investor with $1,000,000 at age 60 would allocate approximately 30% of their portfolio to a trend following strategy.
Note that the initially assumed $40,000 consumption level aligns with the generally recommended 4% withdrawal assumption. In other words, the levels here are less important than their size relative to desired spending.
It is also worth pointing out again that this analysis uses historical returns. Hence, we see a large allocation to T-Bills which, once upon a time, offered a reasonable rate of return. This may not be the case going forward.
Conclusion
Financial theory generally assumes that the order of returns is not important to investors. Any investor contributing or withdrawing from their investment portfolio, however, is dramatically affected by the order of returns. It is much better to save before a large gain or spend before a large loss.
For investors in retirement who are making frequent and consistent withdrawals from their portfolios, sequence manifests itself in the presence of large and prolonged drawdowns. Strategies that can help avoid these losses are, therefore, potentially very valuable.
This is the basis of the traditional glidepath. By de-risking the portfolio over time, investors become less sensitive to sequence risk. However, as bond yields remain low and investor life expectancy increases, investors may need to rely more heavily on higher volatility growth assets to avoid running out of money.
To explore these concepts, we have built our own glide path using four assets: broad U.S. equities, 10-year U.S. Treasuries, U.S. T-Bills, and a trend following strategy. Not surprisingly, we find that trend following commands a significant allocation, particularly in the years and wealth levels where sequence risk is highest, and often is allocated to in lieu of equities themselves.
Beyond recognizing the potential value-add of trend following, however, an important second takeaway may be that there is room for significant value-add in going beyond traditional target-date-based glide paths for investors.