This post is available as a PDF download here.
Summary
- While trend following may help investors avoid prolonged drawdowns, it is susceptible to whipsaw where false signals cause investors to either buy high and sell low (realizing losses) or sell low and buy high (a missed opportunity).
- Empirical evidence suggests that using economic data in the United States as a filter of when to employ trend-following – a “growth-trend timing” model – has historically been fruitful.
- When evaluated in other countries, growth-trend timing has been historically successful in mitigating whipsaw losses without sacrificing the ability to avoid large drawdowns. However, we see mixed results on whether this actually improves upon naïve trend-following.
- We find that countries that can be influenced by factors originating outside of their borders might not benefit from an introspective economic signal.
We apologize in advance, as this commentary will be fairly graph- and table-heavy.
We have written fairly extensively on the topic of factor-timing in the past, and much of the success has been proven to be both hard to implement and recreate out of sample.
One of the inherent pains of trend following is the existence of whipsaws, or more precisely, the misidentification of perceived market trends, which turn out to be more noise than signal. An article from Philosophical Economics proposed using several economic indicators to tune down the noise that might affect price-driven signals such as trend following. Generally, this strategy imposed an overlay that turned trend following “on” when the change in the economic indicators were negative year-over-year signaling a higher likelihood of recession, and conversely, adopted a buy-and-hold stance when the economic indicators were not flashing warning lights.
This strategy presents a certain appeal as leading economic indicators may, as their name implies, lead the market for some time until capital preservation is warranted. Switching to a trend-following approach may allow a strategy to continue to participate in market appreciation while it lasts. On the other hand, using economic confirmation as a filter may help a strategy avoid the whipsaw costs generated from noisy market dips while positive economic conditions persist.
In an effort to test such a strategy out-of-sample, we took the approach global, hoping to capture a broader cross-section of economic and market environments.
First, we will consider trend following with no timing using the economic indicators.1
Below we plot the equity curves for Australia, Germany, Italy, Japan, Singapore, the United Kingdom, and the United States, alongside a strategy that is long the market when the market is above the trailing twelve-month average (“12 Month average”) and steps to cash when the price is below it. The ratio between the two is also included to show the relative cumulative performance between the trend strategy and the respective market. An increasing ratio means that the trend following strategy is adding value over buy-and-hold.
Source: MSCI, Global Financial Data. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.
Through the graphs above, it becomes clear that much of the trend premium is realized by avoiding the large, prolonged bear markets that tend to occur during economic distress. In between these periods, however, the trend strategy lags the market. It makes sense, then, that a potential improvement to this strategy would be to implement an augmentation that could better distinguish between real price break-outs and those that lead to a whipsaw in the portfolio.
Growth-Trend Timing
For each country, we look at a number of economic indicators, including: corporate earnings growth, employment, housing starts, industrial production, and retail sales growth.2 The strategy then followed the same rules as described above: if the economic indicator in question displays a negative percentage change over the previous twelve-month period, a position is taken in a trend following strategy utilizing a twelve-month moving average signal. Otherwise, a buy-and-hold position is established.
To ensure that we are not benefitting from look-ahead bias, a lag of three months was imposed on each of the economic indicators, as it would be unrealistic to assume that the economic levels would be known at the end of each month.
Unfortunately, some of the economic data points could not be found for the entire period in which prices are available, though the analysis can still prove beneficial by indicating what economic regimes trend following is benefitted by growth-trend timing, or the potential identification where one indicator may work when another does not.3
In the charts below, we plot the growth-trend timing (referred to as GTT for the remainder of this commentary) for each country utilizing the available signals. The charts represent the relative cumulative performance over the respective country’s market return. For example, when the lines remain flat, the GTT approach has adopted buy-and-hold exposure and therefore matches the respective market’s returns. Any changes in the ratios are due to the GTT strategy investing in the trend following strategy.
Source: MSCI, Global Financial Data, St. Louis Fed, Bloomberg. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.
What we see from the above figures is a mixed bag of results.
The overlay of economic indicators was by far successful in the mitigation of whipsaw losses, as each country reaped the benefits of being primarily long the market during bull markets. As the 12-month moving average strategy tended to slowly give up a portion of the gains realized from severe market environments, the majority of the GTT strategies remained relatively stagnant until the next major correction.
There are some instances, however, where the indicator was late to the economic party. It is worth remembering that the market is, in theory, a forward-looking measure, and therefore sudden economic shocks may not be captured in economic data as quickly as it is in market returns. This created cases where the strategy either missed the chance to be out of the market during a correction or was sitting on the sidelines during the subsequent recoveries. Notably, the employment signal in Australia, Italy, Singapore, and the United Kingdom tended to be a poor leading indicator as the strategy tended to be invested longer in the bear markets than the trend strategy.
A Candidate for Ensembling
The implicit assumption in the analysis above is that the included indicators behave in similar ways. For example, by using a twelve-month lookback period for the indicators, we are assuming that each indicator will begin to trend in roughly the same way.
That may not be a particularly fair assumption. Whereas housing starts and retail sales are generally considered leading indicators, employment (unemployment) rates are normally categorized as lagging indicators. For this reason, it may be more beneficial to use a shorter lookback period so as to pick up on potential problems in the economy as they begin to present themselves. Further, some signals tend to be more erratic than others, suggesting that a meaningful lookback period for one indicator may not be meaningful for another. With no perfect reason to prefer one lookback over another, we might consider different lookback periods so as to diversify any specification risk that may exist within the strategy.
With the benefit of hindsight, we know that not all recessions occur for the same reasons, so being reliant on one signal that has worked in the past may not be as beneficial in the future. With this in mind, we should consider that all indicators hold some information as to the state of the economy since one indicator may be signaling the all-clear while another may be flashing warning lights.
For the same reason medical professionals take multiple readings to gain insight into the state of the body, we should also consider any available signals to ascertain the health of the economy.
To ensemble this strategy, we will vary the lookbacks from six to eighteen months, while holding the lag at three months, as well as combine the available economic signals for each country. For the sake of brevity, we will hold the trend-following strategy the same with a twelve-month moving average.
Remember, if the economic signal is negative, it does not mean that we are immediately out of the market: a negative economic signal simply moves the strategy into a trend-following approach. With 5 economic indicators and 13 lookback periods, we have 65 possible strategies for each country. As an example, if 40 of these 65 models were positive and 25 were negative, we would hold 62% in the market and 38% in the trend following strategy.
The resulting performance statistics can be seen in the table below.
Source: MSCI, Global Financial Data, St. Louis Fed, Bloomberg. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.
From the table above, we see that there are, again, mixed results. One country that particularly stands out is Italy in that the sign on its return flipped to negative and the drawdown was actually deeper with GTT than with a simple buy-and-hold strategy.
Source: MSCI, Global Financial Data, St. Louis Fed, Bloomberg. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.
Digging deeper, it appears that the GTT strategy for Italy was actually whipsawed by more than just trend-following. Housing start data for Italy was not readily available until December 2008, so Italy may have been at a relative disadvantage when compared against the other countries. Since the reliable data we could find begins at the end of 2008 and the majority of the whipsaw losses occur post-Great Financial Crisis, we can run the analysis again, but with housing start data being added in upon its availability.
Source: MSCI, Global Financial Data, St. Louis Fed, Bloomberg. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.
Adding housing starts in as an indicator did not meaningfully alter the results over the period. One hypothesis is that the indicators included could not fully encapsulate the complex state of Italy’s economy over the period. Italy has weathered three technical recessions over the past decade, so this could be a regime where the market is looking to sources outside the country for indications of distress or where the economic indicator is not reflective of the pressures driving the market.
Source: MSCI, St. Louis Fed. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.
Above, we can see several divergences between the market movement and changes in real GDP. Specifically, in the past decade, we see that the market reacted to information that didn’t materialize in the country’s real GDP. More likely, the market was reacting to regional financial distress driven by debt concerns.
The MSCI Italy index is currently composed of 24 constituents with multinational business operations. Additionally, the index maintains large concentrations in financials, utilities, and energy: 33%, 25%, and 14%, respectively.4 Because of this sector concentration, utilizing the economic indicators may overly focus on the economic health of Italy while ignoring external factors such as energy prices or broader financial distress that could be swaying the market needle.
A parallel explanation could be that the Eurozone is entangled enough that signals could be interfering with each other between countries. Further research could seek to disaggregate signals between the Eurozone and the member-countries, attempting to differentiate between zone, regional, and country signals to ascertain further meaning.
Additionally, economic indicators are influenced by both the private and public sector so this could represent a disconnect between public company health and private company health.
Conclusion
In this commentary, we sought to answer the question, “can we improve trend-following by drawing information from a country’s economy”. It intuitively makes sense that an investor would generally opt for remaining in the market unless there are systemic issues that may lead to market distress. A strategy that successfully differentiates between market choppiness and periods of potential recession would drastically mitigate any losses incurred from whipsaw, thereby capturing a majority of the equity premium as well as the trend premium.
We find that growth-trend timing has been relatively successful in countries such as the United States, Germany, and Japan. However, the country that is being analyzed should be considered in light of their specific circumstances.
Peeking under the hood of Italy, it becomes clear that market movements may be influenced by more than a country’s implicit economic health. In such a case, we should pause and ask ourselves whether a macroeconomic indicator is truly reflective of that country’s economy or if there are other market forces pulling the strings.
Timing Trend Model Specification with Momentum
By Corey Hoffstein
On December 23, 2019
In Craftsmanship, Risk & Style Premia, Trend, Weekly Commentary
A PDF version of this post is available here.
Summary
Over the last several years, we’ve advocated on numerous occasions for a more holistic view of diversification: one that goes beyond just what we invest in, but also considers how those decisions are made and when they are made.
We believe that this style of thinking can be applied “all the way down” our process. For example, how-based diversification would advocate for the inclusion of both value and momentum processes, as well as for different approaches to capturing value and momentum.
Unlike correlation-based what diversification, how-based diversification often does little for traditional portfolio risk metrics. For example, in Is Multi-Manager Diversification Worth It? we demonstrated that within most equity categories, allocating across multiple managers does almost nothing to reduce portfolio volatility. It does, however, have a profound impact on the dispersion of terminal wealth that is achieved, often by avoiding manager-specific tail-risks. In other words, our certainty of achieving a given outcome may be dramatically improved by taking a multi-manager approach.
Ensemble techniques to portfolio construction can be thought of as adopting this same multi-manager approach by creating a set of virtual managers to allocate across.
In late 2018, we wrote two notes that touched upon this: When Simplicity Met Fragility and What Do Portfolios and Teacups Have in Common? In both studies we injected a bit of randomness into asset returns to measure the stability of trend-following strategies. We found that highly simplistic models tended to exhibit significant deviations in results with just slightly modified inputs, suggesting that they are highly fragile. Increasing diversification across what, how, and when axes led to a significant improvement in outcome stability.
As empirical evidence, we studied the real-time results of the popular Dual Momentum GEM strategy in our piece Fragility Case Study: Dual Momentum GEM, finding that slight deviations in model specification lead to significantly different allocation conclusions and therefore meaningfully different performance results. This was particularly pronounced over short horizons.
Tying trend-following to option theory, we then demonstrated how an ensemble of trend following models and specifications could be used to increase outcome certainty in Tightening the Uncertain Payout of Trend-Following.
Yet while more diversification appears to make portfolios more consistent in the outcomes they achieve, empirical evidence also suggests that certain specifications can lead to superior results for prolonged periods of time. For example, slower trend following signals appear to have performed much, much better than fast trend following signals over the last two decades.
One of the benefits of being a quant is that it is easy to create thousands of virtual managers, all of whom may follow the same style (e.g. “trend”) but implement with a different model (e.g. prior total return, price-minus-moving-average, etc) and specification (e.g. 10 month, 200 day, 13 week / 34 week cross, etc). An ancillary benefit is that it is also easy to re-allocate capital among these virtual managers.
Given this ease, and knowing that certain specifications can go through prolonged periods of out-performance, we might ask: can we time specification choices with momentum?
Timing Trend Specification
In this research note, we will explore whether momentum signals can help us time out specification choices as it relates to a simple long/flat U.S. trend equity strategy.
Using data from the Kenneth French library, our strategy will hold broad U.S. equities when the trend signal is positive and shift to the risk-free asset when trends are negative. We will develop 1023 different strategies by employing three different models – prior total return, price-minus-moving-average, and dual-moving-average-cross-over – with lookback choices spanning from 20-to-360 days in length.
After constructing the 1023 different strategies, we will then apply a momentum model that ranks the models based upon prior returns and equally-weights our portfolio across the top 10%. These choices are made daily and implemented with 21 overlapping portfolios to reduce the impact of rebalance timing luck.
It should be noted that because the underlying strategies are only allocating between U.S. equities and a risk-free asset, they can go through prolonged periods where they have identical returns or where more than 10% of models share the highest prior return. In these cases, we select all models that have returns equal-to-or-greater-than the model identified at the 10th percentile.
Before comparing performance results, we think it is worthwhile to take a quick look under the hood to see whether the momentum-based approach is actually creating meaningful tilts in specification selection. Below we plot both aggregate model and lookback weights for the 126-day momentum strategy.
Source: Kenneth French Data Library. Calculations by Newfound Research.
We can see that while the model selection remains largely balanced, with the exception of a few periods, the lookback horizon selection is far more volatile. On average, the strategy preferred intermediate-to-long-term signals (i.e. 181-to-360 day), but we can see intermittent periods where short-term models carried favor.
Did this extra effort generate value, though? Below we plot the ratio of the momentum strategies’ equity curves versus the naïve diversified approach.
We see little consistency in relative performance and four of the five strategies end up flat-to-worse. Only the 252-day momentum strategy out-performs by the end of the testing period and this is only due to a stretch of performance from 1950-1964. In fact, since 1965 the relative performance of the 252-day momentum model has been negative versus the naively diversified approach.
Source: Kenneth French Data Library. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.
This analysis suggests that naïve, momentum-based specification selection does not appear to have much merit against a diversified approach for our simple trend equity strategy.
The Potential Benefits of Virtual Rebalancing
One potential benefit of an ensemble approach is that rebalancing across virtual managers can generate growth under certain market conditions. Similar to a strategically rebalanced portfolio, we find that when returns across virtual managers are expected to be similar, consistent rebalancing can harvest excess returns above a buy-and-hold approach.
The trade-off, of course, is that when there is autocorrelation in specification performance, rebalancing creates a drag. However, given that the evidence above suggests that relative performance between specifications is not persistent, we might expect that continuously rebalancing across our ensemble of virtual managers may actually allow us to harvest returns above and beyond what might be possible with just selecting an individual manager.
Source: Kenneth French Data Library. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.
Conclusion
In this study, we explored whether we could time model specification choices in a simple trend equity strategy using momentum signals.
Testing different lookback horizons of 21-through-378 days, we found little evidence of meaningful persistence in the returns of different model specifications. In fact, four of the five momentum models we studied actually under-performed a naïve, diversified. The one model that did out-perform only seemed to do so due to strong performance realized over the 1950-1964 period, actually relatively under-performing ever since.
While this evidence suggests that timing specification with momentum may not be a fruitful approach, it does suggest that the lack of return persistence may benefit diversification for a second reason: rebalancing. Indeed, barring any belief that one specification would necessarily do better than another, consistently re-pooling and distributing resources through rebalancing may actually lead to the growth-optimal solution.1 This potentially implies an even higher hurdle rate for specification-timers to overcome.