This post is available as a PDF download here.
Summary
- Systematic value strategies have struggled in the post-2008 environment, so one that has performed well catches our eye.
- The Barclays Shiller CAPE sector rotation strategy – a value-based sector rotation strategy – has out-performed the S&P 500 by 267 basis points annualized since it launched in 2012.
- The strategy applies a unique Relative CAPE metric to account for structural differences in sector valuations as well as a momentum filter that seeks to avoid “value traps.”
- In an effort to derive the source of out-performance, we explore various other valuation metrics and model specifications.
- We find that what has actually driven performance in the past may have little to do with value at all.
It is no secret that systematic value investing of all sorts has struggled as of late. With the curious exception, that is, of the Barclays Shiller CAPE sector rotation strategy, a strategy explored by Bunn, Staal, Zhuang, Lazanas, Ural and Shiller in their 2014 paper Es-cape-ing from Overvalued Sectors: Sector Selection Based on the Cyclically Adjusted Price-Earnings (CAPE) Ratio. Initial performance suggests that the idea has performed quite well out-of-sample, which stands out among many “smart-beta” strategies which have failed to live up to their backtests.
Source: CSI Data. Calculations by Newfound Research. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
Why is this strategy finding success where other value strategies have not? That is what we aim to explore in this commentary.
On a monthly basis, the Shiller CAPE sector rotation portfolio is rebalanced into an equal-weight allocation across four of the ten primary GICS sectors. The four are selected first by ranking the 10 primary sectors based upon their Relative CAPE ratios and choosing the cheapest five sectors. Of those cheapest five sectors, the sector with the worst trailing 12-month return (“momentum”) is removed.
The CAPE ratio – standing for Cyclically-Adjusted Price-to-Earnings ratio – is the current price divided by the 10-year moving average of inflation-adjusted earnings. The purpose of this smoothing is to reduce the impact of business cycle fluctuations.
The potential problem with using the raw CAPE value for each sector is that certain sectors have structurally higher and lower CAPE ratios than their peers. High growth sectors – e.g. Technology – tend to have higher CAPE ratios because they reinvest a substantial portion of their earnings while more stable sectors – e.g. Utilities – tend to have much lower CAPE ratios. Were we to simply sort sectors based upon their current CAPE ratio, we would tend to create structural over- and under-weights towards certain sectors.
To adjust for this structural difference, the strategy uses the idea of a Relative CAPE ratio, which is calculated by taking the current CAPE ratio and dividing it by a rolling 20-year average CAPE ratio1 for that sector. The thesis behind this step is that dividing by a long-term mean normalizes the sectors and allows for better comparison. Relative CAPE values above 1 mean that the sector is more expensive than it has historically been, while values less than 1 mean it is cheaper.
It is important to note here that the actual selection is still performed on a cross-sector basis. It is entirely possible that all the sectors appear cheap or expensive on a historical basis at the same time. The portfolio will simply pick the cheapest sectors available.
Poking and Prodding the Parameters
With an understanding of the rules, our first step is to poke and prod a bit to figure out what is really driving the strategy.
We begin by first exploring the impact of using the Relative CAPE ratio versus just the CAPE ratio.
For each of these ratios, we’ll plot two strategies. The first is a naïve Value strategy, which will equally-weight the four cheapest sectors. The second is the Shiller strategy, which chooses the top five cheapest sectors and drops the one with the worst momentum. This should provide a baseline for comparing the impact of the momentum filter.
Strategy returns are plotted relative to the S&P 500.
Source: Siblis Research; Morningstar; CS Data. Calculations by Newfound Research. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
For the Relative CAPE ratio, we also vary the lookback period for calculating the rolling average CAPE from 5- to 20-years.
Source: Siblis Research; Morningstar; CSI Data. Calculations by Newfound Research. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
A few things immediately stand out:
- Interestingly, standard CAPE actually appears to perform better than Relative CAPE for both the traditional value and Shiller implementations.
- The Relative CAPE approach fared much more poorly from 2004-2007 than the simple CAPE approach.
- There is little difference in performance for the Value and Shiller strategy for standard CAPE, but a meaningful difference for Relative CAPE.
- While standard CAPE value has stagnant relative performance since 2007, Relative CAPE appears to continue to work for the Shiller approach.
- A naïve value implementation seems to perform quite poorly for Relative CAPE, while the Shiller strategy appears to perform rather well.
- There is meaningful performance dispersion based upon the lookback period, with longer-dated lookbacks (darker shades) appearing to perform better than shorter-period lookbacks (lighter shades) for the Relative CAPE variation.
The second-to-last point is particularly curious, as it implies that using momentum to “avoid the value trap” creates significant value (no pun intended; okay, pun intended) for the strategy.
Varying the Value Metric (in Vain)
To gain more insight, we next test the impact of the choice of the CAPE ratio. Below we plot the relative returns of different Shiller-based strategies (again varying lookbacks from 5- to 20-years), but use price-to-book, trailing 12-month price-to-earnings, and trailing 12-month EV/EBITDA as our value metrics.
A few things stand out:
- Value-based sector rotation seems to have “worked” from 2000 to 2009, regardless of our metric of choice.
- Almost all value-based strategies appear to exhibit significant relative out-performance during the dot-com and 2008 recessions.
- After 2009, most value strategies appear to exhibit random relative performance versus the S&P 500.
- All three approaches appear to suffer since 2016.
Source: Siblis Research; Morningstar; CSI Data. Calculations by Newfound Research. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
At this point, we have to ask: is there something special about the Relative CAPE that makes it inherently superior to other metrics?
A Big Bubble-Based Bet?
If we take a step back for a moment, it is worth asking ourselves a simple question: what would it take for a sector rotation strategy to out-perform the S&P 500 over the last decade?
With the benefit of hindsight, we know Consumer Discretionary and Technology have led the pack, while traditionally stodgy sectors like Consumer Staples and Utilities have lagged behind (though not nearly as poorly as Energy).
As we mentioned earlier, a naïve rank on the CAPE ratio would almost certainly prefer Utilities and Staples over Technology and Discretionary. Thus, for us to outperform the market, we must somehow construct a value metric that identifies the two most chronically expensive sectors (ignoring back-dated valuations for the new Communication Services sector) as being among the cheapest.
This is where dividing by the rolling 20-year average comes into play. In spirit, it makes a certain degree of sense. In practice, however, this plays out perfectly for Technology, which went through such an enormous bubble in the late 1990s that the 20-year average was meaningfully skewed upward by an outlier event. Thus, for almost the entire 20-year period after the dot-com bubble, Technology appears to be relatively cheap by comparison. After all, you can buy for 30x earnings today what you used to be able to buy for 180x!
The result is a significant – and near-permanent tilt – towards Technology since the beginning of 2012, which can be seen in the graph of strategy weights below.
One way to explore the impact of this choice is calculate the weight differences between a top-4 CAPE strategy and a top-4 Relative CAPE strategy, which we also plot below. We can see that after early 2012, the Relative CAPE strategy is structurally overweight Technology and underweight Financials and Utilities. Prior to 2008, we can see that it is structurally underweight Energy and overweight Consumer Staples.
If we take these weights and use them to construct a return stream, we can isolate the return impact the choice of using Relative CAPE versus CAPE has. Interestingly, the long Technology / short Financials & Utilities trade did not appear to generate meaningful out-performance in the post-2012 era, suggesting that something else is responsible for post-2012 performance.
Source: Siblis Research; Morningstar; CSI Data. Calculations by Newfound Research. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
The Miraculous Mojo of Momentum
This is where the 12-month momentum filter plays a crucial role. Narratively, it is to avoid value traps. Practically, it helps the strategy deftly dodge Financials in 2008, avoiding a significant melt-down in one of the S&P 500’s largest sectors.
Now, you might think that valuations alone should have allowed the strategy to avoid Technology in the dot-com fallout. As it turns out, the Technology CAPE fell so precipitously that in using the Relative CAPE metric the Technology sector was still ranked as one of the top five cheapest sectors from 3/2001 to 11/2002. The only way the strategy was able to avoid it? The momentum filter.
Removing this filter makes the relative results a lot less attractive. Below we re-plot the relative performance of a simple “top 4” Relative CAPE strategy.
Source: Siblis Research; Morningstar; CSI Data. Calculations by Newfound Research. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
Just how much impact does the momentum filter have? We can isolate the effect by taking the weights of the Shiller strategy and subtracting the weights of the Value strategy to construct a long/short index that isolates the effect. Below we plot the returns of this index.
It should be noted that the legs of the long/short portfolio only have a notional exposure of 25%, as that is the most the Value and Shiller strategies can deviate by. Nevertheless, even with this relatively small weight, when isolated the filter generates an annualized return of 1.8% per year with an annualized volatility of 4.8% and a maximum drawdown of 11.6%.
Scaled to a long/short with 100% notional per leg, annualized returns jump to 6.0%. Though volatility and maximum drawdown both climb to 20.4% and 52.6% respectively.
Source: Siblis Research; Morningstar; CSI Data. Calculations by Newfound Research. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
Conclusion
Few, if any, systematic value strategies have performed well as of late. When one does – as with the Shiller CAPE sector rotation strategy – it is worth further review.
As a brief summary of our findings:
- Despite potential structural flaws in measuring cross-sectional sector value, CAPE outperformed Relative CAPE for a naïve rank-based value strategy.
- There is significant dispersion in results using the Relative CAPE metric depending upon which lookback parameterization is selected.Initial tests suggest that the longer lookbacks appear to have been more effective.
- Using valuation metrics other than CAPE – e.g. P/B, P/E (TTM), and EV/EBITDA (TTM) – do not appear as effective in recent years.
- Longer lookbacks allow the Relative CAPE methodology to create a structural overweight to the Technology sector over the last 15 years.
- The momentum filter plays a crucial role in avoiding the Technology sector in 2001-2002 and the Financial sector in 2008.
Taken all together, it is hard to not question whether these results are unintentionally datamined. Unfortunately, we just do not have enough data to extend the tests further back in time for truly out-of-sample analysis.
What we can say, however, is that the backtested and live performance hinges almost entirely a few key trades:
- Avoiding Technology in 2001-2002 due to the momentum filter.
- Avoiding Financials in 2008 due to the momentum filter.
- Avoiding a Technology underweight in recent years due to an inflated “average” historical CAPE due to the dot-com bubble.
- Avoiding Energy in 2014-2016 due to the momentum filter.
Three of these four trades are driven by the momentum filter. When we further consider that the Shiller strategy is in effect the returns of the pure value implementation – which suffered in the dot-com run-up and was a mostly random walk thereafter – and the returns of the isolated momentum filter, it becomes rather difficult to call this a value strategy at all.
As of the date of this document, neither Newfound Research nor Corey Hoffstein holds a position in the securities discussed in this article and do not have any plans to trade in such securities. Newfound Research and Corey Hoffstein do not take a position as to whether this security should be recommended for any particular investor.
Macro and Momentum Factor Rotation
By Yuling "Ivey" Zhang
On September 30, 2019
In Popular, Risk & Style Premia, Weekly Commentary
This post is available as a PDF download here.
Summary
While the empirical evidence suggests that factor investing has historically generated an excess (risk-adjusted) return premium over the long-run, short-term performance can be volatile. Because of this, an increasing number of investors are adopting a diversified approach to factor investing, holding multiple factors at once.
Although even simple allocation models – such as naïve equal-weight – have historically harvested these diversification benefits, some researchers believe that dynamic factor allocation can further enhance the returns.
One argument for taking a dynamic approach is that the performance variability is cyclical and linked to different stages of the economic cycle. The different underlying economic drivers lead to differentiated active returns and therefore lead to potential opportunities for cross-factor rotation.
For example, in a recent white paper, MSCI provides several dynamic models in what they call “Adaptive Multi-Factor Allocation.”
In this commentary, we replicate two variations of the MSCI’s Adaptive Multi-Factor models: the macro-cycle-based and momentum-based allocation methodologies. While the MSCI’s methodology is tested on world equity data, we employ US equity data as a (slightly) out-of-sample test. In line with MSCI’s research, six long-only US factors – Value, Size, Low Volatility, High Yield, Quality, and Momentum – were selected as our investible universe.
Macro Cycle-Based Allocation
Many research studies suggest that macro indicators have a strong explanatory power to systematic factor returns. One hypothesis is that the excess returns of factors could be compensation for bearing different forms of macroeconomic risk.
Historical performance of factor investing seems to prove this argument as the returns have been cyclical under different stages of the business cycle. For example, value and size factors tend to be most affected by negative economic growth (providing a risk-based argument for their long-term premium), while quality and low volatility are usually the most defensive factors due to their structurally lower equity betas. Therefore, it might make sense to invest in defensive factors during the periods of economic slowdown. On the contrary, cyclical factors could add more value during expansionary phases.
Following MSCI’s methodology, economic cycles are classified into four primary states: Expansion, Slowdown, Contraction and Recovery. Each state is defined based upon the level and slope of a 3-month moving average (“MA”) minus a 12-month MA. In this commentary, we will employ three macroeconomic indicators (which are the best 3 performing macro indicators in the MSCI’s model):
If the 3-month MA is above the 12-month MA and the spread between the two is increasing, the economic state is labeled as an Expansion. If the spread is decreasing, however, the economic state is labeled as a Slowdown. On the other hand, if the 3-month MA is below the 12-month MA and the spread is declining, the state is Contraction. If the spread is negative but increasing, then the economy is in a Recovery
Exhibit 1: Economic Cycles of ISM PMI Index
Source: Quandl PMI Composite Index. Calculations by Newfound Research. Results are hypothetical and should not be used for investment purpose.
An equal-weight portfolio comprising of 3 of the 6 factors is constructed for each state. According to the MSCI paper, the 3 factors for each stage are predetermined based on their historical performance and the past studies on each factor. While we have described our skepticism of these choices in our previous commentary on Style Surfing the Business Cycle, we will assume that the intuition on these factor mixes is correct.
The given combinations are as follows:
With the business cycle signals from the moving average cross-overs and the regime-based factor baskets, we can implement the dynamic factor strategies. We construct a portfolio for each indicator, rebalancing monthly based upon the identified economic regime. We also construct an blended portfolio by combining these three sub-portfolios together in hopes of benefiting from signal diversification.
Below we plot the relative returns for each portfolio against the MSCI USA Index and compare them to the equal-weighted portfolio across all 6 factors.
Exhibit 2: Relative Performance of Macro-Cycle Timing Portfolios vs. MSCI USA Index
Source: MSCI USA Standard Universe. Calculations by Newfound Research. Returns are hypothetical and are not intended to be interpreted as recommendation to any portfolio construction. Returns assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index. Sample Period is November 1997 – August 2019.
As we can observe from Exhibit 2, the blended portfolio (and each individual strategy) has meaningfully outperformed both a market-cap weighted benchmark as well as an equal-weight portfolio of the six underlying factors over the past 20+ years. It seems like the macro-cycle-based factor allocation provides a promising return.
Why is that? One potential reason is that factor returns are linked to the economic cycle and that using monthly (or even daily) updated macro indicators can provide timely insights into the economic state. These higher-frequency signals allow us to potentially capture smaller business cycle fluctuations that are not announced by the NBER (National Bureau of Economic Research) or other institutions. The 3-month vs. 12-month moving average may also help filter undesired noise from the process.
Another potential reason is that the factors were well selected for each identified economic state. However, as we highlighted in our previous commentary, we know the factor allocations for each state were largely determined by the historical performances of each factor during that period of the business cycle.
This raises an important question: is the result a byproduct of data mining or the materialization of an unintentional hindsight bias?
To explore this question, we will perform a random sampling test. Specifically, we will look at the results of alternative portfolio choices we could have made. With four economic states and 3 (out of 6) factors selected for each, there are 160,000 possible economic state / factor portfolio configurations. Below we plot the annualized return distribution of these different configurations and highlight where the MSCI selection falls:
Exhibit 3: Distribution of Random Sampling Test
Source: MSCI USA Standard Universe. Calculations by Newfound Research. Returns assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index.
Among all the possible combinations, the portfolio defined by MSCI’s factor choices lies at the 98th percentile on the annualized return distribution curve. While we would certainly want the choice to perform better than a random selection, such strong performance might suggest the choice was impacted by the benefit of hindsight.
However, it is interesting to note that in the MSCI’s definition, value is held in three states of the business cycle (contraction, recovery, and expansion) while the value factor in MSCI’s construction, either intuitionally or historically, may not actually be the most appropriate factor for all three periods. For example, during periods of expansion, some argue that market tends to favor companies with high growth potential instead of firms with low intrinsic value.
Below we plot the annualized returns for reach factor during each macro-economic state (as defined by the CFNAI indicator).
Exhibit 4: Annualized Returns for Each Factor During Each Macro State
Source: MSCI USA Standard Universe. Calculations by Newfound Research. Returns assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index.
Of the 12 possible factor choices to match up with the fully data-mined allocations, MSCI aligns with 9. Still, we cannot assert that the MSCI’s predefined rotation rule is a byproduct of pure data mining. It could be a mix of data mining and prevalent beliefs (e.g. the defensive nature of value prior to the 2008 crisis). We should also remember that we are testing U.S. equity factors while the original MSCI research was performed on world equity data, which might lead to subtly different factor choices.
We should also be careful to consider how market unpredictability might negatively skew the returns. As there is no set reason for how or why a financial crisis might unfold, the reliability of using predetermined definitions based solely upon past history may be questionable for future performance.
Exhibit 5: Monthly Return Distribution of MSCI US Value Factor in Contraction, Recovery, and Expansion Phases
Source: MSCI USA Standard Universe. Calculations by Newfound Research. Returns assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index. Sample Period is November 1997 – August 2019.
Momentum-Based Allocation
MSCI also tests a dynamic factor construction based upon momentum signals. This approach is also not without academic basis. For example, Research Affiliates performed a study on the momentum effect amongst 51 factors and found that factors exhibit stronger momentum than both individual stocks and industries. They found that momentum is a prevailing property of almost all factors.
To test the viability of momentum-based allocation, we follow MSCI’s methodology and rank the factors based upon their prior returns, rebalancing monthly and holding the top 3 ranked factors. The ranking is calculated based upon the last 1-month, 6-month, and 12-month total returns for each factor.
Below we plot the relative performance for each formation period versus a benchmark index. We also plot the relative performance of a naïve, equal-weight factor portfolio. Exhibit 6 plots the approach applied to U.S. equity factors while Exhibit 7 attempts to replicate MSCI’s original results with World equity factors.
Exhibit 6: Relative Performance of Momentum-Based Multi-Factor Portfolios vs MSCI USA Index
Source: MSCI USA Standard Universe. Calculations by Newfound Research. Returns are hypothetical and are not intended to be interpreted as recommendation to any portfolio construction. Returns assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index. Sample Period is December 1998 – August 2019.
Exhibit 7: Relative Performance of Momentum-Based Multi-Factor Portfolios vs MSCI World Index
Source: MSCI World Standard Universe. Calculations by Newfound Research. Returns are hypothetical and are not intended to be interpreted as recommendation to any portfolio construction. Returns assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index. Sample Period is December 1998 – August 2019.
In line with the MSCI’s results, all three momentum-based indicators generate excess returns over the benchmark as well as the equal-weighted portfolio. The best performing indicator is the 6-month variation. However, our out-of-sample test using US equity factors failed to generate similar returns compared to the World equity factors. It is probably because that there is a stronger factor momentum on the global level.
Comparing the Two Methodologies
Now that we have introduced our test results for macro-based and momentum-base dynamic factor allocation, we want to compare their performance of . Summary performance information is reported below:
(Blended)
(6-Month)
Source: MSCI USA Standard Universe. Calculations by Newfound Research. Returns assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index.
It is worth noting that momentum-based timing rotation provides a more stable annualized return with lower volatility while also maintaining a similar Sharpe ratio compared to the macro-based allocation. The trading frequency is also lower on annual basis.
Since these are long-only strategies, a key risk is underperforming during large equity market drawdowns, adding insult to injury. To see these effects, we can perform a scenario test for these two allocation methodologies during the dot-com bubble and the 2008 Financial Crisis. Defined periods are based on NBER’s listed history.
Exhibit 8: Monthly Returns of Macro-Based Allocation vs. Momentum-Based Allocation Under Two Recessions
Source: MSCI USA Standard Universe. Calculations by Newfound Research. Returns assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index.
In general, momentum-based allocation provides better combinations of factors during recessions but tends to react slower when market starts to exit and enter into recovery. This is likely due to the inherent lag from the lookback periods that the momentum strategy has to undertake before generating signals.
Conclusion
Rotating among factors can be very tempting, and in this commentary, we examined two potential ways to implement this strategy: macro cycle-based and momentum-based.
Regardless of which strategy we select, it is important to remember that risk is not destroyed but rather shifted into different forms.
With the momentum-based allocation, the embedded assumption is that the near future will look like the recent past.
On the other hand, dynamic allocation based upon macro-economic regimes requires us to estimate both the current regime we are in as well as which factors will do well during that regime. In sacrificing more dynamic combinations, the fixed structure of the regime allocations may help reduce the impact of short-term noise that might lead to whipsaw trades in a momentum-based approach.
One positive about the momentum-based methodology is that the factor selection is inherently dynamic whereas the macro cycle-based method prespecified regime-dependent factor baskets. We could expect future returns to remain consistent with the historical hypothetical performance, but this is an assumption that may be informed by hindsight-based “intuition”. The trade-off with using momentum to be dynamic is that the lag of the signals may fail to capitalize on potential opportunities during the transitions between business cycle states. This was the case in recovery state during the last two recessions.
We should also keep in mind that there are only 6 factors in our investible universe. What would the returns look like if we add more factors to our universe? What if we use different constructions of the same factors? Will the momentum-based timing rotation still outperform the benchmark? This is an open question for future research.
Both macro-based and momentum-based dynamic factor allocation proved successful in our (slightly) out-of-sample test. However, we should stress that all tests were performed gross of any fees and costs, which can have a substantial impact upon results (especially for high turnover strategies). Furthermore, the success of the macro-based test was highly dependent upon the factors selected for each macro regime, and there is a risk those factors were determined with hindsight bias.
Nevertheless, we believe this evidence suggests that further research is warranted, perhaps incorporating a blend of the approaches as well as other specifications to provide further signal diversification.