This post is available as a PDF download here.
Summary
- While many investors have adopted a multi-factor approach to style investing, some have pushed these boundaries by advocating for an active, rotational approach to factor allocation.
- In a recent white paper, MSCI suggests several methods that might be conducive for performing style rotation, including macro-, momentum-, and value-based signals.
- In this commentary, we attempt to test the macro- and momentum-based approaches on (slightly) out-of-sample data.
- We find that both approaches have historically out-performed a naïve, equal-weight factor portfolio. However, the results for the macro-based approach are so good, they raise questions about hindsight bias. Momentum results, on the other hand, are far less compelling on U.S. equity factors than the World equity factors tested by MSCI.
- After appropriately discounting for fees, taxes, and other costs, as well as adequately discounting for testing biases, these methods may not offer much benefit over naïve, equal-weight approach.
While the empirical evidence suggests that factor investing has historically generated an excess (risk-adjusted) return premium over the long-run, short-term performance can be volatile. Because of this, an increasing number of investors are adopting a diversified approach to factor investing, holding multiple factors at once.
Although even simple allocation models – such as naïve equal-weight – have historically harvested these diversification benefits, some researchers believe that dynamic factor allocation can further enhance the returns.
One argument for taking a dynamic approach is that the performance variability is cyclical and linked to different stages of the economic cycle. The different underlying economic drivers lead to differentiated active returns and therefore lead to potential opportunities for cross-factor rotation.
For example, in a recent white paper, MSCI provides several dynamic models in what they call “Adaptive Multi-Factor Allocation.”
In this commentary, we replicate two variations of the MSCI’s Adaptive Multi-Factor models: the macro-cycle-based and momentum-based allocation methodologies. While the MSCI’s methodology is tested on world equity data, we employ US equity data as a (slightly) out-of-sample test. In line with MSCI’s research, six long-only US factors – Value, Size, Low Volatility, High Yield, Quality, and Momentum – were selected as our investible universe.
Macro Cycle-Based Allocation
Many research studies suggest that macro indicators have a strong explanatory power to systematic factor returns. One hypothesis is that the excess returns of factors could be compensation for bearing different forms of macroeconomic risk.
Historical performance of factor investing seems to prove this argument as the returns have been cyclical under different stages of the business cycle. For example, value and size factors tend to be most affected by negative economic growth (providing a risk-based argument for their long-term premium), while quality and low volatility are usually the most defensive factors due to their structurally lower equity betas. Therefore, it might make sense to invest in defensive factors during the periods of economic slowdown. On the contrary, cyclical factors could add more value during expansionary phases.
Following MSCI’s methodology, economic cycles are classified into four primary states: Expansion, Slowdown, Contraction and Recovery. Each state is defined based upon the level and slope of a 3-month moving average (“MA”) minus a 12-month MA. In this commentary, we will employ three macroeconomic indicators (which are the best 3 performing macro indicators in the MSCI’s model):
- PMI (United States ISM Purchasing Managers Index)
- CFNAI (Chicago Fed National Activity Index)
- ADS (Aruoba-Diebold-Scotti Business Conditions Index)
If the 3-month MA is above the 12-month MA and the spread between the two is increasing, the economic state is labeled as an Expansion. If the spread is decreasing, however, the economic state is labeled as a Slowdown. On the other hand, if the 3-month MA is below the 12-month MA and the spread is declining, the state is Contraction. If the spread is negative but increasing, then the economy is in a Recovery
Exhibit 1: Economic Cycles of ISM PMI Index
Source: Quandl PMI Composite Index. Calculations by Newfound Research. Results are hypothetical and should not be used for investment purpose.
An equal-weight portfolio comprising of 3 of the 6 factors is constructed for each state. According to the MSCI paper, the 3 factors for each stage are predetermined based on their historical performance and the past studies on each factor. While we have described our skepticism of these choices in our previous commentary on Style Surfing the Business Cycle, we will assume that the intuition on these factor mixes is correct.
The given combinations are as follows:
- Expansion: Momentum, Size, Value
- Slowdown: Momentum, Quality, Low Volatility
- Contraction: Low Volatility, Quality, Value
- Recovery: Size, Value, High Yield
With the business cycle signals from the moving average cross-overs and the regime-based factor baskets, we can implement the dynamic factor strategies. We construct a portfolio for each indicator, rebalancing monthly based upon the identified economic regime. We also construct an blended portfolio by combining these three sub-portfolios together in hopes of benefiting from signal diversification.
Below we plot the relative returns for each portfolio against the MSCI USA Index and compare them to the equal-weighted portfolio across all 6 factors.
Exhibit 2: Relative Performance of Macro-Cycle Timing Portfolios vs. MSCI USA Index
Source: MSCI USA Standard Universe. Calculations by Newfound Research. Returns are hypothetical and are not intended to be interpreted as recommendation to any portfolio construction. Returns assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index. Sample Period is November 1997 – August 2019.
As we can observe from Exhibit 2, the blended portfolio (and each individual strategy) has meaningfully outperformed both a market-cap weighted benchmark as well as an equal-weight portfolio of the six underlying factors over the past 20+ years. It seems like the macro-cycle-based factor allocation provides a promising return.
Why is that? One potential reason is that factor returns are linked to the economic cycle and that using monthly (or even daily) updated macro indicators can provide timely insights into the economic state. These higher-frequency signals allow us to potentially capture smaller business cycle fluctuations that are not announced by the NBER (National Bureau of Economic Research) or other institutions. The 3-month vs. 12-month moving average may also help filter undesired noise from the process.
Another potential reason is that the factors were well selected for each identified economic state. However, as we highlighted in our previous commentary, we know the factor allocations for each state were largely determined by the historical performances of each factor during that period of the business cycle.
This raises an important question: is the result a byproduct of data mining or the materialization of an unintentional hindsight bias?
To explore this question, we will perform a random sampling test. Specifically, we will look at the results of alternative portfolio choices we could have made. With four economic states and 3 (out of 6) factors selected for each, there are 160,000 possible economic state / factor portfolio configurations. Below we plot the annualized return distribution of these different configurations and highlight where the MSCI selection falls:
Exhibit 3: Distribution of Random Sampling Test
Source: MSCI USA Standard Universe. Calculations by Newfound Research. Returns assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index.
Among all the possible combinations, the portfolio defined by MSCI’s factor choices lies at the 98th percentile on the annualized return distribution curve. While we would certainly want the choice to perform better than a random selection, such strong performance might suggest the choice was impacted by the benefit of hindsight.
However, it is interesting to note that in the MSCI’s definition, value is held in three states of the business cycle (contraction, recovery, and expansion) while the value factor in MSCI’s construction, either intuitionally or historically, may not actually be the most appropriate factor for all three periods. For example, during periods of expansion, some argue that market tends to favor companies with high growth potential instead of firms with low intrinsic value.
Below we plot the annualized returns for reach factor during each macro-economic state (as defined by the CFNAI indicator).
Exhibit 4: Annualized Returns for Each Factor During Each Macro State
Source: MSCI USA Standard Universe. Calculations by Newfound Research. Returns assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index.
Of the 12 possible factor choices to match up with the fully data-mined allocations, MSCI aligns with 9. Still, we cannot assert that the MSCI’s predefined rotation rule is a byproduct of pure data mining. It could be a mix of data mining and prevalent beliefs (e.g. the defensive nature of value prior to the 2008 crisis). We should also remember that we are testing U.S. equity factors while the original MSCI research was performed on world equity data, which might lead to subtly different factor choices.
We should also be careful to consider how market unpredictability might negatively skew the returns. As there is no set reason for how or why a financial crisis might unfold, the reliability of using predetermined definitions based solely upon past history may be questionable for future performance.
Exhibit 5: Monthly Return Distribution of MSCI US Value Factor in Contraction, Recovery, and Expansion Phases
Source: MSCI USA Standard Universe. Calculations by Newfound Research. Returns assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index. Sample Period is November 1997 – August 2019.
Momentum-Based Allocation
MSCI also tests a dynamic factor construction based upon momentum signals. This approach is also not without academic basis. For example, Research Affiliates performed a study on the momentum effect amongst 51 factors and found that factors exhibit stronger momentum than both individual stocks and industries. They found that momentum is a prevailing property of almost all factors.
To test the viability of momentum-based allocation, we follow MSCI’s methodology and rank the factors based upon their prior returns, rebalancing monthly and holding the top 3 ranked factors. The ranking is calculated based upon the last 1-month, 6-month, and 12-month total returns for each factor.
Below we plot the relative performance for each formation period versus a benchmark index. We also plot the relative performance of a naïve, equal-weight factor portfolio. Exhibit 6 plots the approach applied to U.S. equity factors while Exhibit 7 attempts to replicate MSCI’s original results with World equity factors.
Exhibit 6: Relative Performance of Momentum-Based Multi-Factor Portfolios vs MSCI USA Index
Source: MSCI USA Standard Universe. Calculations by Newfound Research. Returns are hypothetical and are not intended to be interpreted as recommendation to any portfolio construction. Returns assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index. Sample Period is December 1998 – August 2019.
Exhibit 7: Relative Performance of Momentum-Based Multi-Factor Portfolios vs MSCI World Index
Source: MSCI World Standard Universe. Calculations by Newfound Research. Returns are hypothetical and are not intended to be interpreted as recommendation to any portfolio construction. Returns assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index. Sample Period is December 1998 – August 2019.
In line with the MSCI’s results, all three momentum-based indicators generate excess returns over the benchmark as well as the equal-weighted portfolio. The best performing indicator is the 6-month variation. However, our out-of-sample test using US equity factors failed to generate similar returns compared to the World equity factors. It is probably because that there is a stronger factor momentum on the global level.
Comparing the Two Methodologies
Now that we have introduced our test results for macro-based and momentum-base dynamic factor allocation, we want to compare their performance of . Summary performance information is reported below:
MSCI US Index | Macro-Based (Blended) | Momentum-Based (6-Month) | |
Annualized Return | 7.3% | 10.2% | 9.1% |
Annualized Volatility | 15.0% | 14.1% | 13.2% |
Sharpe Ratio | 0.55 | 0.76 | 0.75 |
Rebalance Freq. | 8.7 / year | 5.2 / year |
Source: MSCI USA Standard Universe. Calculations by Newfound Research. Returns assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index.
It is worth noting that momentum-based timing rotation provides a more stable annualized return with lower volatility while also maintaining a similar Sharpe ratio compared to the macro-based allocation. The trading frequency is also lower on annual basis.
Since these are long-only strategies, a key risk is underperforming during large equity market drawdowns, adding insult to injury. To see these effects, we can perform a scenario test for these two allocation methodologies during the dot-com bubble and the 2008 Financial Crisis. Defined periods are based on NBER’s listed history.
Exhibit 8: Monthly Returns of Macro-Based Allocation vs. Momentum-Based Allocation Under Two Recessions
Source: MSCI USA Standard Universe. Calculations by Newfound Research. Returns assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index.
In general, momentum-based allocation provides better combinations of factors during recessions but tends to react slower when market starts to exit and enter into recovery. This is likely due to the inherent lag from the lookback periods that the momentum strategy has to undertake before generating signals.
Conclusion
Rotating among factors can be very tempting, and in this commentary, we examined two potential ways to implement this strategy: macro cycle-based and momentum-based.
Regardless of which strategy we select, it is important to remember that risk is not destroyed but rather shifted into different forms.
With the momentum-based allocation, the embedded assumption is that the near future will look like the recent past.
On the other hand, dynamic allocation based upon macro-economic regimes requires us to estimate both the current regime we are in as well as which factors will do well during that regime. In sacrificing more dynamic combinations, the fixed structure of the regime allocations may help reduce the impact of short-term noise that might lead to whipsaw trades in a momentum-based approach.
One positive about the momentum-based methodology is that the factor selection is inherently dynamic whereas the macro cycle-based method prespecified regime-dependent factor baskets. We could expect future returns to remain consistent with the historical hypothetical performance, but this is an assumption that may be informed by hindsight-based “intuition”. The trade-off with using momentum to be dynamic is that the lag of the signals may fail to capitalize on potential opportunities during the transitions between business cycle states. This was the case in recovery state during the last two recessions.
We should also keep in mind that there are only 6 factors in our investible universe. What would the returns look like if we add more factors to our universe? What if we use different constructions of the same factors? Will the momentum-based timing rotation still outperform the benchmark? This is an open question for future research.
Both macro-based and momentum-based dynamic factor allocation proved successful in our (slightly) out-of-sample test. However, we should stress that all tests were performed gross of any fees and costs, which can have a substantial impact upon results (especially for high turnover strategies). Furthermore, the success of the macro-based test was highly dependent upon the factors selected for each macro regime, and there is a risk those factors were determined with hindsight bias.
Nevertheless, we believe this evidence suggests that further research is warranted, perhaps incorporating a blend of the approaches as well as other specifications to provide further signal diversification.
The Dumb (Timing) Luck of Smart Beta
By Corey Hoffstein
On November 18, 2019
In Craftsmanship, Defensive, Momentum, Popular, Portfolio Construction, Risk & Style Premia, Value, Weekly Commentary
This post is available as a PDF download here.
Summary
We’ve written about the concept of rebalance timing luck a lot. It’s a cowbell we’ve been beating for over half a decade, with our first article going back to August 7^{th}, 2013.
As a reminder, rebalance timing luck is the performance dispersion that arises from the choice of a particular rebalance date (e.g. semi-annual rebalances that occur in June and December versus March and September).
We’ve empirically explored the impact of rebalance timing luck as it relates to strategic asset allocation, tactical asset allocation, and even used our own Systematic Value strategy as a case study for smart beta. All of our results suggest that it has a highly non-trivial impact upon performance.
This summer we published a paper in the Journal of Index Investing that proposed a simple solution to the timing luck problem: diversification. If, for example, we believe that our momentum portfolio should be rebalanced every quarter – perhaps as an optimal balance of cost and signal freshness – then we proposed splitting our capital across the three portfolios that spanned different three-month rebalance periods (e.g. JAN-APR-JUL-OCT, FEB-MAY-AUG-NOV, MAR-JUN-SEP-DEC). This solution is referred to either as “tranching” or “overlapping portfolios.”
The paper also derived a formula for estimating timing luck ex-ante, with a simplified representation of:
Where L is the timing luck measure, T is turnover rate of the strategy, F is how many times per year the strategy rebalances, and S is the volatility of a long/short portfolio that captures the difference of what a strategy is currently invested in versus what it could be invested in if the portfolio was reconstructed at that point in time.
Without numbers, this equation still informs some general conclusions:
Bullet points 1 and 3 may seem similar but capture subtly different effects. This is likely best illustrated with two examples on different extremes. First consider a very high turnover strategy that trades within a universe of highly correlated securities. Now consider a very low turnover strategy that is either 100% long or 100% short U.S. equities. In the first case, the highly correlated nature of the universe means that differences in specific holdings may not matter as much, whereas in the second case the perfect inverse correlation means that small portfolio differences lead to meaningfully different performance.
L, in and of itself, is a bit tricky to interpret, but effectively attempts to capture the potential dispersion in performance between a particular rebalance implementation choice (e.g. JAN-APR-JUL-OCT) versus a timing-luck-neutral benchmark.
After half a decade, you’d would think we’ve spilled enough ink on this subject.
But given that just about every single major index still does not address this issue, and since our passion for the subject clearly verges on fever pitch, here comes some more cowbell.
Equity Style Portfolio Definitions
In this note, we will explore timing luck as it applies to four simplified smart beta portfolios based upon holdings of the S&P 500 from 2000-2019:
Quality is a bit more complicated only because the quality factor has far less consistency in accepted definition. Therefore, we adopted the signals utilized by the S&P 500 Quality Index.
For each of these equity styles, we construct portfolios that vary across two dimensions:
For the different rebalance frequencies, we also generate portfolios that represent each possible rebalance variation of that mix. For example, Momentum portfolios with 50 stocks that rebalance annually have 12 possible variations: a January rebalance, February rebalance, et cetera. Similarly, there are 12 possible variations of Momentum portfolios with 100 stocks that rebalance annually.
By explicitly calculating the rebalance date variations of each Style x Holding x Frequency combination, we can construct an overlapping portfolios solution. To estimate empirical annualized timing luck, we calculate the standard deviation of monthly return dispersion between the different rebalance date variations of the overlapping portfolio solution and annualize the result.
Empirical Timing Luck Results
Before looking at the results plotted below, we would encourage readers to hypothesize as to what they expect to see. Perhaps not in absolute magnitude, but at least in relative magnitude.
For example, based upon our understanding of the variables affecting timing luck, would we expect an annually rebalanced portfolio to have more or less timing luck than a quarterly rebalanced one?
Should a more concentrated portfolio have more or less timing luck than a less concentrated variation?
Which factor has the greatest risk of exhibiting timing luck?
Source: Sharadar. Calculations by Newfound Research.
To create a sense of scale across the styles, below we isolate the results for semi-annual rebalancing for each style and plot it.
Source: Sharadar. Calculations by Newfound Research.
In relative terms, there is no great surprise in these results:
What is perhaps the most surprising is the sheer magnitude of timing luck. Consider that the S&P 500 Enhanced Value, Momentum, Low Volatility, and Quality portfolios all hold 100 securities and are rebalanced semi-annually. Our study suggests that timing luck for such approaches may be as large as 2.5%, 4.4%, 1.1%, and 2.0% respectively.
But what does that really mean? Consider the realized performance dispersion of different rebalance date variations of a Momentum portfolio that holds the top 100 securities in equal weight and is rebalanced on a semi-annual basis.
Source: Sharadar. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.
The 4.4% estimate of annualized timing luck is a measure of dispersion between each underlying variation and the overlapping portfolio solution. If we isolate two sub-portfolios and calculate rolling 12-month performance dispersion, we can see that the difference can be far larger, as one might exhibit positive timing luck while the other exhibits negative timing luck. Below we do precisely this for the APR-OCT and MAY-NOV rebalance variations.
Source: Sharadar. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.
In fact, since these variations are identical in every which way except for the date on which they rebalance, a portfolio that is long the APR-OCT variation and short the MAY-NOV variation would explicitly capture the effects of rebalance timing luck. If we assume the rebalance timing luck realized by these two portfolios is independent (which our research suggests it is), then the volatility of this long/short is approximately the rebalance timing luck estimated above scaled by the square-root of two.
Derivation: For variations v_{i} and v_{j} and overlapping-portfolio solution V, then:
Thus, if we are comparing two identically-managed 100-stock momentum portfolios that rebalance semi-annually, our 95% confidence interval for performance dispersion due to timing luck is +/- 12.4% (2 x SQRT(2) x 4.4%).
Even for more diversified, lower turnover portfolios, this remains an issue. Consider a 400-stock low-volatility portfolio that is rebalanced quarterly. Empirical timing luck is still 0.5%, suggesting a 95% confidence interval of 1.4%.
S&P 500 Style Index Examples
One critique of the above analysis is that it is purely hypothetical: the portfolios studied above aren’t really those offered in the market today.
We will take our analysis one step further and replicate (to the best of our ability) the S&P 500 Enhanced Value, Momentum, Low Volatility, and Quality indices. We then created different rebalance schedule variations. Note that the S&P 500 Low Volatility index rebalances quarterly, so there are only three possible rebalance variations to compute.
Source: Sharadar. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.
We see a meaningful dispersion in terminal wealth levels, even for the S&P 500 Low Volatility index, which appears at first glance in the graph to have little impact from timing luck.
Minimum Terminal Wealth
Maximum Terminal Wealth
$4.45
$5.45
$3.07
$4.99
$6.16
$6.41
$4.19
$5.25
We should further note that there does not appear to be one set of rebalance dates that does significantly better than the others. For Value, FEB-AUG looks best while JUN-DEC looks the worst; for Momentum it’s almost precisely the opposite.
Furthermore, we can see that even seemingly closely related rebalances can have significant dispersion: consider MAY-NOV and JUN-DEC for Momentum. Here is a real doozy of a statistic: at one point, the MAY-NOV implementation for Momentum is down -50.3% while the JUN-DEC variation is down just -13.8%.
These differences are even more evident if we plot the annual returns for each strategy’s rebalance variations. Note, in particular, the extreme differences in Value in 2009, Momentum in 2017, and Quality in 2003.
Source: Sharadar. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.
Conclusion
In this study, we have explored the impact of rebalance timing luck on the results of smart beta / equity style portfolios.
We empirically tested this impact by designing a variety of portfolio specifications for four different equity styles (Value, Momentum, Low Volatility, and Quality). The specifications varied by concentration as well as rebalance frequency. We then constructed all possible rebalance variations of each specification to calculate the realized impact of rebalance timing luck over the test period (2000-2019).
In line with our mathematical model, we generally find that those strategies with higher turnover have higher timing luck and those that rebalance more frequently have less timing luck.
The sheer magnitude of timing luck, however, may come as a surprise to many. For reasonably concentrated portfolios (100 stocks) with semi-annual rebalance frequencies (common in many index definitions), annual timing luck ranged from 1-to-4%, which translated to a 95% confidence interval in annual performance dispersion of about +/-1.5% to +/-12.5%.
The sheer magnitude of timing luck calls into question our ability to draw meaningful relative performance conclusions between two strategies.
We then explored more concrete examples, replicating the S&P 500 Enhanced Value, Momentum, Low Volatility, and Quality indices. In line with expectations, we find that Momentum (a high turnover strategy) exhibits significantly higher realized timing luck than a lower turnover strategy rebalanced more frequently (i.e. Low Volatility).
For these four indices, the amount of rebalance timing luck leads to a staggering level of dispersion in realized terminal wealth.
“But Corey,” you say, “this only has to do with systematic factor managers, right?”
Consider that most of the major equity style benchmarks are managed with annual or semi-annual rebalance schedules. Good luck to anyone trying to identify manager skill when your benchmark might be realizing hundreds of basis points of positive or negative performance luck a year.