This post is available as a PDF download here.
Summary
- We have shown many times that timing luck – when a portfolio chooses to rebalance – can have a large impact on the performance of tactical strategies.
- However, fundamental strategies like value portfolios are susceptible to timing luck, as well.
- Once the rebalance frequency of a strategy is set, we can mitigate the risk of choosing a poor rebalance date by diversifying across all potential variations.
- In many cases, this mitigates the risk of realizing poor performance from an unfortunate choice of rebalance date while achieving a risk profile similar to the top tier of potential strategy variations.
- By utilizing strategies that manage timing luck, the investors can more accurately assess performance differences arising from luck and skill.
On August 7th, 2013 we wrote a short blog post titled The Luck of Rebalance Timing. That means we have been prattling on about the impact of timing luck for over six years now (with apologies to our compliance department…).
(For those still unfamiliar with the idea of timing luck, we will point you to a recent publication from Spring Valley Asset Management that provides a very approachable introduction to the topic.1)
While most of our earliest studies related to the impact of timing luck in tactical strategies, over time we realized that timing luck could have a profound impact on just about any strategy that rebalances on a fixed frequency. We found that even a simple fixed-mix allocation of stocks and bonds could see annual performance spreads exceeding 700bp due only to the choice of when they rebalanced in a given year.
In seeking to generalize the concept, we derived a formula that would estimate how much timing luck a strategy might have. The details of the derivation can be found in our paper recently published in the Journal of Index Investing, but the basic formula is:
Here T is strategy turnover, F is how many times per year the strategy rebalances, and S is the volatility of a long/short portfolio capturing the difference between what the strategy is currently invested in versus what it could be invested in.
We’re biased, but we think the intuition here works out fairly nicely:
- The higher a strategy’s turnover, the greater the impact of our choice of rebalance dates. For example, if we have a value strategy that has 50% turnover per year, an implementation that rebalances in January versus one that rebalances in July might end up holding very different securities. On the other hand, if the strategy has just 1% turnover per year, we don’t expect the differences in holdings to be very large and therefore timing luck impact would be minimal.
- The more frequently we rebalance, the lower the timing luck. Again, this makes sense as more frequent rebalancing limits the potential difference in holdings of different implementation dates. Again, consider a value strategy with 50% turnover. If our portfolio rebalances every other month, there are two potential implementations: one that rebalances January, March, May, etc. and one that rebalances February, April, June, etc. We would expect the difference in portfolio holdings to be much more limited than in the case where we rebalance only annually.2
- The last term, S, is most easily explained with an example. If we have a portfolio that can hold either the Russell 1000 or the S&P 500, we do not expect there to be a large amount of performance dispersion regardless of when we rebalance or how frequently we do so. The volatility of a portfolio that is long the Russell 1000 and short the S&P 500 is so small, it drives timing luck near zero. On the other hand, if a portfolio can hold the Russell 1000 or be short the S&P 500, differences in holdings due to different rebalance dates can lead to massive performance dispersion. Generally speaking, S is larger for more highly concentrated strategies with large performance dispersion in their investable universe.
Timing Luck in Smart Beta
To date, we have not meaningfully tested timing luck in the realm of systematic equity strategies.3 In this commentary, we aim to provide a concrete example of the potential impact.
A few weeks ago, however, we introduced our Systematic Value portfolio, which seeks to deliver concentrated exposure to the value style while avoiding unintended process and timing luck bets.
To achieve this, we implement an overlapping portfolio process. Each month we construct a concentrated deep value portfolio, selecting just 50 stocks from the S&P 500. However, because we believe the evidence suggests that value is a slow-moving signal, we aim for a holding period between 3-to-5 years. To achieve this, our capital is divided across the prior 60 months of portfolios.4
Which all means that we have monthly snapshots of deep value5 portfolios going back to November 2012, providing us data to construct all sorts of rebalance variations.
The Luck of Annual Rebalancing
Given our portfolio snapshots, we will create annually rebalanced portfolios. With monthly portfolios, there are twelve variations we can construct: a portfolio that reconstitutes each January; one that reconstitutes each February; a portfolio that reconstitutes each March; et cetera.
Below we plot the equity curves for these twelve variations.
Source: CSI Analytics. Calculations by Newfound Research. Results are hypothetical. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
We cannot stress enough that these portfolios are all implemented using a completely identical process. The only difference is when they run that process. The annualized returns range from 9.6% to 12.2%. And those two portfolios with the largest disparity rebalanced just a month apart: January and February.
To avoid timing luck, we want to diversify when we rebalance. The simplest way of achieving this goal is through overlapping portfolios. For example, we can build portfolios that rebalance annually, but allocate to two different dates. One portfolio could place 50% of its capital in the January rebalance index and 50% in the July rebalance index.
Another variation could place 50% of its capital in the February index and 50% in the August index.6 There are six possible variations, which we plot below.
The best performing variation (January and July) returned 11.7% annualized, while the worst (February and August) returned 9.7%. While the spread has narrowed, it would be dangerous to confuse 200bp annualized for alpha instead of rebalancing luck.
Source: CSI Analytics. Calculations by Newfound Research. Results are hypothetical. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
We can go beyond just two overlapping portfolios, though. Below we plot the three variations that contain four overlapping portfolios (January-April-July-October, February-May-August-November, and March-June-September-December). The best variation now returns 10.9% annualized while the worst returns 10.1% annualized. We can see how overlapping portfolios are shrinking the variation in returns.
Finally, we can plot the variation that employs 12 overlapping portfolios. This variation returns 10.6% annualized; almost perfectly in line with the average annualized return of the underlying 12 variations. No surprise: diversification has neutralized timing luck.
Source: CSI Analytics. Calculations by Newfound Research. Results are hypothetical. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
Source: CSI Analytics. Calculations by Newfound Research. Results are hypothetical. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
But besides being “average by design,” how can we measure the benefits of diversification?
As with most ensemble approaches, we see a reduction in realized risk metrics. For example, below we plot the maximum realized drawdown for annual variations, semi-annual variations, quarterly variations, and the monthly variation. While the dispersion is limited to just a few hundred basis points, we can see that the diversification embedded in the monthly variation is able to reduce the bad luck of choosing an unfortunate rebalance date.
Source: CSI Analytics. Calculations by Newfound Research. Results are hypothetical. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
Just Rebalance more Frequently?
One of the major levers in the timing luck equation is how frequently the portfolio is rebalanced. However, we firmly believe that while rebalancing frequency impacts timing luck, timing luck should not be a driving factor in our choice of rebalance frequency.
Rather, rebalance frequency choices should be a function of the speed at which our signal decays (e.g. fast-changing signals such as momentum versus slow-changing signals like value) versus implementation costs (e.g. explicit trading costs, market impact, and taxes). Only after this choice is made should we seek to limit timing luck.
Nevertheless, we can ask the question, “how does rebalancing more frequently impact timing luck in this case?”
To answer this question, we will evaluate quarterly-rebalanced portfolios. The distinction here from the quarterly overlapping portfolios above is that the entire portfolio is rebalanced each quarter rather than only a quarter of the portfolio. Below, we plot the equity curves for the three possible variations.
Source: CSI Analytics. Calculations by Newfound Research. Results are hypothetical. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
The best performing variation returns 11.7% annualized while the worst returns 9.7% annualized, for a spread of 200 basis points. This is actually larger than the spread we saw with the three quarterly overlapping portfolio variations, and likely due to the fact that turnover within the portfolios increased meaningfully.
While we can see that increasing the frequency of rebalancing can help, in our opinion the choice of rebalance frequency should be distinct from the choice of managing timing luck.
Conclusion
In our opinion, there are at least two meaningful conclusions here:
The first is for product manufacturers (e.g. index issuers) and is rather simple: if you’re going to have a fixed rebalance schedule, please implement overlapping portfolios. It isn’t hard. It is literally just averaging. We’re all better off for it.
The second is for product users: realize that performance dispersion between similarly-described systematic strategies can be heavily influenced by when they rebalance. The excess return may really just be a phantom of luck, not skill.
The solution to this problem, in our opinion, is to either: (1) pick an approach and just stick to it regardless of perceived dispersion, accepting the impact of timing luck; (2) hold multiple approaches that rebalance on different days; or (3) implement an approach that accounts for timing luck.
We believe the first approach is easier said than done. And without a framework for distinguishing between timing luck and alpha, we’re largely making arbitrary choices.
The second approach is certainly feasible but has the potential downside of requiring more holdings as well as potentially forcing an investor to purchase an approach they are less comfortable with. For example, blending IWD (Russell 1000 Value), RPV (S&P 500 Pure Value), VLUE (MSCI U.S. Enhanced Value), and QVAL (Alpha Architect U.S. Quantitative Value) may create a portfolio that rebalances on many different dates (annual in May; annual in December; semi-annual in May and November; and quarterly, respectively), it also introduces significant process differences. Though research suggests that investors may benefit from further manager/process diversification.
For investors with conviction in a single strategy implementation, the last approach is certainly the best. Unfortunately, as far as we are aware, there are only a few firms who actively implement overlapping portfolios (including Newfound Research, O’Shaughnessy Asset Management, AQR, and Research Affiliates). Until more firms adopt this approach, timing luck will continue to loom large.
Es-CAPE Velocity: Value-Driven Sector Rotation
By Corey Hoffstein
On August 26, 2019
In Portfolio Construction, Risk & Style Premia, Value, Weekly Commentary
This post is available as a PDF download here.
Summary
It is no secret that systematic value investing of all sorts has struggled as of late. With the curious exception, that is, of the Barclays Shiller CAPE sector rotation strategy, a strategy explored by Bunn, Staal, Zhuang, Lazanas, Ural and Shiller in their 2014 paper Es-cape-ing from Overvalued Sectors: Sector Selection Based on the Cyclically Adjusted Price-Earnings (CAPE) Ratio. Initial performance suggests that the idea has performed quite well out-of-sample, which stands out among many “smart-beta” strategies which have failed to live up to their backtests.
Source: CSI Data. Calculations by Newfound Research. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
Why is this strategy finding success where other value strategies have not? That is what we aim to explore in this commentary.
On a monthly basis, the Shiller CAPE sector rotation portfolio is rebalanced into an equal-weight allocation across four of the ten primary GICS sectors. The four are selected first by ranking the 10 primary sectors based upon their Relative CAPE ratios and choosing the cheapest five sectors. Of those cheapest five sectors, the sector with the worst trailing 12-month return (“momentum”) is removed.
The CAPE ratio – standing for Cyclically-Adjusted Price-to-Earnings ratio – is the current price divided by the 10-year moving average of inflation-adjusted earnings. The purpose of this smoothing is to reduce the impact of business cycle fluctuations.
The potential problem with using the raw CAPE value for each sector is that certain sectors have structurally higher and lower CAPE ratios than their peers. High growth sectors – e.g. Technology – tend to have higher CAPE ratios because they reinvest a substantial portion of their earnings while more stable sectors – e.g. Utilities – tend to have much lower CAPE ratios. Were we to simply sort sectors based upon their current CAPE ratio, we would tend to create structural over- and under-weights towards certain sectors.
To adjust for this structural difference, the strategy uses the idea of a Relative CAPE ratio, which is calculated by taking the current CAPE ratio and dividing it by a rolling 20-year average CAPE ratio1 for that sector. The thesis behind this step is that dividing by a long-term mean normalizes the sectors and allows for better comparison. Relative CAPE values above 1 mean that the sector is more expensive than it has historically been, while values less than 1 mean it is cheaper.
It is important to note here that the actual selection is still performed on a cross-sector basis. It is entirely possible that all the sectors appear cheap or expensive on a historical basis at the same time. The portfolio will simply pick the cheapest sectors available.
Poking and Prodding the Parameters
With an understanding of the rules, our first step is to poke and prod a bit to figure out what is really driving the strategy.
We begin by first exploring the impact of using the Relative CAPE ratio versus just the CAPE ratio.
For each of these ratios, we’ll plot two strategies. The first is a naïve Value strategy, which will equally-weight the four cheapest sectors. The second is the Shiller strategy, which chooses the top five cheapest sectors and drops the one with the worst momentum. This should provide a baseline for comparing the impact of the momentum filter.
Strategy returns are plotted relative to the S&P 500.
Source: Siblis Research; Morningstar; CS Data. Calculations by Newfound Research. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
For the Relative CAPE ratio, we also vary the lookback period for calculating the rolling average CAPE from 5- to 20-years.
Source: Siblis Research; Morningstar; CSI Data. Calculations by Newfound Research. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
A few things immediately stand out:
The second-to-last point is particularly curious, as it implies that using momentum to “avoid the value trap” creates significant value (no pun intended; okay, pun intended) for the strategy.
Varying the Value Metric (in Vain)
To gain more insight, we next test the impact of the choice of the CAPE ratio. Below we plot the relative returns of different Shiller-based strategies (again varying lookbacks from 5- to 20-years), but use price-to-book, trailing 12-month price-to-earnings, and trailing 12-month EV/EBITDA as our value metrics.
A few things stand out:
Source: Siblis Research; Morningstar; CSI Data. Calculations by Newfound Research. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
At this point, we have to ask: is there something special about the Relative CAPE that makes it inherently superior to other metrics?
A Big Bubble-Based Bet?
If we take a step back for a moment, it is worth asking ourselves a simple question: what would it take for a sector rotation strategy to out-perform the S&P 500 over the last decade?
With the benefit of hindsight, we know Consumer Discretionary and Technology have led the pack, while traditionally stodgy sectors like Consumer Staples and Utilities have lagged behind (though not nearly as poorly as Energy).
As we mentioned earlier, a naïve rank on the CAPE ratio would almost certainly prefer Utilities and Staples over Technology and Discretionary. Thus, for us to outperform the market, we must somehow construct a value metric that identifies the two most chronically expensive sectors (ignoring back-dated valuations for the new Communication Services sector) as being among the cheapest.
This is where dividing by the rolling 20-year average comes into play. In spirit, it makes a certain degree of sense. In practice, however, this plays out perfectly for Technology, which went through such an enormous bubble in the late 1990s that the 20-year average was meaningfully skewed upward by an outlier event. Thus, for almost the entire 20-year period after the dot-com bubble, Technology appears to be relatively cheap by comparison. After all, you can buy for 30x earnings today what you used to be able to buy for 180x!
The result is a significant – and near-permanent tilt – towards Technology since the beginning of 2012, which can be seen in the graph of strategy weights below.
One way to explore the impact of this choice is calculate the weight differences between a top-4 CAPE strategy and a top-4 Relative CAPE strategy, which we also plot below. We can see that after early 2012, the Relative CAPE strategy is structurally overweight Technology and underweight Financials and Utilities. Prior to 2008, we can see that it is structurally underweight Energy and overweight Consumer Staples.
If we take these weights and use them to construct a return stream, we can isolate the return impact the choice of using Relative CAPE versus CAPE has. Interestingly, the long Technology / short Financials & Utilities trade did not appear to generate meaningful out-performance in the post-2012 era, suggesting that something else is responsible for post-2012 performance.
Source: Siblis Research; Morningstar; CSI Data. Calculations by Newfound Research. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
The Miraculous Mojo of Momentum
This is where the 12-month momentum filter plays a crucial role. Narratively, it is to avoid value traps. Practically, it helps the strategy deftly dodge Financials in 2008, avoiding a significant melt-down in one of the S&P 500’s largest sectors.
Now, you might think that valuations alone should have allowed the strategy to avoid Technology in the dot-com fallout. As it turns out, the Technology CAPE fell so precipitously that in using the Relative CAPE metric the Technology sector was still ranked as one of the top five cheapest sectors from 3/2001 to 11/2002. The only way the strategy was able to avoid it? The momentum filter.
Removing this filter makes the relative results a lot less attractive. Below we re-plot the relative performance of a simple “top 4” Relative CAPE strategy.
Source: Siblis Research; Morningstar; CSI Data. Calculations by Newfound Research. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
Just how much impact does the momentum filter have? We can isolate the effect by taking the weights of the Shiller strategy and subtracting the weights of the Value strategy to construct a long/short index that isolates the effect. Below we plot the returns of this index.
It should be noted that the legs of the long/short portfolio only have a notional exposure of 25%, as that is the most the Value and Shiller strategies can deviate by. Nevertheless, even with this relatively small weight, when isolated the filter generates an annualized return of 1.8% per year with an annualized volatility of 4.8% and a maximum drawdown of 11.6%.
Scaled to a long/short with 100% notional per leg, annualized returns jump to 6.0%. Though volatility and maximum drawdown both climb to 20.4% and 52.6% respectively.
Source: Siblis Research; Morningstar; CSI Data. Calculations by Newfound Research. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.
Conclusion
Few, if any, systematic value strategies have performed well as of late. When one does – as with the Shiller CAPE sector rotation strategy – it is worth further review.
As a brief summary of our findings:
Taken all together, it is hard to not question whether these results are unintentionally datamined. Unfortunately, we just do not have enough data to extend the tests further back in time for truly out-of-sample analysis.
What we can say, however, is that the backtested and live performance hinges almost entirely a few key trades:
Three of these four trades are driven by the momentum filter. When we further consider that the Shiller strategy is in effect the returns of the pure value implementation – which suffered in the dot-com run-up and was a mostly random walk thereafter – and the returns of the isolated momentum filter, it becomes rather difficult to call this a value strategy at all.
As of the date of this document, neither Newfound Research nor Corey Hoffstein holds a position in the securities discussed in this article and do not have any plans to trade in such securities. Newfound Research and Corey Hoffstein do not take a position as to whether this security should be recommended for any particular investor.