This post is available as a PDF download here.
Summary
- In this case study, we explore building a simple, low cost, systematic municipal bond portfolio.
- The portfolio is built using the low volatility, momentum, value, and carry factors across a set of six municipal bond sectors. It favors sectors with lower volatility, better recent performance, cheaper valuations, and higher yields. As with other factor studies, a multi-factor approach is able to harvest major benefits from active strategy diversification since the factors have low correlations to one another.
- The factor tilts lead to over- and underweights to both credit and duration through time. Currently, the portfolio is significantly underweight duration and modestly overweight credit.
- A portfolio formed with the low volatility, value, and carry factors has sufficiently low turnover that these factors may have value in setting strategic allocations across municipal bond sectors.
Recently, we’ve been working on building a simple, ETF-based municipal bond strategy. Probably to the surprise of nobody who regularly reads our research, we are coming at the problem from a systematic, multi-factor perspective.
For this exercise, our universe consists of six municipal bond indices:
- Bloomberg Barclays AMT-Free Short Continuous Municipal Index
- Bloomberg Barclays AMT-Free Intermediate Continuous Municipal Index
- Bloomberg Barclays AMT-Free Long Continuous Municipal Index
- Bloomberg Barclays Municipal Pre-Refunded-Treasury-Escrowed Index
- Bloomberg Barclays Municipal Custom High Yield Composite Index
- Bloomberg Barclays Municipal High Yield Short Duration Index
These indices, all of which are tracked by VanEck Vectors ETFs, offer access to municipal bonds across a range of durations and credit qualities.
Before we get started, why are we writing another multi-factor piece after addressing factors in the context of a multi-asset universe just two weeks ago?
The simple answer is that we find the topic to be that pressing for today’s investors. In a world of depressed expected returns and elevated correlations, we believe that factor-based strategies have a role as both return generators and risk mitigators.
Our confidence in what we view as the premier factors (value, momentum, low volatility, carry, and trend) stems largely from their robustness in out-of-sample tests across asset classes, geographies, and timeframes. The results in this case study not only suggest that a factor-based approach is feasible in muni investing, but also in our opinion strengthens the case for factor investing in other contexts (e.g. equities, taxable fixed income, commodities, currencies, etc.).
Constructing Long/Short Factor Portfolios
For the municipal bond portfolio, we consider four factors:
- Value: Buy undervalued sectors, sell overvalued sectors
- Momentum: Buy strong recent performers, sell weak recent performers
- Low Volatility: Buy low risk sectors, sell high risk sectors
- Carry: Buy higher yielding sectors, sell lower yielding sectors
As a first step, we construct long/short single factor portfolios. The weight on index i at time t in long/short factor portfolio f is equal to:
In this formula, c is a scaling coefficient, S is index i’s time t score on factor f, and N is the number of indices in the universe at time t.
We measure each factor with the following metrics:
- Value: Normalized deviation of real yield from the 5-year trailing average yield[1]
- Momentum: Trailing twelve month return
- Low Volatility: Historical standard deviation of monthly returns[2]
- Carry: Yield-to-worst
For the value, momentum, and carry factors, the scaling coefficient is set so that the portfolio is dollar neutral (i.e. we are long and short the same dollar amount of securities). For the low volatility factor, the scaling coefficient is set so that the volatilities of the long and short portfolios are approximately equal. This is necessary since a dollar neutral construction would be perpetually short “beta” to the overall municipal bond market.
All four factors are profitable over the period from June 1998 to April 2017. The value factor is the top performer both from an absolute return and risk-adjusted return perspective.
There is significant variation in performance over time. All four factors have years where they are the best performing factor and years where they are the worst performing factor. The average annual spread between the best performing factor and the worst performing factor is 11.3%.
The individual long/short factor portfolios are diversified to both each other (average pairwise correlation of -0.11) and to the broad municipal bond market.
Moving From Single Factor to Multi-Factor Portfolios
The diversified nature of the long/short return streams makes a multi-factor approach hard to beat in terms of risk-adjusted returns. This is another example of the type of strategy diversification that we have long lobbied for.
As evidence of these benefits, we have built two versions of a portfolio combining the low volatility, value, carry, and momentum factors. The first version targets an equal dollar allocation to each factor. The second version uses a naïve risk parity approach to target an approximately equal risk contribution from each factor.
Both approaches outperform all four individual factors on a risk-adjusted basis, delivering Sharpe Ratios of 1.19 and 1.23, respectively, compared to 0.96 for the top single factor (value).
To stress this point, diversification is so plentiful across the factors that even the simplest portfolio construction methodologies outperforms an investor who was able to identify the best performing factor with perfect foresight. For additional context, we constructed a “Look Ahead Mean-Variance Optimization (“MVO”) Portfolio” by calculating the Sharpe optimal weights using actual realized returns, volatilities, and correlations. The Look Ahead MVO Portfolio has a Sharpe Ratio of 1.43, not too far ahead of our two multi-factor portfolios. The approximate weights in the Look Ahead MVO Portfolio are 49% to Low Volatility, 25% to Value, 15% to Carry, and 10% to Momentum. While the higher Sharpe Ratio factors (Low Volatility and Value) do get larger allocations, Momentum and Carry are still well represented due to their diversification benefits.
From a risk perspective, both multi-factor portfolios have lower volatility than any of the individual factors and a maximum drawdown that is within 1% of the individual factor with the least amount of historical downside risk. It’s also worth pointing out that the risk parity construction leads to a return stream that is very close to normally distributed (skew of 0.1 and kurtosis of 3.0).
In the graph on the next page, we present another lens through which we can view the tremendous amount of diversification that can be harvested between factors. Here we plot how the allocation to a specific factor, using MVO, will change as we vary that factor’s Sharpe Ratio. We perform this analysis for each factor individually, holding all other parameters fixed at their historical levels.
As an example, to estimate the allocation to the Low Volatility factor at a Sharpe Ratio of 0.1, we:
- Assume the covariance matrix is equal to the historical covariance over the full sample period.
- Assume the excess returns for the other three factors (Carry, Momentum, and Value) are equal to their historical averages.
- Assume the annualized excess return for the Low Volatility factor is 0.16% so that the Sharpe Ratio is equal to our target of 0.1 (Low Volatility’s annualized volatility is 1.6%).
- Calculate the MVO optimal weights using these excess return and risk assumptions.
As expected, Sharpe Ratios and allocation sizes are positively correlated. Higher Sharpe Ratios lead to higher allocations.
That being said, three of the factors (Low Volatility, Carry, and Momentum) would receive allocations even if their Sharpe Ratios were slightly negative.
The allocations to carry and momentum are particularly insensitive to Sharpe Ratio level. Momentum would receive an allocation of 4% with a 0.00 Sharpe, 9% with a 0.25 Sharpe, 13% with a 0.50 Sharpe, 17% with a 0.75 Sharpe, and 20% with a 1.00 Sharpe. For the same Sharpe Ratios, the allocations to Carry would be 10%, 15%, 19%, 22%, and 24%, respectively.
Holding these factors provides a strong ballast within the multi-factor portfolio.
Moving From Long/Short to Long Only
Most investors have neither the space in their portfolio for a long/short muni strategy nor sufficient access to enough affordable leverage to get the strategy to an attractive level of volatility (and hence return). A more realistic approach would be to layer our factor bets on top of a long only strategic allocation to muni bonds.
In a perfect world, we could slap one of our multi-factor long/short portfolios right on top of a strategic municipal bond portfolio. The results of this approach (labeled “Benchmark + Equal Weight Factor Long/Short” in the graphics below) are impressive (Sharpe Ratio of 1.17 vs. 0.93 for the strategic benchmark and return to maximum drawdown of 0.72 vs. 0.46 for the strategic benchmark). Unfortunately, this approach still requires just a bit of shorting. The size of the total short ranges from 0% to 19% with an average of 5%.
We can create a true long only portfolio (“Long Only Factor”) by removing all shorts and normalizing so that our weights sum to one. Doing so modestly reduces risk, return, and risk-adjusted return, but still leads to outperformance vs. the benchmark.
Below we plot both the historical and current allocations for the long only factor portfolio. Currently, the portfolio would have approximately 25% in each short-term investment grade, pre-refunded, and short-term high yield with the remaining 25% split roughly 80/20 between high yield and intermediate-term investment grade. There is currently no allocation to long-term investment grade.
A few interesting observations relating to the long only portfolio and muni factor investing in general:
- The factor tilts lead to clear duration and credit bets over time. Below we plot the duration and a composite credit score for the factor portfolio vs. the benchmark over time.
Currently, the portfolio is near an all-time low in terms of duration and is slightly titled towards lower credit quality sectors relative to the benchmark. Historically, the factor portfolio was most often overweight both duration and credit, having this positioning in 53.7% of the months in the sample. The second and third most common tilts were underweight duration / underweight credit (22.0% of sample months) and underweight duration / overweight credit (21.6% of sample months). The portfolio was overweight duration / underweight credit in only 2.6% of sample months.
- Even for more passive investors, a factor-based perspective can be valuable in setting strategic allocations. The long only portfolio discussed above has annualized turnover of 77%. If we remove the momentum factor, which is by far the biggest driver of turnover, and restrict ourselves to a quarterly rebalance, we can reduce turnover to just 18%. This does come at a cost, as the Sharpe Ratio drops from 1.12 to 1.04, but historical performance would still be strong relative to our benchmark. This suggests that carry, value, and low volatility may be valuable in setting strategic allocations across municipal bond ETFs with only periodic updates at a normal strategic rebalance frequency.
- We ran regressions with our long/short factors on all funds in the Morningstar Municipal National Intermediate category with a track record that extended over our full sample period from June 1998 to April 2017. Below, we plot the betas of each fund to each of our four long/short factors. Blue bars indicate that the factor beta was significant at a 5% level. Gray bars indicate that the factor beta was not significant at a 5% level. We find little evidence of the active managers following a factor approach similar to what we outline in this post. Part of this is certainly the result of the constrained nature of the category with respect to duration and credit quality. In addition, these results do not speak to whether any of the managers use a factor-based approach to pick individual bonds within their defined duration and credit quality mandates.
The average beta to the low volatility factor, ignoring non-statistically significant values, is -0.23. This is most likely a function of category since the category consists of funds with both investment grade credit quality and durations ranging between 4.5 and 7.0 years. In contrast, our low volatility factor on average has short exposure to the intermediate and long-term investment grade sectors.
Only 14 of the 33 funds in the universe have statistically significant exposure to the value factor with an average beta of -0.03.
The average beta to the carry factor, ignoring non-statistically significant values, is -0.23. As described above with respect to low volatility, this is most likely function of category as our carry factor favors the long-term investment grade and high yield sectors.
Only 9 of the 33 funds in the universe have statistically significant exposure to the momentum factor with an average beta of 0.02.
Conclusion
Multi-factor investing has generated significant press in the equity space due to the (poorly named) “smart beta” movement. The popular factors in the equity space have historically performed well both within other asset classes (rates, commodities, currencies, etc.) and across asset classes. The municipal bond market is no different. A simple, systematic multi-factor process has the potential to improve risk-adjusted performance relative to static benchmarks. The portfolio can be implemented with liquid, low cost ETFs.
Moving beyond active strategies, factors can also be valuable tools when setting strategic sector allocations within a municipal bond sleeve and when evaluating and blending municipal bond managers.
Perhaps more importantly, the out-of-sample evidence for the premier factors (momentum, value, low volatility, carry, and trend) across asset classes, geographies, and timeframes continues to mount. In our view, this evidence can be crucial in getting investors comfortable to introducing systematic active premia into their portfolios as both return generators and risk mitigators.
[1] Computed using yield-to-worst. Inflation estimates are based on 1-year and 10-year survey-based expected inflation. We average the value score over the last 2.5 years, allowing the portfolio to realize a greater degree of valuation mean reversion before closing out a position.
[2] We use a rolling 5-year (60-month) window to calculate standard deviation. We require at least 3 years of data for an index to be included in the low volatility portfolio. The standard deviation is multiplied by -1 so that higher values are better across all four factor scores.
The Dumb (Timing) Luck of Smart Beta
By Corey Hoffstein
On November 18, 2019
In Craftsmanship, Defensive, Momentum, Popular, Portfolio Construction, Risk & Style Premia, Value, Weekly Commentary
This post is available as a PDF download here.
Summary
We’ve written about the concept of rebalance timing luck a lot. It’s a cowbell we’ve been beating for over half a decade, with our first article going back to August 7th, 2013.
As a reminder, rebalance timing luck is the performance dispersion that arises from the choice of a particular rebalance date (e.g. semi-annual rebalances that occur in June and December versus March and September).
We’ve empirically explored the impact of rebalance timing luck as it relates to strategic asset allocation, tactical asset allocation, and even used our own Systematic Value strategy as a case study for smart beta. All of our results suggest that it has a highly non-trivial impact upon performance.
This summer we published a paper in the Journal of Index Investing that proposed a simple solution to the timing luck problem: diversification. If, for example, we believe that our momentum portfolio should be rebalanced every quarter – perhaps as an optimal balance of cost and signal freshness – then we proposed splitting our capital across the three portfolios that spanned different three-month rebalance periods (e.g. JAN-APR-JUL-OCT, FEB-MAY-AUG-NOV, MAR-JUN-SEP-DEC). This solution is referred to either as “tranching” or “overlapping portfolios.”
The paper also derived a formula for estimating timing luck ex-ante, with a simplified representation of:
Where L is the timing luck measure, T is turnover rate of the strategy, F is how many times per year the strategy rebalances, and S is the volatility of a long/short portfolio that captures the difference of what a strategy is currently invested in versus what it could be invested in if the portfolio was reconstructed at that point in time.
Without numbers, this equation still informs some general conclusions:
Bullet points 1 and 3 may seem similar but capture subtly different effects. This is likely best illustrated with two examples on different extremes. First consider a very high turnover strategy that trades within a universe of highly correlated securities. Now consider a very low turnover strategy that is either 100% long or 100% short U.S. equities. In the first case, the highly correlated nature of the universe means that differences in specific holdings may not matter as much, whereas in the second case the perfect inverse correlation means that small portfolio differences lead to meaningfully different performance.
L, in and of itself, is a bit tricky to interpret, but effectively attempts to capture the potential dispersion in performance between a particular rebalance implementation choice (e.g. JAN-APR-JUL-OCT) versus a timing-luck-neutral benchmark.
After half a decade, you’d would think we’ve spilled enough ink on this subject.
But given that just about every single major index still does not address this issue, and since our passion for the subject clearly verges on fever pitch, here comes some more cowbell.
Equity Style Portfolio Definitions
In this note, we will explore timing luck as it applies to four simplified smart beta portfolios based upon holdings of the S&P 500 from 2000-2019:
Quality is a bit more complicated only because the quality factor has far less consistency in accepted definition. Therefore, we adopted the signals utilized by the S&P 500 Quality Index.
For each of these equity styles, we construct portfolios that vary across two dimensions:
For the different rebalance frequencies, we also generate portfolios that represent each possible rebalance variation of that mix. For example, Momentum portfolios with 50 stocks that rebalance annually have 12 possible variations: a January rebalance, February rebalance, et cetera. Similarly, there are 12 possible variations of Momentum portfolios with 100 stocks that rebalance annually.
By explicitly calculating the rebalance date variations of each Style x Holding x Frequency combination, we can construct an overlapping portfolios solution. To estimate empirical annualized timing luck, we calculate the standard deviation of monthly return dispersion between the different rebalance date variations of the overlapping portfolio solution and annualize the result.
Empirical Timing Luck Results
Before looking at the results plotted below, we would encourage readers to hypothesize as to what they expect to see. Perhaps not in absolute magnitude, but at least in relative magnitude.
For example, based upon our understanding of the variables affecting timing luck, would we expect an annually rebalanced portfolio to have more or less timing luck than a quarterly rebalanced one?
Should a more concentrated portfolio have more or less timing luck than a less concentrated variation?
Which factor has the greatest risk of exhibiting timing luck?
Source: Sharadar. Calculations by Newfound Research.
To create a sense of scale across the styles, below we isolate the results for semi-annual rebalancing for each style and plot it.
Source: Sharadar. Calculations by Newfound Research.
In relative terms, there is no great surprise in these results:
What is perhaps the most surprising is the sheer magnitude of timing luck. Consider that the S&P 500 Enhanced Value, Momentum, Low Volatility, and Quality portfolios all hold 100 securities and are rebalanced semi-annually. Our study suggests that timing luck for such approaches may be as large as 2.5%, 4.4%, 1.1%, and 2.0% respectively.
But what does that really mean? Consider the realized performance dispersion of different rebalance date variations of a Momentum portfolio that holds the top 100 securities in equal weight and is rebalanced on a semi-annual basis.
Source: Sharadar. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.
The 4.4% estimate of annualized timing luck is a measure of dispersion between each underlying variation and the overlapping portfolio solution. If we isolate two sub-portfolios and calculate rolling 12-month performance dispersion, we can see that the difference can be far larger, as one might exhibit positive timing luck while the other exhibits negative timing luck. Below we do precisely this for the APR-OCT and MAY-NOV rebalance variations.
Source: Sharadar. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.
In fact, since these variations are identical in every which way except for the date on which they rebalance, a portfolio that is long the APR-OCT variation and short the MAY-NOV variation would explicitly capture the effects of rebalance timing luck. If we assume the rebalance timing luck realized by these two portfolios is independent (which our research suggests it is), then the volatility of this long/short is approximately the rebalance timing luck estimated above scaled by the square-root of two.
Derivation: For variations vi and vj and overlapping-portfolio solution V, then:
Thus, if we are comparing two identically-managed 100-stock momentum portfolios that rebalance semi-annually, our 95% confidence interval for performance dispersion due to timing luck is +/- 12.4% (2 x SQRT(2) x 4.4%).
Even for more diversified, lower turnover portfolios, this remains an issue. Consider a 400-stock low-volatility portfolio that is rebalanced quarterly. Empirical timing luck is still 0.5%, suggesting a 95% confidence interval of 1.4%.
S&P 500 Style Index Examples
One critique of the above analysis is that it is purely hypothetical: the portfolios studied above aren’t really those offered in the market today.
We will take our analysis one step further and replicate (to the best of our ability) the S&P 500 Enhanced Value, Momentum, Low Volatility, and Quality indices. We then created different rebalance schedule variations. Note that the S&P 500 Low Volatility index rebalances quarterly, so there are only three possible rebalance variations to compute.
Source: Sharadar. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.
We see a meaningful dispersion in terminal wealth levels, even for the S&P 500 Low Volatility index, which appears at first glance in the graph to have little impact from timing luck.
Minimum Terminal Wealth
Maximum Terminal Wealth
$4.45
$5.45
$3.07
$4.99
$6.16
$6.41
$4.19
$5.25
We should further note that there does not appear to be one set of rebalance dates that does significantly better than the others. For Value, FEB-AUG looks best while JUN-DEC looks the worst; for Momentum it’s almost precisely the opposite.
Furthermore, we can see that even seemingly closely related rebalances can have significant dispersion: consider MAY-NOV and JUN-DEC for Momentum. Here is a real doozy of a statistic: at one point, the MAY-NOV implementation for Momentum is down -50.3% while the JUN-DEC variation is down just -13.8%.
These differences are even more evident if we plot the annual returns for each strategy’s rebalance variations. Note, in particular, the extreme differences in Value in 2009, Momentum in 2017, and Quality in 2003.
Source: Sharadar. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.
Conclusion
In this study, we have explored the impact of rebalance timing luck on the results of smart beta / equity style portfolios.
We empirically tested this impact by designing a variety of portfolio specifications for four different equity styles (Value, Momentum, Low Volatility, and Quality). The specifications varied by concentration as well as rebalance frequency. We then constructed all possible rebalance variations of each specification to calculate the realized impact of rebalance timing luck over the test period (2000-2019).
In line with our mathematical model, we generally find that those strategies with higher turnover have higher timing luck and those that rebalance more frequently have less timing luck.
The sheer magnitude of timing luck, however, may come as a surprise to many. For reasonably concentrated portfolios (100 stocks) with semi-annual rebalance frequencies (common in many index definitions), annual timing luck ranged from 1-to-4%, which translated to a 95% confidence interval in annual performance dispersion of about +/-1.5% to +/-12.5%.
The sheer magnitude of timing luck calls into question our ability to draw meaningful relative performance conclusions between two strategies.
We then explored more concrete examples, replicating the S&P 500 Enhanced Value, Momentum, Low Volatility, and Quality indices. In line with expectations, we find that Momentum (a high turnover strategy) exhibits significantly higher realized timing luck than a lower turnover strategy rebalanced more frequently (i.e. Low Volatility).
For these four indices, the amount of rebalance timing luck leads to a staggering level of dispersion in realized terminal wealth.
“But Corey,” you say, “this only has to do with systematic factor managers, right?”
Consider that most of the major equity style benchmarks are managed with annual or semi-annual rebalance schedules. Good luck to anyone trying to identify manager skill when your benchmark might be realizing hundreds of basis points of positive or negative performance luck a year.