The Research Library of Newfound Research

Tag: smart beta

The Dumb (Timing) Luck of Smart Beta

This post is available as a PDF download here.

Summary

  • In past research notes we have explored the impact of rebalance timing luck on strategic and tactical portfolios, even using our own Systematic Value methodology as a case study.
  • In this note, we generate empirical timing luck estimates for a variety of specifications for simplified value, momentum, low volatility, and quality style portfolios.
  • Relative results align nicely with intuition: higher concentration and less frequent rebalancing leads to increasing levels of realized timing luck.
  • For more reasonable specifications – e.g. 100 stock portfolios rebalanced semi-annually – timing luck ranges between 100 and 400 basis points depending upon the style under investigation, suggesting a significant risk of performance dispersion due only to when a portfolio is rebalanced and nothing else.
  • The large magnitude of timing luck suggests that any conclusions drawn from performance comparisons between smart beta ETFs or against a standard style index may be spurious.

We’ve written about the concept of rebalance timing luck a lot.  It’s a cowbell we’ve been beating for over half a decade, with our first article going back to August 7th, 2013.

As a reminder, rebalance timing luck is the performance dispersion that arises from the choice of a particular rebalance date (e.g. semi-annual rebalances that occur in June and December versus March and September).

We’ve empirically explored the impact of rebalance timing luck as it relates to strategic asset allocation, tactical asset allocation, and even used our own Systematic Value strategy as a case study for smart beta.  All of our results suggest that it has a highly non-trivial impact upon performance.

This summer we published a paper in the Journal of Index Investing that proposed a simple solution to the timing luck problem: diversification.  If, for example, we believe that our momentum portfolio should be rebalanced every quarter – perhaps as an optimal balance of cost and signal freshness – then we proposed splitting our capital across the three portfolios that spanned different three-month rebalance periods (e.g. JAN-APR-JUL-OCT, FEB-MAY-AUG-NOV, MAR-JUN-SEP-DEC).  This solution is referred to either as “tranching” or “overlapping portfolios.”

The paper also derived a formula for estimating timing luck ex-ante, with a simplified representation of:

Where L is the timing luck measure, T is turnover rate of the strategy, F is how many times per year the strategy rebalances, and S is the volatility of a long/short portfolio that captures the difference of what a strategy is currently invested in versus what it could be invested in if the portfolio was reconstructed at that point in time.

Without numbers, this equation still informs some general conclusions:

  • Higher turnover strategies have higher timing luck.
  • Strategies that rebalance more frequently have lower timing luck.
  • Strategies with a less constrained universe will have higher timing luck.

Bullet points 1 and 3 may seem similar but capture subtly different effects.  This is likely best illustrated with two examples on different extremes.  First consider a very high turnover strategy that trades within a universe of highly correlated securities.  Now consider a very low turnover strategy that is either 100% long or 100% short U.S. equities.  In the first case, the highly correlated nature of the universe means that differences in specific holdings may not matter as much, whereas in the second case the perfect inverse correlation means that small portfolio differences lead to meaningfully different performance.

L, in and of itself, is a bit tricky to interpret, but effectively attempts to capture the potential dispersion in performance between a particular rebalance implementation choice (e.g. JAN-APR-JUL-OCT) versus a timing-luck-neutral benchmark.

After half a decade, you’d would think we’ve spilled enough ink on this subject.

But given that just about every single major index still does not address this issue, and since our passion for the subject clearly verges on fever pitch, here comes some more cowbell.

Equity Style Portfolio Definitions

In this note, we will explore timing luck as it applies to four simplified smart beta portfolios based upon holdings of the S&P 500 from 2000-2019:

  • Value: Sort on earnings yield.
  • Momentum: Sort on prior 12-1 month returns.
  • Low Volatility: Sort on realized 12-month volatility.
  • Quality: Sort on average rank-score of ROE, accruals ratio, and leverage ratio.

Quality is a bit more complicated only because the quality factor has far less consistency in accepted definition.  Therefore, we adopted the signals utilized by the S&P 500 Quality Index.

For each of these equity styles, we construct portfolios that vary across two dimensions:

  • Number of Holdings: 50, 100, 150, 200, 250, 300, 350, and 400.
  • Frequency of Rebalance: Quarterly, Semi-Annually, and Annually.

For the different rebalance frequencies, we also generate portfolios that represent each possible rebalance variation of that mix.  For example, Momentum portfolios with 50 stocks that rebalance annually have 12 possible variations: a January rebalance, February rebalance, et cetera.  Similarly, there are 12 possible variations of Momentum portfolios with 100 stocks that rebalance annually.

By explicitly calculating the rebalance date variations of each Style x Holding x Frequency combination, we can construct an overlapping portfolios solution.  To estimate empirical annualized timing luck, we calculate the standard deviation of monthly return dispersion between the different rebalance date variations of the overlapping portfolio solution and annualize the result.

Empirical Timing Luck Results

Before looking at the results plotted below, we would encourage readers to hypothesize as to what they expect to see.  Perhaps not in absolute magnitude, but at least in relative magnitude.

For example, based upon our understanding of the variables affecting timing luck, would we expect an annually rebalanced portfolio to have more or less timing luck than a quarterly rebalanced one?

Should a more concentrated portfolio have more or less timing luck than a less concentrated variation?

Which factor has the greatest risk of exhibiting timing luck?

Source: Sharadar.  Calculations by Newfound Research.

To create a sense of scale across the styles, below we isolate the results for semi-annual rebalancing for each style and plot it.

Source: Sharadar.  Calculations by Newfound Research.

In relative terms, there is no great surprise in these results:

  • More frequent rebalancing limits the risk of portfolios changing significantly between rebalance dates, thereby decreasing the impact of timing luck.
  • More concentrated portfolios exhibit larger timing luck.
  • Faster-moving signals (e.g. momentum) tend to exhibit more timing luck than more stable, slower-moving signals (e.g. low volatility).

What is perhaps the most surprising is the sheer magnitude of timing luck.  Consider that the S&P 500 Enhanced Value, Momentum, Low Volatility, and Quality portfolios all hold 100 securities and are rebalanced semi-annually.  Our study suggests that timing luck for such approaches may be as large as 2.5%, 4.4%, 1.1%, and 2.0% respectively.

But what does that really mean?  Consider the realized performance dispersion of different rebalance date variations of a Momentum portfolio that holds the top 100 securities in equal weight and is rebalanced on a semi-annual basis.

Source: Sharadar.  Calculations by Newfound Research.  Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. 

The 4.4% estimate of annualized timing luck is a measure of dispersion between each underlying variation and the overlapping portfolio solution.  If we isolate two sub-portfolios and calculate rolling 12-month performance dispersion, we can see that the difference can be far larger, as one might exhibit positive timing luck while the other exhibits negative timing luck.  Below we do precisely this for the APR-OCT and MAY-NOV rebalance variations.

Source: Sharadar.  Calculations by Newfound Research.  Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. 

In fact, since these variations are identical in every which way except for the date on which they rebalance, a portfolio that is long the APR-OCT variation and short the MAY-NOV variation would explicitly capture the effects of rebalance timing luck.  If we assume the rebalance timing luck realized by these two portfolios is independent (which our research suggests it is), then the volatility of this long/short is approximately the rebalance timing luck estimated above scaled by the square-root of two.

Derivation: For variations vi and vj and overlapping-portfolio solution V, then:

Thus, if we are comparing two identically-managed 100-stock momentum portfolios that rebalance semi-annually, our 95% confidence interval for performance dispersion due to timing luck is +/- 12.4% (2 x SQRT(2) x 4.4%).

Even for more diversified, lower turnover portfolios, this remains an issue.  Consider a 400-stock low-volatility portfolio that is rebalanced quarterly.  Empirical timing luck is still 0.5%, suggesting a 95% confidence interval of 1.4%.

S&P 500 Style Index Examples

One critique of the above analysis is that it is purely hypothetical: the portfolios studied above aren’t really those offered in the market today.

We will take our analysis one step further and replicate (to the best of our ability) the S&P 500 Enhanced Value, Momentum, Low Volatility, and Quality indices.  We then created different rebalance schedule variations.  Note that the S&P 500 Low Volatility index rebalances quarterly, so there are only three possible rebalance variations to compute.

Source: Sharadar.  Calculations by Newfound Research.  Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. 

We see a meaningful dispersion in terminal wealth levels, even for the S&P 500 Low Volatility index, which appears at first glance in the graph to have little impact from timing luck.

Minimum Terminal Wealth

Maximum Terminal Wealth

Enhanced Value

$4.45

$5.45

Momentum

$3.07

$4.99

Low Volatility

$6.16

$6.41

Quality

$4.19

$5.25

 

We should further note that there does not appear to be one set of rebalance dates that does significantly better than the others.  For Value, FEB-AUG looks best while JUN-DEC looks the worst; for Momentum it’s almost precisely the opposite.

Furthermore, we can see that even seemingly closely related rebalances can have significant dispersion: consider MAY-NOV and JUN-DEC for Momentum. Here is a real doozy of a statistic: at one point, the MAY-NOV implementation for Momentum is down -50.3% while the JUN-DEC variation is down just -13.8%.

These differences are even more evident if we plot the annual returns for each strategy’s rebalance variations.   Note, in particular, the extreme differences in Value in 2009, Momentum in 2017, and Quality in 2003.

Source: Sharadar.  Calculations by Newfound Research.  Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. 

Conclusion

In this study, we have explored the impact of rebalance timing luck on the results of smart beta / equity style portfolios.

We empirically tested this impact by designing a variety of portfolio specifications for four different equity styles (Value, Momentum, Low Volatility, and Quality).  The specifications varied by concentration as well as rebalance frequency.  We then constructed all possible rebalance variations of each specification to calculate the realized impact of rebalance timing luck over the test period (2000-2019).

In line with our mathematical model, we generally find that those strategies with higher turnover have higher timing luck and those that rebalance more frequently have less timing luck.

The sheer magnitude of timing luck, however, may come as a surprise to many.  For reasonably concentrated portfolios (100 stocks) with semi-annual rebalance frequencies (common in many index definitions), annual timing luck ranged from 1-to-4%, which translated to a 95% confidence interval in annual performance dispersion of about +/-1.5% to +/-12.5%.

The sheer magnitude of timing luck calls into question our ability to draw meaningful relative performance conclusions between two strategies.

We then explored more concrete examples, replicating the S&P 500 Enhanced Value, Momentum, Low Volatility, and Quality indices.  In line with expectations, we find that Momentum (a high turnover strategy) exhibits significantly higher realized timing luck than a lower turnover strategy rebalanced more frequently (i.e. Low Volatility).

For these four indices, the amount of rebalance timing luck leads to a staggering level of dispersion in realized terminal wealth.

“But Corey,” you say, “this only has to do with systematic factor managers, right?”

Consider that most of the major equity style benchmarks are managed with annual or semi-annual rebalance schedules.  Good luck to anyone trying to identify manager skill when your benchmark might be realizing hundreds of basis points of positive or negative performance luck a year.

 

Navigating Municipal Bonds With Factors

This post is available as a PDF download here.

Summary

  • In this case study, we explore building a simple, low cost, systematic municipal bond portfolio.
  • The portfolio is built using the low volatility, momentum, value, and carry factors across a set of six municipal bond sectors. It favors sectors with lower volatility, better recent performance, cheaper valuations, and higher yields.  As with other factor studies, a multi-factor approach is able to harvest major benefits from active strategy diversification since the factors have low correlations to one another.
  • The factor tilts lead to over- and underweights to both credit and duration through time. Currently, the portfolio is significantly underweight duration and modestly overweight credit.
  • A portfolio formed with the low volatility, value, and carry factors has sufficiently low turnover that these factors may have value in setting strategic allocations across municipal bond sectors.

 

Recently, we’ve been working on building a simple, ETF-based municipal bond strategy.  Probably to the surprise of nobody who regularly reads our research, we are coming at the problem from a systematic, multi-factor perspective.

For this exercise, our universe consists of six municipal bond indices:

  • Bloomberg Barclays AMT-Free Short Continuous Municipal Index
  • Bloomberg Barclays AMT-Free Intermediate Continuous Municipal Index
  • Bloomberg Barclays AMT-Free Long Continuous Municipal Index
  • Bloomberg Barclays Municipal Pre-Refunded-Treasury-Escrowed Index
  • Bloomberg Barclays Municipal Custom High Yield Composite Index
  • Bloomberg Barclays Municipal High Yield Short Duration Index

These indices, all of which are tracked by VanEck Vectors ETFs, offer access to municipal bonds across a range of durations and credit qualities.

Source: VanEck

Before we get started, why are we writing another multi-factor piece after addressing factors in the context of a multi-asset universe just two weeks ago?

The simple answer is that we find the topic to be that pressing for today’s investors.  In a world of depressed expected returns and elevated correlations, we believe that factor-based strategies have a role as both return generators and risk mitigators.

Our confidence in what we view as the premier factors (value, momentum, low volatility, carry, and trend) stems largely from their robustness in out-of-sample tests across asset classes, geographies, and timeframes.  The results in this case study not only suggest that a factor-based approach is feasible in muni investing, but also in our opinion strengthens the case for factor investing in other contexts (e.g. equities, taxable fixed income, commodities, currencies, etc.).

Constructing Long/Short Factor Portfolios

For the municipal bond portfolio, we consider four factors:

  1. Value: Buy undervalued sectors, sell overvalued sectors
  2. Momentum: Buy strong recent performers, sell weak recent performers
  3. Low Volatility: Buy low risk sectors, sell high risk sectors
  4. Carry: Buy higher yielding sectors, sell lower yielding sectors

As a first step, we construct long/short single factor portfolios.  The weight on index i at time t in long/short factor portfolio f is equal to:

In this formula, c is a scaling coefficient,  S is index i’s time t score on factor f, and N is the number of indices in the universe at time t.

We measure each factor with the following metrics:

  1. Value: Normalized deviation of real yield from the 5-year trailing average yield[1]
  2. Momentum: Trailing twelve month return
  3. Low Volatility: Historical standard deviation of monthly returns[2]
  4. Carry: Yield-to-worst

For the value, momentum, and carry factors, the scaling coefficient  is set so that the portfolio is dollar neutral (i.e. we are long and short the same dollar amount of securities).  For the low volatility factor, the scaling coefficient is set so that the volatilities of the long and short portfolios are approximately equal.  This is necessary since a dollar neutral construction would be perpetually short “beta” to the overall municipal bond market.

Data Source: Blooomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results.

All four factors are profitable over the period from June 1998 to April 2017.  The value factor is the top performer both from an absolute return and risk-adjusted return perspective.

Data Source: Blooomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. The benchmark is an equal-weight portfolio of all indices in the universe adjusted for the indices that are calibrated and included in each long/short factor index based on data availability.

 

There is significant variation in performance over time.  All four factors have years where they are the best performing factor and years where they are the worst performing factor.  The average annual spread between the best performing factor and the worst performing factor is 11.3%.

Data Source: Blooomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. The benchmark is an equal-weight portfolio of all indices in the universe adjusted for the indices that are calibrated and included in each long/short factor index based on data availability. 1998 is a partial year beginning in June 1998 and 2017 is a partial year ending in April 2017.

 

The individual long/short factor portfolios are diversified to both each other (average pairwise correlation of -0.11) and to the broad municipal bond market.

Data Source: Blooomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results.

 

Data Source: Blooomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results.

Moving From Single Factor to Multi-Factor Portfolios

The diversified nature of the long/short return streams makes a multi-factor approach hard to beat in terms of risk-adjusted returns.  This is another example of the type of strategy diversification that we have long lobbied for.

As evidence of these benefits, we have built two versions of a portfolio combining the low volatility, value, carry, and momentum factors.  The first version targets an equal dollar allocation to each factor.  The second version uses a naïve risk parity approach to target an approximately equal risk contribution from each factor.

Both approaches outperform all four individual factors on a risk-adjusted basis, delivering Sharpe Ratios of 1.19 and 1.23, respectively, compared to 0.96 for the top single factor (value).

Data Source: Bloomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. The benchmark is an equal-weight portfolio of all indices in the universe adjusted for the indices that are calibrated and included in each long/short factor index based on data availability. The factor risk parity construction uses a simple inverse volatility methodology. Volatility estimates are shrunk in the early periods when less data is available.

 

To stress this point, diversification is so plentiful across the factors that even the simplest portfolio construction methodologies outperforms an investor who was able to identify the best performing factor with perfect foresight.  For additional context, we constructed a “Look Ahead Mean-Variance Optimization (“MVO”) Portfolio” by calculating the Sharpe optimal weights using actual realized returns, volatilities, and correlations.  The Look Ahead MVO Portfolio has a Sharpe Ratio of 1.43, not too far ahead of our two multi-factor portfolios.  The approximate weights in the Look Ahead MVO Portfolio are 49% to Low Volatility, 25% to Value, 15% to Carry, and 10% to Momentum.  While the higher Sharpe Ratio factors (Low Volatility and Value) do get larger allocations, Momentum and Carry are still well represented due to their diversification benefits.

Data Source: Bloomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. The benchmark is an equal-weight portfolio of all indices in the universe adjusted for the indices that are calibrated and included in each long/short factor index based on data availability. The factor risk parity construction uses a simple inverse volatility methodology. Volatility estimates are shrunk in the early periods when less data is available.

 

From a risk perspective, both multi-factor portfolios have lower volatility than any of the individual factors and a maximum drawdown that is within 1% of the individual factor with the least amount of historical downside risk.  It’s also worth pointing out that the risk parity construction leads to a return stream that is very close to normally distributed (skew of 0.1 and kurtosis of 3.0).

Data Source: Blooomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. The benchmark is an equal-weight portfolio of all indices in the universe adjusted for the indices that are calibrated and included in each long/short factor index based on data availability. The factor risk parity construction uses a simple inverse volatility methodology. Volatility estimates are shrunk in the early periods when less data is available.

 

In the graph on the next page, we present another lens through which we can view the tremendous amount of diversification that can be harvested between factors.  Here we plot how the allocation to a specific factor, using MVO, will change as we vary that factor’s Sharpe Ratio.  We perform this analysis for each factor individually, holding all other parameters fixed at their historical levels.

As an example, to estimate the allocation to the Low Volatility factor at a Sharpe Ratio of 0.1, we:

  1. Assume the covariance matrix is equal to the historical covariance over the full sample period.
  2. Assume the excess returns for the other three factors (Carry, Momentum, and Value) are equal to their historical averages.
  3. Assume the annualized excess return for the Low Volatility factor is 0.16% so that the Sharpe Ratio is equal to our target of 0.1 (Low Volatility’s annualized volatility is 1.6%).
  4. Calculate the MVO optimal weights using these excess return and risk assumptions.

Data Source: Blooomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. The benchmark is an equal-weight portfolio of all indices in the universe adjusted for the indices that are calibrated and included in each long/short factor index based on data availability. The factor risk parity construction uses a simple inverse volatility methodology. Volatility estimates are shrunk in the early periods when less data is available.

 

As expected, Sharpe Ratios and allocation sizes are positively correlated.  Higher Sharpe Ratios lead to higher allocations.

That being said, three of the factors (Low Volatility, Carry, and Momentum) would receive allocations even if their Sharpe Ratios were slightly negative.

The allocations to carry and momentum are particularly insensitive to Sharpe Ratio level.  Momentum would receive an allocation of 4% with a 0.00 Sharpe, 9% with a 0.25 Sharpe, 13% with a 0.50 Sharpe, 17% with a 0.75 Sharpe, and 20% with a 1.00 Sharpe.  For the same Sharpe Ratios, the allocations to Carry would be 10%, 15%, 19%, 22%, and 24%, respectively.

Holding these factors provides a strong ballast within the multi-factor portfolio.

Moving From Long/Short to Long Only

Most investors have neither the space in their portfolio for a long/short muni strategy nor sufficient access to enough affordable leverage to get the strategy to an attractive level of volatility (and hence return).  A more realistic approach would be to layer our factor bets on top of a long only strategic allocation to muni bonds.

In a perfect world, we could slap one of our multi-factor long/short portfolios right on top of a strategic municipal bond portfolio.  The results of this approach (labeled “Benchmark + Equal Weight Factor Long/Short” in the graphics below) are impressive (Sharpe Ratio of 1.17 vs. 0.93 for the strategic benchmark and return to maximum drawdown of 0.72 vs. 0.46 for the strategic benchmark).  Unfortunately, this approach still requires just a bit of shorting. The size of the total short ranges from 0% to 19% with an average of 5%.

We can create a true long only portfolio (“Long Only Factor”) by removing all shorts and normalizing so that our weights sum to one.  Doing so modestly reduces risk, return, and risk-adjusted return, but still leads to outperformance vs. the benchmark.

Data Source: Blooomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. The benchmark is an equal-weight portfolio of all indices in the universe adjusted for the indices that are calibrated and included in each long/short factor index based on data availability.

 

Data Source: Blooomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. The benchmark is an equal-weight portfolio of all indices in the universe adjusted for the indices that are calibrated and included in each long/short factor index based on data availability.

 

Below we plot both the historical and current allocations for the long only factor portfolio.  Currently, the portfolio would have approximately 25% in each short-term investment grade, pre-refunded, and short-term high yield with the remaining 25% split roughly 80/20 between high yield and intermediate-term investment grade. There is currently no allocation to long-term investment grade.

Data Source: Blooomberg. Calculations by Newfound Research. All allocations are backtested and hypothetical. The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results.

 

Data Source: Blooomberg. Calculations by Newfound Research. All allocations are backtested and hypothetical. The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results.

 

A few interesting observations relating to the long only portfolio and muni factor investing in general:

  1. The factor tilts lead to clear duration and credit bets over time.  Below we plot the duration and a composite credit score for the factor portfolio vs. the benchmark over time.

    Data source: Calculations by Newfound Research. All allocations are backtested and hypothetical. The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. Weighted average durations are estimated using current constituent durations.

    Data source: Calculations by Newfound Research. All allocations are backtested and hypothetical. The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. Weighted average credit scores are estimated using current constituent credit scores. Credit scores use S&P’s methodology to aggregate scores based on the distribution of credit scores of individual bonds.

    Currently, the portfolio is near an all-time low in terms of duration and is slightly titled towards lower credit quality sectors relative to the benchmark.  Historically, the factor portfolio was most often overweight both duration and credit, having this positioning in 53.7% of the months in the sample.  The second and third most common tilts were underweight duration / underweight credit (22.0% of sample months) and underweight duration / overweight credit (21.6% of sample months).  The portfolio was overweight duration / underweight credit in only 2.6% of sample months.

  2. Even for more passive investors, a factor-based perspective can be valuable in setting strategic allocations.  The long only portfolio discussed above has annualized turnover of 77%.  If we remove the momentum factor, which is by far the biggest driver of turnover, and restrict ourselves to a quarterly rebalance, we can reduce turnover to just 18%.  This does come at a cost, as the Sharpe Ratio drops from 1.12 to 1.04, but historical performance would still be strong relative to our benchmark. This suggests that carry, value, and low volatility may be valuable in setting strategic allocations across municipal bond ETFs with only periodic updates at a normal strategic rebalance frequency.
  3. We ran regressions with our long/short factors on all funds in the Morningstar Municipal National Intermediate category with a track record that extended over our full sample period from June 1998 to April 2017.  Below, we plot the betas of each fund to each of our four long/short factors.  Blue bars indicate that the factor beta was significant at a 5% level.  Gray bars indicate that the factor beta was not significant at a 5% level.  We find little evidence of the active managers following a factor approach similar to what we outline in this post.  Part of this is certainly the result of the constrained nature of the category with respect to duration and credit quality.  In addition, these results do not speak to whether any of the managers use a factor-based approach to pick individual bonds within their defined duration and credit quality mandates.

    Data source: Calculations by Newfound Research. Analysis over the period from June 1998 to April 2017.

    The average beta to the low volatility factor, ignoring non-statistically significant values, is -0.23.  This is most likely a function of category since the category consists of funds with both investment grade credit quality and durations ranging between 4.5 and 7.0 years.  In contrast, our low volatility factor on average has short exposure to the intermediate and long-term investment grade sectors.

    Data source: Calculations by Newfound Research. Analysis over the period from June 1998 to April 2017.

    Only 14 of the 33 funds in the universe have statistically significant exposure to the value factor with an average beta of -0.03.

    Data source: Calculations by Newfound Research. Analysis over the period from June 1998 to April 2017.

    The average beta to the carry factor, ignoring non-statistically significant values, is -0.23.  As described above with respect to low volatility, this is most likely function of category as our carry factor favors the long-term investment grade and high yield sectors.

    Data source: Calculations by Newfound Research. Analysis over the period from June 1998 to April 2017.

    Only 9 of the 33 funds in the universe have statistically significant exposure to the momentum factor with an average beta of 0.02.

Conclusion

Multi-factor investing has generated significant press in the equity space due to the (poorly named) “smart beta” movement.  The popular factors in the equity space have historically performed well both within other asset classes (rates, commodities, currencies, etc.) and across asset classes.  The municipal bond market is no different.  A simple, systematic multi-factor process has the potential to improve risk-adjusted performance relative to static benchmarks.  The portfolio can be implemented with liquid, low cost ETFs.

Moving beyond active strategies, factors can also be valuable tools when setting strategic sector allocations within a municipal bond sleeve and when evaluating and blending municipal bond managers.

Perhaps more importantly, the out-of-sample evidence for the premier factors (momentum, value, low volatility, carry, and trend) across asset classes, geographies, and timeframes continues to mount.  In our view, this evidence can be crucial in getting investors comfortable to introducing systematic active premia into their portfolios as both return generators and risk mitigators.

 

[1] Computed using yield-to-worst.  Inflation estimates are based on 1-year and 10-year survey-based expected inflation.  We average the value score over the last 2.5 years, allowing the portfolio to realize a greater degree of valuation mean reversion before closing out a position.

[2] We use a rolling 5-year (60-month) window to calculate standard deviation.  We require at least 3 years of data for an index to be included in the low volatility portfolio.  The standard deviation is multiplied by -1 so that higher values are better across all four factor scores.

 

 

What are Growth and Value?

This commentary is available as a PDF here.

SUMMARY

  • Growth and value have intuitive definitions, but there are many ways to quantify each.
  • As with broad factors, such as value, momentum, and dividend growth, the specific metrics used to describe growth and value may fall in and out of favor, depending on the market environment.
  • Taking a diversified approach to quantifying value and growth can lead to more consistent performance over time.

In our commentary a few weeks ago, we pointed out a key flaw that many index providers have in their growth and value style indices. The industry norm lumps “low value” in with “growth” and “low growth” in with “value” when, in reality, growth and value are independent characteristics of companies. The result is that many of the growth and value ETFs that track these indices are not giving investors what they expect – or what they want.

Final index construction aside, let’s go down to a more fundamental level: what are growth and value in the first place, and how do we measure them?

Intuitively, growth refers to companies that are growing and expected to continue, and value refers to companies that are currently cheap relative to their fair price.

Simple enough.

But a quick survey of index providers finds that the characteristics they use to measure a stock’s growth and value characteristics vary across the board:

Growth Characteristics:

  • Long-term forward earnings per share growth (EPS) rate (CRSP, MSCI, Russell)
  • Short-term forward EPS growth rate (CRSP, MSCI)
  • Current internal growth rate (MSCI)
  • Long-term historical EPS growth trend (CRSP, MSCI, S&P)
  • Long-term historical sales per share growth trend (CSRP, MSCI, Russell, S&P)
  • 12-month price change (S&P)
  • Investment-to-assets ratio (CRSP)
  • Return on assets, ROA (CRSP)

Value Characteristics:

  • Book-to-price ratio (CRSP, MSCI, S&P, Russell)
  • Forward earning to price ratio (CRSP, MSCI)
  • Earnings-to-price ratio (CRSP, S&P)
  • Sales-to-price ratio (CRSP, S&P)
  • Dividend yield (CRSP, MSCI)

Only one metric on each list is common to all four index providers (Sales per share growth trend for growth and book-to-price ratio for value).

So who is right?

We can test the performance of many of these metrics using data readily available online. The forward-looking growth data are more difficult to find historically, but general financial statement data is available on Morningstar’s website.

To keep matters simple, we will look at three metrics for each of growth and value. For growth: 3-year EPS growth, 3-year sales per share growth, and ROA. For value: the P/E, P/S, and P/B ratios.

And to keep things as realistic as possible, we will evaluate the stocks in the S&P 500 as they stood at the end of 2014. Relative to the current set of companies in the S&P 500, we added back in some companies that dropped out of the S&P 500 (mainly energy and materials companies) in 2015. Some mergers and acquisitions also make getting data for the companies more difficult. For example, Covidien was bought by Medtronic, AT&T bought DirecTV, and Kraft merged with Heinz. Since we will be focusing on relative performance differences rather than on absolute ones, we will simply reconstruct a proxy S&P 500 index using the data that is available. In all, our universe contains 481 companies.

Using the fundamental data from December 2014, we can sort based on each metric and select the top 160 companies (about one-third of the universe) and see how that “value” or “growth” portfolio would have performed in 2015. Within each portfolio, we equally weight for simplicity. Results are compared to an equal-weight benchmark to control for any out or underperformance arising from the equal-weight allocation methodology as opposed to stock selection.

There is significant variation during the year depending on which metric was used.

Growth portfolios

Source: Data from Yahoo! Finance and Morningstar, calculations by Newfound

Value portfolios

Source: Data from Yahoo! Finance and Morningstar, calculations by Newfound

For growth, all of the portfolios tracked each other until mid-March when the portfolio formed on sales growth began to diverge. The portfolios formed on EPS growth and ROA continued to track each other until mid-June. At this time, ROA rallied hard, eclipsing the sales growth portfolio in the 4th quarter of 2015.

On the value front, the P/S ratio led through most of the year before falling back to the pack in the Fall. The P/E and P/B portfolios ended the year in very similar places, with the P/S portfolio eking out a ~65bp benefit over the other two portfolios.

 

Which Metric to Choose

One year is hardly enough data to make a sound judgment as to which metric is the best for selecting growth and value stocks. As we have said many times before, even though we may know a factor (e.g. value) has outperformed in the past and is likely to do so in the future based on behavioral evidence, stating whether that factor will outperform in any given year is tough.

Likewise, deciding which measure of a factor will outperform in a given year is also difficult. Even with value companies, a metric like P/E ratio may not work well when companies with strong sales experience short-term earnings shocks or when companies are able to inflate earnings based on accounting allowances. The P/B ratio may not work well in periods when service oriented companies, which rely on intangible human capital as a large driver of growth, are being rewarded in the market.

Let’s take a closer look at some popular ways of quantifying the value factor.

“Value”, as it stands in academic literature, is commonly measured using the P/B ratio. This is what the famous Fama-French Three Factor Model uses as its basis for calculating the value factor, high-minus-low (HML).

However, using data from Kenneth French going back to 1951, we can see that, for long-only portfolios, those formed both on P/E and P/S actually beat the portfolio formed on P/B both on an absolute and risk-adjusted basis.

table

Furthermore, AQR showed in their 2014 paper, “The Devil in HML’s Details,” that not only does the metric matter, but the method of calculating the metric matters, as well. While Fama and French calculated HML using book value data that was lagged by 6 months to ensure that data would be available, they also lagged price data by the same amount. The AQR paper proposed using the most recent price data for calculating P/B ratios and showed that their method was superior to the standard lagged-price method because using more current price data better captures the relationship between value and momentum.

The P/S and P/E ratios used in the table above are also calculated using lagged price data. Based on AQR’s research, we expect that those results might also be improved by using the current price data.

 

Different Measures of Factors May Ebb and Flow

We should be careful not to rush to judgment though. The fact that P/B has underperformed the other value metrics does not mean we should drop it entirely. It is helpful to remember that individual factors can go through periods of significant underperformance. The same is true for different ways of measuring a single factor. For example, over rolling 12-month periods, the return difference between portfolios formed portfolio on P/B, P/S, and P/E – all “value” metrics – has often been in excess of 2000bp!

Put bluntly: your mileage may vary dramatically depending on which value metric you choose.

Portfolios ranges

Source: Data from Kenneth French Data Library, calculations by Newfound

With our 2015 example, we saw that P/S resulted in the best performing portfolio, but as we said before, different measures tend to cycle unpredictably. We can see which ones have been in favor historically by comparing each individual portfolio to the average of all three portfolios.

Single factor

Source: Data from Kenneth French Data Library, calculations by Newfound

The fact that many index providers combine multiple metrics into a composite growth or value score is an acknowledgement of this unpredictability.

Averaging the different value portfolios would have led to a fraction of outperforming periods on par with the best individual portfolios, higher average outperformance than the P/S portfolio, and lower average underperformance than all three individual portfolios.

One year periods

Rolling perf

Source: Data from Kenneth French Data Library, calculations by Newfound

If you read our previous commentary about multi-factor portfolio construction, you’ll notice that the averaging we did above is approach #1 (the “or” method). In effect, we are investing in companies that have either low P/S, P/B, or P/E ratios. One way to implement this would be to form portfolios based on each metric and then average the allocations into a final value portfolio.

In practice, most index providers score companies based on each selected metric, normalize the scores, and then average them (sometimes using different weightings). The portfolio is then formed using this composite score. This is more in line with approach #2 from the commentary (the “and” approach), which favors companies that have some degree of combined strength across multiple metrics.

While we used value and momentum in the commentary to illustrate why using the “and” approach is problematic in multi-factor portfolios, using this approach isn’t as bad when attempting to identify a single factor. The problem with value and momentum stemmed from the difference in time that each factor took to mature. Using the “and” approach introduced drag from the shorter maturity factor.

If there is no convincing argument that an individual growth or value measure takes longer to mature than another (for instance, does P/S normalize faster than P/B), then taking the “and” approach is not likely to result in a worse outcome. In this case, where we are simply trying to identify growth or value, we care more about the predictive nature of each metric that goes into forming the portfolio.

The index providers vary considerably in regards to what characteristics they look at and how they weight them to arrive at a final portfolio. If you believe that the P/B ratio is the best determinant of company value then you will get the purest exposure with Russell. If you think return on assets is an important contributing factor to company growth, CRSP’s index will be more in line with your view.

However, if you are like us and concede that while there are many ways to quantify growth and value, no one method can outperform over every single period, a diversified approach may be your best option.

Powered by WordPress & Theme by Anders Norén