The Research Library of Newfound Research

Tag: factor timing

Should I Stay or Should I Growth Now?

This post is available as a PDF download here.

Summary

  • Naïve value factor portfolios have been in a drawdown since 2007.
  • More thoughtful implementations performed well after 2008, with many continuing to generate excess returns versus the market through 2016.
  • Since 2017, however, most value portfolios have experienced a steep drawdown in their relative performance, significantly underperforming glamour stocks and the market as a whole.
  • Many investors are beginning to point to the relative fundamental attractiveness of value versus growth, arguing that value is well poised to out-perform going forward.
  • In this research note, we aim to provide further data for the debate, constructing two different value indices (a style-box driven approach and a factor-driven approach) and measuring the relative attractiveness of fundamental measures versus both the market and growth stocks.

 

“Should I stay or should I go now?
If I go, there will be trouble
And if I stay it will be double”

— The Clash

 

It is no secret that quantitative value strategies have struggled as of late.  Naïve sorts – like the Fama-French HML factor – peaked around 2007, but most quants would stick their noses up and say, “See? Craftsmanship matters.”  Composite metrics, industry-specific scoring, sector-neutral constraints, factor-neutral constraints, and quality screens all helped quantitative value investors stay in the game.

Even a basket of long-only value ETFs didn’t peak against the S&P 500 until mid-2014.

Source: Sharadar.  Calculations by Newfound Research.  Past performance is not an indicator of future results.  Performance is backtested and hypothetical.  Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions.  The Value ETF basket is an equal-weight portfolio of FVAL, IWD, JVAL, OVLU, QVAL, RPV, VLU, and VLUE, with each ETF being included when it is first available.  Performance of the long/short portfolio is calculated as the monthly return of the Value ETF Basket minus the monthly return of the S&P 500 (“SPY”).

Many strategies were able to keep the mojo going until 2016 or so.  But at that point, the wheels came off for just about everyone.

A decade of under-performance for the most naïve approaches and three-plus years of under-performance for some of the most thoughtful has many people asking, “is quantitative value an outdated idea?  Should we throw in the towel and just buy growth?”

Of course, it should come as no surprise that many quantitative value managers are now clamoring that this is potentially the best time to invest in value since the dot-com bubble.  “No pain, no premium,” as we like to say.

Nevertheless, the question of value’s attractiveness itself is muddied for a variety of reasons:

  • How are we defining value?
  • Are we talking about long/short factors or long-only implementations?
  • Are we talking about the style-box definition or the factor definition of value?

By no means will this commentary be a comprehensive evaluation as to the attractiveness of Value, but we do hope to provide some more data for the debate.

Replicating Style-Box Growth and Value

If you want the details of how we are defining Growth and Value, read on.  Otherwise, you can skip ahead to the next section.

Morningstar invented the style box back in the early 1990s.  Originally, value was simply defined based upon price-to-book and price-to-earnings.  But somewhere along the line, things changed.  Not only was the definition of value expanded to include more metrics, but growth was given an explicit set of metrics to quantify it, as well.

The subtle difference here is rather than measuring cheap versus expensive, the new model more explicitly attempted to capture value versus growth.  The problem – at least in my opinion – is that the model makes it such that the growth-iest fund is now the one that simultaneously ranks the highest on growth metrics and the lowest on value metrics.  Similarly, the value-iest fund is the one that ranks the highest on value metrics and the lowest on growth metrics.  So growth is growing but expensive and value is cheap but contracting.

The index providers took the same path Morningstar did.  For example, while MSCI originally defined value and growth based only upon price-to-book, they later amended it to include not only other value metrics, but growth metrics as well.  S&P Dow Jones and FTSE Russell follow this same general scheme.  Which is all a bit asinine if you ask me.1

Nevertheless, it is relevant to the discussion as to whether value is attractive or not, as value defined by a style-box methodology can differ from value as defined by a factor methodology.  Therefore, to dive under the hood, we created our own “Frankenstein’s style-box” by piecing together different components of S&P Dow Jones’, FTSE Russell’s, and MSCI’s methodologies.

  • The parent universe is the S&P 500.
  • Growth metrics are 3-year earnings-per-share growth, 3-year revenue-per-share growth, internal growth rate2, and 12-month price momentum.3
  • Value metrics are book-to-price4, earnings-to-price5, free-cash-flow-to-price, and sales-to-enterprise-value6.
  • Metrics are all winsorized at the 90th percentile.
  • Z-scores for each Growth and Value metric are calculated using market-capitalization weighted means and standard deviations.
  • An aggregate Growth and Value score is calculated for each security as the sum of the underlying style z-scores.

From this point, we basically follow MSCI’s methodology.  Each security is plotted onto a “style space” (see image below) and assigned value and growth inclusion factors based upon the region it falls into.  These inclusion factors represent the proportion of a security’s market cap that can be allocated to the Value or Growth index.

Securities are then sorted by their distance from the origin point.  Starting with the securities that are furthest from the origin (i.e. those with more extreme style scores), market capitalizations are proportionally allocated to Value and Growth based upon their inclusion factors.  Once one style hits 50%, the remaining securities are allocated to the other style regardless of inclusion factors.

Source: MSCI.

The result of this process is that each style represents approximately 50% of the total market capitalization of the S&P 500.  The market capitalization for each security will be fully represented in the combination of growth and value and may even be represented in both Value and Growth as a partial weight (though never double counted).

Portfolios are rebalanced semi-annually using six overlapping portfolios.

How Attractive is Value?

To evaluate the relative attractiveness of Growth versus Value, we will evaluate two approaches.

In the first approach, we will make the assumption that fundamentals will not change but prices will revert.  In this approach, we will plot the ratio of price-to-fundamental measures (e.g. price-to-earnings of Growth over price-to-earnings of Value) minus 1.  This can be thought of as how far price would have to revert between the two indices before valuations are equal.

As an example, consider the following two cases.  First, Value has an earnings yield of 2% and Growth has an earnings yield of 1%.  In this case, both are expensive (Value has a P/E of 50 and Growth has a P/E of 100), but the price of Value would have to double (or the price of Growth would have to get cut in half) for their valuations to meet.  As a second case, Value has an earnings yield of 100% and Growth has an earnings yield of 50%.  Both are very cheap, but we would still have to see the same price moves for their fundamentals to meet.

For our second approach, we will assume prices and fundamentals remain constant and ask the question, “how much carry do I earn for this trade?”  Specifically, we will measure shareholder yield (dividend yield plus buyback yield) for each index and evaluate the spread.

In both cases, we will decompose our analysis into Growth versus the Market and the Market versus Value to gain a better perspective as to how each leg of the trade is influencing results.

Below we plot the relative ratio for price-to-book, price-to-earnings, price-to-free-cash-flow, and price-to-sales.

Source: Sharadar.  Calculations by Newfound Research.

A few things stand out:

  • The ratio of Growth’s price-to-book versus the S&P 500’s price-to-book appears to be at 2000-level highs. Even the ratio of the S&P 500’s price-to-book versus Value’s price-to-book appears extreme.  However, the interpretation of this data is heavily reliant upon whether we believe price-to-book is still a relevant valuation metric.  If not, this result may simply be a byproduct of naïve value construction loading up on financials and ignoring technology companies, leading to an artificially high spread.  The fact that Growth versus the S&P 500 has far out-stripped the S&P 500 versus Value in this metric might suggest that this result might just be caused Growth loading up on industries where the market feels book value is no longer relevant.
  • The ratio of price-to-earnings has certainly increased in the past year for both Growth versus the S&P 500 and the S&P 500 versus Value, suggesting an even larger spread for Growth versus Value. We can see, however, that we are still a far way off from 2000 highs.
  • Ratios for free cash flows actually look to be near 20-year lows.
  • Finally, we can see that ratios in price-to-sales have meaningfully increased in the last few years. Interestingly, Growth versus the S&P 500 has climbed much faster than the S&P 500 versus Value, suggesting that moving from Growth to the S&P 500 may be sufficient for de-risking against reversion.  Again, while these numbers sit at decade highs, they are still well below 2000-era levels.

Below we plot our estimate of carry (i.e. our return expectation given no change in prices): shareholder yield.  Again, we see recent-era highs, but levels still well below 2000 and 2008 extremes.

Source: Sharadar.  Calculations by Newfound Research.

Taken all together, value certainly appears cheaper – and a trade we likely would be paid more to sit on than we had previously – but a 2000s-era opportunity seems a stretch.

Growth is not Glamour

One potential flaw in the above analysis is that we are evaluating “Value 1.0” indices.  More modern factor indices drop the “not Growth” aspect of defining value, preferring to focus only on valuation metrics.  Therefore, to acknowledge that investors today may be evaluating the choice of a Growth 1.0 index versus a modern Value factor index, we repeat the above analysis using a Value strategy more consistent with current smart-beta products.

Specifically, we winsorize earnings yield, free-cash-flow yield, and sales yield and then compute market-cap-weighted z-scores.  A security’s Value score is then equal to its average z-score across all three metrics with no mention of growth scores.  The strategy selects the securities in the top quintile of Value scores and weights them in proportion to their value-score-scaled market capitalization.  The strategy is rebalanced semi-annually using six overlapping portfolios.

Source: Sharadar.  Calculations by Newfound Research.

We can see:

  • In the Value 1.0 approach, moving from Growth appeared much more expensive versus the S&P 500 than the S&P 500 did versus Value. With a more concentrated approach, the S&P 500 now appears far more expensive versus Value than Growth does versus the S&P 500.
  • Relative price-to-book (despite price-to-book no longer being a focus metric) still appears historically high. While it peaked in Q3 2019, meaningful reversion could still occur.  All the same caveats as before apply, however.
  • Relative price-to-earnings did appear to hit multi-decade highs (excluding the dot-com era) in early 2019. If the prior 6/2016-to-2/2018 reversion is the playbook, then we appear to be halfway home.
  • Relative price-to-free-cash-flow and price-to-sales are both near recent highs, but both below 2008 and dot-com era levels.

Plotting our carry for this trade, we do see a more meaningful divergence between Value and Growth.  Furthermore, the carry for bearing Value risk does appear to be at decade highs; however it is certainly not at extreme levels and it has actually reverted from Q3 2019 highs.

Source: Sharadar.  Calculations by Newfound Research.

Conclusion

In this research note, we sought to explore the current value-of-value.  Unfortunately, it proves to be an elusive question, as the very definition of value is difficult to pin down.

For our first approach, we build a style-box driven definition of Value.  We then plot the relative ratio of four fundamental measures – price-to-book, price-to-earnings, price-to-sales, and price-to-free-cash-flow – of Growth versus the S&P 500 and the S&P 500 versus Value.  We find that both Growth and the S&P 500 look historically expensive on price-to-book and price-to-earnings metrics (implying that Value is very, very cheap), whereas just Growth looks particularly expensive for price-to-sales (implying that Value may not be cheap relative to the Market).  However, none of the metrics look particularly cheap compared to the dot-com era.

We also evaluate Shareholder Yield as a measure of carry, finding that Value minus Growth reached a 20-year high in 2019 if the dot-com and 2008 periods are excluded.

Recognizing that many investors may prefer a more factor-based definition of value, we run the same analysis for a more concentrated value portfolio.  Whereas the first analysis generally pointed to Growth versus the S&P 500 being more expensive than the S&P 500 versus Value trade, the factor-based approach finds the opposite conclusion. Similar to the prior results, Value appears historically cheap for price-to-book, price-to-earnings, and price-to-sales metrics, though it appears to have peaked in Q3 2019.

Finally, the Shareholder Yield spread for the factor approach also appears to be at multi-decade highs ignoring the dot-com and 2008 extremes.

Directionally, this analysis suggests that Value may indeed be cheaper-than-usual.  Whether that cheapness is rational or not, however, is only something we’ll know with the benefit of hindsight.

For further reading on style timing, we highly recommend Style Timing: Value vs Growth (AQR).  For more modern interpretations: Value vs. Growth: The New Bubble (QMA), It’s Time for a Venial Value-Timing (AQR), and Reports of Value’s Death May Be Greatly Exaggerated (Research Affiliates).

 


 

The Limit of Factor Timing

This post is available as a PDF download here.

Summary­

  • We have shown previously that it is possible to time factors using value and momentum but that the benefit is not large.
  • By constructing a simple model for factor timing, we examine what accuracy would be required to do better than a momentum-based timing strategy.
  • While the accuracy required is not high, finding the system that achieves that accuracy may be difficult.
  • For investors focused on managing the risks of underperformance – both in magnitude and frequency – a diversified factor portfolio may be the best choice.
  • Investors seeking outperformance will have to bear more concentration risk and may be open to more model risk as they forego the diversification among factors.

A few years ago, we began researching factor timing – moving among value, momentum, low volatility, quality, size etc. – with the hope of earning returns in excess not only of the equity market, but also of buy-and-hold factor strategies.

To time the factors, our natural first course of action was to exploit the behavioral biases that may create the factors themselves. We examined value and momentum across the factors and used these metrics to allocate to factors that we expected to outperform in the future.

The results were positive. However, taking into account transaction costs led to the conclusion that investors were likely better off simply holding a diversified factor portfolio.

We then looked at ways to time the factors using the business cycle.

The results in this case were even less convincing and were a bit too similar to a data-mined optimal solution to instill much faith going forward.

But this evidence does not necessarily remove the temptation to take a stab at timing the factors, especially since explicit transactions costs have been slashed for many investors accessing long-only factors through ETFs.Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. 

After all, there is a lot to gain by choosing the right factors. For example, in the first 9 months of 2019, the spread between the best (Quality) and worst (Value) performing factors was nearly 1,000 basis points (“bps”). One month prior, that spread had been double!

In this research note, we will move away from devising a systematic approach to timing the factors (as AQR asserts, this is deceptively difficult) and instead focus on what a given method would have to overcome to achieve consistent outperformance.

Benchmarking Factor Timing

With all equity factor strategies, the goal is usually to outperform the market-cap weighted equity benchmark.

Since all factor portfolios can be thought of as a market cap weighted benchmark plus a long/short component that captures the isolated factor performance, we can focus our study solely on the long/short portfolio.

Using the common definitions of the factors (from Kenneth French and AQR), we can look at periods over which these self-financing factor portfolios generate positive returns to see if overlaying them on a market-cap benchmark would have added value over different lengths of time.1

We will also include the performance of an equally weighted basket of the four factors (“Blend”).

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

The persistence of factor outperformance over one-month periods is transient. If the goal is to outperform the most often, then the blended portfolio satisfies this requirement, and any timing strategy would have to be accurate enough to overcome this already existing spread.

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

The results for the blended portfolio are so much better than the stand-alone factors because the factors have correlations much lower than many other asset classes, allowing even naïve diversification to add tremendous value.

The blended portfolio also cuts downside risk in terms of returns. If the timing strategy is wrong, and chooses, for example, momentum in an underperforming month, then it could take longer for the strategy to climb back to even. But investors are used to short periods of underperformance and often (we hope) realize that some short-term pain is necessary for long-term gains.

Looking at the same analysis over rolling 1-year periods, we do see some longer periods of factor outperformance. Some examples are quality in the 1980s, value in the mid-2000s, momentum in the 1960s and 1990s, and size in the late-1970s.

However, there are also decent stretches where the factors underperform. For example, the recent decade for value, quality in the early 2010s, momentum sporadically in the 2000s, and size in the 1980s and 1990s. If the timing strategy gets stuck in these periods, then there can be a risk of abandoning it.

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

Again, a blended portfolio would have addressed many of these underperforming periods, giving up some of the upside with the benefit of reducing the risk of choosing the wrong factor in periods of underperformance.

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

And finally, if we extend our holding period to three years, which may be used for a slower moving signal based on either value or the business cycle, we see that the diversified portfolio still exhibits outperformance over the most rolling periods and has a strong ratio of upside to downside.

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

The diversified portfolio stands up to scrutiny against the individual factors but could a generalized model that can time the factors with a certain degree of accuracy lead to better outcomes?

Generic Factor Timing

To construct a generic factor timing model, we will consider a strategy that decides to hold each factor or not with a certain degree of accuracy.

For example, if the accuracy is 50%, then the strategy would essentially flip a coin for each factor. Heads and that factor is included in the portfolio; tails and it is left out. If the accuracy is 55%, then the strategy will hold the factor with a 55% probability when the factor return is positive and not hold the factor with the same probability when the factor return is negative. Just to be clear, this strategy is constructed with look-ahead bias as a tool for evaluation.

All factors included in the portfolio are equally weighted, and if no factors are included, then the returns is zero for that period.

This toy model will allow us to construct distributions to see where the blended portfolio of all the factors falls in terms of frequency of outperformance (hit rate), average outperformance, and average underperformance. The following charts show the percentiles of the diversified portfolio for the different metrics and model accuracies using 1,000 simulations.2

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

In terms of hit rate, the diversified portfolio behaves in the top tier of the models over all time periods for accuracies up to about 57%. Even with a model that is 60% accurate, the diversified portfolio was still above the median.

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

For average underperformance, the diversified portfolio also did very well in the context of these factor timing models. The low correlation between the factors leads to opportunities for the blended portfolio to limit the downside of individual factors.

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

For average outperformance, the diversified portfolio did much worse than the timing model over all time horizons. We can attribute this also to the low correlation between the factors, as choosing only a subset of factors and equally weighting them often leads to more extreme returns.

Overall, the diversified portfolio manages the risks of underperformance, both in magnitude and in frequency, at the expense of sacrificing outperformance potential. We saw this in the first section when we compared the diversified portfolio to the individual factors.

But if we want to have increased return potential, we will have to introduce some model risk to time the factors.

Checking in on Momentum

Momentum is one model-based way to time the factors. Under our definition of accuracy in the toy model, a 12-1 momentum strategy on the factors has an accuracy of about 56%. While the diversified portfolio exhibited some metrics in line with strategies that were even more accurate than this, it never bore concentration risk: it always held all four factors.

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

For the hit rate percentiles of the momentum strategy, we see a more subdued response. Momentum does not win as much as the diversified portfolio over the different time periods.

But not winning as much can be fine if you win bigger when you do win.

The charts below show that momentum does indeed have a higher outperformance percentile but with a worse underperformance percentile, especially for 1-month periods, likely due to mean reversionary whipsaw.

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

While momentum is definitely not the only way to time the factors, it is a good baseline to see what is required for higher average outperformance.

Now, turning back to our generic factor timing model, what accuracy would you need to beat momentum?

Sharpening our Signal

The answer is: not a whole lot. Most of the time, we only need to be about 53% accurate to beat the momentum-based factor timing.

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. 

The caveat is that this is the median performance of the simulations. The accuracy figure climbs closer to 60% if we use the 25th percentile as our target.

While these may not seem like extremely high requirements for running a successful factor timing strategy, it is important to observe that not many investors are doing this. True accuracy may be hard to discover, and sticking with the system may be even harder when the true accuracy can never be known.

Conclusion

If you made it this far looking for some rosy news on factor timing or the Holy Grail of how to do it skillfully, you may be disappointed.

However, for most investors looking to generate some modest benefits relative to market-cap equity, there is good news. Any signal for timing factors does not have to be highly accurate to perform well, and in the absence of a signal for timing, a diversified portfolio of the factors can lead to successful results by the metrics of average underperformance and frequency of underperformance.

For those investors looking for higher outperformance, concentration risk will be necessary.

Any timing strategy on low correlation investments will generally forego significant diversification in the pursuit of higher returns.

While this may be the goal when constructing the strategy, we should always pause and determine whether the potential benefits outweigh the costs. Transaction costs may be lower now. However, there are still operational burdens and the potential stress caused by underperformance when a system is not automated or when results are tracked too frequently.

Factor timing may be possible, but timing and tactical rotation may be better suited to scenarios where some of the model risk can be mitigated.

Style Surfing the Business Cycle

This post is available as a PDF download here.

Summary­

  • In this commentary, we ask whether we should consider rotating factor exposure based upon the business cycle.
  • To eliminate a source of model risk, we assume perfect knowledge of future recessions, allowing us to focus only on whether prevailing wisdom about which factors work during certain economic phases actually adds value.
  • Using two models of factor rotation and two definitions of business cycles, we construct four timing portfolios and ultimately find that rotating factor exposures does not add meaningful value above a diversified benchmark.
  • We find that the cycle-driven factor rotation recommendations are extremely close to data-mined optimal results. The similarity of the recommendations coupled with the lackluster performance of conventional style timing recommendations may highlight how fragile the rotation process inherently is.

Just as soon as the market began to meaningfully adopt factor investing, someone had to go and ask, “yeah, but can they be timed?”  After all, while the potential opportunity to harvest excess returns is great, who wants to live through a decade of relative drawdowns like we’re seeing with the value factor?

And thus the great valuation-spread factor timing debates of 2017 were born and from the ensuing chaos emerged new, dynamic factor rotation products.

There is no shortage of ways to test factor rotation: valuation-spreads, momentum, and mean-reversion to name a few.  We have even found mild success using momentum and mean reversion, though we ultimately question whether the post-cost headache is worth the potential benefit above a well-diversified portfolio.

Another potential idea is to time factor exposure based upon the state of the economic or business cycle.

It is easy to construct a narrative for this approach.  For example, it sounds logical that you might want to hold higher quality, defensive stocks during a recession to take advantage of the market’s flight-to-safety.  On the other hand, it may make sense to overweight value during a recovery to exploit larger mispricings that might have occurred during the contraction period.

An easy counter-example, however, is the performance of value during the last two recessions.  During the dot-com fall-out, cheap out-performed expensive by a wide margin. This fit a wonderful narrative of value as a defensive style of investing, as we are buying assets at a discount to intrinsic value and therefore establishing a margin of safety.

Of course, we need only look towards 2008 to see a very different scenario.  From peak to trough, AQR’s HML Devil factor had a drawdown of nearly 40% during that crisis.

Two recessions with two very different outcomes for a single factor.  But perhaps there is still hope for this approach if we diversify across enough factors and apply it over the long run.

The problem we face with business cycle style timing is really two-fold.  First, we have to be able to identify the factors that will do well in a given market environment.  Equally important, however, is our ability to predict the future economic environment.

Philosophically, there are limitations in our ability to accurately identify both simultaneously.  After all, if we could predict both perfectly, we could construct an arbitrage.

If we believe the markets are at all efficient, then being able to identify the factors that will out-perform in a given state of the business cycle should lead us to conclude that we cannot predict the future state of the business cycle. Similarly, if we believe we can predict the future state of the business cycle, we should not be able to predict which factors will necessarily do well.

Philosophical arguments aside, we wanted to test the efficacy of this approach. 

Which Factors and When?

Rather than simply perform a data-mining exercise to determine which factors have done well in each economic environment, we wanted to test prevalent beliefs about factor performance and economic cycles.  To do this, we identified marketing and research materials from two investment institutions that tie factor allocation recommendations to the business cycle.

Both models expressed a view using four stages of the economic environment: a slowdown, a contraction, a recovery, and an economic expansion.

Model #1

  • Slowdown: Momentum, Quality, Low Volatility
  • Contraction: Value, Quality, Low Volatility
  • Recovery: Value, Size
  • Expansion: Value, Size, Momentum

Model #2

  • Slowdown: Quality, Low Volatility
  • Contraction: Momentum, Quality, Low Volatility
  • Recovery: Value, Size
  • Expansion: Value, Size, Momentum

Defining the Business Cycle

Given these models, our next step was to build a model to identify the current economic environment.  Rather than build a model, however, we decided to dust off our crystal ball. After all, if business-cycle-based factor rotation does not work with perfect foresight of the economic environment, what hope do we have for when we have to predict the environment?

We elected to use the National Bureau of Economic Research’s (“NBER”) listed history of US business cycle expansions and contractions.  With the benefit of hindsight, they label recessions as the peak of the business cycle prior to the subsequent trough.

Unfortunately, NBER only provides a simple indicator as to whether a given month is in a recession or not.  We were left to fill in the blanks around what constitutes a slowdown, a contraction, a recovery, and an expansionary period.  Here we settled on two definitions:

Definition #1

  • Slowdown: The first half of an identified recession
  • Contraction: The second half of an identified recession
  • Recovery: The first third of a non-recessionary period
  • Expansion: The remaining part of a non-recessionary period

Definition #2

  • Slowdown: The 12-months leading up to a recession
  • Contraction: The identified recessionary periods
  • Recovery: The 12-months after an identified recession
  • Expansion: The remaining non-recessionary period

For definition #2, in the case where two recessions were 12 or fewer months apart (as was the case in the 1980s), the intermediate period was split equivalently into recovery and slowdown.  

Implementing Factor Rotation

After establishing the rotation rules and using our crystal ball to identify the different periods of the business cycle, our next step was to build the factor rotation portfolios.

We first sourced monthly long/short equity factor returns for size, value, momentum, and quality from AQR’s data library.  To construct a low-volatility factor, we used portfolios sorted on variance from the Kenneth French library and subtracted bottom-quintile returns from top-quintile returns.

As the goal of our study is to identify the benefit of factor timing, we de-meaned the monthly returns by the average of all factor returns in that month to identify relative performance.

We constructed four portfolios using the two factor rotation definitions and the two economic cycle definitions.  Generically, at the end of each month, we would use the next month’s economic cycle label to identify which factors to hold in our portfolio.  Identified factors were held in equal weight.

Below we plot the four equity curves.  Remember that these series are generated using de-meaned return data, so reflect the out-performance against an equal-weight factor benchmark.

 Source: NBER, AQR, and Kenneth French Data Library. Calculations by Newfound Research. Returns are backtested and hypothetical. Returns assume the reinvestment of all distributions.  Returns are gross of all fees.  None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary.  You cannot invest in an index.

It would appear that even with a crystal ball, conventional wisdom about style rotation and business cycles may not hold.  And even where it might, we can see multi-decade periods where it adds little-to-no value.

Data-Mining Our Way to Success

If we are going to use a crystal ball, we might as well just blatantly data-mine our way to success and see what we learn along the way.

To achieve this goal, we can simply look at the annualized de-meaned returns of each factor during each period of the business cycle.

Source: NBER, AQR, and Kenneth French Data Library.  Calculations by Newfound Research.  Returns are backtested and hypothetical.  Returns assume the reinvestment of all distributions.  Returns are gross of all fees.  None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary.  You cannot invest in an index.

Despite two different definitions of the business cycle, we can see a strong alignment in which factors work when.  Slow-downs / pre-recessionary periods are tilted towards momentum and defensive factors like quality and low-volatility.  Momentum may seem like a curious factor, but its high turnover may give it a chameleon-like nature that can tilt it defensively in certain scenarios.

In a recession, momentum is replaced with value while quality and low-volatility remain. In the initial recovery, small-caps, value, and momentum are favored.  In this case, while value may actually be benefiting from multiple expansion, small-caps may simply be a way to play higher beta.  Finally, momentum is strongly favored during an expansion.

Yet even a data-mined solution is not without its flaws.  Below we plot rolling 3-year returns for our data-mined timing strategies.  Again, remember that these series are generated using de-meaned return data, so reflect the out-performance against an equal-weight factor benchmark.

Source: NBER, AQR, and Kenneth French Data Library.  Calculations by Newfound Research.  Returns are backtested and hypothetical.  Returns assume the reinvestment of all distributions.  Returns are gross of all fees.  None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary.  You cannot invest in an index.

Despite a crystal ball telling us what part of the business cycle we are in and completely data-mined results, there are still a number of 3-year periods with low-to-negative results.  And we have not even considered manager costs, transaction costs, or taxes yet.

A few more important things to note.

Several of these factors exhibit strong negative performance during certain parts of the market cycle, indicating a potential for out-performance by taking the opposite side of the factor.  For example, value appears to do poorly during pre-recession and expansion periods.  One hypothesis is that during expansionary periods, markets tend to over-extrapolate earnings growth potential, favoring growth companies that appear more expensive.

We should also remember that our test is on long/short portfolios and may not necessarily be relevant for long-only investors.  While we can think of a long-only portfolio as a market-cap portfolio plus a long/short portfolio, the implicit long/short is not necessarily identical to academic factor definitions.

Finally, it is worth considering that these results are data-mined over a 50+ year period, which may allow outlier events to dramatically skew the results.  Momentum, for example, famously exhibited dramatic crashes during the Great Depression and in the 2008-crisis, but may have actually relatively out-performed in other recessions.

Conclusion

In this commentary we sought to answer the question, “can we use the business cycle to time factor exposures?”  Assuming access to a crystal ball that could tell us where we stood precisely in the business cycle, we found that conventional wisdom about factor timing did not add meaningful value over time.  We do not hold out much hope, based on this conventional wisdom, that someone without a crystal ball would fare much better.

Despite explicitly trying to select models that reflected conventional wisdom, we found a significant degree of similarity in these recommendations with those that came from blindly data-mining optimal results.  Nevertheless, even slight recommendation differences lead to lackluster results.

The similarities between data-mined results and conventional wisdom, however, should give us pause.  While the argument for conventional wisdom is often a well-articulated economic rationale, we have to wonder whether we have simply fooled ourselves with a narrative that has been inherently constructed with the benefit of hindsight.

Powered by WordPress & Theme by Anders Norén