This blog post is available as a PDF download here.

Summary­

  • After stumbling across a set of old research notes from 2009 and 2012, we attempt to implement a Cyclicals versus Defensives sector trade out-of-sample.
  • Post-2012 returns prove unconvincing and we find little evidence supporting the notion that PMI changes can be used for constructing this trade.
  • Using data from the Kenneth French website, we extend the study to 1948, and similarly find that changes in PMI (regardless of lookback period) are not an effective signal for trading Cyclical versus Defensive sectors.

I love coming across old research because it allows for truly out-of-sample testing.

Earlier this week, I stumbled across a research note from 2009 and a follow-up note from 2012, both exploring the use of macro-based signals for constructing dollar-neutral long/short sector trades.  Specifically, the pieces focused on using manufacturing Purchasing Manager Indices (PMIs) as a predictor for Cyclical versus Defensive sectors.1

The strategy outlined is simple: when the prior month change in manufacturing PMI is positive, the strategy is long Cyclicals and short Defensives; when the change is negative, the strategy is long Defensives and short Cyclicals.  The intuition behind this signal is that PMIs provide a guide to hard economic activity.

The sample period for the initial test is from 1998 to 2009, a period over which the strategy performed quite well on a global basis and even better when using the more forward-looking ratio of new orders to inventory.

Red flags start to go up, however, when we read the second note from 2012.  “It appears that the new orders-to-inventory ratio has lost its ability to forecast the output index.”  “In addition, the optimal lookback period … has shifted from one to two months.”

At this point, we can believe one of a few things:

  • The initial strategy works, has simply hit a rough patch in the three years after publishing, and will work again in the future.
  • The initial strategy worked but has broken since publishing.
  • The initial strategy never worked and was an artifact of datamining.

I won’t even bother addressing the whole “one-month versus two-month” comment. Long-time readers know where we come down on ensembles versus parameter specification…

Fortunately, we do not have to pass qualitative judgement: we can let the numbers speak for themselves.

While the initial notes focused on global implementation, we can rebuild the strategy using U.S. equity sectors and US manufacturing PMI as the driving signal. This will serve both as an out-of-sample test for assets, as well as provide approximately 7 more years of out-of-sample time to evaluate returns.

Below we plot the results of this strategy for both 1-month and 2-month lookback periods, highlighting the in-sample and out-of-sample periods for each specification based upon the date the original research notes were published.  We use the State Street SPDR Sector Select ETFs as our implementation vehicles, with the exception of the iShares Dow Jones US Telecom ETF.

Source: CSI Data; Quandl. Calculations by Newfound Research.  Results are hypothetical.  Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.  

 

The first thing we notice is that the original 1-month implementation – which appeared to work on a global scale – does not seem particularly robust when implemented with U.S. sectors.  Post publish date results do not fare much better.

The 2-month specification, however, does appear to work reasonably well both in- and out-of-sample.

But is there something inherently magical about that two-month specification?  We are hard-pressed to find a narrative explanation.

If we plot lookback specifications from 3- to 12-months, we see that the 2-month specification proves to be a significant outlier. Given the high correlation between all the other specifications, it is more likely that the 2-month lookback was the beneficiary of luck rather than capturing a special particular edge.

Source: CSI Data; Quandl. Calculations by Newfound Research.  Results are hypothetical.  Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.  

 

Perhaps we’re not giving this idea enough breathing room.  After all, were we to evaluate most value strategies in the most recent decades, we’d likely declare them insignificant as well.

With manufacturing PMI data extending back to the 1948, we can use sector index data from the Kenneth French website to reconstruct this strategy.

Unfortunately, the Kenneth French definitions do not match GICs perfectly, so we have to change the definition of Cyclicals and Defensives slightly.  Using the Kenneth French data, we will define Cyclicals to be Manufacturing, Non-Durables, Technology, and Shops. Defensives are defined to be Durables, Telecom, Health Care, and Utilities.

We use the same strategy as before, going long Cyclicals and short Defensives when changes in PMI are positive, and short Cyclicals and long Defensives when changes to PMI are negative.  We again vary the lookback period from 1- to 12-months.

Source: Kenneth French Data Library; Quandl. Calculations by Newfound Research. Results are hypothetical.  Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.  

 

The results are less than convincing.  Not only do we see significant dispersion across implementations, but there is also no consistency in those implementations that do well versus those that do not.

Perhaps worse, the best performing variation only returned a paltry 1.40% annualized gross of any implementation costs.  Once we start accounting for transaction costs, slippage, and management fees, this figure deflates towards zero rather quickly.

Source: Kenneth French Data Library; Quandl. Calculations by Newfound Research. Results are hypothetical.  Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.  

Conclusion

There is no shortage of quantitative research in the market and the research can be particularly compelling when it seems to fit a pre-existing narrative.

Cyclicals versus Defensives are a perfect example.  Their very names imply the regimes during which they are supposed to add value, but actually translating this notion into a robust strategy proves to be less than easy.

I would make the philosophical argument that it quite simply cannot be easy.  Consider the two pieces of information we need to believe for this strategy to work:

  • Cyclicals outperform Defensives in an economic expansion and Defensives outperform Cyclicals in an economic contraction.
  • We can forecast economic expansions and contractions before it is priced into the market.

If we have very high confidence in both statements, it effectively implies an arbitrage.

Therefore, if we have very high confidence in the truth of the first statement, then for markets to be reasonably efficient, we must have little confidence in the second statement.

Similarly, if we have high confidence in the trust of the second statement, then for markets to be reasonably efficient, we must have little confidence in the first statement.

Thus, a more reasonable expectation might be that Cyclicals tend to outperform Defensives during an expansion, and Defensives tend to outperform Cyclicals in a contraction, but there may be meaningful exceptions depending upon the particular cycle.

Furthermore, we may believe we have an edge in forecasting expansions and contractions (perhaps not with just PMI, though), but there will be many false positives and false negatives along the way.

Taken together, we might believe we can construct such a strategy, but errors in both assumptions will lead to periods of frustration.  However, we should recognize that for such an “open secret” strategy to work in the long run, there have to be troughs of sorrow deep enough to avoid permanent crowding.

In this case, we believe there is little evidence to suggest that level changes in PMI provide particular insight into Cyclicals versus Defensives, but that does not mean there are no macro signals that might.