The Research Library of Newfound Research

Tag: factors

Pursuing Factor Purity

This post is available as a PDF download here.

Summary

  • Factors play an important role for quantitative portfolio construction.
  • How a factor is defined and how a factor portfolio is constructed play important roles in the results achieved.
  • Naively constructed portfolios – such as most “academic” factors – can lead to latent style exposures and potentially large unintended bets.
  • Through numerical techniques, we can seek to develop pure factors that provide targeted exposure to one style while neutralizing exposure to the rest.
  • In this research note, we implement a regression-based and optimized-based approach to achieving pure factor portfolios and report the results achieved.

Several years ago, we penned a note titled Separating Ingredients and Recipe in Factor Investing (May 21, 2018).  In the note we discussed why we believe it is important for investors and allocators to consider not just what ingredients are going into their portfolios – i.e. securities, styles, asset classes, et cetera – but the recipe by which those ingredients are combined.  Far too often the ingredients are given all the attention, but mistake salt for sugar and I can guarantee that you’re not going to enjoy your cake, regardless of the quality of the salt.

As an example, the note focused on constructing momentum portfolios.  By varying the momentum measure, lookback period, rebalance frequency, portfolio construction, weighting scheme, and sector constraints we constructed over 1,000 momentum strategies.  The resulting dispersion between the momentum strategies was more-often-than-not larger than the dispersion between generic value (top 30% price-to-book) and momentum (top 30% by 12-1 prior returns).

Yet having some constant definition for factor portfolios is desirable for a number of reasons, including both alpha signal generation and return attribution.

One potential problem for naïve factor construction – e.g. a simple characteristic rank-sort – is that it can lead to time-varying correlations between factors.

For example, below we plot the correlation between momentum and value, size, growth, and low volatility factors.  We can see significant time-varying behavior; for example, in 2018 momentum and low volatility exhibited moderate negative correlation, while in 2019 they exhibited significant positive correlation.

The risk of time-varying correlations is that they can potentially leading to the introduction of unintended bets within single- or multi-factor portfolios or make it more difficult to determine with accuracy a portfolio’s sensitivity to different factors.

More broadly, low and stable correlations are preferable – assuming they can be achieved without meaningfully sacrificing expected returns – because they should allow investors to develop portfolios with lower volatility and higher information ratios.

Naively constructed equity styles can also exhibit time-varying correlations to traditional economic factors (e.g. interest rate risk), risk premia (e.g. market beta) or risk factors (e.g. sector or country exposure).

But equity styles can even exhibit time-varying sensitivities to themselves.  For example, below we multiply the weights of naively constructed long/short style portfolios against the characteristic z-scores for the underlying holdings.  As the characteristics of the underlying securities change, so does the actual weighted characteristic score of the portfolio.  While some signals stay quite steady (e.g. size), others can vary substantially; sometimes value is just more ­value-y.

Source: Sharadar.  Calculations by Newfound Research.  Factor portfolios self-financing long/short portfolios that are long the top quintile and short the bottom quintile of securities, equally weighted and rebalanced monthly, ranked based upon their specific characteristics (see below). 

In the remainder of this note, we will explore two approaches to constructing “pure” factor portfolios that can be used to generate a factor portfolio that neutralizes exposure to risk factors and other style premia.

Using the S&P 500 as our parent universe, we will construct five different factors defined by the security characteristics below:

  • Value (VAL): Earnings yield, free cash flow yield, and revenue yield.
  • Size (SIZE): Negative log market capitalization.
  • Momentum (MOM): 12-1 month total return.
  • Quality (QUAL): Return on equity1, negative accruals ratio, negative leverage ratio2.
  • Low Volatility (VOL): Negative 12-month realized volatility.

All characteristics are first cross-sectionally winsorized at the 5th and 95th percentiles, then cross-sectionally z-scored, and finally averaged (if a style is represented by multiple scores) to create a single score for each security.

Naively constructed style benchmarks are 100% long the top-ranked quintile of securities and 100% short the bottom-ranked quintile, with securities receiving equal weights.

Source: Sharadar.  Calculations by Newfound Research.  Past performance is not an indicator of future results.  Performance is backtested and hypothetical.  Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions.  

Factor Mimicry with Fama-MacBeth

Our first approach to designing “pure” factor portfolios is inspired by Fama-MacBeth (1973)3.  Fama-MacBeth regression is a two-step approach:

  1. Regress each security against proposed risk factors to determine the security’s beta for that risk factor;
  2. Regress all security returns for a fixed time period against the betas to determine the risk premium for each factor.

Similarly, we will assume a factor model where the return for a given security can be defined as:

Where Rm is the return of the market and RFj is the return for some risk factor.  In this equation, the betas define a security’s sensitivity to a given risk factor.  However, instead of using the Fama-MacBeth two-step approach to solve for the factor betas, we can replace the betas with factor characteristic z-scores.

Using these known scores, we can both estimate the factor returns using standard regression4 and extract the weights of the factor mimicking portfolios.  The upside to this approach is that each factor mimicking portfolios will, by design, have constant unit exposure to its specific factor characteristic and zero exposure to the others.

Here we should note that unless an intercept is added to the regression equation, the factor mimicking portfolios will be beta-neutral but not dollar-neutral.  This can have a substantial impact on factors like low volatility (VOL), where we expect our characteristics to be informative about risk-adjusted returns but not absolute returns.  We can see the impact of this choice in the factor return graphs plotted below.5

Furthermore, by utilizing factor z-scores, this approach will neutralize characteristic exposure, but not necessarily return exposure.  In other words, correlations between factor returns may not be zero.  A further underlying assumption of this construction is that an equal-weight portfolio of all securities is style neutral.  Given that equal-weight portfolios are generally considered to embed positive size and value tilts, this is an assumption we should be cognizant of.

Source: Sharadar.  Calculations by Newfound Research.  Past performance is not an indicator of future results.  Performance is backtested and hypothetical.  Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. 

Attempting to compare these mimic portfolios versus our original naïve construction is difficult as they target a constant unit of factor exposure, varying their total notional exposure to do so.  Therefore, to create an apples-to-apples comparison, we adjust both sets of factors to target a constant volatility of 5%.

Source: Sharadar.  Calculations by Newfound Research.  Past performance is not an indicator of future results.  Performance is backtested and hypothetical.  Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. 

We can see that neutralizing market beta and other style factors leads to an increase in annualized return for value, size, momentum, and quality factors, leading to a corresponding increase in information ratio.  Unfortunately, none of these results are statistically significant at a 5% threshold.

Nevertheless, it may still be informative to take a peek under the hood to see how the weights shook out.  Below we plot the average weight by security characteristic percentile (at each rebalance, securities are sorted into percentile score bins and their weights are summed together; weights in each bin are then averaged over time).

Before reviewing the weights, however, it is important to recall that each portfolio is designed to capture a constant unit exposure to a style and therefore total notional exposure will vary over time.  To create a fairer comparison across factors, then, we scale the weights such that each leg has constant 100% notional exposure.

As we would generally expect, all the factors are over-weight high scoring securities and underweight low scoring securities.  What is interesting to note, however, is that the shapes by which they achieve their exposure are different.  Value, for example leans strongly into top decile securities whereas quality leans heavily away (i.e. shorts) the bottom decile.  Unlike the other factors which are largely positively sloped in their weights, low volatility exhibits fairly constant positive exposure above the 50th percentile.

What may come as a surprise to many is how diversified the portfolios appear to be across securities.  This is because the regression result is equivalent to minimizing the sum of squared weights subject to target exposure constraints.

Source: Sharadar.  Calculations by Newfound Research.

While we focused specifically on neutralizing style exposure, this approach can be extended to also neutralize industry / sector exposure (e.g. with dummy variables), region exposure, and even economic factor exposure.  Special care must be taken, however, to address potential issues of multi-collinearity.

Pure Quintile Portfolios with Optimization

Liu (2016)6 proposes an alternative means for constructing pure factor portfolios using an optimization-based approach.  Specifically, long-only quintile portfolios are constructed such that:

  • They minimize the squared sum of weights;
  • Their weighted characteristic exposure for the target style is equal to the weighted characteristic exposure of a naïve, equally-weighted, matching quintile portfolio; and
  • Weighted characteristic exposure for non-targeted styles equals zero.

While the regression-based approach was fast due to its closed-form solution, an optimization-based approach can potentially allow for greater flexibility in objectives and constraints.

Below we replicate the approach proposed in Liu (2016) and then create dollar-neutral long/short factor portfolios by going long the top quintile portfolio and short the bottom quintile portfolio.  Portfolios are re-optimized and rebalanced monthly.  Unlike the regression-based approach, however, these portfolios do not seek to be beta-neutral.

Source: Sharadar.  Calculations by Newfound Research.  Past performance is not an indicator of future results.  Performance is backtested and hypothetical.  Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. 

We can see that the general shapes of the factor equity curves remain largely similar to the naïve implementations.  Unlike the results reported in Liu (2016), however, we measure a decline in return among several factors (e.g. value and size).  We also find that annualized volatility is meaningfully reduced for all the optimized portfolios; taken together, information ratio differences are statistically indistinguishable from zero.

Source: Sharadar.  Calculations by Newfound Research.  Past performance is not an indicator of future results.  Performance is backtested and hypothetical.  Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. 

As with the regression-based approach, we can also look at the average portfolio exposures over time to characteristic ranks.  Below we plot these results for both the naïve and optimized Value quintiles.  We can see that the top and bottom quintiles lean heavily into top- and bottom-decile securities, while 2nd, 3rd, and 4th quintiles had more diversified security exposure on average.  Similar weighting profiles are displayed by the other factors.

Source: Sharadar.  Calculations by Newfound Research.

Conclusion

Factors are easy to define in general but difficult to define explicitly.  Commonly accepted academic definitions are easy to construct and track, but often at the cost of inconsistent style exposure and the risk of latent, unintended bets.  Such impure construction may lead to time-varying correlations between factors, making it more difficult for managers to manage risk as well as disentangle the true source of returns.

In this research note we explored two approaches that attempt to correct for these issues: a regression-based approach and an optimization-based approach.  With each approach, we sought to eliminate non-target style exposure, resulting in a pure factor implementation.

Despite a seemingly well-defined objective, we still find that how “purity” is defined can lead to different results.  For example, in our regression-based approach we targeted unit style exposure and beta-neutrality, allowing total notional exposure to vary.  In our optimization-based approach, we constructed long-only quintiles independently, targeting the same weighted-average characteristic exposure as a naïve, equal-weight factor portfolio.  We then built a long/short implementation from the top and bottom quintiles.  The results between the regression-based and optimization-based approaches were markedly different.

And, statistically, not any better than the naïve approaches.

This is to say nothing of other potential choices we could make about defining “purity.”  For example, what assumptions should we make about industry, sector, or regional exposures?

More broadly, is “purity” even desirable?

In Do Factors Market Time? (June 5, 2017) we demonstrated that beta timing was an unintentional byproduct of naïve value, size, and momentum portfolios and had actually been a meaningful tailwind for value from 1927-1957.  Some factors might actually be priced across industries rather than just within them (Vyas and van Baren (2019)7).  Is the chameleon-like nature of momentum to rapidly tilt towards whatever style, sector, or theme has been recently outperforming a feature or a bug?

And this is all to say nothing of the actual factor definitions we selected.

While impurity may be a latent risk for factor portfolios, we believe this research suggests that purity is in the eye of the beholder.

 


 

 

Factor Orphans

This post is available as a PDF download here.

Summary­

  • To generate returns that are different than the market, we must adopt a positioning that is different than the market.
  • With the increasing adoption of systematic factor portfolios, we explore whether an anti-factor stance can generate contrarian-based profits.
  • Specifically, we explore the idea of factor orphans: stocks that are not included in any factor portfolio at a given time.
  • To identify these stocks, we replicate four popular factor indices: the S&P 500 Enhanced Value index, the S&P 500 Momentum index, the S&P 500 Low Volatility index, and the S&P 500 Quality index.
  • On average, there are over 200 stocks in the S&P 500 that are orphaned at any given time.
  • Generating an equal-weight portfolio of these stocks does not exhibit meaningfully different performance than a naïve equal-weight S&P 500 portfolio.

Contrarian investing is nothing new.  Holding a variant perception to the market is often cited as a critical component to generating differentiated performance.  The question in the details is, however, “contrarian to what?”

In the last decade, we’ve witnessed a dramatic rise in the popularity of systematically managed active strategies.  These so-called “smart beta” portfolios seek to harvest documented risk premia and market anomalies and implement them with ruthless discipline.

But when massively adopted, do these strategies become the commonly-held view and therefore more efficiently priced into the market?  Would this mean that the variant perception would actually be buying those securities totally ignored by these strategies?

This is by no means a new idea.  Morningstar has long maintained its Unloved strategy that purchases the three equity categories that have witnessed the largest outflows at the end of the year.  A few years ago, Vincent Deluard constructed a “DUMB” beta portfolio that included all the stocks shunned by popular factor ETFs.  In the short out-of-sample period the performance of the strategy was tested, it largely kept pace with an equal-factor portfolio.  More recently, a Bank of America research note claimed that a basket of most-hated securities – as defined by companies neglected by mutual funds and shorted by hedge funds hedge funds – had tripled the S&P 500’s return over the past year.

The approach certainly has an appealing narrative: as the crowd zigs to adopt smart beta, we zag.  But has it worked?

To test this concept, we wanted to identify what we call “factor orphans”: those securities not held by any factor portfolio.  Once identified, we can build a portfolio holding these stocks and track its performance over time.

As a quant, this idea strikes us as a little crazy.  A stock not held in a value, momentum, low volatility, or quality index is likely one that is expensive, highly volatile, with poor fundamentals and declining performance.  Precisely the type of stock factor investing would tell us not to own.

But perhaps the fact that these securities are orphaned means that there are no more sellers: the major cross-section of market strategies have already abandoned the stock.  Thus, stepping in to buy them may allow us to offload them later when they are picked back up by these systematic approaches.

Perhaps this idea is crazy enough it just might work…

To test this idea, we first sought to replicate four common factor benchmarks: the S&P 500 Enhanced Value index, the S&P 500 Momentum index, the S&P 500 Low Volatility index and the S&P 500 Quality index.  Once replicated, we can use the underlying baskets as being representative of the holdings for factor portfolios is general.

Results of our replication efforts are plotted below.  We can see that our models fit the shape of most of the indices closely, with very close fits for the Momentum and Low Volatility portfolios.

Source: Sharadar.  Calculations by Newfound Research.  Past performance is not an indicator of future results.  Performance is backtested and hypothetical.  Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. 

The Quality replication represents the largest deviation from the underlying index, but still approximates the shape of the total return profile rather closely.  This gives us confidence that the portfolio we constructed is a quality portfolio (which should come as no surprise, as securities were selected based upon common quality metrics), but the failure to more closely replicate this index may represent a thorn in our ability to identify truly orphaned stocks.

At the end of each month, we identify the set of all securities held by any of the four portfolios.  The securities in the S&P 500 (at that point in time) but not in the factor basket are the orphaned stocks.  Somewhat surprisingly, we find that approximately 200 names are orphaned at any given time, with the number reaching as high as 300 during periods when underlying factors converge.

Also interesting is that the actual overlap in holdings in the factor portfolios is quite low, rarely exceeding 30%.  This is likely due to the rather concentrated nature of the indices selected, which hold only 100 stocks at a given time.

Source: Sharadar.  Calculations by Newfound Research.

Once our orphaned stocks are identified, we construct a portfolio that holds them in equal weight.  We rebalance our portfolio monthly to sell those stocks that have been acquired by a factor portfolio and roll into those securities that have been abandoned.

We plot the results of our exercise below as well as an equally weighted S&P 500 benchmark.

Source: Sharadar.  Calculations by Newfound Research.  Past performance is not an indicator of future results.  Performance is backtested and hypothetical.  Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. 

While the total return is modestly less (but certainly not statistically significantly so), what is most striking is how little deviation there is in the orphaned stock portfolio versus the equal-weight benchmark.

However, as we have demonstrated in the past, the construction choices in a portfolio can have a significant impact upon the realized results.  As we look at the factor portfolios themselves, we must acknowledge that they represent relative tilts to the benchmark, and that the absence of one security might actually represent a significantly smaller relative underweight to the benchmark than the absence of another.  Or the absence of one security may actually represent a smaller relative underweight than another that is actually included.

Therefore, as an alternative test we construct an equal-weight factor portfolio and subtract the S&P 500 market-capitalization weights.  The result is the implied over- and under-weights of the combined factor portfolios.  We then rank securities to select the 100 most under-weight securities each month and hold them in equal weight.

Source: Sharadar.  Calculations by Newfound Research.  Past performance is not an indicator of future results.  Performance is backtested and hypothetical.  Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. 

Of course, we didn’t actually have to perform this exercise had we stepped back to think for a moment.  We generally know that these (backtested) factors have out-performed the benchmark.  Therefore, selecting stocks that they are underweight means we’re taking the opposite side of the factor trade, which we know has not worked.

Which does draw an important distinction between most underweight and orphaned.  It would appear that factor orphans do not necessarily create the strong anti-factor tilt the way that the most underweight portfolio does.

For the sake of completion, we can also evaluate the portfolios containing securities held in just one of the factor portfolios, two of the factor portfolios, three of the factor portfolios, or all of the factor portfolios at a given time.

Below we plot the count of securities in such portfolios over time.  We can see that it is very uncommon to identify securities that are simultaneously held by all the factors, or even three of the factors, at once.

Source: Sharadar.  Calculations by Newfound Research.

Source: Sharadar.  Calculations by Newfound Research.  Past performance is not an indicator of future results.  Performance is backtested and hypothetical.  Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. 

We can see that the portfolio built from stocks held in just one factor (“In One”) closely mimics the portfolio built from stocks held in no factor (“In Zero”), which in turn mimics the S&P 500 Equal Weight portfolio.  This is likely because the portfolios include so many securities that they effectively bring you back to the index.

On the other end of the spectrum, we see the considerable risks of concentration manifest in the portfolios built from stocks held in three or four of the factors.  The portfolio comprised of stocks held in all four factors simultaneously (“In Four”) not only goes long stretches of holding nothing at all, but is also subject to large bouts of volatility due to the extreme concentration.

We also see this for the portfolio that holds stocks held by three of the factors simultaneously (“In Three”).  While this portfolio has modestly more diversification – and even appears to out-perform the equal-weight benchmark – the concentration risk finally materializes in 2018-2019, causing a dramatic drawdown.

The portfolio holding stocks held in just two of the factors (“In Two”), though, appears to offer some out-performance opportunity.  Perhaps by forcing just two factors to agree, we strike a balance between confirmation among signals and portfolio diversification.

Unfortunately, our enthusiasm quickly wanes when we realize that this portfolio closely matches the results achieved just by naively equally-weighting exposure among the four factor portfolios themselves, which is far more easily implemented.

Source: Sharadar.  Calculations by Newfound Research.  Past performance is not an indicator of future results.  Performance is backtested and hypothetical.  Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. 

 

Conclusion

To achieve differentiated results, we must take a differentiated stance from the market.  As systematic factor portfolios are more broadly adopted, we should consider asking ourselves if taking an anti-factor stance might lead to contrarian-based profits.

In this study, we explore the idea of factor orphans: stocks not held by any factor portfolio at a given time.  Our hypothesis is that these orphaned securities may be systematically over-sold, leading to an opportunity for future out-performance if they are re-acquired by the factor portfolios at a later date.

We begin by replicating four factor indices: the S&P 500 Enhanced Value index, the S&P 500 Momentum index, the S&P 500 Low Volatility index, and the S&P 500 Quality index.  Replicating these processes allows us to identify historical portfolio holdings, which in turn allows us to identify stocks not held by the factors.

We are able to closely replicate the S&P 500 Momentum and Low Volatility portfolios, create meaningful overlap with the S&P 500 Enhanced Value method, and generally capture the S&P 500 Quality index.  The failure to more closely replicate the S&P 500 Quality index may have a meaningful impact on the results herein, though we believe our methodology still captures the generic return of a quality strategy.

We find that, on average, there are over 200 factor orphans at a given time.  Constructing an equal-weight portfolio of these orphans, however, only seems to lead us back to an S&P 500 Equal Weight benchmark.  While there does not appear to be an edge in this strategy, it is interesting that there does not appear to be a negative edge either.

Recognizing that long-only factor portfolios represent active bets expressed as over- and underweights relative to the S&P 500, we also construct a portfolio of the most underweight stocks.  Not surprisingly, as this portfolio actively captures a negative factor tilt, the strategy meaningfully underperforms the S&P 500 Equal Weight benchmark.  Though the relative underperformance meaningfully dissipates in recent years.

Finally, we develop portfolios to capture stocks held in just one, two, three, or all four of the factors simultaneously.  We find the portfolios comprised stocks held in either three or four of the factors at once exhibit significant concentration risk.  As with the orphan portfolio, the portfolio of stocks held by just one of the factors closely tracks the S&P 500 Equal Weight benchmark, suggesting that it might be over-diversified.

The portfolio holding stocks held by just two factors at a time appears to be the Goldilocks portfolio, with enough concentration to be differentiated from the benchmark but not so much as to create significant concentration risk.

Unfortunately, this portfolio also almost perfectly replicates a naïve equal-weight portfolio among the four factors, suggesting that the approach is likely a wasted effort.

In conclusion, we find no evidence that factor orphans have historically offered a meaningful excess return opportunity.  Nor, however, do they appear to have been a drag on portfolio returns either.  We should acknowledge, however, that the adoption of factor portfolios accelerated rapidly after the Great Financial Crisis, and that backtests may not capture current market dynamics.  More recent event studies of orphaned stocks being added to factor portfolios may provide more insight into the current environment.

Factor Fimbulwinter

This post is available as a PDF download here.

Summary­

  • Value investing continues to experience a trough of sorrow. In particular, the traditional price-to-book factor has failed to establish new highs since December 2006 and sits in a 25% drawdown.
  • While price-to-book has been the academic measure of choice for 25+ years, many practitioners have begun to question its value (pun intended).
  • We have also witnessed the turning of the tides against the size premium, with many practitioners no longer considering it to be a valid stand-alone anomaly. This comes 35+ years after being first published.
  • With this in mind, we explore the evidence that would be required for us to dismiss other, already established anomalies.  Using past returns to establish prior beliefs, we simulate out forward environments and use Bayesian inference to adjust our beliefs over time, recording how long it would take for us to finally dismiss a factor.
  • We find that for most factors, we would have to live through several careers to finally witness enough evidence to dismiss them outright.
  • Thus, while factors may be established upon a foundation of evidence, their forward use requires a bit of faith.

In Norse mythology, Fimbulvetr (commonly referred to in English as “Fimbulwinter”) is a great and seemingly never-ending winter.  It continues for three seasons – long, horribly cold years that stretch on longer than normal – with no intervening summers.  It is a time of bitterly cold, sunless days where hope is abandoned and discord reigns.

This winter-to-end-all-winters is eventually punctuated by Ragnarok, a series of events leading up to a great battle that results in the ultimate death of the major gods, destruction of the cosmos, and subsequent rebirth of the world.

Investment mythology is littered with Ragnarok-styled blow-ups and we often assume the failure of a strategy will manifest as sudden catastrophe.  In most cases, however, failure may more likely resemble Fimbulwinter: a seemingly never-ending winter in performance with returns blown to-and-fro by the harsh winds of randomness.

Value investors can attest to this.  In particular, the disciples of price-to-book have suffered greatly as of late, with “expensive” stocks having outperformed “cheap” stocks for over a decade.  The academic interpretation of the factor sits nearly 25% belowits prior high-water mark seen in December 2006.

Expectedly, a large number of articles have been written about the death of the value factor.  Some question the factor itself, while others simply argue that price-to-book is a broken implementation.

But are these simply retrospective narratives, driven by a desire to have an explanation for a result that has defied our expectations?  Consider: if price-to-book had exhibited positive returns over the last decade, would we be hearing from nearly as large a number of investors explaining why it is no longer a relevant metric?

To be clear, we believe that many of the arguments proposed for why price-to-book is no longer a relevant metric are quite sound. The team at O’Shaughnessy Asset Management, for example, wrote a particularly compelling piece that explores how changes to accounting rules have led book value to become a less relevant metric in recent decades.1

Nevertheless, we think it is worth taking a step back, considering an alternate course of history, and asking ourselves how it would impact our current thinking.  Often, we look back on history as if it were the obvious course.  “If only we had better prior information,” we say to ourselves, “we would have predicted the path!”2  Rather, we find it more useful to look at the past as just one realized path of many that’s that could have happened, none of which were preordained.  Randomness happens.

With this line of thinking, the poor performance of price-to-book can just as easily be explained by a poor roll of the dice as it can be by a fundamental break in applicability.  In fact, we see several potential truths based upon performance over the last decade:

  1. This is all normal course performance variance for the factor.
  2. The value factor works, but the price-to-book measure itself is broken.
  3. The price-to-book measure is over-crowded in use, and thus the “troughs of sorrow” will need to be deeper than ever to get weak hands to fold and pass the alpha to those with the fortitude to hold.
  4. The value factor never existed in the first place; it was an unfortunate false positive that saturated the investing literature and broad narrative.

The problem at hand is two-fold: (1) the statistical evidence supporting most factors is considerable and (2) the decade-to-decade variance in factor performance is substantial.  Taken together, you run into a situation where a mere decade of underperformance likely cannot undue the previously established significance.  Just as frustrating is the opposite scenario. Consider that these two statements are not mutually exclusive: (1) price-to-book is broken, and (2) price-to-book generates positive excess return over the next decade.

In investing, factor return variance is large enough that the proof is not in the eating of the short-term return pudding.

The small-cap premium is an excellent example of the difficulty in discerning, in real time, the integrity of an established factor.  The anomaly has failed to establish a meaningful new high since it was originally published in 1981.  Only in the last decade – nearly 30 years later – have the tides of the industry finally seemed to turn against it as an established anomaly and potential source of excess return.

Thirty years.

The remaining broadly accepted factors – e.g. value, momentum, carry, defensive, and trend – have all been demonstrated to generate excess risk-adjusted returns across a variety of economic regimes, geographies, and asset classes, creating a great depth of evidence supporting their existence. What evidence, then, would make us abandon faith from the Church of Factors?

To explore this question, we ran a simple experiment for each factor.  Our goal was to estimate how long it would take to determine that a factor was no longer statistically significant.

Our assumption is that the salient features of each factor’s return pattern will remain the same (i.e. autocorrelation, conditional heteroskedasticity, skewness, kurtosis, et cetera), but the forward average annualized return will be zero since the factor no longer “works.”

Towards this end, we ran the following experiment: 

  1. Take the full history for the factor and calculate prior estimates for mean annualized return and standard error of the mean.
  2. De-mean the time-series.
  3. Randomly select a 12-month chunk of returns from the time series and use the data to perform a Bayesian update to our mean annualized return.
  4. Repeat step 3 until the annualized return is no longer statistically non-zero at a 99% confidence threshold.

For each factor, we ran this test 10,000 times, creating a distribution that tells us how many years into the future we would have to wait until we were certain, from a statistical perspective, that the factor is no longer significant.

Sixty-seven years.

Based upon this experience, sixty-seven years is median number of years we will have to wait until we officially declare price-to-book (“HML,” as it is known in the literature) to be dead.3  At the risk of being morbid, we’re far more likely to die before the industry finally sticks a fork in price-to-book.

We perform this experiment for a number of other factors – including size (“SMB” – “small-minus-big”), quality (“QMJ” – “quality-minus-junk”), low-volatility (“BAB” – “betting-against-beta”), and momentum (“UMD” – “up-minus-down”) – and see much the same result.  It will take decades before sufficient evidence mounts to dethrone these factors.

HMLSMB4QMJBABUMD
Median Years-until-Failure6743132284339

 

Now, it is worth pointing out that these figures for a factor like momentum (“UMD”) might be a bit skewed due to the design of the test.  If we examine the long-run returns, we see a fairly docile return profile punctuated by sudden and significant drawdowns (often called “momentum crashes”).

Since a large proportion of the cumulative losses are contained in these short but pronounced drawdown periods, demeaning the time-series ultimately means that the majority of 12-month periods actually exhibit positive returns.  In other words, by selecting random 12-month samples, we actually expect a high frequency of those samples to have a positive return.

For example, using this process, 49.1%, 47.6%, 46.7%, 48.8% of rolling 12-month periods are positive for HML, SMB, QMJ, and BAB factors respectively.  For UMD, that number is 54.7%.  Furthermore, if you drop the worst 5% of rolling 12-month periods for UMD, the average positive period is 1.4x larger than the average negative period.  Taken together, not only are you more likely to select a positive 12-month period, but those positive periods are, on average, 1.4x larger than the negative periods you will pick, except for the rare (<5%) cases.

The process of the test was selected to incorporate the salient features of each factor.  However, in the case of momentum, it may lead to somewhat outlandish results.

Conclusion

While an evidence-based investor should be swayed by the weight of the data, the simple fact is that most factors are so well established that the majority of current practitioners will likely go our entire careers without experiencing evidence substantial enough to dismiss any of the anomalies.

Therefore, in many ways, there is a certain faith required to use them going forward. Yes, these are ideas and concepts derived from the data.  Yes, we have done our best to test their robustness out-of-sample across time, geographies, and asset classes.  Yet we must also admit that there is a non-zero probability, however small it is, that these are false positives: a fact we may not have sufficient evidence to address until several decades hence.

And so a bit of humility is warranted.  Factors will not suddenly stand up and declare themselves broken.  And those that are broken will still appear to work from time-to-time.

Indeed, the death of a factor will be more Fimulwinter than Ragnarok: not so violent to be the end of days, but enough to cause pain and frustration among investors.

 

Addendum

We have received a large number of inbound notes about this commentary, which fall upon two primary lines of questions.  We want to address these points.

How were the tests impacted by the Bayesian inference process?

The results of the tests within this commentary are rather astounding.  We did seek to address some of the potential flaws of the methodology we employed, but by-in-large we feel the overarching conclusion remains on a solid foundation.

While we only presented the results of the Bayesian inference approach in this commentary, as a check we actually tested two other approaches:

  1. A Bayesian inference approach assuming that forward returns would be a random walk with constant variance (based upon historical variance) and zero mean.
  2. Forward returns were simulated using the same bootstrap approach, but the factor was being discovered for the first time and the entire history was being evaluated for its significance.

The two tests were in effort to isolate the effects of the different components of our test.

What we found was that while the reported figures changed, the overall  magnitude did not.  In other words, the median death-date of HML may not have been 67 years, but the order of magnitude remained much the same: decades.

Stepping back, these results were somewhat a foregone conclusion.  We would not expect an effect that has been determined to be statistically significant over a hundred year period to unravel in a few years.  Furthermore, we would expect a number of scenarios that continue to bolster the statistical strength just due to randomness alone.

Why are we defending price-to-book?

The point of this commentary was not to defend price-to-book as a measure.  Rather, it was to bring up a larger point.

As a community, quantitative investors often leverage statistical significance as a defense for the way we invest.

We think that is a good thing.  We should look at the weight of the evidence.  We should be data driven.  We should try to find ideas that have proven to be robust over decades of time and when applied in different markets or with different asset classes.  We should want to find strategies that are robust to small changes in parameterization.

Many quants would argue (including us among them), however, that there also needs to be a why.  Why does this factor work?  Without the why, we run the risk of glorified data mining.  With the why, we can choose for ourselves whether we believe the effect will continue going forward.

Of course, there is nothing that prevents the why from being pure narrative fallacy.  Perhaps we have simply weaved a story into a pattern of facts.

With price-to-book, one might argue we have done the exact opposite.  The effect, technically, remains statistically significant and yet plenty of ink has been spilled as to why it shouldn’t work in the future.

The question we must answer, then, is, “when does statistically significant apply and when does it not?”  How can we use it as a justification in one place and completely ignore it in others?

Furthermore, if we are going to rely on hundreds of years of data to establish significance, how can we determine when something is “broken” if the statistical evidence does not support it?

Price-to-book may very well be broken.  But that is not the point of this commentary.  The point is simply that the same tools we use to establish and defend factors may prevent us from tearing them down.

 

Navigating Municipal Bonds With Factors

This post is available as a PDF download here.

Summary

  • In this case study, we explore building a simple, low cost, systematic municipal bond portfolio.
  • The portfolio is built using the low volatility, momentum, value, and carry factors across a set of six municipal bond sectors. It favors sectors with lower volatility, better recent performance, cheaper valuations, and higher yields.  As with other factor studies, a multi-factor approach is able to harvest major benefits from active strategy diversification since the factors have low correlations to one another.
  • The factor tilts lead to over- and underweights to both credit and duration through time. Currently, the portfolio is significantly underweight duration and modestly overweight credit.
  • A portfolio formed with the low volatility, value, and carry factors has sufficiently low turnover that these factors may have value in setting strategic allocations across municipal bond sectors.

 

Recently, we’ve been working on building a simple, ETF-based municipal bond strategy.  Probably to the surprise of nobody who regularly reads our research, we are coming at the problem from a systematic, multi-factor perspective.

For this exercise, our universe consists of six municipal bond indices:

  • Bloomberg Barclays AMT-Free Short Continuous Municipal Index
  • Bloomberg Barclays AMT-Free Intermediate Continuous Municipal Index
  • Bloomberg Barclays AMT-Free Long Continuous Municipal Index
  • Bloomberg Barclays Municipal Pre-Refunded-Treasury-Escrowed Index
  • Bloomberg Barclays Municipal Custom High Yield Composite Index
  • Bloomberg Barclays Municipal High Yield Short Duration Index

These indices, all of which are tracked by VanEck Vectors ETFs, offer access to municipal bonds across a range of durations and credit qualities.

Source: VanEck

Before we get started, why are we writing another multi-factor piece after addressing factors in the context of a multi-asset universe just two weeks ago?

The simple answer is that we find the topic to be that pressing for today’s investors.  In a world of depressed expected returns and elevated correlations, we believe that factor-based strategies have a role as both return generators and risk mitigators.

Our confidence in what we view as the premier factors (value, momentum, low volatility, carry, and trend) stems largely from their robustness in out-of-sample tests across asset classes, geographies, and timeframes.  The results in this case study not only suggest that a factor-based approach is feasible in muni investing, but also in our opinion strengthens the case for factor investing in other contexts (e.g. equities, taxable fixed income, commodities, currencies, etc.).

Constructing Long/Short Factor Portfolios

For the municipal bond portfolio, we consider four factors:

  1. Value: Buy undervalued sectors, sell overvalued sectors
  2. Momentum: Buy strong recent performers, sell weak recent performers
  3. Low Volatility: Buy low risk sectors, sell high risk sectors
  4. Carry: Buy higher yielding sectors, sell lower yielding sectors

As a first step, we construct long/short single factor portfolios.  The weight on index i at time t in long/short factor portfolio f is equal to:

In this formula, c is a scaling coefficient,  S is index i’s time t score on factor f, and N is the number of indices in the universe at time t.

We measure each factor with the following metrics:

  1. Value: Normalized deviation of real yield from the 5-year trailing average yield[1]
  2. Momentum: Trailing twelve month return
  3. Low Volatility: Historical standard deviation of monthly returns[2]
  4. Carry: Yield-to-worst

For the value, momentum, and carry factors, the scaling coefficient  is set so that the portfolio is dollar neutral (i.e. we are long and short the same dollar amount of securities).  For the low volatility factor, the scaling coefficient is set so that the volatilities of the long and short portfolios are approximately equal.  This is necessary since a dollar neutral construction would be perpetually short “beta” to the overall municipal bond market.

Data Source: Blooomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results.

All four factors are profitable over the period from June 1998 to April 2017.  The value factor is the top performer both from an absolute return and risk-adjusted return perspective.

Data Source: Blooomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. The benchmark is an equal-weight portfolio of all indices in the universe adjusted for the indices that are calibrated and included in each long/short factor index based on data availability.

 

There is significant variation in performance over time.  All four factors have years where they are the best performing factor and years where they are the worst performing factor.  The average annual spread between the best performing factor and the worst performing factor is 11.3%.

Data Source: Blooomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. The benchmark is an equal-weight portfolio of all indices in the universe adjusted for the indices that are calibrated and included in each long/short factor index based on data availability. 1998 is a partial year beginning in June 1998 and 2017 is a partial year ending in April 2017.

 

The individual long/short factor portfolios are diversified to both each other (average pairwise correlation of -0.11) and to the broad municipal bond market.

Data Source: Blooomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results.

 

Data Source: Blooomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results.

Moving From Single Factor to Multi-Factor Portfolios

The diversified nature of the long/short return streams makes a multi-factor approach hard to beat in terms of risk-adjusted returns.  This is another example of the type of strategy diversification that we have long lobbied for.

As evidence of these benefits, we have built two versions of a portfolio combining the low volatility, value, carry, and momentum factors.  The first version targets an equal dollar allocation to each factor.  The second version uses a naïve risk parity approach to target an approximately equal risk contribution from each factor.

Both approaches outperform all four individual factors on a risk-adjusted basis, delivering Sharpe Ratios of 1.19 and 1.23, respectively, compared to 0.96 for the top single factor (value).

Data Source: Bloomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. The benchmark is an equal-weight portfolio of all indices in the universe adjusted for the indices that are calibrated and included in each long/short factor index based on data availability. The factor risk parity construction uses a simple inverse volatility methodology. Volatility estimates are shrunk in the early periods when less data is available.

 

To stress this point, diversification is so plentiful across the factors that even the simplest portfolio construction methodologies outperforms an investor who was able to identify the best performing factor with perfect foresight.  For additional context, we constructed a “Look Ahead Mean-Variance Optimization (“MVO”) Portfolio” by calculating the Sharpe optimal weights using actual realized returns, volatilities, and correlations.  The Look Ahead MVO Portfolio has a Sharpe Ratio of 1.43, not too far ahead of our two multi-factor portfolios.  The approximate weights in the Look Ahead MVO Portfolio are 49% to Low Volatility, 25% to Value, 15% to Carry, and 10% to Momentum.  While the higher Sharpe Ratio factors (Low Volatility and Value) do get larger allocations, Momentum and Carry are still well represented due to their diversification benefits.

Data Source: Bloomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. The benchmark is an equal-weight portfolio of all indices in the universe adjusted for the indices that are calibrated and included in each long/short factor index based on data availability. The factor risk parity construction uses a simple inverse volatility methodology. Volatility estimates are shrunk in the early periods when less data is available.

 

From a risk perspective, both multi-factor portfolios have lower volatility than any of the individual factors and a maximum drawdown that is within 1% of the individual factor with the least amount of historical downside risk.  It’s also worth pointing out that the risk parity construction leads to a return stream that is very close to normally distributed (skew of 0.1 and kurtosis of 3.0).

Data Source: Blooomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. The benchmark is an equal-weight portfolio of all indices in the universe adjusted for the indices that are calibrated and included in each long/short factor index based on data availability. The factor risk parity construction uses a simple inverse volatility methodology. Volatility estimates are shrunk in the early periods when less data is available.

 

In the graph on the next page, we present another lens through which we can view the tremendous amount of diversification that can be harvested between factors.  Here we plot how the allocation to a specific factor, using MVO, will change as we vary that factor’s Sharpe Ratio.  We perform this analysis for each factor individually, holding all other parameters fixed at their historical levels.

As an example, to estimate the allocation to the Low Volatility factor at a Sharpe Ratio of 0.1, we:

  1. Assume the covariance matrix is equal to the historical covariance over the full sample period.
  2. Assume the excess returns for the other three factors (Carry, Momentum, and Value) are equal to their historical averages.
  3. Assume the annualized excess return for the Low Volatility factor is 0.16% so that the Sharpe Ratio is equal to our target of 0.1 (Low Volatility’s annualized volatility is 1.6%).
  4. Calculate the MVO optimal weights using these excess return and risk assumptions.

Data Source: Blooomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. The benchmark is an equal-weight portfolio of all indices in the universe adjusted for the indices that are calibrated and included in each long/short factor index based on data availability. The factor risk parity construction uses a simple inverse volatility methodology. Volatility estimates are shrunk in the early periods when less data is available.

 

As expected, Sharpe Ratios and allocation sizes are positively correlated.  Higher Sharpe Ratios lead to higher allocations.

That being said, three of the factors (Low Volatility, Carry, and Momentum) would receive allocations even if their Sharpe Ratios were slightly negative.

The allocations to carry and momentum are particularly insensitive to Sharpe Ratio level.  Momentum would receive an allocation of 4% with a 0.00 Sharpe, 9% with a 0.25 Sharpe, 13% with a 0.50 Sharpe, 17% with a 0.75 Sharpe, and 20% with a 1.00 Sharpe.  For the same Sharpe Ratios, the allocations to Carry would be 10%, 15%, 19%, 22%, and 24%, respectively.

Holding these factors provides a strong ballast within the multi-factor portfolio.

Moving From Long/Short to Long Only

Most investors have neither the space in their portfolio for a long/short muni strategy nor sufficient access to enough affordable leverage to get the strategy to an attractive level of volatility (and hence return).  A more realistic approach would be to layer our factor bets on top of a long only strategic allocation to muni bonds.

In a perfect world, we could slap one of our multi-factor long/short portfolios right on top of a strategic municipal bond portfolio.  The results of this approach (labeled “Benchmark + Equal Weight Factor Long/Short” in the graphics below) are impressive (Sharpe Ratio of 1.17 vs. 0.93 for the strategic benchmark and return to maximum drawdown of 0.72 vs. 0.46 for the strategic benchmark).  Unfortunately, this approach still requires just a bit of shorting. The size of the total short ranges from 0% to 19% with an average of 5%.

We can create a true long only portfolio (“Long Only Factor”) by removing all shorts and normalizing so that our weights sum to one.  Doing so modestly reduces risk, return, and risk-adjusted return, but still leads to outperformance vs. the benchmark.

Data Source: Blooomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. The benchmark is an equal-weight portfolio of all indices in the universe adjusted for the indices that are calibrated and included in each long/short factor index based on data availability.

 

Data Source: Blooomberg. Calculations by Newfound Research. All returns are hypothetical and backtested. Returns reflect the reinvestment of all distributions and are gross of all fees (including any management fees and transaction costs). The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. The benchmark is an equal-weight portfolio of all indices in the universe adjusted for the indices that are calibrated and included in each long/short factor index based on data availability.

 

Below we plot both the historical and current allocations for the long only factor portfolio.  Currently, the portfolio would have approximately 25% in each short-term investment grade, pre-refunded, and short-term high yield with the remaining 25% split roughly 80/20 between high yield and intermediate-term investment grade. There is currently no allocation to long-term investment grade.

Data Source: Blooomberg. Calculations by Newfound Research. All allocations are backtested and hypothetical. The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results.

 

Data Source: Blooomberg. Calculations by Newfound Research. All allocations are backtested and hypothetical. The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results.

 

A few interesting observations relating to the long only portfolio and muni factor investing in general:

  1. The factor tilts lead to clear duration and credit bets over time.  Below we plot the duration and a composite credit score for the factor portfolio vs. the benchmark over time.

    Data source: Calculations by Newfound Research. All allocations are backtested and hypothetical. The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. Weighted average durations are estimated using current constituent durations.

    Data source: Calculations by Newfound Research. All allocations are backtested and hypothetical. The hypothetical indices start on June 30, 1998. The start date was chosen based on data availability of the underlying indices and the time necessary to calibrate the factor models. Data is through April 30, 2017. The portfolios are reconstituted monthly. Past performance does not guarantee future results. Weighted average credit scores are estimated using current constituent credit scores. Credit scores use S&P’s methodology to aggregate scores based on the distribution of credit scores of individual bonds.

    Currently, the portfolio is near an all-time low in terms of duration and is slightly titled towards lower credit quality sectors relative to the benchmark.  Historically, the factor portfolio was most often overweight both duration and credit, having this positioning in 53.7% of the months in the sample.  The second and third most common tilts were underweight duration / underweight credit (22.0% of sample months) and underweight duration / overweight credit (21.6% of sample months).  The portfolio was overweight duration / underweight credit in only 2.6% of sample months.

  2. Even for more passive investors, a factor-based perspective can be valuable in setting strategic allocations.  The long only portfolio discussed above has annualized turnover of 77%.  If we remove the momentum factor, which is by far the biggest driver of turnover, and restrict ourselves to a quarterly rebalance, we can reduce turnover to just 18%.  This does come at a cost, as the Sharpe Ratio drops from 1.12 to 1.04, but historical performance would still be strong relative to our benchmark. This suggests that carry, value, and low volatility may be valuable in setting strategic allocations across municipal bond ETFs with only periodic updates at a normal strategic rebalance frequency.
  3. We ran regressions with our long/short factors on all funds in the Morningstar Municipal National Intermediate category with a track record that extended over our full sample period from June 1998 to April 2017.  Below, we plot the betas of each fund to each of our four long/short factors.  Blue bars indicate that the factor beta was significant at a 5% level.  Gray bars indicate that the factor beta was not significant at a 5% level.  We find little evidence of the active managers following a factor approach similar to what we outline in this post.  Part of this is certainly the result of the constrained nature of the category with respect to duration and credit quality.  In addition, these results do not speak to whether any of the managers use a factor-based approach to pick individual bonds within their defined duration and credit quality mandates.

    Data source: Calculations by Newfound Research. Analysis over the period from June 1998 to April 2017.

    The average beta to the low volatility factor, ignoring non-statistically significant values, is -0.23.  This is most likely a function of category since the category consists of funds with both investment grade credit quality and durations ranging between 4.5 and 7.0 years.  In contrast, our low volatility factor on average has short exposure to the intermediate and long-term investment grade sectors.

    Data source: Calculations by Newfound Research. Analysis over the period from June 1998 to April 2017.

    Only 14 of the 33 funds in the universe have statistically significant exposure to the value factor with an average beta of -0.03.

    Data source: Calculations by Newfound Research. Analysis over the period from June 1998 to April 2017.

    The average beta to the carry factor, ignoring non-statistically significant values, is -0.23.  As described above with respect to low volatility, this is most likely function of category as our carry factor favors the long-term investment grade and high yield sectors.

    Data source: Calculations by Newfound Research. Analysis over the period from June 1998 to April 2017.

    Only 9 of the 33 funds in the universe have statistically significant exposure to the momentum factor with an average beta of 0.02.

Conclusion

Multi-factor investing has generated significant press in the equity space due to the (poorly named) “smart beta” movement.  The popular factors in the equity space have historically performed well both within other asset classes (rates, commodities, currencies, etc.) and across asset classes.  The municipal bond market is no different.  A simple, systematic multi-factor process has the potential to improve risk-adjusted performance relative to static benchmarks.  The portfolio can be implemented with liquid, low cost ETFs.

Moving beyond active strategies, factors can also be valuable tools when setting strategic sector allocations within a municipal bond sleeve and when evaluating and blending municipal bond managers.

Perhaps more importantly, the out-of-sample evidence for the premier factors (momentum, value, low volatility, carry, and trend) across asset classes, geographies, and timeframes continues to mount.  In our view, this evidence can be crucial in getting investors comfortable to introducing systematic active premia into their portfolios as both return generators and risk mitigators.

 

[1] Computed using yield-to-worst.  Inflation estimates are based on 1-year and 10-year survey-based expected inflation.  We average the value score over the last 2.5 years, allowing the portfolio to realize a greater degree of valuation mean reversion before closing out a position.

[2] We use a rolling 5-year (60-month) window to calculate standard deviation.  We require at least 3 years of data for an index to be included in the low volatility portfolio.  The standard deviation is multiplied by -1 so that higher values are better across all four factor scores.

 

 

What are Growth and Value?

This commentary is available as a PDF here.

SUMMARY

  • Growth and value have intuitive definitions, but there are many ways to quantify each.
  • As with broad factors, such as value, momentum, and dividend growth, the specific metrics used to describe growth and value may fall in and out of favor, depending on the market environment.
  • Taking a diversified approach to quantifying value and growth can lead to more consistent performance over time.

In our commentary a few weeks ago, we pointed out a key flaw that many index providers have in their growth and value style indices. The industry norm lumps “low value” in with “growth” and “low growth” in with “value” when, in reality, growth and value are independent characteristics of companies. The result is that many of the growth and value ETFs that track these indices are not giving investors what they expect – or what they want.

Final index construction aside, let’s go down to a more fundamental level: what are growth and value in the first place, and how do we measure them?

Intuitively, growth refers to companies that are growing and expected to continue, and value refers to companies that are currently cheap relative to their fair price.

Simple enough.

But a quick survey of index providers finds that the characteristics they use to measure a stock’s growth and value characteristics vary across the board:

Growth Characteristics:

  • Long-term forward earnings per share growth (EPS) rate (CRSP, MSCI, Russell)
  • Short-term forward EPS growth rate (CRSP, MSCI)
  • Current internal growth rate (MSCI)
  • Long-term historical EPS growth trend (CRSP, MSCI, S&P)
  • Long-term historical sales per share growth trend (CSRP, MSCI, Russell, S&P)
  • 12-month price change (S&P)
  • Investment-to-assets ratio (CRSP)
  • Return on assets, ROA (CRSP)

Value Characteristics:

  • Book-to-price ratio (CRSP, MSCI, S&P, Russell)
  • Forward earning to price ratio (CRSP, MSCI)
  • Earnings-to-price ratio (CRSP, S&P)
  • Sales-to-price ratio (CRSP, S&P)
  • Dividend yield (CRSP, MSCI)

Only one metric on each list is common to all four index providers (Sales per share growth trend for growth and book-to-price ratio for value).

So who is right?

We can test the performance of many of these metrics using data readily available online. The forward-looking growth data are more difficult to find historically, but general financial statement data is available on Morningstar’s website.

To keep matters simple, we will look at three metrics for each of growth and value. For growth: 3-year EPS growth, 3-year sales per share growth, and ROA. For value: the P/E, P/S, and P/B ratios.

And to keep things as realistic as possible, we will evaluate the stocks in the S&P 500 as they stood at the end of 2014. Relative to the current set of companies in the S&P 500, we added back in some companies that dropped out of the S&P 500 (mainly energy and materials companies) in 2015. Some mergers and acquisitions also make getting data for the companies more difficult. For example, Covidien was bought by Medtronic, AT&T bought DirecTV, and Kraft merged with Heinz. Since we will be focusing on relative performance differences rather than on absolute ones, we will simply reconstruct a proxy S&P 500 index using the data that is available. In all, our universe contains 481 companies.

Using the fundamental data from December 2014, we can sort based on each metric and select the top 160 companies (about one-third of the universe) and see how that “value” or “growth” portfolio would have performed in 2015. Within each portfolio, we equally weight for simplicity. Results are compared to an equal-weight benchmark to control for any out or underperformance arising from the equal-weight allocation methodology as opposed to stock selection.

There is significant variation during the year depending on which metric was used.

Growth portfolios

Source: Data from Yahoo! Finance and Morningstar, calculations by Newfound

Value portfolios

Source: Data from Yahoo! Finance and Morningstar, calculations by Newfound

For growth, all of the portfolios tracked each other until mid-March when the portfolio formed on sales growth began to diverge. The portfolios formed on EPS growth and ROA continued to track each other until mid-June. At this time, ROA rallied hard, eclipsing the sales growth portfolio in the 4th quarter of 2015.

On the value front, the P/S ratio led through most of the year before falling back to the pack in the Fall. The P/E and P/B portfolios ended the year in very similar places, with the P/S portfolio eking out a ~65bp benefit over the other two portfolios.

 

Which Metric to Choose

One year is hardly enough data to make a sound judgment as to which metric is the best for selecting growth and value stocks. As we have said many times before, even though we may know a factor (e.g. value) has outperformed in the past and is likely to do so in the future based on behavioral evidence, stating whether that factor will outperform in any given year is tough.

Likewise, deciding which measure of a factor will outperform in a given year is also difficult. Even with value companies, a metric like P/E ratio may not work well when companies with strong sales experience short-term earnings shocks or when companies are able to inflate earnings based on accounting allowances. The P/B ratio may not work well in periods when service oriented companies, which rely on intangible human capital as a large driver of growth, are being rewarded in the market.

Let’s take a closer look at some popular ways of quantifying the value factor.

“Value”, as it stands in academic literature, is commonly measured using the P/B ratio. This is what the famous Fama-French Three Factor Model uses as its basis for calculating the value factor, high-minus-low (HML).

However, using data from Kenneth French going back to 1951, we can see that, for long-only portfolios, those formed both on P/E and P/S actually beat the portfolio formed on P/B both on an absolute and risk-adjusted basis.

table

Furthermore, AQR showed in their 2014 paper, “The Devil in HML’s Details,” that not only does the metric matter, but the method of calculating the metric matters, as well. While Fama and French calculated HML using book value data that was lagged by 6 months to ensure that data would be available, they also lagged price data by the same amount. The AQR paper proposed using the most recent price data for calculating P/B ratios and showed that their method was superior to the standard lagged-price method because using more current price data better captures the relationship between value and momentum.

The P/S and P/E ratios used in the table above are also calculated using lagged price data. Based on AQR’s research, we expect that those results might also be improved by using the current price data.

 

Different Measures of Factors May Ebb and Flow

We should be careful not to rush to judgment though. The fact that P/B has underperformed the other value metrics does not mean we should drop it entirely. It is helpful to remember that individual factors can go through periods of significant underperformance. The same is true for different ways of measuring a single factor. For example, over rolling 12-month periods, the return difference between portfolios formed portfolio on P/B, P/S, and P/E – all “value” metrics – has often been in excess of 2000bp!

Put bluntly: your mileage may vary dramatically depending on which value metric you choose.

Portfolios ranges

Source: Data from Kenneth French Data Library, calculations by Newfound

With our 2015 example, we saw that P/S resulted in the best performing portfolio, but as we said before, different measures tend to cycle unpredictably. We can see which ones have been in favor historically by comparing each individual portfolio to the average of all three portfolios.

Single factor

Source: Data from Kenneth French Data Library, calculations by Newfound

The fact that many index providers combine multiple metrics into a composite growth or value score is an acknowledgement of this unpredictability.

Averaging the different value portfolios would have led to a fraction of outperforming periods on par with the best individual portfolios, higher average outperformance than the P/S portfolio, and lower average underperformance than all three individual portfolios.

One year periods

Rolling perf

Source: Data from Kenneth French Data Library, calculations by Newfound

If you read our previous commentary about multi-factor portfolio construction, you’ll notice that the averaging we did above is approach #1 (the “or” method). In effect, we are investing in companies that have either low P/S, P/B, or P/E ratios. One way to implement this would be to form portfolios based on each metric and then average the allocations into a final value portfolio.

In practice, most index providers score companies based on each selected metric, normalize the scores, and then average them (sometimes using different weightings). The portfolio is then formed using this composite score. This is more in line with approach #2 from the commentary (the “and” approach), which favors companies that have some degree of combined strength across multiple metrics.

While we used value and momentum in the commentary to illustrate why using the “and” approach is problematic in multi-factor portfolios, using this approach isn’t as bad when attempting to identify a single factor. The problem with value and momentum stemmed from the difference in time that each factor took to mature. Using the “and” approach introduced drag from the shorter maturity factor.

If there is no convincing argument that an individual growth or value measure takes longer to mature than another (for instance, does P/S normalize faster than P/B), then taking the “and” approach is not likely to result in a worse outcome. In this case, where we are simply trying to identify growth or value, we care more about the predictive nature of each metric that goes into forming the portfolio.

The index providers vary considerably in regards to what characteristics they look at and how they weight them to arrive at a final portfolio. If you believe that the P/B ratio is the best determinant of company value then you will get the purest exposure with Russell. If you think return on assets is an important contributing factor to company growth, CRSP’s index will be more in line with your view.

However, if you are like us and concede that while there are many ways to quantify growth and value, no one method can outperform over every single period, a diversified approach may be your best option.

Powered by WordPress & Theme by Anders Norén