The Research Library of Newfound Research

Month: January 2019

Tightening the Uncertain Payout of Trend-Following

This post is available as a PDF download here.

Summary­

  • Long/flat trend-following strategies have historically delivered payout profiles similar to those of call options, with positive payouts for larger positive underlying asset returns and slightly negative payouts for near-zero or negative underlying returns.
  • However, this functional relationship contains a fair amount of uncertainty for any given trend-following model and lookback period.
  • In portfolio construction, we tend to favor assets that have a combination of high expected returns or diversifying return profiles.
  • Since broad investor behavior provides a basis for systematic trend-following models to have positive expected returns, taking a multi-model approach to trend-following can be used to reduce the variance around the expected payout profile.

Introduction

Over the past few months, we have written much about model diversification as a tactic for managing specification risk, even with specific case studies. When we consider the three axes of diversification, model diversification pertains to the “how” axis, which focuses on strategies that have the same overarching objective but go about achieving it in different ways.

Long/flat trend-following, especially with equity investments, aims to protect capital on the downside while maintaining participation in positive markets. This leads to a payout profile that looks similar to that of a call option.1

However, while a call option offers a defined payout based on the price of an underlying asset and a specific maturity date, a trend-following strategy does not provide such a guarantee. There is a degree of uncertainty.

The good news is that uncertainty can potentially be diversified given the right combinations of assets or strategies.

In this commentary, we will dive into a number of trend-following strategies to see what has historically led to this benefit and the extent that diversification would reduce the uncertainty around the expected payoff.

Diversification in Trend-Following

The justification for a multi-model approach boils down to a simple diversification argument.

Say you would like to include trend-following in a portfolio as a way to manage risk (e.g. sequence risk for a retiree). There is academic and empirical evidence that trend-following works over a variety of time horizons, generally ranging from 3 to 12 months. And there are many ways to measure trends, such as moving average crossovers, trailing returns, deviations from moving averages, risk adjusted returns, etc.

The basis for deciding ex-ante which variant will be the best over our own investment horizon is tenuous at best. Backtests can show one iteration outperforming over a given time horizon, but most of the differences between strategies are either noise from a statistical point of view or realized over a longer time period than any investor has the lifespan (or mettle) to endure.

However, we expect each one to generate positive returns over a sufficiently long time horizon. Whether this is one year, three years, five years, 10 years, 50 years… we don’t know. What we do know is that out of the multitude the variations of trend-following, we are very likely to pick one that is not the best or even in the top segment of the pack in the short-term.

From a volatility standpoint, when the strategies are fully invested, they will have volatility equal to the underlying asset. Determining exactly when the diversification benefits will come in to play – that is, when some strategies are invested and others are not – is a fool’s errand.

Modern portfolio theory has done a disservice in making correlation seem like an inherent trait of an investment. It is not.

Looking at multiple trend-following strategies that can coincide precisely for stretches of time before behaving completely differently from each other, makes many portfolio construction techniques useless.  We do not expect correlation benefits to always be present.  These are nonlinear strategies, and fitting them into a linear world does not make sense.

If you have pinned up ReSolve Asset Management’s flow chart of portfolio choice above your desk (from Portfolio Optimization: A General Framework for Portfolio Choice), then the decision on this is easy.

Source: ReSolve Asset Management.  Reprinted with permission

From this simple framework, we can break the different performance regimes down as follows:

The Math Behind the Diversification

The expected value of a trend-following strategy can be thought of as a function of the underlying security return:

Where the subscript i is used to indicate that the function is dependent on the specific trend-following strategy.

If we combine multiple trend-following strategies into a portfolio, then the expectation is the average of these functions (assuming an equal weight portfolio per the ReSolve chart above):

What’s left to determine is the functional form of f.

Continuing in the vein of the call option payoff profile, we can use the Black-Scholes equation as the functional form (with the risk-free rate set to 0). This leaves three parameters with which to fit the formula to the data: the volatility (with the time to expiration term lumped in, i.e. sigma * sqrt(T-t)), the strike, and the initial cost of the option.

where d1 and d2 are defined in the standard fashion and N is the cumulative normal distribution function.

rK is the strike price in the option formula expressed as a percent relative to the current value of the underlying security.

In the following example, we will attempt to provide some meaning to the fitted parameters. However, keep in mind that any mapping is not necessarily one-to-one with the option parameters. The functional form may apply, but the parameters are not ones that were set in stone ex-ante.2

An Example: Trend-Following on the S&P 500

As an example, we will consider a trend-following model on the S&P 500 using monthly time-series momentum with lookback windows ranging from 4 to 16 months. The risk-free rate was used when the trends were negative.

The graph below shows an example of the option price fit to the data using a least-squares regression for the 15-month time series momentum strategy using rolling 3-year returns from 1927 to 2018.

Source: Global Financial Data and Kenneth French Data Library. Calculation by Newfound. Returns are backtested and hypothetical. Returns assume the reinvestment of all distributions. Returns are gross of all fees. None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary. You cannot invest in an index.

The volatility parameter was 9.5%, the strike was 2.3%, and the cost was 1.7%.

What do these parameters mean?

As we said before this can be a bit tricky. Painting in broad strokes:

  • The volatility parameter describes how “elbowed” payoff profile is. Small values are akin to an option close to expiry where the payoff profile changes abruptly around the strike price. Larger values yield a more gentle change in slope.
  • The strike represents the point at which the payoff profile changes from participation to protection using trend-following lingo. In the example where the strike is 2.3%, this means that the strategy would be expected to start protecting capital when the S&P 500 return is less than 2.3%. There is some cost associated with this value being high.
  • The cost is the vertical shift of the payoff profile, but it is not good to think of it as the insurance premium of the trend-following strategy. It is only one piece. To see why this is the case, consider that the fitted volatility may be large and that the option price curve may be significantly above the final payout curve (i.e. if the time-scaled volatility went to zero).

So what is the actual “cost” of the strategy?

With trend-following, since whipsaw is generally the largest potential detractor, we will look at the expected return on the strategy when the S&P 500 is flat, that is, an absence of an average trend. It is possible for the cost to be negative, indicating a positive expected trend-following return when the market was flat.

Looking at the actual fit of the data from a statistical perspective, the largest deviations from the expected value (the residuals from the regression) are seen during large positive returns for the S&P 500, mainly coming out of the Great Depression. This characteristic of individual trend-following models is generally attributable to the delay in getting back into the market after a prolonged, severe drawdown due to the time it takes for a new positive trend to be established.

Source: Global Financial Data and Kenneth French Data Library. Calculation by Newfound. Returns are backtested and hypothetical. Returns assume the reinvestment of all distributions. Returns are gross of all fees. None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary. You cannot invest in an index.

Part of the seemingly large number of outliers is simply due to the fact that these returns exhibit autocorrelation since the periods are rolling, which means that the data points have some overlap. If we filtered the data down into non-overlapping periods, some of these outliers would be removed.

The outliers that remain are a fact of trend-following strategies. While this fact of trend-following cannot be totally removed, some of the outliers may be managed using multiple lookback periods.

The following chart illustrates the expected values for the trend-following strategies over all the lookback periods.

Source: Global Financial Data and Kenneth French Data Library. Calculation by Newfound. Returns are backtested and hypothetical. Returns assume the reinvestment of all distributions. Returns are gross of all fees. None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary. You cannot invest in an index.

The shorter-term lookback windows have the expected value curves that are less horizontal on the left side of the chart (higher volatility parameter).

As we said before the cost of the trend-following strategy can be represented by the strategy’s expected return when the S&P 500 is flat. This can be thought of as the premium for the insurance policy of the trend-following strategies.

Source: Global Financial Data and Kenneth French Data Library. Calculation by Newfound. Returns are backtested and hypothetical. Returns assume the reinvestment of all distributions.  Returns are gross of all fees. None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary. You cannot invest in an index.

The blend does not have the lowest cost, but this cost is only one part of the picture. The parameters for the expected value functions do nothing to capture the distribution of the data around – either above or below – these curves.

The diversification benefits are best seen in the distribution of the rolling returns around the expected value functions.

Source: Global Financial Data and Kenneth French Data Library. Calculation by Newfound. Returns are backtested and hypothetical. Returns assume the reinvestment of all distributions. Returns are gross of all fees. None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary. You cannot invest in an index.

Now with a more comprehensive picture of the potential outcomes, a cost difference of even 3% is less than one standard deviation, making the blended strategy much more robust to whipsaw for the potential range of S&P 500 returns.

As a side note, the cost of the short window (4 and 5 month) strategies is relatively high. However, since there are many rolling periods when these models are the best performing of the group, there can still be a benefit to including them. With them in the blend, we still see a reduction in the dispersion around the expected value function.

Expanding the Multitude of Models

To take the example even further down the multi-model path, we can look at the same analysis for varying lookback windows for a price-minus-moving-average model and an exponentially weighted moving average model.

Source: Global Financial Data and Kenneth French Data Library. Calculation by Newfound. Returns are backtested and hypothetical. Returns assume the reinvestment of all distributions. Returns are gross of all fees. None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary. You cannot invest in an index.

Source: Global Financial Data and Kenneth French Data Library. Calculation by Newfound. Returns are backtested and hypothetical. Returns assume the reinvestment of all distributions. Returns are gross of all fees. None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary. You cannot invest in an index.

And finally, we can combine all three trend-following measurement style blends into a final composite blend.

Source: Global Financial Data and Kenneth French Data Library. Calculation by Newfound. Returns are backtested and hypothetical. Returns assume the reinvestment of all distributions. Returns are gross of all fees. None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary. You cannot invest in an index.

As with nearly every study on diversification, the overall blend is not the best by all metrics. In this case, its cost is higher than the EWMA blended model and its dispersion is higher than the TS blended model. But it exhibits the type of middle-of-the-road characteristics that lead to results that are robust to an uncertain future.

Conclusion

Long/flat trend-following strategies have payoff profiles similar to call options, with larger upsides and limited downsides. Unlike call options (and all derivative securities) that pay a deterministic amount based on the underlying securities prices, the payoff of a trend-following strategy is uncertain,

Using historical data, we can calculate the expected payoff profile and the dispersion around it. We find that by blending a variety of trend-following models, both in how they measure trend and the length of the lookback window, we can often reduce the implied cost of the call option and the dispersion of outcomes.

A backtest of an individual trend-following model can look the best over a given time period, but there are many factors that play into whether that performance will be valid going forward. The assets have to behave similarly, potentially both on an absolute and relative basis, and an investor has to hold the investment for a long enough time to weather short-term underperformance.

A multi-model approach can address both of these.

It will reduce the model specification risk that is present ex-ante. It will not pick the best model, but then again, it will not pick the worst.

From an investor perspective, this diversification reduces the spread of outcomes which can lead to an easier product to hold as a long-term investment. Diversification among the models may not always be present (i.e. when style risk dominates and all trend-following strategies do poorly), but when it is, it reduces the chance of taking on uncompensated risks.

Taking on compensated risks is a necessary part of investing, and in the case of trend-following, the style risk is something we desire. Removing as many uncompensated risks as possible leads to more pure forms of this style risk and strategies that are robust to unfavorable specifications.

Drawdowns and Portfolio Longevity

This post is available as a PDF download here.

Summary­

  • While retirement planning is often performed with Monte Carlo simulations, investors only experience a single path.
  • Large or prolonged drawdowns early in retirement can have a significant impact upon the probability of success.
  • We explore this idea by simulation returns of a 60/40 portfolio and measuring the probability of portfolio failure based upon a quantitative measure of risk called the Ulcer Index.
  • We find that a high Ulcer Index reading early in an investor’s retirement can dramatically increase the probability of failure as well as decrease the expected longevity of a portfolio.

Introduction

At Newfound we often say, “while other asset managers focus on alpha, our first focus is on risk.”

Not that there is anything wrong with the pursuit of alpha.  We’d argue that the pursuit of alpha is actually a necessary component for well-functioning financial markets.

It’s simply that we have never met a financial advisor who has built a financial plan that assumed any sort of alpha.  Alpha is great if we can harvest it, but the empirical evidence suggesting how difficult that can be (both for the manager net-of-fees as well as the investor behaviorally) would make the presumption of achieving alpha rather bold.

Furthermore, alpha is a zero-sum game: we can’t all plan for it.

Risk, however, is a crucial element of every investor’s plan.  Bearing too little risk can lead to a portfolio that “fails slowly,” falling short of achieving the escape velocity required to outpace inflation.  Bearing too much risk, however, can lead to sudden and catastrophic ruin: a case of “failing fast.”

When investors hit retirement, the usual portfolio math changes.  While we’re taught in Finance 101 that the order of returns does not matter, the introduction of portfolio withdrawals makes the order of returns a large determinant of plan success.  This phenomenon is known as “sequence risk” and it peaks in the years just before and after retirement.

Typically, we look at returns through the lens of the investment.  In retirement, however, what really matters is the returns of the investor.

We’re often told that our primitive brain, trained on the African veldt, is unsuited for investing.  Yet our brain seems to understand quite well that we do not get to live our lives as the average of a Monte Carlo simulation.

If we lose our arm to a lion because we did not flee when we heard a rustle in the bushes, we do not end up with half of an arm because of all the other parallel universes where we did flee.  On the timeline we live, the situation is binary.

As investors, the same is true.  We live but a single path and there are very real, very permanent knock-out conditions we need to be aware of.  Prolonged and significant drawdowns during the first years of retirement rank among the most dangerous.

Drawdowns and the Risk of Ruin

A retirement plan typically establishes a safe withdrawal rate.  This is the amount of inflation-adjusted money an investor can withdraw from their portfolio every year and still retain a sufficiently high probability that they will not run out of money before they die.

A well-established (albeit controversial) rule is that 4% of an investor’s portfolio level at retirement is usually an appropriate withdrawal amount.  For example, if an investor retires with a $1,000,000 portfolio, they can theoretically safely withdraw $40,000 a year.  Another way to think of this is that the portfolio reflects 25 years of spending assuming growth matches inflation.

The problem with portfolio drawdowns is that the withdrawal rate now reflects a larger proportion of capital unless it is commensurately adjusted downward.  For example, if the portfolio falls to $700,000, a $40,000 withdrawal is now 5.7% of capital and the portfolio reflects just 17.5 years of spending units.

Even shallow, prolonged drawdowns can have a damaging effect.  If the portfolio falls to $900,000 and stays stagnant for the next five years, the $40,000 withdrawals grow from representing 4% of the portfolio to nearly 5.5% of the portfolio.  If we do not adjust the withdrawal, at five years into retirement we have gone from 25 spending units to 18.5, losing a year and a half of portfolio longevity.

As sudden and steep drawdowns can be just as damaging as shallow and prolonged ones, we prefer to use a quantitative measure known as the Ulcer Index to measure this risk.  Specifically, the Ulcer Index is calculated as the root mean square of monthly drawdowns, capturing both severity and duration simultaneously.

In an effort to demonstrate the damaging impact of drawdowns early in retirement, we will run the following experiment:

  • Generate 250,000 simulations, each block-bootstrapped from monthly real U.S. equity and real U.S. 5-year Treasury bond returns from 1918 – 2018.
  • Assume a 65 year old investor with a $1,000,000 starting portfolio and a fixed real $3,333 withdrawal monthly ($40,000 annual).
  • Assume the investor holds a 60/40 portfolio at all times.
  • For each simulation:
    • Calculate the Ulcer Index of the first five years of portfolio returns (ignoring withdrawals).
    • Determine how many years until the portfolio runs out of money.

Based upon this data, below we plot the probability of failure – i.e. the probability we run out of money before we die – given an assumed age of death as well as the Ulcer Index realized by the portfolio in the first five years of retirement.

As an example of how to read this graph, consider the darkest blue line in the middle of the graph, which reflects an assumed age of death of 84.  Along the x-axis are different bins of Ulcer Index levels, with lower numbers reflecting fewer and less severe drawdowns, while higher numbers reflect steeper and more frequent ones.

As we trace the line, we can see that the probability of failure – i.e. running out of money before death – increases dramatically as the Ulcer Index increases.  While for shallow and infrequent drawdowns the probability of failure is <5%, we can see that the probability approaches 50% for more severe, frequent losses.

Beyond the binary question of failure, it is also important to consider when a portfolio runs out of money relative to when we die.  Below we plot how many years prior to death a portfolio runs out of money, on average, based upon the Ulcer Index.

Once again using the darkest blue line as an example, we can see that for most minor-to-moderate Ulcer Index levels, the portfolio would only run out of money a year or two before we die in the case of failure.  For more extreme losses, however, the portfolio can run out of money a full decade before we kick the bucket.

It is worth stressing here that these Ulcer Index readings are derived using simulations based upon prior realized U.S. equity and fixed income returns.  In other words, while improbable (see the histogram below), extreme readings are not impossible.

It is worth further acknowledging that U.S. assets have experienced some of the highest realized risk premia in the world, and more conservative estimates may put a higher probability mass on more extreme Ulcer Index readings.

Conclusion

For early retirees, large or prolonged drawdowns early in retirement can have a significant impact on the probability of success.

In this commentary, we capture both the depth and duration of drawdowns using a single metric known as the Ulcer Index.  We simulate 250,000 possible return paths for a 60/40 portfolio and calculate the Ulcer Index in the first five years of returns.  We then plot the probability of failure as well as expected portfolio longevity conditional upon the Ulcer Index level realized.

We clearly see a positive relationship between failure and Ulcer Index, with larger and more prolonged drawdowns earlier in retirement leading to a higher probability of failure.  This phenomenon is precisely why investors tend to de-risk their portfolios over time.

While the right risk profile and a well-diversified portfolio make for a strong foundation, we believe that investors should also consider expanding their investment palette to include alternative assets and style premia that may be more defensive oriented in nature.  For example, defensive equities (e.g. low-volatility and quality approaches) have historically demonstrated an ability to reduce drawdown risk.  Diversified, multi-asset style premia also tend to exhibit low correlation to traditional risk factors and a low intrinsic style premia.

Here at Newfound, we focus on trend equity strategies, which seek to overlay trend-following approaches on top of equity exposures in an effort to reduce left-tail risk and create a higher quality of return profile.

However, an investor chooses to build their portfolio, however, it should be risk that is on the forefront of their mind.

Fragility Case Study: Dual Momentum GEM

This post is available as a PDF download here.

Summary­

  • Recent market volatility has caused many tactical models to make sudden and significant changes in their allocation profiles.
  • Periods such as Q4 2018 highlight model specification risk: the sensitivity of a strategy’s performance to specific implementation decisions.
  • We explore this idea with a case study, using the popular Dual Momentum GEM strategy and a variety of lookback horizons for portfolio formation.
  • We demonstrate that the year-to-year performance difference can span hundreds, if not thousands, of basis points between the implementations.
  • By simply diversifying across multiple implementations, we can dramatically reduce model specification risk and even potentially see improvements in realized metrics such as Sharpe ratio and maximum drawdown.

Introduction

Among do-it-yourself tactical investors, Gary Antonacci’s Dual Momentum is the strategy we tend to see implemented the most.  The Dual Momentum approach is simple: by combining both relative momentum and absolute momentum (i.e. trend following), Dual Momentum seeks to rotate into areas of relative strength while preserving the flexibility to shift entirely to safety assets (e.g. short-term U.S. Treasury bills) during periods of pervasive, negative trends.

In our experience, the precise implementation of Dual Momentum tends to vary (with various bells-and-whistles applied) from practitioner to practitioner.  The most popular benchmark model, however, is the Global Equities Momentum (“GEM”), with some variation of Dual Momentum Sector Rotation (“DMSR”) a close second.

Recently, we’ve spoken to several members in our extended community who have bemoaned the fact that Dual Momentum kept them mostly aggressively positioned in Q4 2018 and signaled a defensive shift at the beginning of January 2019, at which point the S&P 500 was already in a -14% drawdown (having peaked at over -19% on December 24th).  Several DIYers even decided to override their signal in some capacity, either ignoring it entirely, waiting a few days for “confirmation,” or implementing some sort of “half-and-half” rule where they are taking a partially defensive stance.

Ignoring the fact that a decision to override a systematic model somewhat defeats the whole point of being systematic in the first place, this sort of behavior highlights another very important truth: there is a significant gap of risk that exists between the long-term supporting evidence of an investment style (e.g. momentum and trend) and the precise strategy we attempt to implement with (e.g. Dual Momentum GEM).

At Newfound, we call that gap model specification risk.  There is significant evidence supporting both momentum and trend as quantitative styles, but the precise means by which we measure these concepts can lead to dramatically different portfolios and outcomes.  When a portfolio’s returns are highly sensitive to its specification – i.e. slight variation in returns or model parameters lead to dramatically different return profiles – we label the strategy as fragile.

In this brief commentary, we will use the Global Equities Momentum (“GEM”) strategy as a case study in fragility.

Global Equities Momentum (“GEM”)

To implement the GEM strategy, an investor merely needs to follow the decision tree below at the end of each month.

From a practitioner stand-point, there are several attractive features about this model.  First, it is based upon the long-run evidence of both trend-following and momentum.  Second, it is very easy to model and generate signals for.  Finally, it is fairly light-weight from an implementation perspective: only twelve potential rebalances a year (and often much less), with the portfolio only holding one ETF at a time.

Despite the evidence that “simple beats complex,” the simplicity of GEM belies its inherent fragility.  Below we plot the equity curves for GEM implementations that employ different lookback horizons for measuring trend and momentum, ranging from 6- to 12-months.

Source: CSI Analytics.  Calculations by Newfound Research.  Returns are backtested and hypothetical.  Returns assume the reinvestment of all distributions.  Returns are gross of all fees except for underlying ETF expense ratios.  None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary.  You cannot invest in an index.

We can see a significant dispersion in potential terminal wealth.  That dispersion, however, is not necessarily consistent with the notion that one formation period is inherently better than another.  While we would argue, ex-ante, that there should be little performance difference between a 9-month and 10-month lookback – they both, after all, capture the notion of “intermediate-term trends” – the former returned just 43.1% over the period while the latter returned 146.1%.

These total return figures further hide the year-to-year disparity that exists.  The 9-month model, for example, was not a consistent loser.  Below we plot these results, highlighting both the best (blue) and worst (orange) performing specifications.  We see that the yearly spread between these strategies can be hundreds-to-thousands of basis points; consider that in 2010, the strategy formed using a 10-month lookback returned 12.2% while the strategy formed using a 9-month lookback returned -9.31%.

Same thesis.  Same strategy.  Slightly different specification.  Dramatically different outcomes.  That single year is likely the difference between hired and fired for most advisors and asset managers.

Source: CSI Analytics.  Calculations by Newfound Research.  Returns are backtested and hypothetical.  Returns assume the reinvestment of all distributions.  Returns are gross of all fees except for underlying ETF expense ratios.  None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary.  You cannot invest in an index.


☞ Explore a diversified approach with the Newfound/ReSolve Robust Equity Momentum Index.


For those bemoaning their 2018 return, note that the 10-month specification would have netted a positive result!  That specification turned defensive at the end of October.

Now, some may cry “foul” here.  The evidence for trend and momentum is, after all, centuries in length and the efficacy of all these horizons is supported.  Surely the noise we see over this ten-year period would average out over the long run, right?

The unfortunate reality is that these performance differences are not expected to mean-revert.  The gambler’s fallacy would have us believe that bad luck in one year should be offset by good luck in another and vice versa.  Unfortunately, this is not the case.  While we would expect, at any given point in time, that each strategy has equal likelihood of experiencing good or bad luck going forward, that luck is expected to occur completely independently from what has happened in the past.

The implication is that performance differences due to model specification are not expected to mean-revert and are therefore expected to be random, but very permanent, return artifacts.1

The larger problem at hand is that none of us have a hundred years to invest.  In reality, most investors have a few decades.  And we act with the temperament of having just a few years.  Therefore, bad luck can have very permanent and very scarring effects not only upon our psyche, but upon our realized wealth.

But consider what happens if we try to neutralize the role of model specification risk and luck by diversifying across the seven different models equally (rebalanced annually).  We see that returns closer in line with the median result, a boost to realized Sharpe ratio, and a reduction in the maximum realized drawdown.

Source: CSI Analytics.  Calculations by Newfound Research.  Returns are backtested and hypothetical.  Returns assume the reinvestment of all distributions.  Returns are gross of all fees except for underlying ETF expense ratios.  None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary.  You cannot invest in an index.

These are impressive results given that all we employed was naïve diversification.

Conclusion

The odd thing about strategy diversification is that it guarantees we will be wrong.  Each and every year, we will, by definition, allocate at least part of our capital to the worst performing strategy.  The potential edge, however, is in being vaguely wrong rather than precisely wrong.  The former is annoying.  The latter can be catastrophic.

In this commentary we use the popular Dual Momentum GEM strategy as a case study to demonstrate how model specification choices can lead to performance differences that span hundreds, if not thousands, of basis points a year.    Unfortunately, we should not expect these performance differences to mean revert.  The realizations of good and bad luck are permanent, and potentially very significant, artifacts within our track records.

By simply diversifying across the different models, however, we can dramatically reduce specification risk and thereby reduce strategy fragility.

To be clear, no amount of diversification will protect you from the risk of the style.  As we like to say, “risk cannot be destroyed, only transformed.”  In that vein, trend following strategies will always incur some sort of whipsaw risk.  The question is whether it is whipsaw related to the style as a whole or to the specific implementation.

For example, in the graphs above we can see that Dual Momentum GEM implemented with a 10-month formation period experienced whipsaw in 2011 when few of the other implementations did.  This is more specification whipsaw than style whipsaw.  On the other hand, we can see that almost all the specifications exhibited whipsaw in late 2015 and early 2016, an indication of style whipsaw, not specification whipsaw.

Specification risk we can attempt to control for; style risk is just something we have to bear.

At Newfound, evidence such as this informs our own trend-following mandates.  We seek to diversify ourselves across the axes of what (“what are we investing in?”), how (“how are we making the decisions?”), and when (“when are we making those decisions?”) in an effort to reduce specification risk and provide the greatest style consistency possible.


 

Is Multi-Manager Diversification Worth It?

This post is available as a PDF download here.

Summary­

  • Portfolio risk is traditionally quantified by volatility.  The benefits of diversification are measured in how portfolio volatility is changed with the addition or subtraction of different investments.
  • Another measure of portfolio risk is the dispersion in terminal wealth: a measure that attempts to capture the potential difference in realized returns. For example, two equity managers that each hold 30 stock portfolios may exhibit similar volatility levels but will likely have very different realized results.
  • In this commentary we explore existing literature covering the potential diversification benefits that can arise from combining multiple managers together.
  • Empirical evidence suggests that in heterogeneous categories (e.g. many hedge fund styles), combining managers can reduce portfolio volatility. Yet even in homogenous categories (e.g. equity style boxes), combining managers can have a pronounced effect on reducing the dispersion in terminal wealth.
  • Finally, we address the question as to whether manager diversification leads to dilution, arguing that a combination of managers will reduce idiosyncratic process risks but maintain overall style exposure.

Introduction

In their 2014 paper The Free Lunch Effect: The Value of Decoupling Diversification and Risk, Croce, Guinn, and Robinson draw a distinction between the risk reduction effects that occur due to de-risking and those that occur due to diversification benefits.

To illustrate the distinction, the authors compare the volatility of an all equity portfolio versus a balanced stock/bond mix.  In the 1984-2014 sample period, they find that the all equity portfolio has an annualized volatility of 15.25% while the balanced portfolio has an annualized volatility of just 9.56%.

Over 75% of this reduction in volatility, however, is due simply to the fact that bonds were much less volatile than stocks over the period.  In fact, of the 568-basis-point reduction, only 124 basis points was due to actual diversification benefits.

Why does this matter?

Because de-risking carries none of the benefits of diversification.  If there is a commensurate trade-off between expected return and risk, then all we have done is reduced the expected return of our portfolio.1

It is only by combining assets of like volatility – and, it is assumed, like expected return – that should allow us to enjoy the free lunch of diversification.

Unfortunately, unless you are willing to apply leverage (e.g. risky parity), the reality of finding such free lunch opportunities across assets is limited. The classic example of inter-asset diversification, though, is taught in Finance 101: as we add more stocks to a portfolio, we drive the contribution of idiosyncratic volatility towards zero.

Yet volatility is only one way to measure risk.  If we build a portfolio of 30 stocks and you build a portfolio of 30 stocks, the portfolios may have nearly identical levels of volatility, but we almost assuredly will end up with different realized results.  This difference between the expected and the realized is captured by a measure known as terminal wealth dispersion, first introduced by Robert Radcliffe in his book Investment: Concepts, Analysis, Strategy.

This form of risk naturally arises when we select between investment managers.  Two managers may both select securities from the same universe using the same investment thesis, but the realized results of their portfolios can be starkly different.  In rare cases, the specific choice of one manager over another can even lead to catastrophic results.

The selection of a manager reflects not only an allocation to an asset class, but also reflects an allocation to a process.  In this commentary, we ask: how much diversification benefit exists in process diversification?

The Theory Behind Manager Diversification

In Factors from Scratch, the research team at O’Shaughnessy Asset Management (OSAM), in partnership with anonymous blogger Jesse Livermore, digs into the driving elements behind value and momentum equity strategies.

They find that value stocks do tend to exhibit negative EPS growth, but this decay in fundamentals is offset by multiple expansion.  In other words, markets do appear to correctly identify companies with contracting fundamentals, but they also exaggerate and over-extrapolate that weakness.  The historical edge for the strategy has been that the re-rating – measured via multiple expansion – tends to overcompensate for the contraction in fundamentals.

For momentum, OSAM finds a somewhat opposite effect.  The strategy correctly identifies companies with strengthening fundamentals, but during the holding period a valuation contraction occurs as the market recognizes that its outlook might have been too optimistic. Historically, however, the growth outweighed the contraction to create a net positive effect.

These are the true, underlying economic and behavioral effects that managers are trying to capture when they implement value and momentum strategies.

These are not, however, effects we can observe directly in the market; they are effects that we have to forecast.  To do so, we have to utilize semi-noisy signals that we believe are correlated. Therefore, every manager’s strategy will be somewhat inefficient at capturing these effects.

For example, there are a number of quantitative measures we may apply in our attempt to identify value opportunities; e.g. price-to-book, price-to-earnings, and EBITDA-to-enterprise-value to name a few. Two different noisy signals might end up with different performance just due to randomness.

This noise between signals is further compounded when we consider all the other decisions that must be made in the portfolio construction process.  Two managers may use the same signals and still end up with very different portfolios based upon how the signals are translated into allocations.

Consider this: Morningstar currently2 lists 1,217 large-cap value funds in its mutual fund universe and trailing 1-year returns ranged from 1.91% to -22.90%. This is not just a case of extreme outliers, either: the spread between the 10th and 90thpercentile returning funds was 871 basis points.

It bears repeating that these are funds that, in theory, are all trying to achieve the same goal: large-cap value exposure.

Yet this result is not wholly surprising to us.  In Separating Ingredients and Recipe in Factor Investing we demonstrated that the performance dispersion between different momentum strategy definitions (e.g. momentum measure, look-back length, rebalance frequency, weighting scheme, et cetera) was larger than the performance dispersion between the traditional Fama-French factors themselves in 90% of rolling 1-year periods.  As it turns out, intra-factor differences can cause greater dispersion than inter-factor differences.

Without an ex-ante view as to the superiority of one signal, one process, or one fund versus another, it seems prudent for a portfolio to have diversified exposure to a broad range of signals that seem plausibly related to the underlying phenomenon.

Literature Review

While foundational literature on modern portfolio diversification extends back to the 1950s, little has been written in the field of manager diversification. While it is a well-established teaching that a portfolio of 25-40 stocks is typically sufficient to reduce idiosyncratic risk, there is no matching rule for how many managers to combine together.

One of the earliest articles on the topic was written by Edward O’Neal in 1997, titled How Many Mutual Funds Constitute a Diversified Mutual Fund Portfolio?

Published in the Financial Analysts Journal, this article explores risk across two different dimensions: the volatility of returns over time and the dispersion in terminal period wealth.  Again, the idea behind the latter measure is that two investors with identical horizons and different investments will achieve different terminal wealth levels, even if those investments have the same volatility.

Exploring equity mutual fund returns from 1986 to 1997, the study adopts a simulation-based approach to constructing portfolios and tracking returns.  Multi-manager portfolios of varying sizes are randomly constructed and compared against other multi-manager portfolios of the same size.

O’Neal finds that while combining managers has little-to-no effect on volatility (manager returns were too homogenous), it had a significant effect upon the dispersion of terminal wealth.  To quote the article,

Holding more than a single mutual fund in a portfolio appears to have substantial diversification benefits. The traditional measure of volatility, the time-series standard deviation, is not greatly influenced by holding multiple funds. Measures of the dispersion in terminal-wealth levels, however, which are arguably more important to long-term investors than time-series risk measures, can be reduced significantly. The greatest portion of the reduction occurs with the addition of small numbers of funds. This reduction in terminal-period wealth dispersion is evident for all holding periods studied. Two out of three downside risk measures are also substantially reduced by including multiple funds in a portfolio. These findings are especially important for investors who use mutual funds to fund fixed-horizon investment goals, such as retirement and college savings.

Allocating to three managers instead of just one could reduce the dispersion in terminal wealth by nearly 50%, an effect found to be quite consistent across the different time horizons measured.

In 1999, O’Neal teamed up with L. Franklin Fant to publish Do You Need More than One Manager for a Given Equity Style? Adopting a similar simulation-based approach, Fant and O’Neal explored multi-manager equity portfolios in the context of the style-box framework.

And, as before, they find that taking a multi-manager approach has little effect upon portfolio volatility.

It does, however, again prove to have a significant effect on the deviation in terminal wealth.

To quote the paper,

Regardless of the style category considered, the variability in terminal wealth levels is significantly reduced by using more managers. The first few additional managers make the most difference, as terminal wealth standard deviation declines at a decreasing rate with the number of managers. Concentrating on the variability of periodic portfolio returns fails to document the advantage of using multiple managers within style categories.

Second, some categories benefit more from additional managers than others. Plan sponsors would do well to allocate relatively more managers to the styles that display the greatest diversification benefits. Growth styles and small-cap styles appear to offer the greatest potential for diversification.

In 2002, François-Serge Lhabitant and Michelle Learned pursued a similar vein of research in the realm of hedge funds in their article Hedge Fund Diversification: How Much is Enough?  They employ the same simulation-based approach but evaluate diversification effects within the different hedge fund styles.

They find that while diversification does little to affect the expected return for a given style, it does appear to help reduce portfolio volatility: sometimes quite significantly so. This somewhat contradictory result to the prior research is likely due to the fact that hedge funds within a given category exhibit far more heterogeneity in process and returns than do equity managers in the same style box.

(Note that while the graphs below only show the period 1990-1993, the paper explores three time periods: 1990-1993, 1994-1997, and 1998-2001 and finds a similar conclusion in all three).

Perhaps most importantly, however, they find a rather significant reduction in risk characteristics like a portfolio’s realized maximum drawdown.

To quote the article,

We find that naively adding more funds to a portfolio tends to leave returns stable, decrease the standard deviation, and reduce downside risk. Thus, diversification should be increased as long as the marginal benefits of adding a new asset to a portfolio exceeds the marginal cost.

If a sample of managers is relatively style pure, then a fewer number of managers will minimize the unsystematic risk of that style. On the contrary, if the sample is really heterogeneous, increasing the number of managers may still provide important diversification benefits.

Taken together, this literature paints an important picture:

  • Diversifying across managers in the same category will likely do little to reduce portfolio volatility, except in the cases where categories are broad enough to capture many heterogeneous managers.
  • Diversifying across managers appears to significantly reduce the potential dispersion in terminal wealth.

But why is minimizing “the dispersion of terminal wealth” important?  The answer is the same reason why we diversify in the first place: risk management.

The potential for high dispersion in terminal wealth means that we can have dramatically different outcomes based upon the choices we are making, placing significant emphasis on our skill in manager selection.  Choosing just one manager is more right style thinking rather than our preferred less wrong.

But What About Dilution?

The number one response we hear when we talk about manager diversification is: “when we combine managers, won’t we just dilute our exposure back to the market?”

The answer, as with all things, is: “it depends.”  For the sake of brevity, we’re just going to leave it with, “no.”

No?

No.

If we identify three managers as providing exposure to value, then it makes little logical sense that somehow a combination of them would suddenly remove that exposure.  Subtraction through addition only works if there is a negative involved; i.e. one of the managers would have to provide anti-value exposure to offset the others.

Remember that an active manager’s portfolio can always be decomposed into two pieces: the benchmark and a dollar-neutral long/short portfolio that isolates the active over/under-weights that manager has made.

To “dilute back to the benchmark,” we’d have to identify managers and then weight them such that all of their over/under-weights net out to equal zero.

Candidly, we’d be impressed if you managed to do that.  Especially if you combine managers within the same style who should all be, at least directionally, taking similar bets.  The dilution that occurs is only across those bets which they disagree on and therefore reflect the idiosyncrasies of their specific process.

What a multi-manager implementation allows us to diversify is our selection risk, leading to a return profile more “in-line” with a given style or category.  In fact, Lhabitant and Learned (2002) demonstrated this exact notion with a graph that plots the correlation of multi-manager portfolios with their broad category.  While somewhat tautological, an increase in manager diversification leads to a return profile closer to the given style than to the idiosyncrasies of those managers.

We can also see this with a practical example.  Below we take several available ETFs that implement quantitative value strategies and plot their rolling 52-week return relative to the S&P 500. We also construct a multi-manager index (“MM_IDX”) that is a naïve, equal-weight portfolio.  The only wrinkle to this portfolio is that ETFs are not introduced immediately, but rather slowly over a 12-month period.3

Source: CSI Analytics.  Calculations by Newfound Research.  It is not possible to invest in an index.  Returns are total returns (i.e. assume the reinvestment of all distributions) and are gross of all fees except for underlying expense ratios of ETFs. Past performance does not guarantee future results. 

 

We can see that while the multi-manager blend is never the best performing strategy, it is also never the worst.  Never the hero; never a zero.

It should be noted that while manager diversification may be able to reduce the idiosyncratic returns that result from process differences, it will not prevent losses (or relative underperformance) of the underlying style itself.  In other words, we might avoid the full brunt of losses specific to the Sequoia Fund, but no amount of diversification would prevent the relative drag seen by the quantitative value style in general over the last decade.

We can see this in the graph above by the fact that all the lines generally tend to move together.  2015 was bad for value managers.  2016 was much better.  But we can also see that every once in a while, a specific implementation will hit a rough patch that is idiosyncratic to that approach; e.g. IWD in 2017 and most of 2018.

Multi-manager diversification is the tool that allows us to avoid the full brunt of this risk.

Conclusion

Taken together, the research behind manager diversification suggests:

  • In heterogeneous categories (e.g. many hedge fund styles), manager diversification may reduce portfolio volatility.
  • In more homogenous categories (e.g. equity style boxes), manager diversification may reduce the dispersion in terminal wealth.
  • Multi-manager implementations appear to reduce realized portfolio risk metrics such as maximum drawdown. This is likely partially due to the reduction in portfolio volatility, but also due to a reduction in exposure to funds that exhibit catastrophic losses.
  • Multi-manager implementations do not necessarily “dilute” the portfolio back to market exposure, but rather “dilute” the portfolio back to the style exposure, reducing exposure idiosyncratic process risk.

For advisors and investors, this evidence may cause a sigh of relief.  Instead of having to spend time trying to identify the best manager or the best process, there may be significant advantages to simply “avoiding the brain damage”4 and allocating equally among a few.  In other words, if you don’t know which low-volatility ETF to pick, just buy a couple and move on with your life.

But what are the cons?

  • A multi-manager approach may be tax inefficient, as we will need to rebalance allocations back to parity between the exposures.
  • A multi-manager approach may lead to fund bloat within a portfolio, doubling or tripling the number of holdings we have. While this is merely optical, except possibly in small portfolios, we recognize there exists an aversion to it.
  • By definition, performance will be middling: the cost of avoiding the full brunt of losers is that we also give up the full benefit of winners. We’re reluctant to label this as a con, as it is arguably the whole point of diversification, but it is worth pointing out that the same behavioral biases that emerge in portfolio reviews of asset allocation will likely re-emerge in reviews of manager selection, especially over short time horizons.

For investment managers, a natural interpretation of this research is that approaches blending different signals and portfolio construction methods together should lead to more consistent outcomes.  It should be no surprise, then, that asset managers adopting machine learning are finding significant advantages with ensemble techniques. After all, they invoke the low-hanging fruit of manager diversification.

Perhaps most interesting is that this research suggests that fund-of-funds really are not such bad ideas so long as costs can be kept under control.  As the asset management business continues to be more competitive, perhaps there is an edge – and a better client result – found in cooperation.

 

Powered by WordPress & Theme by Anders Norén