The Research Library of Newfound Research

Author: Justin Sibears Page 1 of 3

From 2012-2019, Justin Sibears served as Managing Director and Portfolio Manager at Newfound Research. At Newfound, Justin was responsible for portfolio management, investment research, strategy development, and communication of the firm's views to clients.

Justin holds a Master of Science in Computational Finance and a Master of Business Administration from Carnegie Mellon University as a well as a BBA in Mathematics and Finance from the University of Notre Dame.

Timing Equity Returns Using Monetary Policy

This post is available as PDF download here.

Summary

  • Can the monetary policy environment be used to predict global equity market returns? Should we overweight/buy countries with expansionary monetary policy regimes and underweight/sell countries with contractionary monetary policy regimes?
  • In twelve of the fourteen countries studied, both nominal and real equity returns are higher (lower) when the central banks most recent action was to cut (hike) rates. For example, nominal U.S. equity returns are 1.8% higher during expansionary environments.  Real U.S. equity returns are 3.6% higher during expansionary environments.  The gap is even larger outside the United States.
  • However, the monetary policy regime explains very little of the overall variation in equity returns from a statistical standpoint.
  • While many of the return differentials during expansionary vs. contractionary regimes seem large at first glance, few are statistically significant once we realistically account for the salient features of equity returns and monetary policy. In other words, we can’t be sure the return differentials didn’t arise simply due to luck.
  • As a result, evidence suggests that making buy/sell decisions on the equity markets of a given country using monetary policy regime as the lone signal is overly ambitious.

Can the monetary policy environment be used to predict global equity market returns?  Should we overweight/buy countries with expansionary monetary policy and underweight/sell countries with contractionary monetary policy?

Such are the softball questions that our readers tend to send in.

Intuitively, it’s clear that monetary policy has some type of impact on equity returns.  After all, if the Fed raised rates to 10% tomorrow, that would clearly impact stocks.

The more pertinent question though is if these impacts always tend to be in one direction.  It’s relatively straightforward to build a narrative around why this could be the case.  After all, the Fed’s primary tool to manage its unemployment and inflation mandates is the discount rate.  Typically, we think about the Fed hiking interest rates when the economy gets “too hot” and cutting them when it gets “too cold.”  If hiking (cutting) rates has the goal of slowing (stimulating) the economy, it’s plausible to think that equity returns would be pushed lower (higher).

There are a number of good academic papers on the subject. Ioannadis and Kontonikas (2006) is a good place to start. The paper investigates the impact of monetary policy shifts on equity returns in thirteen OECD countries1 from 1972 to 2002.

Their analysis can be split into two parts.  First, they explore whether there is a contemporaneous relationship between equity returns and short-term interest rates (i.e. how do equity returns respond to interest rate changes?)2.  If there is a relationship, are returns likely to be higher or lower in months where rates increase?

Source: “Monetary Policy and the Stock Market: Some International Evidence” by Ioannadis and Kontonikas (2006).

 

In twelve of the thirteen countries, there is a negative relationship between interest rate changes and equity returns.  Equity returns tend to be lower in months where short-term rates increase.  The relationship is statistically significant at the 5% level in eight of the countries, including the United States.

While these results are interesting, they aren’t of much direct use for investors because, as mentioned earlier, they are contemporaneous.  Knowing that equity returns are lower in months where short-term interest rates rise is actionable only if we can accurately predict the interest rate movements ahead of time.

As an aside, if there is one predictive interest rate model we subscribe to, it’s that height matters.

Fortunately, this is where the authors’ second avenue of analysis comes into play.  In this section, they first classify each month as being part of either a contractionary or an expansionary monetary policy regime.  A month is part of a contractionary regime if the last change in the discount rate was positive (i.e. the last action by that country’s central bank was a hike).  Similarly, a month is part of an expansionary regime if the last central bank action was a rate cut.

We illustrate this classification for the United States below.  Orange shading indicates contractionary regimes and gray shading indicates expansionary regimes.

The authors then regress monthly equity returns on a dummy variable representing which regime a month belongs to.  Importantly, this is not a contemporaneous analysis: we know whether the last rate change was positive or negative heading into the month.  Quoting the paper:

“The estimated beta coefficients associated with the local monetary environment variable are negative and statistically significant in six countries (Finland, France, Italy, Switzerland, UK, US).  Hence, for those countries our measure of the stance of monetary policy contains significant information, which can be used to forecast expected stock returns.  Particularly, we find that restrictive (expansive) monetary policy stance decreases (increases) expected stock returns.”

Do we agree?

Partially.  When we analyze the data using a similar methodology and with data updated through 20183, we indeed find a negative relationship between monetary policy environment and forward 1-month equity returns.  For example, annualized nominal returns in the United States were 10.6% and 8.8% in expansionary and contractionary regimes, respectively.  The gap is larger for real returns – 7.5% in expansionary environments and 3.9% in contractionary environments.

Source: Bloomberg, MSCI, Newfound Research. Past performance does not guarantee future results. Return data relies on hypothetical indices and is exclusive of all fees and expenses. Returns assume the reinvestment of dividends.

 

A similar, albeit more pronounced, pattern emerges when we go outside the United States and consider thirteen other countries.

 

Source: Bloomberg, MSCI, Newfound Research. Past performance does not guarantee future results. Return data relies on hypothetical indices and is exclusive of all fees and expenses. Returns assume the reinvestment of dividends.

 

The results are especially striking in ten of the fourteen countries examined. The effect in the U.S. was smaller compared to many of these.

 

Source: Bloomberg, MSCI, Newfound Research. Past performance does not guarantee future results. Return data relies on hypothetical indices and is exclusive of all fees and expenses. Returns assume the reinvestment of dividends.

 

That being said, we think the statistical significance (and therefore investing merit) is less obvious.  Now, it is certainly the case that many of these differences are statistically significant when measured traditionally.  In this sense, our results agree with Ioannadis and Kontonikas (2006).

However, there are two issues to consider.  First, the R2 values for the regressions are very low.  For example, the highest R2 in the paper is 0.037 for Finland.  In other words, the monetary regime models do not do a particularly great job explaining stock returns.

Second, it’s important to take a step back and think about how monetary regimes evolve.  Central banks, especially today, typically don’t raise rates one month, cut the next, raise the next, etc.  Instead, these regimes tend to last multiple months or years.  The traditional significance testing assumes the former type of behavior, when the latter better reflects reality.

Now, this wouldn’t be a major issue if stock returns were what statisticians call “IID” (independent and identically distributed).  The results of a coin flip are IID.  The probability of heads and tails are unchanged across trials and the result of one flip doesn’t impact the odds for the next.

Daily temperatures are not IID.  The distribution of temperatures is very different for a day in December than they are for a day in July, at least for most of us.  They are not identical.  Nor are they independent.  Today’s high temperature gives us some information that tomorrow’s temperature has a good chance of hitting that value as well.

Needless to say, stock returns behave more like temperatures than they do coin flips.  This combination of facts – stock returns being non-IID (exhibiting both heteroskedasticity4 and autocorrelation) and monetary policy regimes having the tendency to persist over the medium term – leads to false positives.  What at first glance look like statistically significant relationships are no longer up to snuff because the model was poorly constructed in the first place.

To flush out these issues, we used two different simulation-based approaches to test for the significance of return differences across regimes.5

The first approach works as follows for each country:

  1. Compute the probability of expansionary and contractionary regimes using that country’ actual history.
  2. Randomly classify each month into one of the two regimes using the probabilities from #1.
  3. Compute the difference between annualized returns in expansionary vs. contractionary regimes using that country’s actual equity returns.
  4. Return to #2, repeating 10,000 times total.

This approach assumes that today’s monetary policy regime says nothing about what tomorrow’s may be. We have transformed monetary policy into an IID variable.  Below, we plot the regime produced by a single iteration of the simulation. Clearly, this is not realistic.

Source: Newfound Research

 

The second approach is similar to the first in all ways except how the monetary policy regimes are simulated.  The algorithm is:

  1. Compute the transition matrix for each country using that country’s actual history of monetary policy shifts. A transition matrix specifies the likelihood of moving to each regime state given that we were in a given regime the prior month.  For example, if last month was contractionary, we may have a 95% probability of staying contractionary and a 5% probability of moving to an expansionary state.
  2. Randomly classify each month into one of the two regimes using the transition matrix from #1. We have to determine how to seed the simulation (i.e. which state do we start off in).  We do this randomly using the overall historical probability of contractionary/expansionary regimes for that country.
  3. Compute the difference between annualized returns in expansionary vs. contractionary regimes using that country’s actual equity returns.
  4. Return to #2, repeating 10,000 times total.

The regimes produced by this simulation look much more realistic.

Source: Newfound Research

 

When we compare the distribution of return differentials produced by each of the simulation approaches, we see that the second produces a wider range of outcomes.

 

Source: Newfound Research

 

In the table below, we present the confidence intervals for return differentials using each algorithm.  We see that the differentials are statistically significant in six of the fourteen countries when we use the first methodology that produces unrealistic monetary regimes.  Only four countries show statistically significant results with the improved second method.

 

CountrySpread Between Annualized Real Returns95% CI
First Method
P-Value
First Method
95% CI
Second Method
P-Value
Second Method
Australia+9.8%-1.1% to +20.7%7.8%-1.5% to +21.1%8.9%
Belgium+14.6%+4.1% to +25.1%0.6%+0.7% to +28.5%3.9%
Canada-0.7%-12.2% to +10.8%90.5%-14.2% to +12.8%91.9%
Finland+29.0%+6.5% to +51.5%1.2%-2.4% to +60.4%7.1%
France+17.3%-0.5% to +35.1%5.7%-10.8% to +45.4%22.7%
Germany+10.8%-1.1% to +22.7%7.5%-2.8% to +24.4%12.0%
Italy+17.3%+3.6% to +31.0%1.3%-0.2% to +34.8%5.3%
Japan+26.5%+12.1% to +40.9%0.0%+3.4% to +49.6%2.5%
Netherlands+16.8%-1.8% to +35.4%7.6%-11.6% to +45.2%24.7%
Spain+23.8%+11.3% to +36.3%0.0%+9.9% to +37.7%0.1%
Sweden+30.4%+12.7% to +48.1%0.1%+4.7% to +56.1%2.1%
Switzerland+2.3%-11.5% to +16.1%74.4%-26.3% to +30.9%87.5%
United Kingdom-0.6%-11.5% to +10.3%91.4%-12.0% to +10.8%91.8%
United States+3.6%-5.0% to +12.2%41.1%-6.0% to +13.2%46.2%

Source: Bloomberg, MSCI, Newfound Research

 

Conclusion

We find that global equity returns have been more than 10% higher during expansionary regimes.  At first glance, such a large differential suggests there may be an opportunity to profitably trade stocks based on what regime a given country is in.

Unfortunately, the return differentials, while large, are generally not statistically significant when we account for the realistic features of equity returns and monetary policy regimes. In plain English, we can’t be sure that the return differentials didn’t arise simply due to randomness.

This result isn’t too surprising when we consider the complexity of the relationship between equity returns and interest rates (despite what financial commentators may have you believe).  Interest rate changes can impact both the numerator (dividends/dividend growth) and denominator (discount rate) of the dividend discount model in complex ways.  In addition, there are numerous other factors that impact equity returns and are unrelated / only loosely related to interest rates.

When such complexity reigns, it is probably a bit ambitious to rely on a standalone measure of monetary policy regime as a predictor of equity returns.

 


 

The State of Risk Management

This post is available as PDF download here

Summary

  • We compare and contrast different approaches to risk managing equity exposure; including fixed income, risk parity, managed futures, tactical equity, and options-based strategies; over the last 20 years.
  • We find that all eight strategies studied successfully reduce risk, while six of the eight strategies improve risk-adjusted returns. The lone exceptions are two options-based strategies that involve being long volatility and therefore are on the wrong side of the volatility risk premium.
  • Over time, performance of the risk management strategies varies significantly both relative to the S&P 500 and compared to the other strategies. Generally, risk-managed strategies tend to behave like insurance, underperforming on the upside and outperforming on the downside.
  • Diversifying your diversifiers by blending a number of complementary risk-managed strategies together can be a powerful method of improving long-term outcomes. The diversified approach to risk management shows promise in terms of reducing sequence risk for those investors nearing or in retirement.

I was perusing Twitter the other day and came across this tweet from Jim O’Shaughnessy, legendary investor and author of What Works on Wall Street.

As always. Jim’s wisdom is invaluable.  But what does this idea mean for Newfound as a firm?  Our first focus is on managing risk.  As a result, one of the questions that we MUST know the answer to is how to get more investors comfortable with sticking to a risk management plan through a full market cycle.

Unfortunately, performance chasing seems to us to be just as prevalent in risk management as it is in investing as a whole.  The benefits of giving up some upside participation in exchange for downside protection seemed like a no brainer in March of 2009.  After 8+ years of strong equity market returns (although it hasn’t always been as smooth of a ride as the market commentators may make you think), the juice may not quite seem worth the squeeze.

While we certainly don’t profess to know the answer to our burning question from above, we do think the first step towards finding one is a thorough understanding on the risk management landscape.  In that vein, this week we will update our State of Risk Management presentation from early 2016.

We examine eight strategies that roughly fit into four categories:

  • Diversification Strategies: strategic 60/40 stock/bond mix1 and risk parity2
  • Options Strategies: equity collar3, protective put4, and put-write5
  • Equity Strategies: long-only defensive equity that blends a minimum volatility strategy6, a quality strategy7, and a dividend growth strategy8 in equal weights
  • Trend-Following Strategies: managed futures9 and tactical equity10

The Historical Record

We find that over the period studied (December 1997 to July 2018) six of the eight strategies outperform the S&P 500 on a risk-adjusted basis both when we define risk as volatility and when we define risk as maximum drawdown.  The two exceptions are the equity collar strategy and the protective put strategy.  Both of these strategies are net long options and therefore are forced to pay the volatility risk premium.  This return drag more than offsets the reduction of losses on the downside.

Data Source: Bloomberg, CSI. Calculations by Newfound Research. Past performance does not guarantee future results. Volatility is a statistical measure of the amount of variation around the average returns for a security or strategy. All returns are hypothetical index returns. You cannot invest directly in an index and unmanaged index returns do not reflect any fees, expenses, sales charges, or trading expenses. Index returns include the reinvestment of dividends. No index is meant to measure any strategy that is or ever has been managed by Newfound Research. The Tactical Equity strategy was constructed by Newfound in August 2018 for purposes of this analysis and is therefore entirely backtested and not an investment strategy that is currently managed and offered by Newfound.

 

Data Source: Bloomberg, CSI. Calculations by Newfound Research. Past performance does not guarantee future results. Drawdown is a statistical measure of the losses experienced by a security or strategy relative to its historical maximum. The maximum drawdown is the largest drawdown over the security or strategy’s history. All returns are hypothetical index returns. You cannot invest directly in an index and unmanaged index returns do not reflect any fees, expenses, sales charges, or trading expenses. Index returns include the reinvestment of dividends. No index is meant to measure any strategy that is or ever has been managed by Newfound Research. The Tactical Equity strategy was constructed by Newfound in August 2018 for purposes of this analysis and is therefore entirely backtested and not an investment strategy that is currently managed and offered by Newfound.

 

Not Always a Smooth Ride

While it would be nice if this outperformance accrued steadily over time, reality is quite a bit messier.  All eight strategies exhibit significant variation in their rolling one-year returns vs. the S&P 500.  Interestingly, the two strategies with the widest ranges of historical one-year performance vs. the S&P 500 are also the two strategies that have delivered the most downside protection (as measured by maximum drawdown).  Yet another reminder that there is no free lunch in investing.  The more aggressively you wish to reduce downside capture, the more short-term tracking error you must endure.

Relative 1-Year Performance vs. S&P 500 (December 1997 to July 2018)

Data Source: Bloomberg, CSI. Calculations by Newfound Research. Past performance does not guarantee future results. All returns are hypothetical index returns. You cannot invest directly in an index and unmanaged index returns do not reflect any fees, expenses, sales charges, or trading expenses. Index returns include the reinvestment of dividends. No index is meant to measure any strategy that is or ever has been managed by Newfound Research. The Tactical Equity strategy was constructed by Newfound in August 2018 for purposes of this analysis and is therefore entirely backtested and not an investment strategy that is currently managed and offered by Newfound.

 

Thinking of Risk Management as (Uncertain) Portfolio Insurance

When we examine this performance dispersion across different market environments, we find a totally intuitive result: risk management strategies generally underperform the S&P 500 when stocks advance and outperform the S&P 500 when stocks decline.  The hit rate for the risk management strategies relative to the S&P 500 is 81.2% in the four years that the S&P 500 was down (2000, 2001, 2002, and 2008) and 19.8% in the seventeen years that the S&P was up.

In this way, risk management strategies are akin to insurance.  A premium, in the form of upside capture ratios less than 100%, is paid in exchange for a (hopeful) reduction in downside capture.

With this perspective, it’s totally unsurprising that these strategies have underperformed since the market bottomed during the global market crisis.   Seven of the eight strategies (with the long-only defensive equity strategy being the lone exception) underperformed the S&P 500 on an absolute return basis and six of the eight strategies (with defensive equity and the 60/40 stock/bond blend) underperformed on a risk-adjusted basis.

Annual Out/Underperformance Relative to S&P 500 (December 1997 to July 2018)

Data Source: Bloomberg, CSI. Calculations by Newfound Research. Past performance does not guarantee future results. All returns are hypothetical index returns. You cannot invest directly in an index and unmanaged index returns do not reflect any fees, expenses, sales charges, or trading expenses. Index returns include the reinvestment of dividends. No index is meant to measure any strategy that is or ever has been managed by Newfound Research. The Tactical Equity strategy was constructed by Newfound in August 2018 for purposes of this analysis and is therefore entirely backtested and not an investment strategy that is currently managed and offered by Newfound.

 

Diversifying Your Diversifiers

The good news is that there is significant year-to-year variation in the performance across strategies, as evidenced by the periodic table of returns above, suggesting there are diversification benefits to be harvested by allocating to multiple risk management strategies.  The average annual performance differential between the best performing strategy and the worst performing strategy is 20.0%.  This spread was less than 10% in only 3 of the 21 years studied.

We see the power of diversifying your diversifiers when we test simple equal-weight blends of the risk management strategies.  Both blends have higher Sharpe Ratios than 7 of the 8 individual strategies and higher excess return to drawdown ratios than 6 of the eight individual strategies.

This is a very powerful result, indicating that naïve diversification is nearly as good as being able to pick the best individual strategies with perfect foresight.

Data Source: Bloomberg, CSI. Calculations by Newfound Research. Past performance does not guarantee future results. All returns are hypothetical index returns. You cannot invest directly in an index and unmanaged index returns do not reflect any fees, expenses, sales charges, or trading expenses. Index returns include the reinvestment of dividends. No index is meant to measure any strategy that is or ever has been managed by Newfound Research. The Tactical Equity strategy was constructed by Newfound in August 2018 for purposes of this analysis and is therefore entirely backtested and not an investment strategy that is currently managed and offered by Newfound.

 

Why Bother with Risk Management in the First Place?

As we’ve written about previously, we believe that for most investors investing “failure” means not meeting one’s financial objectives.  In the portfolio management context, failure comes in two flavors.  “Slow” failure results from taking too little risk, while “fast” failure results from taking too much risk.

In this book, Red Blooded Risk, Aaron Brown summed up this idea nicely: “Taking less risk than is optimal is not safer; it just locks in a worse outcome.  Taking more risk than is optimal also results in a worst outcome, and often leads to complete disaster.”

Risk management is not synonymous with risk reduction.  It is about taking the right amount of risk, not too much or too little.

Having a pre-defined risk management plan in place before a crisis can help investors avoid panicked decisions that can turn a bad, but survivable event into catastrophe (e.g. the retiree that sells all of his equity exposure in early 2009 and then stays out of the market for the next five years).

It’s also important to remember that individuals are not institutions.  They have a finite investment horizon.  Those that are at or near retirement are exposed to sequence risk, the risk of experiencing a bad investment outcome at the wrong time.

We can explore sequence risk using Monte Carlo simulation.  We start by assessing the S&P 500 with no risk management overlay and assume a 30-year retirement horizon.  The simulation process works as follows:

  1. Randomly choose a sequence of 30 annual returns from the set of actual annual returns over the period we studied (December 1998 to July 2018).
  2. Adjust returns for inflation.
  3. For the sequence of returns chosen, calculate the perfect withdrawal rate (PWR). Clare et al, 2016 defines the PWR as “the withdrawal rate that effectively exhausts wealth at death (or at the end of a fixed period, known period) if one had perfect foresight of all returns over the period.11
  4. Return to #1, repeating 1000 times in total.

We plot the distribution of PWRs for the S&P 500 below.  While the average PWR is a respectable 5.7%, the range of outcomes is very wide (0.6% to 14.7%).  The 95 percent confidence interval around the mean is 2.0% to 10.3%.  This is sequence risk.  Unfortunately, investors do not have the luxury of experiencing the average, they only see one draw.  Get lucky and you may get to fund a better lifestyle than you could have imagined with little to no financial stress.  Get unlucky and you may have trouble paying the bills and will be sweating every market move.

Calculations by Newfound Research. Past performance does not guarantee future results. All returns are hypothetical index returns. You cannot invest directly in an index and unmanaged index returns do not reflect any fees, expenses, sales charges, or trading expenses. Index returns include the reinvestment of dividends.

 

Next, we repeat the simulation, replacing the pure S&P 500 exposure with the equal-weight blend of risk management strategies excluding the equity collar and the protective put.  We see quite a different result.  The average PWR is similar (6.2% to 5.7%), but the range of outcomes is much smaller (95% confidence interval from 4.4% to 8.1%).  At its very core, this is what implementing a risk management plan is all about.  Reducing the role of investment luck in financial planning.  We give up some of the best outcomes (in the right tail of the S&P 500 distribution) in exchange for reducing the probability of the very worst outcomes (in the left tail).

Calculations by Newfound Research. Past performance does not guarantee future results. All returns are hypothetical index returns. You cannot invest directly in an index and unmanaged index returns do not reflect any fees, expenses, sales charges, or trading expenses. Index returns include the reinvestment of dividends.

Conclusion

There is no holy grail when it comes to risk management.  While a number of approaches have historically delivered strong results, each comes with its own pros and cons.

In an uncertain world where we cannot predict exactly what the next crisis will look like, diversifying your diversifiers by combining a number of complementary risk-managed strategies may be a prudent course of action. We believe that this type of balanced approach has the potential to deliver compelling results over a full market cycle while managing the idiosyncratic risk of any one manager or strategy.

Diversification can also help to increase the odds of an investor sticking with their risk management plan as the short-term performance lows won’t be quite as low as they would be with a single strategy (conversely, the highs won’t be as high either).

That being said, having the discipline to stick with a risk management plan also requires being realistic.  While it would be great to build a strategy with 100% upside and 0% downside, such an outcome is unrealistic.  Risk-managed strategies tend to behave a lot like uncertain insurance for the portfolio.  A premium, in the form of upside capture ratios less than 100%, is paid in exchange for a (hopeful) reduction in downside capture.  This upside underperformance is a feature, not a bug.  Trying too hard to correct it may lead to overfit strategies fail to deliver adequate protection on the downside.

Machine Learning, Subset Resampling, and Portfolio Optimization

This post is available as a PDF download here

Summary

  • Portfolio optimization research can be challenging due to the plethora of factors that can influence results, making it hard to generalize results outside of the specific cases tested.
  • That being said, building a robust portfolio optimization engine requires a diligent focus on estimation risk. Estimation risk is the risk that the inputs to the portfolio optimization process (i.e. expected returns, volatilities, correlations) are imprecisely estimated by sampling from the historical data, leading to suboptimal allocations.
  • We summarize the results from two recent papers we’ve reviewed on the topic of managing estimation risk. The first paper relies on techniques from machine learning while the second paper uses a form of simulation called subset resampling.
  • Both papers report that their methodologies outperform various heuristic and optimization-based benchmarks.
  • We perform our own tests by building minimum variance portfolios using the 49 Fama/French industry portfolios.  We find that while both outperform equal-weighting on a risk-adjusted basis, the results are not statistically significant at the 5% level.

 

This week, we are going to review a couple of recent papers we’ve come across on the topic of reducing estimation risk in portfolio optimization.

Before we get started, we want to point out that while there are many fascinating papers on portfolio optimization, it is also one of the most frustrating areas to study in our opinion.  Why?  Because ultimately portfolio optimization is a very, very complex topic.  The results will be impacted in significant ways by a number of factors like:

  • What is the investment universe studied?
  • Over what time period?
  • How are the parameters estimated?
  • What are the lookback periods used to estimate parameters?
  • And so on…

Say that you find a paper that argues for the superiority of equal-weighted portfolios over mean-variance optimization by testing on a universe of large-cap U.S. equities. Does this mean that equal-weighting is superior to mean-variance optimization in general?  We tend to believe not.  Rather, we should take the study at face value: equal-weighting was superior to the particular style of mean-variance in this specific test.

In addition, the result in and of itself says nothing about why the outperformance occurred.  It could be that equal-weighting is a superior portfolio construction technique.

But maybe the equal-weighted stock portfolio just happens by chance to be close to the true Sharpe optimal portfolio.  If I have a number of asset classes that have reasonably similar returns, risks, and correlations, it is very likely that equal-weighting does a decent job of getting close to the Sharpe optimal solution.  On the other hand, consider an investment universe that consists of 9 equity sectors and U.S. Treasuries.  In this case, equal-weighting is much less likely to be close to optimal and we would find it more probable that optimization approaches could outperform.

Maybe equal-weighting exposes the stock portfolio to risk-premia like the value and size factors that improve performance.  I suspect that to some extent the outperformance of minimum variance portfolios in a number of studies is at least partially explained by the exposures that these portfolios have to the defensive or low beta factor (the tendency of low risk exposures to outperform high risk exposures on a risk-adjusted basis).

Maybe the mean estimates in the mean-variance optimization are just terrible and the results are less an indictment on MVO than on the particular mean estimation technique used.  To some extent, the difficulty of estimating means is a major part of the argument for equal-weighting or other heuristic or shrinkage-based approaches.  At the same time, we see a number of studies that estimate expected returns using sample means with long (i.e. 5 or 10 year) lookbacks.  These long-term horizons are exactly the period over which returns tend to mean revert and so the evidence would suggest these are precisely the types of mean estimates you wouldn’t want to use.  To properly test mean-variance, we should at least use mean estimates that have a chance of succeeding.

All this is a long-winded way of saying that it can be difficult to use the results from research papers to build a robust, general purpose portfolio optimizer.  The results may have limited value outside of the very specific circumstances explored in that particular paper.

That being said, this does not give us an excuse to stop trying.  With that preamble out of the way, we’ll return to our regularly scheduled programming.

 

Estimation Risk in Portfolio Optimization

Estimation risk is the risk that the inputs to the portfolio optimization process (i.e. expected returns, volatilities, correlations) are imprecisely estimated by sampling from the historical data, leading to suboptimal allocations.

One popular approach to dealing with estimation risk is to simply ignore parameters that are hard to estimate.  For example, the naïve 1/N portfolio, which allocates an equal amount of capital to each investment in the universe, completely foregoes using any information about the distribution of returns.  DiMiguel, Garlappi and Uppal (2007)[1] tested fourteen variations of sample-based mean-variance optimization on seven different datasets and concluded that “…none is consistently better than the 1/N rule in terms of Sharpe Ratio, certainty-equivalent return, or turnover, which indicates that, out of sample, the gain from optimal diversification is more than offset by estimator error.”

Another popular approach is to employ “shrinkage estimators” for key inputs.  For example, Ledoit and Wolf (2004)[2] propose shrinking the sample correlation matrix towards (a fancy way of saying “averaging it with”) the constant correlation matrix.  The constant correlation matrix is simply the correlation matrix where each diagonal element is equal to the pairwise average correlation across all assets.

Generally speaking, shrinkage involves blending an “unstructured estimator” like the sample correlation matrix with a “structured estimator” like the constant correlation matrix that tries to represent the data with few free parameters. Shrinkage tends to limit extreme observations, thereby reducing the unwanted impact that such observations can have on the optimization result.

Interestingly, the common practice of imposing a short-sale constraint when performing mean-variance optimization or minimum variance optimization is equivalent to shrinking the expected return estimates[3] and the covariance estimates[4], respectively.

Both papers that we’ll discuss here are alternate ways of performing shrinkage.

Applying Machine Learning to Reduce Estimation Risk

The first paper, Reducing Estimation Risk in Mean-Variance Portfolios with Machine Learning by Daniel Kinn (2018)[5], explores using a standard machine learning approach to reduce estimation risk in portfolio optimization.

Kinn’s approach recognizes that estimation error can be decomposed into two sources: bias and variance.  Both bias and variance result in suboptimal results, but in very different ways.  Bias results from the model doing a poor job of capturing the pertinent features of the data.  Variance, on the other hand, results from the model being sensitive to the data used to train the model.

To get a better intuitive sense of bias vs. variance, consider two weather forecasters, Mr. Bias and Ms. Variance.  Both Mr. Bias and Ms. Variance work in a town where the average temperature is 50 degrees.  Mr. Bias is very stubborn and set in his ways.  He forecasts that the temperature will be 75 degrees each and every day.  Ms. Variance, however, is known for having forecasts that jump up and down.  Half of the time she forecasts a temperature of 75 degrees and half of the time she forecasts a temperature of 25 degrees.

Both forecasters have roughly the same amount of forecast error, but the nature of their errors are very different.  Mr. Bias is consistent but has way too rosy of a picture of the town’s weather.  Ms. Variance on the other hand, actually has the right idea when it comes to long-term weather trends, but her volatile forecasts still leave much to be desired.

The following graphic from EliteDataScience.com gives another take on explaining the difference between the two concepts.

Source: https://elitedatascience.com/bias-variance-tradeoff

 

When it comes to portfolio construction, some popular techniques can be neatly classified into one of these two categories.  The 1/N portfolio, for example, has no variance (weights will be the same every period), but may have quite a bit of bias if it is far from the true optimal portfolio.  Sample-based mean-variance options, on the other hand, should have no bias (assuming the underlying distributions of asset class returns does not change over time), but can be highly sensitive to parameter measurements and therefore exhibit high variance.  At the end of the day, we are interested in minimum total estimation error, which will generally involve a trade-off between bias and variance.

Source: https://elitedatascience.com/bias-variance-tradeoff

 

Finding where this optimal trade-off lies is exactly what Kinn sets out to accomplish with the machine learning algorithm described in this paper.  The general outline of the algorithm is pretty straightforward:

  1. Identify the historical data to be used in calculating the sample moments (expected returns, volatilities, and correlations).
  2. Add a penalty function to the function that we are going to optimize. The paper discusses a number of different penalty functions including Ridge, Lasso, Elastic Net, and Principal Component Regression.  These penalty functions will effectively shrink the estimated parameters with the exact nature of the shrinkage dependent on the penalty function being used.  By doing so we introduce some bias, but hopefully with the benefit of reducing variance even further and as a result reducing overall estimation error.
  3. Use K-fold cross-validation to fit the parameter(s) of the penalty function. Cross-validation is a machine learning technique where the training data is divided in various sets of in sample and out of sample data.  The parameter(s) chosen will be those that produce the lowest estimation error in the out of sample data.
  4. Using the optimized parameters from #3, fit the model on the entire training set. The result will be the optimized portfolio weights for the next holding period.

Kinn tests three versions of the algorithm (one using a Ridge penalty function, one using a Lasso penalty function, and one using principal component regression) on the following real-world data sets.

  • 20 randomly selected stocks from the S&P 500 (covers January 1990 to November 2017)
  • 50 randomly selected stocks from the S&P 500 covers January 1990 to November 2017)
  • 30 industry portfolios using stocks listed on the NYSE, AMEX, and NASDAQ covers January 1990 to November January 2018)
  • 49 industry portfolios using stocks listed on the NYSE, AMEX, and NASDAQ covers January 1990 to November January 2018)
  • 200 largest cryptocurrencies by market value as of the end of 2017 (if there was ever a sign of a 2018 paper on portfolio optimization it has to be that one of the datasets relates to crypto)
  • 1200 cryptocurrencies observed from September 2013 to December 2017

As benchmarks, Kinn uses traditional sample-based mean-variance, sample-based mean-variance with no short selling, minimum variance, and 1/N.

The results are pretty impressive with the machine learning algorithms delivering statistically significant risk-adjusted outperformance.

Here are a few thoughts/comments we had when implementing the paper ourselves:

  1. The specific algorithm, as outlined in the paper, is a bit inflexible in the sense that it only works for mean-variance optimization where the means and covariances are estimated from the sample. In other words, we couldn’t use the algorithm to compute a minimum variance portfolio or a mean-variance portfolio where we want to substitute in our own return estimates.  That being said, we think there are some relatively straightforward tweaks that can make the process applicable in these scenarios.
  2. In our tests, the parameter optimization for the penalty functions was a bit unstable. For example, when using the principal component regression, we might identify two principal components as being worth keeping in one month and then ten principal components being worth keeping in the next month.  This can in term lead to instability in the allocations.  While this is a concern, it could be dealt with by smoothing the parameters over a number of months (although this introduces more questions like how exactly to smooth and over what period).
  3. The results tend to be biased towards having significantly fewer holdings than the 1/N benchmark. For example, see the righthand chart in the exhibit below.  While this is by design, we do tend to get wary of results showing such concentrated portfolios to be optimal especially when in the real world we know that asset class distributions are far from well-behaved.

 

Applying Subset Resampling to Reduce Estimation Error

The second paper, Portfolio Selection via Subset Resampling by Shen and Wang (2017)[6], uses a technique called subset resampling.  This approach works as follows:

  1. Select a random subset of the securities in the universe (e.g. if there are 30 commodity contracts, you could pick ten of them).
  2. Perform the portfolio optimization on the subset selected in #1.
  3. Repeat steps #1 and #2 many times.
  4. Average the resulting allocations together to get the following result.

The table below shows an example of how this would work for three asset classes and three simulations with two asset classes selected in each subset.

One way we can try to get intuition around subset resampling is by thinking about the extremes.  If we resampled using subsets of size 1, then we would end up with the 1/N portfolio.  If we resampled using subsets that were the same size as the universe, we would just have the standard portfolio optimized over the entire universe.  With subset sizes greater than 1 and less than the size of the whole universe, we end up with some type of blend between 1/N and the traditionally optimized portfolio.

The only parameter we need to select is the size of the universe.  The authors suggest a subset size equal to n0.8 where n is the number of securities in the universe.  For the S&P 500, this would correlate to a subset size of 144.

The authors test subset resampling on the following real-world data sets.

  • FF100: 100 Fama and French portfolios spanning July 1963 to December 2004
  • ETF139: 139 ETFs spanning January 2008 to October 2012
  • EQ181:  Individual equities from the Russell Top 200 Index (excluding those stocks with missing data) spanning January 2008 to October 2012
  • SP434:  Individual equities from the S&P 500 Index (excluding those stocks with missing data) spanning September 2001 to August 2013.

As benchmarks, the authors use 1/N (EW); value-weighted (VW); minimum-variance (MV); resampled efficiency (RES) from Michaud (1989)[7]; the two-fund portfolio (TZT) from Tu and Zhou (2011)[8], which blends 1/N and classic mean-variance; the three-fund portfolio (KZT) from Kan and Zhou (2007)[9] which blends the risk-free asset, classic mean-variance, and minimum variance; the four fund portfolio (TZF) from Tu and Zhou (2011) which blends KZT and 1/N; mean-variance using the shrinkage estimator from Ledoit and Wolf (2004) (SKC); and on-line passive aggressive mean reversion (PAMR) from Li (2012)[10].

Similar to the machine learning algorithm, subset resampling does very well in terms of risk-adjusted performance.  On three of the four data sets, the Sharpe Ratio of subset resampling is better than that of 1/N by a statistically significant margin.  Additionally, subset resampling has the lowest maximum drawdown in three of the four data sets.  From a practical standpoint, it is also positive to see that the turnover for subset resampling is significantly lower than many of the competing strategies.

 

As we did with the first paper, here are some thoughts that came to mind in reading and re-implementing the subset resampling paper:

  1. As presented, the subset resampling algorithm will be sensitive to the number and types of asset classes in an undesirable way. What do we mean by this?  Suppose we had three uncorrelated asset classes with identical means and standard deviations.  We use subset resampling with subsets of size two to compute a mean-variance portfolio.  The result will be approximately 1/3 of the portfolio in each asset class, which happens to match the true mean-variance optimal portfolio.  Now we add a fourth asset class that also has the same mean and standard deviation but is perfectly correlated to the third asset class.  With this setup, the third and fourth asset classes are one in the same.  As a result, the true mean-variance optimal portfolio will have 1/3 in the first and second asset classes and 1/6 in the third or fourth asset class (in reality the solution will be optimal as long as the allocations to the third and fourth asset classes sum to 1/3).  However, subset resampling will produce a portfolio that is 25% in each of the four asset classes, an incorrect result.  Note that this is a problem with many heuristic solutions, including the 1/N portfolio.
  2. There are ways that we could deal with the above issue by not sampling uniformly, but this will introduce some more complexity into the approach.
  3. In a mean-variance setting, the subset resampling will dilute the value of our mean estimates. Now, this should be expected when using any shrinkage-like approach, but it is something to at least be aware of. Dilution will be more severe the smaller the size of the subsets.
  4. In terms of computational burden, it can be very helpful to use some “smart” resampling that is able to get a representative sampling with fewer iterations that a naïve approach. Otherwise, subset resampling can take quite a while to run due to the sheer number of optimizations that must be calculated.

Performing Our Own Tests

In this section, we perform our own tests using what we learned from the two papers.  Initially, we performed the test using mean-variance as our optimization of choice with 12-month return as the mean estimate.  We found, however, that the impact of the mean estimate swamped that of the optimizations.  As a result, we repeated the tests, this time building minimum variance portfolios.  This will isolate the estimator error relating to the covariance matrix, which we think is more relevant anyways since few practitioners use sample-based estimates of expected returns. Note that we used the principal component regression version of the machine learning algorithm.

Our dataset was the 49 industry portfolios provided in the Fama and French data library. We tested the following optimization approaches:

  • EW: 1/N equally-weighted portfolio
  • NRP: naïve risk parity where positions are weighted inversely to their volatility, correlations are ignored
  • MV: minimum variance using the sample covariance matrix
  • ZERO: minimum variance using sample covariance matrix shrunk using a shrinkage target where all correlations are assumed to be zero
  • CONSTANT: minimum variance using sample covariance matrix shrunk using a shrinkage target where all correlations are equal to the sample pairwise correlation across all assets in the universe
  • PCA: minimum variance using sample covariance matrix shrunk using a shrinkage target that only keeps the top 10% of eigenvectors by variance explained
  • SSR: subset resampling
  • ML: machine learning with principal component regression

The results are presented below:

Results are hypothetical and backtested and do not reflect any fees or expenses. Returns include the reinvestment of dividends. Results cover the period from 1936 to 2018. Past performance does not guarantee future results.

 

All of the minimum variance strategies deliver lower risk than EW and NRP and outperform a risk-adjusted basis although none of the Sharpe Ratio differences are significant at a 5% confidence level. Of the strategies, ZERO (shrinking with a covariance matrix that assumes zero correlation) and SSR (subset resampling) delivered the highest Sharpe Ratios.

 

Conclusion

Portfolio optimization research can be challenging due to the plethora of factors that can influence results, making it hard to generalize results outside of the specific cases tested.  It can be difficult to ascertain whether the conclusions are truly attributable to the optimization processes being tested or some other factors.

That being said, building a robust portfolio optimization engine requires a diligent focus on estimation risk.  Estimation risk is the risk that the inputs to the portfolio optimization process (i.e. expected returns, volatilities, correlations) are imprecisely estimated by sampling from the historical data, leading to suboptimal allocations.

We summarize the results from two recent papers we’ve reviewed on the topic of managing estimation risk.  The first paper relies on techniques from machine learning to find the optimal shrinkage parameters that minimize estimation error by acknowledging the trade-off between bias and variance.  The second paper uses a form of simulation called subset resampling.  In this approach, we repeatedly select a random subset of the universe, optimize over that subset, and then blend the subset results to get the final result.

Both papers report that their methodologies outperform various heuristic and optimization-based benchmarks.  We feel that both the machine learning and subset resampling approaches have merit after making some minor tweaks to deal with real world complexities.

We perform our own tests by building minimum various portfolios using the 49 Fama/French industry portfolios.  We find that while both outperform equal-weighting on a risk-adjusted basis, the results are not statistically significant at the 5% level.  While this highlights that research results may not translate out of sample, this certainly does not disqualify either method as potentially being useful as tools to manage estimation risk.

 

 

[1] Paper can be found here: http://faculty.london.edu/avmiguel/DeMiguel-Garlappi-Uppal-RFS.pdf.

[2] Paper can be found here: http://www.ledoit.net/honey.pdf

[3] DiMiguel, Garlappi and Uppal (2007)

[4] Jagannathan and Ma (2003), “Risk reduction in large portfolios: Why imposing the wrong constraints helps.”

[5] Paper can be found here: https://arxiv.org/pdf/1804.01764.pdf.

[6] Paper can be found here: https://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14443

[7] Paper can be found here: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2387669

[8] Paper can be found here: https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=2104&context=lkcsb_research

[9] Paper can be found here: https://www.cambridge.org/core/journals/journal-of-financial-and-quantitative-analysis/article/optimal-portfolio-choice-with-parameter-uncertainty/A0E9F31F3B3E0873109AD8B2C8563393

[10] Paper can be found here: http://research.larc.smu.edu.sg/mlg/papers/PAMR_ML_final.pdf

 

Failing Slow, Failing Fast, and Failing Very Fast

This post is available as a PDF download here

Summary

  • For most investors, long-term “failure” means not meeting one’s financial objectives.
  • In the portfolio management context, failure comes in two flavors. “Slow” failure results from taking too little risk, while “fast” failure results from taking too much risk.  In his book, Red Blooded Risk, Aaron Brown summed up this idea nicely: “Taking less risk than is optimal is not safer; it just locks in a worse outcome.  Taking more risk than is optimal also results in a worse outcome, and often leads to complete disaster.”
  • A third type of failure, failing very fast, occurs when we allow behavioral biases to compound the impact of market volatility (i.e. panicked selling near the bottom of a bear market).
  • In the aftermath of the global financial crisis, risk management was often used synonymously with risk reduction. In actuality, a sound risk management plan is not just about reducing risk, but rather about calibrating risk appropriately as a means of minimizing the risk of both slow and fast failure.

On the way back from a recent trip, I ran across a fascinating article in Vanity Fair: “The Clock is Ticking: Inside the Worst U.S. Maritime Disaster in Decades.”  The article details the saga of the SS El Faro, a U.S. flagged cargo ship that sunk in October 2015 at the hands of Hurricane Joaquin.  Quoting from the beginning of the article:

“In the darkness before dawn on Thursday, October 1, 2015, an American merchant captain named Michael Davidson sailed a 790-foot U.S.-flagged cargo ship, El Faro, into the eye wall of a Category 3 hurricane on the exposed windward side of the Bahama Islands.  El Faro means “the lighthouse” in Spanish.

 The hurricane, named Joaquin, was one of the heaviest to ever hit the Bahamas.  It overwhelmed and sank the ship.  Davidson and the 32 others aboard drowned. 

They had been headed from Jacksonville, Florida, on a weekly run to San Juan, Puerto Rico, carrying 391 containers and 294 trailers and cars.  The ship was 430 miles southwest of Miami in deep water when it went down.

Davidson was 53 and known as a stickler for safety.  He came from Windham, Maine, and left behind a wife and two college age daughters.  Neither his remains nor those of his shipmates were ever recovered. 

Disasters at sea do not get the public attention that aviation accidents do, in part because the sea swallows the evidence.  It has been reported that a major merchant ship goes down somewhere in the world every two or three days; most ships are sailing under flags of convenience, with underpaid crews and poor safety records. 

The El Faro tragedy attracted immediate attention for several reasons.  El Faro was a U.S.-flagged ship with a respected captain – and it should have been able to avoid the hurricane.  Why didn’t it?  Add to the mystery this sample fact: the sinking of the El Faro was the worst U.S. maritime disaster in three decades.”

From the beginning, Hurricane Joaquin was giving forecasters fits.  A National Hurricane Center release from September 29th said, “The track forecast remains highly uncertain, and if anything, the spread in the track model guidance is larger now beyond 48 hours…”  Joaquin was so hard to predict that FiveThirtyEight wrote an article about it.  The image below shows just how much variation there was in projected paths for the storm as of September 30th.

Davidson knew all of this.  Initially, he had two options.  The first option was the standard course: a 1,265-mile trip directly through open ocean toward San Juan.   The second was the safe play, a less direct route that would use a number of islands as protection from the storm.  This option would add 184 miles and six plus hours to the trip.

Davidson faced a classic risk management problem.  Should he risk failing fast or failing slow?

Failing fast would mean taking the standard course and suffering damage or disaster at the hands of the storm.  In this scenario – which tragically ended up playing out – Davidson paid the fatal price by taking too much risk.

Failing slow, on the other hand, would be playing it safe and taking the less direct route.  The risk here would be wasting the company’s time and money.  By comparison, this seems like the obvious choice.  However, the article suggests that Davidson may have been particularly sensitive to this risk as he had been gunning for a captain position on a new vessel that would soon replace El Faro on the Jacksonville to San Juan route.  In this scenario, Davidson would fail by taking too little risk.

This dichotomy between taking too little risk and failing slow and taking too much risk and failing fast is central to portfolio risk management.

Aaron Brown summed this idea up nicely in his book Red Blooded Risk, where he wrote, “Taking less risk than is optimal is not safer; it just locks in a worse outcome.  Taking more risk than is optimal also results in a worse outcome, and often leads to complete disaster.”

Failing Slow

In the investing context, failing slow happens when portfolio returns are insufficient to generate the growth needed to meet one’s objectives.  No one event causes this type of failure.  Rather, it slowly builds over time.  Think death by a thousand papercuts or your home slowly being destroyed from the inside by termites.

Traditionally, this was probably the result of taking too little risk.  Oversized allocations to cash, which as an asset class has barely kept up with inflation over the last 90 years, are particularly likely to be a culprit in this respect.

Data Source: http://pages.stern.nyu.edu/~adamodar/New_Home_Page/datafile/histretSP.html. Calculations by Newfound Research. Past performance does not guarantee future results.

 

Take your average 60% stock / 40% bond investor as an example.  Historically, such an investor would see a $100,000 investment grow to $1,494,003 over a 30-year horizon. Add a 5% cash allocation to that portfolio and the average end result drops to $1,406,935, an $87k cash drag.  Double the cash bucket to 10% and the average drag increases to nearly $170k.  This pattern continues as each additional 5% cash increment lowers ending wealth by approximately $80k.

Data Source: http://pages.stern.nyu.edu/~adamodar/New_Home_Page/datafile/histretSP.html. Calculations by Newfound Research. Past performance does not guarantee future results.

 

Fortunately, there are ways to manage funds earmarked for near-term expenditures or as a safety net without carrying excessive amounts of cash.  For one example, see the Betterment article: Safety Net Funds: Why Traditional Advice Is Wrong.

Unfortunately, today’s investors face a more daunting problem.  Low returns may not be limited to cash.  Below, we present medium term (5 to 10 year) expected returns on U.S. equities, U.S. bonds, and a 60/40 blend from seven different firms/individuals.  The average expected return on the 60/40 portfolio is less than 1% per year after inflation.  Even if we exclude the outlier, GMO, the average expected return for the 60/40 is still only 1.3%.  Heck, even the most optimistic forecast from AQR is downright depressing relative to historical experience.

 

Expected return forecasts are the views of the listed firms, are uncertain, and should not be considered investment advice. Nominal returns are adjusted by subtracting 2.2% assumed inflation.

 

And the negativity is far from limited to U.S. markets.  For example, Research Affiliates forecasts a 5.7% real return for emerging market equities.  This is their highest projected return asset class and it still falls well short of historical experience for the U.S. equity markets, which have returned 6.5% after inflation over the last 90 years.

One immediate solution that may come to mind is just to take more risk.  For example, a 4% real return may still be technically achievable[1]. Assuming that Research Affiliates’ forecasts are relatively accurate, this still requires buying into and sticking with a portfolio that holds around 40% in emerging market securities, more than 20% in real assets/alternatives, and exactly 0% large-cap U.S. equity exposure[2].

This may work for those early in the accumulation phase, but it certainly would require quite a bit of intestinal fortitude.  For those nearing, or in, retirement, the problem is more daunting.  We’ve written quite a bit recently about the problems that low forward returns pose for retirement planning[3][4] and what can be done about it[5][6].

And obviously, one of the main side effects of taking more risk is increasing the portfolio’s exposure to large losses and fast failure, very much akin to Captain Davidson sailing way too close to the eye of the hurricane.

Failing Fast

At its core, failing fast in investing is about realizing large losses at the wrong time.  Think your house burning down or being leveled by a tornado instead of being destroyed slowly by termites.

Note that large losses are a necessary, but not sufficient condition for fast failure[7].  After all, for long-term investors, experiencing a bear market eventually is nearly inevitable.  For example, there has never been a 30-year period in the U.S. equity markets without at least one year-over-year loss of greater than 20%.  79% of historical 30-year periods have seen at least one year-over-year loss greater than 40%.

Fast failure is really about being unfortunate enough to realize a large loss at the wrong time.  This is called “sequence risk” and is particularly relevant for individuals nearing or in the early years of retirement.

We’ve used the following simple example of sequence risk before.  Consider three investments:

  • Portfolio A: -30% return in Year 1 and 6% returns for Years 2 to 30.
  • Portfolio B: 6% returns for Years 1 to 14, a -30% return in Year 15, and 6% returns for Years 16 to 30.
  • Portfolio C: 6% returns in Years 1 to 29 and a -30% return in Year 30.

Over the full 30-year period, all three investments have an identical geometric return of 4.54%.

Yet, the experience of investing in each of the three portfolios will be very different for a retiree taking withdrawals[8].  We see that Portfolio C fares the best, ending the 30-year period with 12% more wealth than it began with.  Portfolio B makes it through the period, ending with 61% of the starting wealth, but not without quite a bit more stress.  Portfolio A, however, ends in disaster, running out of money prematurely.

 

One way we can measure sequence risk is to compare historical returns from a particular investment with and without withdrawals.  The larger this gap, the more sequence risk was realized.

We see that sequence risk peaks in periods where large losses were realized early in the 10-year period.  To highlight a few periods:

  • The period ending in 2009 started with the tech bubble and ended with the global financial crisis.
  • The period ending in 1982 started with losses of 14.3% in 1973 and 25.9% in 1974.
  • The period ending in 1938 started off strong with a 43.8% return in 1928, but then suffered four consecutive annual losses as the Great Depression took hold.

Data Source: http://pages.stern.nyu.edu/~adamodar/New_Home_Page/datafile/histretSP.html. Calculations by Newfound Research. Past performance does not guarantee future results.

 

A consequence of sequence risk is that asset classes or strategies with strong risk-adjusted returns, especially those that are able to successfully avoid large losses, can produce better outcomes than investments that may outperform them on a pure return basis.

For example, consider the period from August 2000, when the equity market peaked prior to the popping of the tech bubble, to March 2018.  Over this period, two common risk management tools – U.S. Treasuries (proxied by the Bloomberg Barclays 7-10 Year U.S. Treasury Index and iShares 7-10 Year U.S. Treasuries ETF “IEF”) and Managed Futures (proxied by the Salient Trend Index) – delivered essentially the same return as the S&P 500 (proxied by the SPDR S&P 500 ETF “SPY”).  Both risk management tools have significantly underperformed during the ongoing bull market (16.6% return from March 2009 to March 2018 for SPY compared to 3.1% for IEF and 0.7% for the Salient Trend Index).

Data Source: CSI, Salient. Calculations by Newfound Research. Past performance does not guarantee future results. Returns include no fees except underlying ETF fees. Returns include the reinvestment of dividends.

 

Yet, for investors withdrawing regularly from their portfolio, bonds and managed futures would have been far superior options over the last two decades.  The SPY-only investor would have less than $45k of their original $100k as of March 2018.  On the other hand, both the bond and managed futures investors would have growth their account balance by $34k and $29k, respectively.

Data Source: CSI, Salient. Calculations by Newfound Research. Past performance does not guarantee future results. Returns include no fees except underlying ETF fees. Returns include the reinvestment of dividends.

 

Failing Really Fast

Hurricanes are an unfortunate reality of sea travel.  Market crashes are an unfortunate reality of investing.  Both have the potential to do quite a bit of damage on their own.  However, what plays out over and over again in times of crisis is that human errors compound the situation.  These errors turn bad situations into disasters.  We go from failing fast to failing really fast.

In the case of El Faro, the list of errors can be broadly classified into two categories:

  1. Failures to adequately prepare ahead of time. For example, El Faro had two lifeboats, but they were not up to current code and were essentially worthless on a hobbled ship in the midst of a Category 4 hurricane.
  2. Poor decisions in the heat of the moment. Decision making in the midst of a crisis is very difficult.   The Coast Guard and NTSB put most of the blame on Davidson for poor decision making, failure to listen to the concerns of the crew, and relying on outdated weather information.

These same types of failures apply to investing.  Imagine the retiree that sells all of his equity exposure in early 2009 and sits out of the market for a few years during the first few years of the bull market or maybe the retiree that goes all-in on tech stocks in 2000 after finally getting frustrated with hearing how much money his friend had made off of Pets.com.  Taking a 50%+ loss on your equity exposure is bad, panicking and making rash decisions can throw your financial plans off track for good.

Compounding bad events with bad decisions is a recipe for fast failure.  Avoiding this fate means:

  1. Having a plan in place ahead of time.
  2. If you plan on actively making decisions during a crisis (instead of simply holding), systematize your process. Lay out ahead of time how you will react to various triggers.
  3. Sticking to your plan, even when it may feel a bit uncomfortable.
  4. Diversify, diversify, diversify.

On that last point, the benefits of diversifying your diversifiers cannot be overstated.

For example, take the following four common risk management techniques:

  1. Static allocation to fixed income (60% SPY / 40% IEF blend)
  2. Risk parity (Salient Risk Parity Index)
  3. Managed futures (Salient Trend Index)
  4. Tactical equity with trend-following (binary SPY or IEF depending on 10-month SPY return).

We see that a simple equal-weight blend of the four strategies delivers risk-adjusted returns that are in line with the best individual strategy.  In other words, the power of diversification is so significant that an equal-weight portfolio performs nearly the same as someone who had a crystal ball at the beginning of the period and could foresee which strategy would do the best.

Data Source: CSI, Salient, Bloomberg. Calculations by Newfound Research. Past performance does not guarantee future results. Returns include no fees except underlying ETF fees. Returns include the reinvestment of dividends. Blend is an equal-weight portfolio of the four strategies that is rebalanced on a monthly basis.

 

Achieving Risk Ignition

In the wake of the tech bubble and the global financial crisis, lots of attention has (rightly) been given to portfolio risk management.  Too often, however, we see risk management used as a synonym for risk reduction.  Instead, we believe that risk management is ultimately taking the right amount of risk, not too little or too much.  We call this achieving risk ignition[9] (a phrase we stole from Aaron Brown), where we harness the power of risk to achieve our objectives.

In our opinion, a key part of achieving risk ignition is utilizing changes that can dynamically adapt the amount of risk in the portfolio to any given market environment.

As an example, take an investor that wants to target 10% volatility using a stock/bond mix.  Using historical data going back to the 1980s, this would require holding 55% in stocks and 45% in bonds.  Yet, our research shows that 20% of that bond position is held simply to offset the worst 3 years of equity returns. With 10-year Treasuries yielding only 2.8%, the cost of re-allocating this 20% of the portfolio from stocks to bonds just to protect against market crashes is significant.

This is why we advocate using tactical asset allocation as a pivot around a strategic asset allocation core.  Let’s continue to use the 55/45 stock/bond blend as a starting point.  We can take 30% of the portfolio and put it into a tactical strategy that has the flexibility to move between 100% stocks and 100% bonds.  We fund this allocation by taking half of the capital (15%) from stocks and the other half from bonds.  Now our portfolio has 40% in stocks, 30% in bonds, and 30% in tactical.  When the market is trending upwards, the tactical strategy will likely be fully invested and the entire portfolio will be tilted 70/30 towards stocks, taking advantage of the equity market tailwinds.  When trends turn negative, the tactical strategy will re-allocate towards bonds and in the most extreme configuration tilt the entire portfolio to a 40/60 stock/bond mix.

In this manner, we can use a dynamic strategy to dial the overall portfolio’s risk up and down as market risk ebbs and flows.

Summary

For most investors, failure means not meeting one’s financial objectives.  In the portfolio management context, failure comes in two flavors: slow failure results from taking too little risk and fast failure results from taking too much risk.

While slow failure has typically resulted from allocating too conservatively or holding excessive cash balances, the current low return environment means that even investors doing everything by the book may not be able to achieve the growth necessary to meet their goals.

Fast failure, on the other hand, is always a reality for investors.  Market crashes will happen eventually.  The biggest risk for investors is that they are unlucky enough to experience a market crash at the wrong time.  We call this sequence risk.

A robust risk management strategy should seek to manage the risk of both slow failure and fast failure.  This means not simply seeking to minimize risk, but rather calibrating it to both the objective and the market environment.

 


 

[1] Using Research Affiliates’ asset allocation tool, the efficient portfolio that delivers an expected real return of 4% means taking on estimated annualized volatility of 12%.  This portfolio has more than double the volatility of a 40% U.S. large-cap / 60% intermediate Treasuries portfolio, which not coincidently returned 4% after inflation going back to the 1920s.

[2] The exact allocations are 0.5% U.S. small-cap, 14.1% foreign developed equities, 24.6% emerging market equities, 12.0% long-term Treasuries, 5.0% intermediate-term Treasuries, 0.8% high yield, 4.5% bank loans, 2.5% emerging market bonds (USD), 8.1% emerging market bonds (local currency), 4.4% emerging market currencies, 3.2% REITs, 8.6% U.S. commercial real estate, 4.2% commodities, and 7.5% private equity.

[3] https://blog.thinknewfound.com/2017/08/impact-high-equity-valuations-safe-retirement-withdrawal-rates/

[4] https://blog.thinknewfound.com/2017/09/butterfly-effect-retirement-planning/

[5] https://blog.thinknewfound.com/2017/09/addressing-low-return-forecasts-retirement-tactical-allocation/

[6] https://blog.thinknewfound.com/2017/12/no-silver-bullets-8-ideas-financial-planning-low-return-environment/

[7] Obviously, there are scenarios where large losses alone can be devastating.  One example are losses that are permanent or take an investment’s value to zero or negative (e.g. investments that use leverage).  Another are large losses that occur in portfolios that are meant to fund short-term objectives/liabilities.

[8] We assume 4% withdrawals increased for 2% annual inflation.

[9] https://blog.thinknewfound.com/2015/09/achieving-risk-ignition/

The Butterfly Effect in Retirement Planning

This article is available for download as a PDF here

Summary

  • The low current market outlook for stocks and bonds paints a gloomy picture for retirees under common retirement forecasting assumptions.
  • However, assumptions such as net investment returns and retirement spending can have a large impact on forecasted retirement success, even for small changes in parameters.
  • By boosting returns through a combination of broader asset class and strategy diversification, considering lower fee options for passive exposures, and nailing down how retirement spending will evolve over time, we can arrive at retirement success projections that are both more reflective of a retiree’s actual situation and more in line with historical experience.

A few weeks back, we wrote about the potential impact that high core asset valuations – and the associated muted forward return expectations – may have on retirement[1].

In the post, we presented the following visualization:

Historical Wealth Paths for a 4% Withdrawal Rate and 60/40 Stock/Bond Allocation

Source: Shiller Data Library. Calculations by Newfound Research. Credit to Reddit user zaladin for the graph format. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

The horizontal (x-axis) represents the year when retirement starts.  The vertical (y-axis) represents a given year in history.  The coloring of each cell represents the savings balance at a given point in time.  The meaning of each color as follows:

  • Green: Current account value greater than or equal to initial account value (e.g. an investor starting retirement with $1,000,000 has a current account balance that is at least $1,000,000).
  • Yellow: Current account value is between 75% and 100% of initial account value
  • Orange: Current account value is between 50% and 75% of the initial account value.
  • Red: Current account value is between 25% and 50% of the initial account value.
  • Dark Red: Current account value is between 0% and 25% of initial account value.
  • Black: Current account value is zero; the investor has run out of money.

We then recreated the visualization, but with one key modification: we adjusted the historical stock and bond returns downward so that the long-term averages are in line with realistic future return expectations[2] given current valuation levels.  We did this by subtracting the difference between the actual average log return and the forward-looking long return from each year’s return.  With this technique, we capture the effect of subdued average returns while retaining realistic behavior for shorter-term returns.

Historical Wealth Paths for a 4% Withdrawal Rate and 60/40 Stock/Bond Allocation with Current Return Expectations

Source: Shiller Data Library. Calculations by Newfound Research. Credit to Reddit user zaladin for the graph format. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

One downside of the above visualizations is that they only consider one withdrawal rate / portfolio composition combination.  If we want the see results for withdrawal rates ranging from 1% to 10% in 1% increments and portfolio combinations ranging from 0/100 stocks/bonds to 100/0 stocks/bonds in 20% increments, we would need sixty graphs!

To distill things a bit more, we looked at the historical “success” of various investment and withdrawal strategies.  We evaluated success on three metrics:

  1. Absolute Success Rate (“ASR”): The historical probability that an individual or couple will not run out of money before their retirement horizon ends.
  2. Comfortable Success Rate (“CSR”): The historical probability that an individual or couple will have at least the same amount of money, in real terms, at the end of their retirement horizon compared to what they started with.
  3. Ulcer Index (“UI”): The average pain of the wealth path over the retirement horizon where pain is measured as the severity and duration of wealth drawdowns relative to starting wealth[3].

As a quick refresher, below we present the ASR for various withdrawal rate / risk profile combinations over a 30-year retirement horizon first using historical returns and then using historical returns adjusted to reflect current valuation levels.

Absolute Success Rate for Various Combinations of Withdrawal Rate and Portfolio Composition – 30 Yr. Horizon

Source: Shiller Data Library. Calculations by Newfound Research. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Absolute Success Rate for Various Combinations of Withdrawal Rate and Portfolio Composition with Average Stock and Bond Returns Equal to Current Expectations – 30 Yr. Horizon

Source: Shiller Data Library. Calculations by Newfound Research. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Overall, our analysis suggested that retirement withdrawal rates that were once safe may now deliver success rates that are no better – or even worse – than a coin flip.

Over the coming weeks, we want to delve a bit deeper into this topic.  Specifically, we are going to explore some key properties of distribution portfolios – portfolios from which investors take regular withdrawals to finance retirement spending – as well as some strategies that investors may consider in order to improve retirement outcomes.

This week we are going to focus on the high degree of sensitivity that retirement planning outcomes can have to initial assumptions.  In upcoming weeks, we will explore other retirement investment topics, including:

  1. The sequence of returns and risk management.
  2. The impact of behavioral finance and investor emotions.
  3. Finding the right portfolio risk profile through retirement.

The Butterfly Effect in Retirement Portfolios

Quoting from a great piece on distribution portfolio theory by James Sandidge[4]:

“The butterfly effect refers to the ability of small changes early in a process that lead to significant impact later.  It gets its name from the idea that a butterfly flapping its wings in Brazil could trigger a chain of events that would culminate in the formation of a tornado in Texas[5].  The butterfly effect applies to distribution portfolios where even small changes early in retirement can have significant impact long-term.” 

One example of the butterfly effect in the context of retirement planning is the impact of small changes in long-term average returns.  These differences could arise from investment outperformance or underperformance, fees, expenses, or taxes.

In the example below, we consider 60/40 stock/bond investor with a 30-year investment horizon and a 4% target withdrawal rate, adjusted each year for inflation.  We consider three scenarios:

  1. Pessimistic Scenario: Average annual portfolio returns are 100bps below our long-term assumption (e.g. we picked bad managers, allocated assets poorly, paid high fees, etc.).
  2. Base Case Scenario: Average annual portfolio returns are equal to our long-term assumption.
  3. Optimistic Scenario: Average annual portfolio returns are 100bs above our long-term assumption (e.g. we picked good managers, nailed our asset allocation, paid lower than expected fees, etc.).

We see that varying our return assumption by just +/-100bps can swing our probability of fully funding retirement – without decreasing withdrawals below plan – from 48% to 74%.  Similarly, the probability of ending retirement with our original nest egg fully intact ranges from 11% in the pessimistic scenario to 47% in the optimistic scenario.

In the optimistic scenario, the median ending wealth after 30 years is $800k for an initial investment of $1mm.  Not outstanding but certainly nothing to complain about.  In the pessimistic scenario, however, our median ending wealth is zero, meaning the most likely outcome is running out of money!

The Butterfly Effect and Changes to Average Long-Term Return Assumption:
30-Yr. Horizon, 60/40 Allocation, 4% Withdrawals

Source: Shiller Data Library. Calculations by Newfound Research. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Below, we present one example that is particularly telling: an investor that retired in 1973[6].  We see that a 100bps difference in returns in either direction can literally be the difference between running out of funds (gray), sweating every dollar and cent (orange), or a relatively comfortable retirement (blue).

Source: Shiller Data Library. Calculations by Newfound Research. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

Camouflaged Butterflies: Assumptions in Spending Rate Changes

An example of a secondary input that sometimes may be glossed over, but nonetheless can have a large impact on outcomes is the assumption regarding how quickly withdrawals will increase relative to inflation.  Again, we consider three scenarios:

  1. Withdrawals increase at a rate that is 1% slower than inflation (i.e. spending will rise by 2% year-over-year when inflation is 3% – spending falls in real terms).
  2. Withdrawals increase at the same rate of inflation (spending stays constant in real terms).
  3. Withdrawals increase at a rate that is 1% faster than inflation (i.e. spending will rise by 4% year-over-year when inflation is 3% – spending rises in real terms). This is probably an unrealistic scenario, for reasons that we will discuss later, but it still helps illustrate the sensitivity of planning analysis to its inputs.

The Butterfly Effect and Changes to the Spending Growth Assumption:
30-Yr. Horizon, 60/40 Allocation, 4% Withdrawals

Source: Shiller Data Library. Calculations by Newfound Research. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Overall, the results are very similar in magnitude to what we saw when we adjusted the return assumption.

Implications of the Butterfly Effect

The examples above provide clear evidence that retirement success is significantly impacted by both primary and secondary assumptions.  But what does this mean for investors?  We think there are two main implications.

Getting the details right is crucial.    

First, it’s important to get the details right when planning for retirement.  To highlight this, let’s return to the topic of spending.  Many financial calculators assume that spending increases one-for-one with inflation through retirement.  Put differently, this assumes that spending is constant after adjusting for inflation.

Data from the Employee Benefit Research Institute (“EBRI”) suggests that this is generally an erroneous assumption.  Instead, spending tends to decline as retirees age.  Specifically, EBRI found that on average spending declines 20% from age 50-64 to 65-79, 22% from age 65-79 to 80-89, and 12% from age 80-89 to 90+.

(Note: This is obviously a gross oversimplification of actual spending behavior.  At the end of this commentary, we discuss a few interesting research pieces on this topic.  They make clear the importance of customizing spending assumptions to each client’s situation and preferences.)

Source: “Adaptive Distribution Theory” by James B. Sandidge

 

Implementing more realistic spending assumptions makes a material difference in our Absolute Success Rate (“ASR”), Comfortable Success Rate (“CSR”), and Ulcer Index stats.

Below, we recreate our ASR, CSR, and Ulcer Index tables assuming that real spending declines by 1% per year.  We also compare these measures across three scenarios for a 4% withdrawal rate:

  1. Historical return assumptions and constant real spending
  2. Current return assumptions and constant real spending
  3. Current return assumptions and 1% per year decline in real spending

We see that our adjusted spending assumption helps to close the gap between the historical and forward-looking return scenarios.  This is especially true when we look at the ASR.

For example, a 60/40 portfolio and 4% constant real withdrawal rate produced an ASR of 99% across all historical market scenarios.  The success rate dropped all the way to 58% when we adjusted the historical stock and bond returns downward for our future expectations.  Changing to the declining spending path increases the success rate from 58% to 75%.

 

Source: Shiller Data Library. Calculations by Newfound Research. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Absolute Success Rate for Various Combinations of Withdrawal Rate and Portfolio Composition with Average Stock and Bond Returns Equal to Current Expectations and Real Spending Declining by 1% Per Year – 30 Yr. Horizon

Source: Shiller Data Library. Calculations by Newfound Research. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Source: Shiller Data Library. Calculations by Newfound Research. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Comfortable Success Rate for Various Combinations of Withdrawal Rate and Portfolio Composition with Average Stock and Bond Returns Equal to Current Expectations and Real Spending Declining by 1% Per Year – 30 Yr. Horizon

Source: Shiller Data Library. Calculations by Newfound Research. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Source: Shiller Data Library. Calculations by Newfound Research. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Ulcer Index for Various Combinations of Withdrawal Rate and Portfolio Composition with Average Stock and Bond Returns Equal to Current Expectations and Real Spending Declining by 1% Per Year – 30 Yr. Horizon

Source: Shiller Data Library. Calculations by Newfound Research. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Incremental increases (decreases) in portfolio returns (spending) matter, a lot.

Reducing spending is a very personal topic, so we will focus on some potential ways to grind out some incremental portfolio gains.  (Note: another important topic when constructing withdrawal portfolios is to manage sequence of return risk.  We will address this topic in a future post).

First, it’s important to be strategic, not static.  To us, this means having a thoughtful, forward-looking outlook when setting a strategic asset allocation.  A big part of this is fighting the temptation of home-country bias.

Source: https://personal.vanguard.com/pdf/icrrhb.pdf

 

This tendency to prefer home-country assets not only leaves quite a bit of diversification on the table, but also puts U.S. investors on the wrong side current equity market valuations.

Source: https://personal.vanguard.com/pdf/icrrhb.pdf

 

Based upon a blended set of capital market assumptions sourced from J.P. Morgan, Blackrock, and BNY Mellon, we see that it’s possible to increase long-term expected returns by between 30bps and 50bps, depending on desired risk profile, by moving beyond U.S. stocks and bonds[7].  Last week we discussed the “weird portfolios” that may be best positioned for the future.

Source: J.P. Morgan, Blackrock, BNY Mellon, Newfound Research. Return forecasts are forward-looking statements based upon the reasonable beliefs of Newfound Research and are not a guarantee of future performance. Forecasts are not representative of any Newfound Research strategy or fund. Forward-looking statements speak only as of the date they are made and Newfound assumes no duty to and does not undertake to update forward-looking statements. Forward looking statements are subject to numerous assumptions, risks, and uncertainties, which change over time. Actual results may differ materially from those anticipated in forward-looking statements. Returns are presented gross of taxes and fees.

Second, we recommend using a hybrid active/passive approach for core exposures given the increasing availability of evidence-based, favor-driven investment strategies.  Now this sounds great in theory, but with over 300 factors now identified across the global equity markets and the proliferation of “smart beta” ETFs, it is reasonable to wonder how in the world one can have a view of which factors will actually work going forward.  To dig into this a bit deeper, let’s look at one of our favorite examples of factor-based investing.

 

This portfolio, suggested by Vanguard, buys companies whose tickers start with the letters S, M, A, R, or T.  This is not a real portfolio that anyone should invest in; yet it has identified an anomalous outperformance pattern.  On a backtested basis, the S.M.A.R.T. Beta portfolio nearly doubled the annualized return of the S&P 500.

 

In order to determine the validity of this so-called factor, we need to understand:

 

  1. What is the theory that explains why the factor works (provides excess return)? Without a theory for why something works, we cannot possibly form an intelligent view as to whether or not it will world in the future.
  2. How has the factor performed on an out-of-sample basis? This is math speak for the following types of questions: How as the factor performed after its discovery?  How does the factor work with slightly alternative implementations?  Does the factor perform well in other assets classes and geographies?

 

In the case of the S.M.A.R.T. Beta factor, these questions allow us to quickly dismiss it.  There is obviously no good reason – at least no good reason we can think of off the top of our heads –  for why the first letter in a stock’s ticker should drive returns[8].  While we have not tested S.M.A.R.T. Beta across asset classes and geographies, we know that this was simply a tongue-in-cheek example presented by Vanguard trying to get the point across that it’s easy to find something that works in the past, but much harder to find something that works in the future.  We suspect that if we did test the strategy in other countries, as an example, that it would probably outperform in some cases and underperform in others.  This lack of robustness would be a clear sign that our level of confidence in this factor going forward should be very low.

So, what factors do meet these criteria (in our view)?  Only four that are applicable to stocks:

  • Value: Buy cheap stocks and sell expensive ones
  • Momentum: Buy outperforming securities and sell underperforming ones
  • Defensive: But lower risk/higher quality securities and sell higher risk/lower quality ones
  • Size/Liquidity: Buy smaller/less liquid companies and sell larger/more liquid ones[9]

Data Source: AQR, Calculations by Newfound Research. Value is the HML Devil factor. Momentum is the UMD factor. Defensive is a blend of the BAB and QMJ factors. Size is the SMB factor. Equal Weight is an equally weighted blend of all four factors, rebalanced monthly. Returns include the reinvestment of dividends and are gross of all fees and expenses. Past performance does not guarantee future results.

 

Going back to 1957, an equally-weighted blend of the four factors mentioned above would have generated in excess of 500bps of excess annualized return before fees and expenses.  Even if we discount future performance by 50% for reduced strategy efficacy and fees, the equal weight factor portfolio could add nearly 160bps for a 60/40 investor[10].

Third, we recommend looking beyond fixed income for risk management.  Broadly speaking, we divide asset classes and strategies into two categories: return generators and risk mitigators.

Over the last 30+ years, investors have been very fortunate that their primary risk mitigator – fixed income – happened to experience an historic bull market.

Unfortunately, our situation today is much different than the early 1980s.  Current yields are very low by historical standards, implying that fixed income is likely to be a drag to portfolio performance especially after accounting for inflation.  However, that does not mean that bonds should not still play a key role in all but the most aggressive portfolios.  It simply means that the premium for using bonds as a form of portfolio insurance is high relative to historical standards.  As a result, we advocate looking for complementary risk management tools.

One option here would be to employ a multi-strategy, unconstrained sleeve like we constructed in a recent commentary[11]. When constructed with the right objectives in mind, these types of portfolios can act as an effective buffer to equity market volatility without the cost of large fixed income positions in a low interest rate environment.  Let’s take the Absolute Return strategy that we discussed in that piece.  It was constructed by optimizing for an equal risk contribution across the following seven asset classes and strategies:

  1. U.S. Treasuries: 25%
  2. Low volatility equities: 8%
  3. Trend-based tactical asset allocation: 9%
  4. Value-based tactical asset allocation: 12%
  5. Unconstrained fixed income: 25%
  6. Risk Parity: 9%
  7. Managed Futures: 12%

Now let’s consider our typical 60/40 investor.  Historically, a 25% allocation to this unconstrained sleeve with 18.8% (3/4 of the 25%) taken from fixed income and 6.3% (1/4 of the 25%) taken from equities would have left the investor in the same place as the original 60/40 from a risk perspective.  This holds true whether we measure risk as volatility or maximum drawdown.

When we regress the absolute return strategy on world equities and U.S. Treasuries, we get the following results (data for this analysis covers the period from January 1993 to June 2016):

  • A loading to global equities of 0.25
  • A loading to U.S. Treasuries of 0.49
  • Annualized alpha of approximately 2%
  • Annualized residual volatility of 2.2%.
  • An R-squared of around 0.77

From the relatively high R-squared, we can conclude that a decent way to think of the absolute return portfolio is as a combination of three positions: 1) a 25% allocation to world stocks, 2) a 49% allocation to U.S. Treasuries, and 3) a 100% allocation to an unconstrained long/short portfolio with historical performance characterized by a 2% excess return and 2.2% volatility.

Using this construct, we can get at least a very rough idea of what to expect going forward by plugging in our capital market assumptions for world equities and U.S. Treasuries and making a reasonable assumption for what the long/short portfolio can deliver going forward on a net-of-fee basis. Let’s assume as we did for the factor discussion that the long/short portfolio only captures around 50% of its historical performance after fees.  This would still imply an expected forward-looking return of 4.1% compared to an average expected return of 2.5% for U.S. core bonds[15].  For the 60/40 investor, this could mean close to 25bps of incremental return.

Finally, we should seek to reduce fees, all else being equal.  Four things that we think are worth mentioning here. 

  1. We need to consider fees holistically. This means looking beyond expense ratios and considering factors like execution costs (e.g. bid/ask spread), commissions, and ticket charges.
  2. The “all else being equal” part is really important. We want to be fee-conscious, not fee centric.  Just like you probably don’t always buy the cheapest home, clothes, and electronics, we don’t believe in defaulting to the lowest cost investment option in all cases.  We want to find value in the investments we choose.  If market-cap weighted equity exposure costs 5bps and we can get multi-factor exposure for 25bps, we will not eliminate the factor product from consideration just due higher fees if we believe it can offer more than 20bps in incremental value. Fortunately, the proliferation of passive investment vehicles effectively being offered for free has helped put downward pressure on products throughout the industry.
  3. We have to remember that while there are many, many merits to a passive, market-cap weighted approach, the rise of this type of investing has largely coincided with upward trends in equity and bond valuations. In other words, the return pie has been very big and therefore the name of the game has been capturing as much of the pie as possible, usually by minimizing fees and staying disciplined (after all, a passive approach to investing, like any other approach, only works long-term if we can stick with it, and behavioral science and experience suggests there are real difficulties doing so especially when markets get volatile).  Today, we are in a fundamentally different situation.  The pie is nearly as small as it’s ever been.  For many investors, even capturing 100% of the pie may not be enough.  Instead, many must search out ways to expand the pie in order to meet their goals.
  4. From a behavioral perspective, there is nothing wrong with channeling our inner Harry Markowitz and going with a hybrid active[13]/passive approach within the same portfolio. Markowitz, who helped revolutionize portfolio construction theory with his landmark paper “Portfolio Selection,” famously explained that when building his own portfolio he knew he should have “…computed the historical covariances of the asset classes and drawn an efficient frontier.”  Instead, he said, “I visualized my grief if the stock market went way up and I wasn’t in it – or if it went way down and I was completely in it.  So, I split my contributions 50/50 between stocks and bonds.”  We are strong advocates for passive, just not for 100% concentration in passive.

Let’s say as an example that by using these techniques, we are able to improve returns by 150bps annually.  What would the impact be on ASR, CSR, and Ulcer Index using our same framework?  For this analysis, we retain our assumption from earlier that real spending declines by 1% per year.

Source: Shiller Data Library. Calculations by Newfound Research. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Absolute Success Rate for Various Combinations of Withdrawal Rate and Portfolio Composition with Average Stock and Bond Returns Equal to Current Expectations Plus 150bps and Real Spending Declining by 1% Per Year – 30 Yr. Horizon

Source: Shiller Data Library. Calculations by Newfound Research. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Source: Shiller Data Library. Calculations by Newfound Research. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Comfortable Success Rate for Various Combinations of Withdrawal Rate and Portfolio Composition with Average Stock and Bond Returns Equal to Current Expectations Plus 150bps and Real Spending Declining by 1% Per Year – 30 Yr. Horizon

Source: Shiller Data Library. Calculations by Newfound Research. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Source: Shiller Data Library. Calculations by Newfound Research. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Ulcer Index for Various Combinations of Withdrawal Rate and Portfolio Composition with Average Stock and Bond Returns Equal to Current Expectations Plus 150bps and Real Spending Declining by 1% Per Year – 30 Yr. Horizon

Source: Shiller Data Library. Calculations by Newfound Research. Analysis assumes the reinvestment of dividends. Returns are hypothetical index returns and are gross of all fees and expenses. Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Conclusion: The Sum of All Assumptions in Retirement

Retirement projections are based on many different assumptions including asset class returns, time horizon, allocation strategies, inflation, and how withdrawals evolve over time. Small changes in many of these assumptions can have a large impact on retirement success rates (the Retirement Butterfly Effect).

High valuations of core assets in the U.S. suggest that retirement withdrawal rates that were once safe may now deliver success rates that are no better – or even worse – than a coin flip.  However, by focusing our efforts on refining the assumptions that go into retirement planning, we can arrive at results that do not spell doom and gloom for retirees.

While getting all the details right is ideal, there are specific areas that matter the most.

For returns, increasing net returns is what matters, which means there are many knobs to adjust.  Incorporating factor based strategies and broader diversification are good initial starting points. Expanding the usage of international equity and unconstrained strategy exposure can be simple modifications to traditional U.S. equity and bond heavy portfolios that may give a boost to forward-looking returns.

Fees, expenses, and taxes can be other areas to examine as long as we keep in mind that it is best to be fee/expense/tax-conscious, not fee/expense/tax-centric.  Slight fee or tax inefficiencies can cause a “guaranteed” loss of return, but these effects must be weighed against the potential upside.

For many exposures (e.g. passive and long-only core stock and bond exposure), minimizing cost is certainly appropriate.  However, do not let cost considerations preclude the consideration of strategies or asset classes that can bring unique return generating or risk mitigating characteristics to the portfolio.

These are all ideas that help form the foundation for our QuBe Model Portfolios.

With spending, the assumption that retirees will track inflation with their withdrawals throughout a 30 year retirement is not applicable across the board. Nailing down spending is tough, but improved assumptions can have a big impact on retirement forecasts. A thorough conversation on housing, health care, travel, insurance, and general consumption is critical.

As with any model that produces a forecast, there will always be errors in retirement projections. When asset class returns are strong, as they have been in previous decades, we can comfortably brush many assumptions under the rug. However, with muted future returns, achieving financial goals requires a better understanding of model sensitivities and more diligent research into how to equip portfolios to thrive in such an environment.

 

Appendix: Retiree Spending Behavior

Estimating the True Cost of Retirement[14]

David Blanchett, Head of Retirement Research for Morningstar Investment Management, argues that the common assumptions of a generic replacement rate[15], constant real spending, and a fixed retirement horizon do not accurately capture the highly personalized nature of a retiree’s spending behavior.

Key takeaways include:

  1. From a category perspective, the main changes through retirement are a decline in relative spending on insurance and pensions and an increase in health care spending.

    Source: Blanchett’s Estimating the True Cost of Retirement

  2. Forecasts on spending by category can be used to determine a customized spending inflation rate for a given household.  For example, Blanchett plots general inflation vs. medical inflation.  Using this relationship, we can predict that 2% general inflation would lead to medical cost inflation of approximately 4%.  One theme of many research papers on the topic of retirement spending is that health care planning should be accounted for in a separate line item.  Not only does the future of the health care system have the potential to look much different from the past, but the actual financial impact of health care costs can differ greatly depending on each individual’s insurance situation.  Blanchett also finds that health care spending does not differ materially across income levels.

    Source: Blanchett’s Estimating the True Cost of Retirement

  3. Blanchett finds that spending does decline through retirement and on average follows a “U” pattern whereby spending declines accelerate before age 75 and decelerate afterwards.

    Source: Blanchett’s Estimating the True Cost of Retirement

  4. Blanchett decomposed the population of his dataset into four groups based on spending and net worth.  $30,000 was the threshold for separating spenders into high and low groups.  $400,000 was the threshold for dividing the population by net worth.  He found that households with “matched” spending and net worth (i.e. low spending and low net worth or high spending and high net worth) exhibited the “U” pattern that we saw with the full dataset.  However, households with mismatched spending/net worth behaved differently.  High net worth and low spending households saw spending increase through retirement, although the rate of this increase was faster earlier in retirement.  Conversely, households with high spending and low net worth reduced their spending more aggressively than the other groups.

    Source: Blanchett’s Estimating the True Cost of Retirement

How Does Household Expenditure Change with Age for Older Americans? [16]

The EBRI studied linked above also documents spending reductions through retirement.  It presents very interesting data on the distribution of health care spending by age group.  We see that the distribution widens out significantly over time with the largest increases occurring in the right tail (90th and 95th percentile of spending).

Source: EBRI

 

Spending in Retirement [17]

In this piece, J.P. Morgan analyzed retirement spending using a unique dataset of 613,000 households that utilize the Chase platform (debit cards, credit cards, mortgage payments, etc.) for the majority of their spending.  The authors found the same general trend of declining spending as in the EBRI and Morningstar pieces.

Spending declines were largest in the transportation, apparel & services, and mortgage categories.  The overall and category-specific patterns were generally consistent across wealth levels.  The researchers were able to classify households into five categories: foodies, homebodies, globetrotters, health care spenders, and snowflakes.  This categorization is relevant because each group can expect to see their spending needs evolve differently over time.  Some key takeaways for each group are:

  1. Foodies
    1. Most common group
    2. Generally frugal
    3. Low housing expenses due to mortgages being paid off and low property tax bills
    4. Tend to spend less as they get older and so an assumption of faster declines in real spending may be appropriate
  2. Homebodies
    1. High share of spending on mortgages, real estate taxes, and ongoing maintenance
    2. May be prudent to assume that expenses track inflation
    3. For planning purposes, it’s important to consider future plans related to housing
  3. Globetrotters
    1. Highest overall spending
    2. More common among households with higher net worth
    3. May be prudent to assume that expenses track inflation
  4. Health care spenders
    1. Medicare-related expenses were the largest share of spending for these households
    2. These expenses may grow faster than inflation.
    3. For further reading, see:
      1. Health care costs in retirement [18]
      2. Guide to Retirement [19]
  5. Snowflakes
    1. These households are more unique and do not fit into one of the other four categories.

[1] https://blog.thinknewfound.com/2017/08/impact-high-equity-valuations-safe-retirement-withdrawal-rates/

[2] Specifically, we use the “Yield & Growth” capital market assumptions from Research Affiliates.  These capital market assumptions assume that there is no valuation mean reversion (i.e. valuations stay the same going forward).  The adjusted average nominal returns for U.S. equities and 10-year U.S. Treasuries are 5.3% and 3.1%, respectively, compared to the historical values of 9.0% and 5.3%.

[3] Normally, the Ulcer Index would be measured using true drawdown from peak, however, we believe that using starting wealth as the reference point may lead to a more accurate gauge of pain.

[4] References to ideas similar to the butterfly effect date back as far as the 1800s.  In academia, the idea is prevalent in the field of chaos theory.

[5] https://www.imca.org/sites/default/files/current-issues/JIC/JIC172_AdaptiveDistributionTheory.pdf

[6] We continue to adjust returns to account for current valuations.  Therefore, this example takes the actual returns for U.S. stocks and bonds from 1973 to 2003 and then adjusts them downward based on the Research Affiliates’ long-term return assumptions.

[7] Potential increases in expected return, based upon the capital market assumptions of the three institutions listed, are actually larger than what we present here.  This results from two aspects of the QuBe investment process.  First, we utilize a simulation-based approach that incorporates downside shocks to the correlation matrix and that accounts for parameter estimate uncertainty.  Second, we consider two behaviorally-based optimizations, one that attempts to smooth the absolute path of returns and another that attempts to smooth the path of returns relative to a common benchmark, which is tilted toward U.S. equities.  Both of these techniques reduce the expected returns generated when we combine the resulting weights with the stated capital market assumptions.

[8] There actually has been research published suggesting evidence that stock tickers can be useful in picking stocks.  For example, “Would a stock by any other ticker smell as sweet?” by Alex Head, Gary Smith, and Julia Wilson find evidence that stocks with “clever” tickers (e.g. Southwest’s choice of LUV to reflect its brand) outperform the broader market.  Their results were robust to the Fama-French 3-factor model.  As a rationale for these results, the authors posited that clever tickers might signal manager ability or that the memorable tickers feed into the behavioral biases of investors.

[9] The size premium is probably the most hotly debated of the four today.  Recent research suggests that that size prospers once we control for quality (i.e. we want to buy small, high quality companies not just small companies).

[10] As we’ve written about in the past, factor portfolios do not have to generate excess returns to justify an allocation in equity portfolios.  Even with zero to slightly negative premiums, moderate allocations to these strategies would have historically led to increased risk-adjusted returns due to the diversification that they provide to market-cap weighted portfolios.

[11] https://blog.thinknewfound.com/2017/07/building-unconstrained-sleeve/

[12] Again using data from J.P. Morgan, Blackrock, and BNY Mellon.

[13] When we say active, we usually (but not always) mean systematic strategies that are factor-based and implemented using a quantitative and rules-based investment process.

[14] Blanchett, David.  2013.  Estimating the True Cost of Retirement.  Working paper, Morningstar Investment Management.  https://corporate.morningstar.com/ib/documents/MethodologyDocuments/ResearchPapers/Blanchett_True-Cost-of-Retirement.pdf

[15] Quoting from Blanchett, “The replacement rate is the percentage of household earnings need to maintain a similar standard of living during retirement.

[16] Banerjee, Sudipto.  2014.  How Does Household Expenditure Change with Age for Older Americans? Employee Benefits Research Institute.  Notes 35, no. 9 (September). https://www.ebri.org/pdf/notespdf/Notes.Sept14.EldExp-Only.pdf

[17] Roy, Katherine and Sharon Carson. 2015.  Spending in Retirement.  J.P. Morgan.  https://am.jpmorgan.com/gi/getdoc/1383244966137.

[18] Carson, Sharon and Laurance McGrath. 2016.  Health care costs in retirement.  J.P. Morgan.  https://am.jpmorgan.com/blob-gim/1383331734803/83456/RI_Healthcare%20costs_2016_r4.pdf?segment=AMERICAS_US_ADV&locale=en_US

[19] Roy, Katherine, Sharon Carson, and Lena Rizkallah.  2016.  Guide to Retirement.  J.P. Morgan.  https://am.jpmorgan.com/blob-gim/1383280097558/83456/JP-GTR.pdf

 

Page 1 of 3

Powered by WordPress & Theme by Anders Norén