The Research Library of Newfound Research

Author: Nathan Faber Page 2 of 5

Nathan is a Portfolio Manager at Newfound Research. At Newfound, Nathan is responsible for investment research, strategy development, and supporting the portfolio management team.

Prior to joining Newfound, he was a chemical engineer at URS, a global engineering firm in the oil, natural gas, and biofuels industry where he was responsible for process simulation development, project economic analysis, and the creation of in-house software.

Nathan holds a Master of Science in Computational Finance from Carnegie Mellon University and graduated summa cum laude from Case Western Reserve University with a Bachelor of Science in Chemical Engineering and a minor in Mathematics.

The Limit of Factor Timing

This post is available as a PDF download here.

Summary­

  • We have shown previously that it is possible to time factors using value and momentum but that the benefit is not large.
  • By constructing a simple model for factor timing, we examine what accuracy would be required to do better than a momentum-based timing strategy.
  • While the accuracy required is not high, finding the system that achieves that accuracy may be difficult.
  • For investors focused on managing the risks of underperformance – both in magnitude and frequency – a diversified factor portfolio may be the best choice.
  • Investors seeking outperformance will have to bear more concentration risk and may be open to more model risk as they forego the diversification among factors.

A few years ago, we began researching factor timing – moving among value, momentum, low volatility, quality, size etc. – with the hope of earning returns in excess not only of the equity market, but also of buy-and-hold factor strategies.

To time the factors, our natural first course of action was to exploit the behavioral biases that may create the factors themselves. We examined value and momentum across the factors and used these metrics to allocate to factors that we expected to outperform in the future.

The results were positive. However, taking into account transaction costs led to the conclusion that investors were likely better off simply holding a diversified factor portfolio.

We then looked at ways to time the factors using the business cycle.

The results in this case were even less convincing and were a bit too similar to a data-mined optimal solution to instill much faith going forward.

But this evidence does not necessarily remove the temptation to take a stab at timing the factors, especially since explicit transactions costs have been slashed for many investors accessing long-only factors through ETFs.Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. 

After all, there is a lot to gain by choosing the right factors. For example, in the first 9 months of 2019, the spread between the best (Quality) and worst (Value) performing factors was nearly 1,000 basis points (“bps”). One month prior, that spread had been double!

In this research note, we will move away from devising a systematic approach to timing the factors (as AQR asserts, this is deceptively difficult) and instead focus on what a given method would have to overcome to achieve consistent outperformance.

Benchmarking Factor Timing

With all equity factor strategies, the goal is usually to outperform the market-cap weighted equity benchmark.

Since all factor portfolios can be thought of as a market cap weighted benchmark plus a long/short component that captures the isolated factor performance, we can focus our study solely on the long/short portfolio.

Using the common definitions of the factors (from Kenneth French and AQR), we can look at periods over which these self-financing factor portfolios generate positive returns to see if overlaying them on a market-cap benchmark would have added value over different lengths of time.1

We will also include the performance of an equally weighted basket of the four factors (“Blend”).

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

The persistence of factor outperformance over one-month periods is transient. If the goal is to outperform the most often, then the blended portfolio satisfies this requirement, and any timing strategy would have to be accurate enough to overcome this already existing spread.

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

The results for the blended portfolio are so much better than the stand-alone factors because the factors have correlations much lower than many other asset classes, allowing even naïve diversification to add tremendous value.

The blended portfolio also cuts downside risk in terms of returns. If the timing strategy is wrong, and chooses, for example, momentum in an underperforming month, then it could take longer for the strategy to climb back to even. But investors are used to short periods of underperformance and often (we hope) realize that some short-term pain is necessary for long-term gains.

Looking at the same analysis over rolling 1-year periods, we do see some longer periods of factor outperformance. Some examples are quality in the 1980s, value in the mid-2000s, momentum in the 1960s and 1990s, and size in the late-1970s.

However, there are also decent stretches where the factors underperform. For example, the recent decade for value, quality in the early 2010s, momentum sporadically in the 2000s, and size in the 1980s and 1990s. If the timing strategy gets stuck in these periods, then there can be a risk of abandoning it.

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

Again, a blended portfolio would have addressed many of these underperforming periods, giving up some of the upside with the benefit of reducing the risk of choosing the wrong factor in periods of underperformance.

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

And finally, if we extend our holding period to three years, which may be used for a slower moving signal based on either value or the business cycle, we see that the diversified portfolio still exhibits outperformance over the most rolling periods and has a strong ratio of upside to downside.

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

The diversified portfolio stands up to scrutiny against the individual factors but could a generalized model that can time the factors with a certain degree of accuracy lead to better outcomes?

Generic Factor Timing

To construct a generic factor timing model, we will consider a strategy that decides to hold each factor or not with a certain degree of accuracy.

For example, if the accuracy is 50%, then the strategy would essentially flip a coin for each factor. Heads and that factor is included in the portfolio; tails and it is left out. If the accuracy is 55%, then the strategy will hold the factor with a 55% probability when the factor return is positive and not hold the factor with the same probability when the factor return is negative. Just to be clear, this strategy is constructed with look-ahead bias as a tool for evaluation.

All factors included in the portfolio are equally weighted, and if no factors are included, then the returns is zero for that period.

This toy model will allow us to construct distributions to see where the blended portfolio of all the factors falls in terms of frequency of outperformance (hit rate), average outperformance, and average underperformance. The following charts show the percentiles of the diversified portfolio for the different metrics and model accuracies using 1,000 simulations.2

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

In terms of hit rate, the diversified portfolio behaves in the top tier of the models over all time periods for accuracies up to about 57%. Even with a model that is 60% accurate, the diversified portfolio was still above the median.

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

For average underperformance, the diversified portfolio also did very well in the context of these factor timing models. The low correlation between the factors leads to opportunities for the blended portfolio to limit the downside of individual factors.

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

For average outperformance, the diversified portfolio did much worse than the timing model over all time horizons. We can attribute this also to the low correlation between the factors, as choosing only a subset of factors and equally weighting them often leads to more extreme returns.

Overall, the diversified portfolio manages the risks of underperformance, both in magnitude and in frequency, at the expense of sacrificing outperformance potential. We saw this in the first section when we compared the diversified portfolio to the individual factors.

But if we want to have increased return potential, we will have to introduce some model risk to time the factors.

Checking in on Momentum

Momentum is one model-based way to time the factors. Under our definition of accuracy in the toy model, a 12-1 momentum strategy on the factors has an accuracy of about 56%. While the diversified portfolio exhibited some metrics in line with strategies that were even more accurate than this, it never bore concentration risk: it always held all four factors.

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

For the hit rate percentiles of the momentum strategy, we see a more subdued response. Momentum does not win as much as the diversified portfolio over the different time periods.

But not winning as much can be fine if you win bigger when you do win.

The charts below show that momentum does indeed have a higher outperformance percentile but with a worse underperformance percentile, especially for 1-month periods, likely due to mean reversionary whipsaw.

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. Data from July 1957 – September 2019.

While momentum is definitely not the only way to time the factors, it is a good baseline to see what is required for higher average outperformance.

Now, turning back to our generic factor timing model, what accuracy would you need to beat momentum?

Sharpening our Signal

The answer is: not a whole lot. Most of the time, we only need to be about 53% accurate to beat the momentum-based factor timing.

Source: Kenneth French Data Library, AQR. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes.  Performance assumes the reinvestment of all distributions. 

The caveat is that this is the median performance of the simulations. The accuracy figure climbs closer to 60% if we use the 25th percentile as our target.

While these may not seem like extremely high requirements for running a successful factor timing strategy, it is important to observe that not many investors are doing this. True accuracy may be hard to discover, and sticking with the system may be even harder when the true accuracy can never be known.

Conclusion

If you made it this far looking for some rosy news on factor timing or the Holy Grail of how to do it skillfully, you may be disappointed.

However, for most investors looking to generate some modest benefits relative to market-cap equity, there is good news. Any signal for timing factors does not have to be highly accurate to perform well, and in the absence of a signal for timing, a diversified portfolio of the factors can lead to successful results by the metrics of average underperformance and frequency of underperformance.

For those investors looking for higher outperformance, concentration risk will be necessary.

Any timing strategy on low correlation investments will generally forego significant diversification in the pursuit of higher returns.

While this may be the goal when constructing the strategy, we should always pause and determine whether the potential benefits outweigh the costs. Transaction costs may be lower now. However, there are still operational burdens and the potential stress caused by underperformance when a system is not automated or when results are tracked too frequently.

Factor timing may be possible, but timing and tactical rotation may be better suited to scenarios where some of the model risk can be mitigated.

Macro Timing with Trend Following

This post is available for download here.

Summary

  • While it may be tempting to time allocations to active strategies, it is generally best to hold them as long-term allocations.
  • Despite this, some research has shown that there may be certain economic environments where trend following equity strategies are better suited.
  • In this commentary, we replicate this data and find that a broad filter of recessionary periods does indeed show this for certain trend equity strategies but not for the style of trend equity in general.
  • However, further decomposing the business cycle into contractions, recoveries, expansions, and slowdowns using leading economic indicators such as PMI and unemployment does show some promising relationships between the forecasted stage of the business cycle and trend following’s performance relative to buy-and-hold equities.
  • Even if this data is not used to time trend equity strategies, it can be beneficial to investors for setting expectations and providing insight into performance differences.


Systematic active investing strategies are a way to achieve alternative return profiles that are not necessarily present when pursuing standard asset allocation and may therefore play an important role in developing well-diversified portfolios.

But these strategies are best viewed as allocations rather than trades.1 This is a topic we’ve written about a number of times with respect to factor investing over the past several years, citing the importance of weathering short-term pain for long-term gains. For active strategies to outperform, some underperformance is necessary. Or, as we like to say, “no pain, no premium.”

That being said, being tactical in our allocations to active strategies may have some value in certain cases. In one sense, we can view the multi-layered active decisions simply as another active strategy, distinct from the initial one.

An interesting post on Philosophical Economics looked at using a variety of recession indicators (unemployment, earnings growth, industrial production, etc.) as ways to systematically invest in either U.S. equities or a trend following strategy on U.S. equities. If the economic indicator was in a favorable trend, the strategy was 100% invested in equities. If the economic indicator was in an unfavorable trend, the strategy was invested in a trend following strategy applied to equities, holding cash when the market was in a downtrend.

The reasoning behind this strategy is intuitively appealing. Even if a recession indicator flags a likely recession, the market may still have room to run before turning south and warranting capital protection. On the other hand, when the recession indicator was favorable, purely investing in equities avoids some of the whipsaw costs that are inherent in trend following strategies.

In this commentary, we will first look at the general style of trend equity in the context of recessionary and non-recessionary periods and then get a bit more granular to see when trend following has worked historically through the economic cycle of Expansion, Slowdown, Contraction, and Recovery.

Replicating the Data

To get our bearings, we will first attempt to replicate some of the data from the Philosophical Economics post using only the classifications of “recession” and “not-recession”.

Keeping in line with the Philosophical Economics method, we will use whether the economic metric is above or below its 12-month moving average as the recession signal for the next month. We will use market data from the Kenneth French Data Library for the total U.S. stock market returns and the risk-free rate as the cash rate in the equity trend following model.

The following table shows the results of the trend following timing models using the United States ISM Purchasing Managers Index (PMI) and the Unemployment Rate as indicators.

U.S. Equities12mo MA Trend Equity12m MA Trend Timing Model (PMI)12mo MA Trend Timing Model (Unemployment)
Annualized Return11.3%11.1%11.3%12.2%
Annualized Volatility14.7%11.2%11.9%12.4%
Maximum Drawdown50.8%24.4%32.7%30.0%
Sharpe Ratio0.490.620.610.66

Source: Quandl and U.S. Bureau of Labor Statistics. Calculations by Newfound Research. Results are hypothetical. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index. Data is from Jan 1948 – Sep 2019.

With the trend timing model, we see an improvement in the absolute returns compared to the trend equity strategy alone. However, this comes at the expense of increasing the volatility and maximum drawdown.

In the case of unemployment, which was the strongest indicator that Philosophical Economics found, there is an improvement in risk-adjusted returns in the timing model.

Still, while there is a benefit, it may not be robust.

If we remove the dependence of the trend following model on a single metric or lookback parameter, the benefit of the macro-timing decreases. Specifically, if we replace our simple 12-month moving average trend equity rule with the ensemble approach utilized in the Newfound Trend Equity Index, we see very different results. This may indicate that one specific variant of trend following did well in this overall model, but the style of trend following might not lend itself well to this application.

U.S. EquitiesNewfound Trend Equity IndexTrend Equity Index Blend (PMI)Trend Equity Index Blend (Unemployment)
Annualized Return11.3%10.7%10.9%10.9%
Annualized Volatility14.7%11.1%11.8%13.5%
Maximum Drawdown50.8%25.8%36.1%36.0%
Sharpe Ratio0.490.590.580.50

Source: Quandl and U.S. Bureau of Labor Statistics. Calculations by Newfound Research. Results are hypothetical. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index. Data is from Jan 1948 – Sep 2019.

A more robust trend following model may already provide more upside capture during non-recessionary periods but at the expense of more downside capture during recessions. However, we cannot confidently assert that the lower level of down-capture in the single specification of the trend model is not partially due to luck.

If we desire to more thoroughly evaluate the style of trend following, we must get more granular with the economic cycles.

Breaking Down the Economic Cycle

Moving beyond the simple classification of “recession” and “not-recession”, we can follow MSCI’s methodology, which we used here previously, to classify the economic cycle into four primary states: Expansion, Slowdown, Contraction and Recovery.

We will focus on the 3-month moving average (“MA”) minus the 12-month MA for each indicator we examine according to the decision tree below. In the tree, we use the terms better or worse since lower unemployment rate and higher PMI values signal a stronger economy.

Economic cycle

There is a decent amount of difference in the classifications using these two indicators, with the unemployment indicator signaling more frequent expansions and slowdowns. This should be taken as evidence that economic regimes are difficult to predict.

Source: Quandl and U.S. Bureau of Labor Statistics. Calculations by Newfound Research. Results are hypothetical. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index. Data is from Jan 1948 – Sep 2019.

Once each indicator is in each state the transition probabilities are relatively close.

Source: Quandl and U.S. Bureau of Labor Statistics. Calculations by Newfound Research. Results are hypothetical. Past performance is not an indicator of future results.

This agrees with intuition when we consider the cyclical nature of these economic metrics. While not a perfect mathematical relationship, these states generally unfold sequentially without jumps from contractions to expansions or vice versa.

Trend Following in the Economic Cycle

Applying the four-part classification to the economic cycle shows where trend equity outperformed.

PMI IndicatorUnemployment Indicator
U.S. EquitiesTrend EquityU.S. EquitiesTrend Equity
Contraction7.6%10.3%1.0%7.3%
Recovery12.2%9.3%15.4%15.0%
Expansion14.3%14.4%13.9%11.3%
Slowdown7.2%5.4%10.5%8.0%

Source: Quandl and U.S. Bureau of Labor Statistics. Calculations by Newfound Research. Results are hypothetical. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index. Data is from Jan 1948 – Sep 2019.

During contraction phases, regardless of indicators, trend equity outperformed buy-and-hold.

For the PMI indicator, trend equity was able to keep up during expansions, but this was not the case with the unemployment indicator. The reverse of this was true for recoveries: trend following was close to keeping up in the periods denoted by the unemployment indicator but not by the PMI indicator.

For both indicators, trend following underperformed during slowdowns.

This may seem contradictory at first, but these may be periods of more whipsaw as markets try to forecast future states. And since slowdowns typically occur after expansions and before contractions (at least in the idealized model), we may have to bear more of this whipsaw risk for the strategy to be adaptable enough to add value during the contraction.

The following two charts show the longest historical slowdowns for each indicator: the PMI indicator was for 11 months in late 2009 through much of 2010 and the unemployment rate indicator was for 16 months in 1984-85.

Source: Quandl and U.S. Bureau of Labor Statistics. Calculations by Newfound Research. Results are hypothetical. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index.

In the first slowdown period, the trend equity strategy rode in tandem with equities as they continued to climb and then de-risked when equities declined. Equities quickly rebounded leaving the trend equity strategy underexposed to the rally.

In the second slowdown period, the trend equity strategy was heavily defensive going into the slowdown. This protected capital initially but then caused the strategy to lag once the market began to increase steadily.

The first period illustrates a time when the trend equity strategy was ready to adapt to changing market conditions and was unfortunately whipsawed. The second period illustrates a time when the trend equity strategy was already adapted to a supposedly oncoming contraction that did not materialize.

Using these historical patterns of performance, we can now explore how a strategy that systematically allocates to trend equity strategies might be constructed.

Timing Trend Following with the Economic Cycle

One simple way to apply a systematic timing strategy for shifting between equities and trend following is to only invest in equities when a slowdown is signaled.

The charts below show the returns and risk metrics for models using the PMI and unemployment rate individually and a model that blends the two allocations.

Growth trend timing

Source: Quandl and U.S. Bureau of Labor Statistics. Calculations by Newfound Research. Results are hypothetical. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index. Data is from Jan 1948 – Sep 2019.

Source: Quandl and U.S. Bureau of Labor Statistics. Calculations by Newfound Research. Results are hypothetical. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results. You cannot invest in an index. Data is from Jan 1948 – Sep 2019.

The returns increased slightly in every model relative to buy-and-hold, and the blended model performed consistently high across all metrics.

Blending multiple models generally produces benefits like these shown here, and in an actual implementation, utilizing additional economic indicators may make the strategy even more robust. There may be other ways to boost performance across the economic cycle, and we will explore these ideas in future research.

Conclusion

Should investors rotate in and out of active strategies?

Not in most cases, since the typical drivers are short-term underperformance that is a necessary component of active strategies.

However, there may be opportunities to make allocation tweaks based on the economic cycle.

The historical data suggests that a specification-neutral trend-equity strategy has outperformed buy-and-hold equities during economic contractions for both economic indicators. The performance during recoveries and expansions was mixed across indicators. It kept up with the buy-and-hold strategy during expansions denoted by PMI but not unemployment. This relationship was reversed for recoveries denoted by unemployment. In both models, trend equity has also lagged during economic slowdowns as whipsaw becomes more prevalent.

Based on the most recent PMI data, the current cycle is a contraction, indicating a favorable environment for trend equity under both cycle indicators. However, we should note that December 2018 through March 2019 was also labeled as a contraction according to PMI. Not all models are perfect.

Nevertheless, there may be some evidence that trend following can provide differentiated benefits based on the prevailing economic environment.

While an investor may not use this knowledge to shift around allocations to active trend following strategies, it can still provide insight into performance difference relative to buy-and-hold and set expectations going forward.

Harvesting the Bond Risk Premium

This post is available as a PDF download here.

Summary­

  • The bond risk premium is the return that investors earn by investing in longer duration bonds.
  • While the most common way that investors can access this return stream is through investing in bond portfolios, bonds often significantly de-risk portfolios and scale back returns.
  • Investors who desire more equity-like risk can tap into the bond risk premium by overlaying bond exposure on top of equities.
  • Through the use of a leveraged ETP strategy, we construct a long-only bond risk premium factor and investigate its characteristics in terms of rebalance frequency and timing luck.
  • By balancing the costs of trading with the risk of equity overexposure, investors can incorporate the bond risk premium as a complementary factor exposure to equities without sacrificing return potential from scaling back the overall risk level unnecessarily.

The discussion surrounding factor investing generally pertains to either equity portfolios or bond portfolios in isolation. We can calculate value, momentum, carry, and quality factors for each asset class and invest in the securities that exhibit the best characteristics of each factor or a combination of factors.

There are also ways to use these factors to shift allocations between stocks and bonds (e.g. trend and standardizing based on historical levels). However, we do not typically discuss bonds as their own standalone factor.

The bond risk premium – or term premium – can be thought of as the premium investors earn from holding longer duration bonds as opposed to cash. In a sense, it is a measure of carry. Its theoretical basis is generally seen to be related to macroeconomic factors such as inflation and growth expectations.1

While timing the term premium using factors within bond duration buckets is definitely a possibility, this commentary will focus on the term premium in the context of an equity investor who wants long-term exposure to the factor.

The Term Premium as a Factor

For the term premium, we can take the usual approach and construct a self-financing long/short portfolio of 100% intermediate (7-10 year) U.S. Treasuries that borrows the entire portfolio value at the risk-free rate.

This factor, shown in bold in the chart below, has exhibited a much tamer return profile than common equity factors.

Source: CSI Analytics, AQR, and Bloomberg. Calculations by Newfound Research. Data from 1/31/1992 to 6/28/2019. Results are hypothetical.  Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.  

Source: CSI Analytics, AQR, and Bloomberg. Calculations by Newfound Research. Data from 1/31/1992 to 6/28/2019. Results are hypothetical.  Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.  

But over the entire time period, its returns have been higher than those of both the Size and Value factors. Its maximum drawdown has been less than 40% of that of the next best factor (Quality), and it is worth acknowledging that its volatility – which is generally correlated to drawdown for highly liquid assets with non-linear payoffs – has also been substantially lower.

The term premium also has exhibited very low correlation with the other equity factors.

Source: CSI Analytics, AQR, and Bloomberg. Calculations by Newfound Research. Data from 1/31/1992 to 6/28/2019. Results are hypothetical.  Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.  

A Little Free Lunch

Whether we are treating bonds as factor or not, they are generally the primary way investors seek to diversify equity portfolios.

The problem is that they are also a great way to reduce returns during most market environments through their inherently lower risk.

Anytime that an asset with lower volatility is added to a portfolio, the risk will be reduced. Unless the asset class also has a particularly high Sharpe ratio, maintaining the same level of return is virtually impossible even if risk-adjusted returns are improved.

In a 2016 paper2, Salient broke down this reduction in risk into two components: de-risking and the “free lunch” affect.

The reduction in risk form the free lunch effect is desirable, but the risk reduction from de-risking may or may not be desirable, depending on the investor’s target risk profile.

The following chart shows the volatility breakdown of a range of portfolios of the S&P 500 (IVV) and 7-10 Year U.S. Treasuries (IEF).

Source: CSI Analytics and Bloomberg. Calculations by Newfound Research. Data from 1/31/1992 to 6/28/2019. Results are hypothetical.  Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.  

Moving from an all equity portfolio to a 50/50 equity reduces the volatility from 14.2% to 7.4%. But only 150 bps of this reduction is from the free lunch effect that stems from the lower correlation between the two assets (-0.18). The remaining 530 bps of volatility reduction is simply due to lower risk.

In this case, annualized returns were dampened from 9.6% to 7.8%. While the Sharpe ratio climbed from 0.49 to 0.70, an investor seeking higher risk would not benefit without the use of leverage.

Despite the strong performance of the term premium factor, risk-seeking investors (e.g. those early in their careers) are generally reluctant to tap into this factor too much because of the de-risking effect.

How do investors who want to bear risk commensurate with equities tap into the bond risk premium without de-risking their portfolio?

One solution is using leveraged ETPs.

Long-Only Term Premium

By taking a 50/50 portfolio of the 2x Levered S&P 500 ETF (SSO) and the 2x Levered 7-10 Year U.S. Treasury ETF (UST), we can construct a portfolio that has 100% equity exposure and 100% of the term premium factor.3

But managing this portfolio takes some care.

Left alone to drift, the allocations can get very far away from their target 50/50, spanning the range from 85/15 to 25/75. Periodic rebalancing is a must.

Source: CSI Analytics and Bloomberg. Calculations by Newfound Research. Data from 1/31/1992 to 6/28/2019. Results are hypothetical.  Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.  

Of course, now the question is, “How frequently should we rebalance the portfolio?”

This boils down to a balancing act between performance and costs (e.g. ticket charges, tax impacts, operational burden, etc.).

On one hand, we would like to remain as close to the 50/50 allocation as possible to maintain the desired exposure to each asset class. However, this could require a prohibitive amount of trading.

From a performance standpoint, we see improved results with longer holding periods (take note of the y-axes in the following charts; they were scaled to highlight the differences).

Source: CSI Analytics and Bloomberg. Calculations by Newfound Research. Data from 1/31/1992 to 6/28/2019. Results are hypothetical.  Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.  

The returns do not show a definitive pattern based on rebalance frequency, but the volatility decreases with increasing time between rebalances. This seems like it would point to waiting longer between rebalances, which would be corroborated by a consideration of trading costs.

The issues with waiting longer between the rebalance are twofold:

  1. Waiting longer is essentially a momentum trade. The better performing asset class garners a larger allocation as time progresses. This can be a good thing – especially in hindsight with how well equities have done – but it allows the portfolio to become overexposed to factors that we are not necessarily intending to exploit.
  2. Longer rebalances are more exposed to timing luck. For example, a yearly rebalance may have done well from a performance perspective, but the short-term performance could vary by as much as 50,000 bps between the best performing rebalance month and the worst! The chart below shows the performance of each iteration relative to the median performance of the 12 different monthly rebalance strategies.

Source: CSI Analytics and Bloomberg. Calculations by Newfound Research. Data from 1/31/1992 to 6/28/2019. Results are hypothetical.  Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.  

As the chart also shows, tranching can help mitigate timing luck. Tranching also gives the returns of the strategies over the range of rebalance frequencies a more discernible pattern, with longer rebalance period strategies exhibiting slightly higher returns due to their higher average equity allocations.

Under the assumption that we can tranche any strategy that we choose, we can now compare only the tranched strategies at different rebalance frequencies to address our concern with taking bets on momentum.

Pausing for a minute, we should be clear that we do not actually know what the true factor construction should be; it is a moving target. We are more concerned with robustness than simply trying to achieve outperformance. So we will compare the strategies to the median performance of the previously monthly offset annual rebalance strategies.

The following charts shows the aggregate risk of short-term performance deviations from this benchmark.

The first one shows the aggregate deviations, both positive and negative, and the second focuses on only the downside deviation (i.e. performance that is worse than the median).4

Both charts support a choice of rebalance frequency somewhere in the range of 3-6 months.

Source: CSI Analytics and Bloomberg. Calculations by Newfound Research. Data from 1/31/1992 to 6/28/2019. Results are hypothetical.  Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.  

With the rebalance frequency set based on the construction of the factor, the last part is a consideration of costs.

Unfortunately, this is more situation-specific (e.g. what commissions does your platform charge for trades?).

From an asset manager point-of-view, where we can trade with costs proportional to the size of the trade, execute efficiently, and automate much of the operational burden, tranching is our preferred approach.

We also prefer this approach over simply rebalancing back to the static 50/50 allocation more frequently.

In our previous commentary on constructing value portfolios to mitigate timing luck, we described how tranching monthly is a different decision than rebalancing monthly and that tranching frequency and rebalance frequency are distinct decisions.

We see the same effect here where we plot the monthly tranched annually rebalanced strategy (blue line) and the strategy rebalanced back to 50/50 every month (orange line).

Source: CSI Analytics and Bloomberg. Calculations by Newfound Research. Data from 1/31/1992 to 6/28/2019. Results are hypothetical.  Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.  

Tranching wins out.

However, since the target for the term premium factor is a 50/50 static allocation, running a simple allocation filter to keep the portfolio weights within a certain tolerance can be a way to implement a more dynamic rebalancing model while reducing costs.

For example, rebalancing when the allocations for SSO and UST we outside a 5% band (i.e. the portfolio was beyond a 55/45 or 45/55) achieved better performance metrics than the monthly rebalanced version with an average of only 3 rebalances per year.

Conclusion

The bond term premium does not have to be reserved for risk-averse investors. Investors desiring portfolios tilted heavily toward equities can also tap into this diversifying return stream as a factor within their portfolio.

Utilizing leveraged ETPs is one way to maintaining exposure to equities while capturing a significant portion of the bond risk premium. However, it requires more oversight than investing in other factors such as value, momentum, and quality, which are typically packaged in easy-to-access ETFs.

If a fixed frequency rebalance approach is used, tranching is an effective way to reduce timing risk, especially when markets are volatile. Aside from tranching, we find that, historically, holding periods between 3 and 6 months yield results close in line with the median rolling short-term performance of the individual strategies. Implementing a methodology like this can reduce the risk of poor luck in choosing the rebalance frequency or starting the strategy at an unfortunate time.

If frequent rebalances – like those seen with tranching – are infeasible, a dynamic schedule based on a drift in allocations is also a possibility.

Leveraged ETPs are often seen as risk trading instruments that are not fit for retail investors who are more focused on buy-and-hold systems. However, given the right risk management, these investment vehicles can be a way for investors to access the bond term premium, getting a larger free lunch, and avoiding undesired de-risking along the way.

Dynamic Spending in Retirement Monte Carlo

This post is available as a PDF download here.

Summary­

  • Many retirement planning analyses rely on Monte Carlo simulations with static assumptions for withdrawals.
  • Incorporating dynamic spending rules can more closely align the simulations with how investors would likely behave during times when the plan looked like it was on a path to failure.
  • Even a modest reduction in withdrawals (e.g. 10%) can have a meaningful impact on reducing failure rates, nearly cutting it in half in a sample simulation.
  • Combining dynamic spending rules with other marginal improvements, such as supplemental income and active risk management, can lead to more robust retirement plans and give investors a better understanding of the variables that are within their realm of control.

Monte Carlo simulations are a prevalent tool in financial planning, especially pertaining to retirement success calculations.

Under a typical framework of normally distributed portfolio returns and constant inflation-adjusted withdrawals, calculating the success of a given retirement portfolio is straightforward. But as with most tools in finance, the art lies both in the assumptions that go into the calculation and in the proper interpretation of the result.

If a client is told they have a 10% chance of running out of money over their projected retirement horizon, what does that mean for them?

They cannot make 9 copies of themselves to live out separate lives, with one copy (hopefully not the original) unfortunately burning through the account prematurely.

They also cannot create 9 parallel universes and ensure they do not choose whichever one does not work out.

We wrote previously how investors follow a single path (You Are Not a Monte-Carlo Simulation). If that path hits zero, the other hypothetical simulation paths don’t mean a thing.

A simulation path is only as valuable as the assumptions that go into creating it, and fortunately, we can make our simulations align more closely with investor behavior.

The best way to interpret the 10% failure rate is to think of it as a 10% chance of having to make an adjustment before it hits zero. Rarely would an investor stand by while their account went to zero. There are circumstances that are entirely out of investor control, but to the extent that there was something they could do to prevent that event, they would most likely do it.

Derek Tharp, on Michael Kitces’ blog, wrote a post a few years ago weighing the relative benefit of implementing small but permanent adjustments vs. large but temporary adjustments to retirement withdrawals and found that making small adjustments and leaving them in place led to greater likelihoods of success over retirement horizons (Dynamic Retirement Spending Adjustments: Small-But-Permanent Vs Large-But-Temporary).

In this week’s commentary, we want to dig a little deeper into some simple path dependent modifications that we can make to retirement Monte-Carlo simulations with the hope of creating a more robust toolset for financial planning.

The Initial Plan

Suppose an investor is 65 and holds a moderate portfolio of 60% U.S. stocks and 40% U.S. Treasuries. From 1871 until mid-2019, this portfolio would have returned an inflation-adjusted 5.1% per year with 10.6% volatility according to Global Financial Data.

Sticking with the rule-of-thumb 4% annual withdrawal of the initial portfolio balance and assuming a 30-year retirement horizon, this yields a predicted failure rate of 8% (plus or minus about 50 bps).

The financial plan is complete.

If you start with $1,000,000, simply withdraw $3,333/month and you should be fine 92% of the time.

But what if the portfolio drops 5% in the first month? (It almost did that in October 2018).

The projected failure rate over the next 29 years and 11 months has gone up to 11%. That violates a 10% threshold that may have been a target in the planning process.

Or what if it drops 30% in the first 6 months, like it would have in the second half of 1931?

Now the project failure rate is a staggering 46%. Retirement success has been reduced to a coin flip.

Admittedly, these are trying scenarios, but these numbers are a key driver for financial planning. If we can better understand the risks and spell out a course of action beforehand, then the risk of making a rash emotion-driven decision can be mitigated.

Aligning the Plan with Reality

When the market environment is challenging, investors can benefit by being flexible. The initial financial plan does not have to be jettisoned; agreed upon actions within it are implemented.

One of the simplest – and most impactful – modifications to make is an adjustment to spending. For instance, an investor might decide at the outset to scale back spending by a set amount when the probably of failure crosses a threshold.Source: Global Financial Data. Calculations by Newfound.

This reduction in spending would increase the probability of success going forward through the remainder of the retirement horizon.

And if we knew that this spending cut would likely happen if it was necessary, then we can quantify it as a rule in the initial Monte Carlo simulation used for financial planning.

Graphically, we can visualize this process by looking at the probabilities of failure for varying asset levels over time. For example, at 10 years after retirement, the orange line indicates that a portfolio value ~80% of the initial value would have about a 5% failure rate.

Source: Global Financial Data. Calculations by Newfound.

As long as the portfolio value remains above a given line, no adjustment would be needed based on a standard Monte Carlo analysis. Once a line is crossed, the probability of success is below that threshold.

This chart presents a good illustration of sequence risk: the lines are flatter initially after retirement and the slope progressively steepens as the time progresses. A large drawdown initially puts the portfolio below the threshold for making and adjustment.

For instance, at 5 years, the portfolio has more than a 10% failure rate if the value is below 86%. Assuming zero real returns, withdrawals alone would have reduced the value to 80%. Positive returns over this short time period would be necessary to feel secure in the plan.

Looking under the hood along the individual paths used for the Monte Carlo simulation, at 5 years, a quarter of them would be in a state requiring an adjustment to spending at this 10% failure level.

Source: Global Financial Data. Calculations by Newfound.

This belies the fact that some of the paths that would have crossed this 10% failure threshold prior to the 5-year mark improved before the 5-year mark was hit. 75% of the paths were below this 10% failure rate at some point prior to the 5-year mark. Without more appropriate expectations of a what these simulations mean, under this model, most investors would have felt like their plan’s failure rate was uncomfortable at some point in the first 5 years after retirement!

Dynamic Spending Rules

If the goal is ultimately not to run out of funds in retirement, the first spending adjustment case can substantially improve those chances (aside from a large negative return in the final periods prior to the last withdrawals).

Each month, we will compare the portfolio value to the 90% success value. If the portfolio is below that cutoff, we will size the withdrawal to hit improve the odds of success back to that level, if possible.

The benefit of this approach is greatly improved success along the different paths. The cost is forgone income.

But this can mean forgoing a lot of income over the life of the portfolio in a particularly bad state of the world. The worst case in terms of this total forgone income is shown below.

Source: Global Financial Data. Calculations by Newfound.

The portfolio gives up withdrawals totaling 74%, nearly 19 years’ worth. Most of this is given up in consecutive periods during the prolonged drawdown that occurs shortly after retirement.

This is an extreme case that illustrates how large of income adjustments could be required to ensure success under a Monte Carlo framework.

The median case foregoes 9 months of total income over the portfolio horizon, and the worst 5% of cases all give up 30% (7.5 years) of income based off the initial portfolio value.

That is still a bit extreme in terms of potential cutbacks.

As a more realistic scenario that is easier on the pocketbook, we will limit the total annual cutback to 30% of the withdrawal in the following manner:

  • If the current chance of failure is greater than 20%, cut spending by 30%. This equates to reducing the annual withdrawal by $12,000 assuming a $1,000,000 initial balance.
  • If the current chance of failure is between 15% and 20%, cut spending by 20%. This equates to reducing the annual withdrawal by $8,000 assuming a $1,000,000 initial balance.
  • If the current chance of failure is between 10% and 15%, cut spending by 10%. This equates to reducing the annual withdrawal by $4,000 assuming a $1,000,000 initial balance.

These rules still increase the success rate to 99% but substantially reduce the amount of reductions in income.

Looking again at the worst-case scenario, we see that this case still “fails” (even though it lasts another 4.5 years) but that its reduction in come is now less than half of what it was in the extreme cutback case. This pattern is in line with the “lower for longer” reductions that Derek had looked at in the blog post.

Source: Global Financial Data. Calculations by Newfound.

On the 66% of sample paths where there was a cut in spending at some point, the average total cut amounted to 5% of the portfolio (a little over a year of withdrawals spread over the life of the portfolio).

Even moving to an even less extreme reduction regime where only 10% cuts are ever made if the probability of failure increases above 10%, the average reduction in the 66% of cases that required cuts was about 9 months of withdrawals over the 30-year period.

In these scenarios, the failure rate is reduced to 5% (from 8% with no dynamic spending rules).

Source: Global Financial Data. Calculations by Newfound.

Conclusion

Retirement simulations can be a powerful planning tool, but they are only as good as their inputs and assumptions. Making them align as closes with reality as possible can be a way to quantify the impact of dynamic spending rules in retirement.

While the magnitude of spending reductions necessary to guarantee success of a retirement plan in all potential states of the world is prohibitive. However, small modifications to spending can have a large impact on success.

For example, reducing withdrawal by 10% when the forecasted failure rate increases above 10% nearly cut the failure rate of the entire plan in half.

But dynamic spending rules do not exist in a vacuum; they can be paired with other marginal improvements to boost the likelihood of success:

  • Seek out higher returns – small increases in portfolio returns can have a significant impact over the 30 -ear planning horizon.
  • Supplement income – having supplements to income, even small ones, can offset spending during any market environment, improving the success rate of the financial plan.
  • Actively manage risk – managing risk, especially early in retirement is a key factor to now having to reduce withdrawals in retirement.
  • Plan for more flexibility – having the ability to reduce spending when necessary reduces the need to rely on the portfolio balance when the previous factors are not working.

While failure is certainly possible for investors, a “too big to fail” mentality is much more in line with the reality of retirement.

Even if absolute failure is unlikely, adjustments will likely be a requirement. These can be built into the retirement planning process and can shed light on stress testing scenarios and sensitivity.

From a retirement planning perspective, flexibility is simply another form of risk management.

The Path-Dependent Nature of Perfect Withdrawal Rates

This post is available as a PDF download here.

Summary

  • The Perfect Withdrawal Rate (PWR) is the rate of regular portfolio withdrawals that leads to a zero balance over a given time frame.
  • 4% is the commonly accepted lower bound for safe withdrawal rates, but this is only based on one realization of history and the actual risk investors take on by using this number may be uncertain.
  • Using simulation techniques, we aim to explore how different assumptions match the historical experience of retirement portfolios.
  • We find that simple assumptions commonly used in financial planning Monte Carlo simulations do not seem to reflect as much variation as we have seen in the historical PWR.
  • Including more stress testing and utilizing richer simulation methods may be necessary to successfully gauge that risk in a proposed PWR, especially as it pertains to the risk of failure in the financial plan.

Financial planning for retirement is a combination of art and science. The problem is highly multidimensional, requiring estimates of cash flows, investment returns and risk, taxation, life events, and behavioral effects. Reduction along the dimensions can simplify the analysis, but introduces consequences in the applicability and interpretation of the results. This is especially true for investors who are close to the line between success and failure.

One of the primary simplifying assumptions is the 4% rule. This heuristic was derived using worst-case historical data for portfolio withdrawals under a set of assumptions, such as constant inflation adjusted withdrawals, a fixed mix of stock and bonds, and a set time horizon.

Below we construct a monthly-rebalanced, fixed-mix 60/40 portfolio using the S&P 500 index for U.S. equities and the Dow Jones Corporate Bond index for U.S. bonds. Using historical data from 12/31/1940 through 12/31/2018, we can evaluate the margin for error the 4% rule has historically provided and how much opportunity for higher withdrawal rates was sacrificed in “better” market environments.

Source: Global Financial Data and Shiller Data Library. Calculations by Newfound Research. Returns are backtested and hypothetical. Past performance is not a guarantee of future results. Returns are gross of all fees. Returns assume the reinvestment of all distributions. None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary. You cannot invest in an index.

But the history is only a single realization of the world. Risk is hard to gauge.

Perfect Withdrawal Rates

The formula (in plain English) for the perfect withdrawal rate (“PWR”) in a portfolio, assuming an ending value of zero, is relatively simple since it is just a function of portfolio returns:

The portfolio value in the numerator is the final value of the portfolio over the entire period, assuming no withdrawals. The sequence risk in the denominator is a term that accounts for both the order and magnitude of the returns.

Larger negative returns earlier on in the period increase the sequence risk term and therefore reduce the PWR.

From a calculation perspective, the final portfolio value in the equation is typically described (e.g. when using Monte Carlo techniques) as a log-normal random variable, i.e. the log-returns of the portfolio are assumed to be normally distributed. This type of random variable lends itself well to analytic solutions that do not require numerical simulations.

The sequence risk term, however, is not so friendly to closed-form methods. The path-dependent, additive structure of returns within the sequence risk term means that we must rely on numerical simulations.

To get a feel for some features of this equation, we can look at the PWR in the context of the historical portfolio return and volatility.

Source: Global Financial Data and Shiller Data Library. Calculations by Newfound Research. Returns are backtested and hypothetical. Past performance is not a guarantee of future results. Returns are gross of all fees. Returns assume the reinvestment of all distributions. None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary. You cannot invest in an index.

The relationship is difficult to pin down.

As we saw in the equation shown before, the –annualized return of the portfolio– does appear to impact the ­–PWR– (correlation of 0.51), but there are periods (e.g. those starting in the 1940s) that had higher PWRs with lower returns than in the 1960s. Therefore, investors beginning withdrawals in the 1960s must have had higher sequence risk.

Correlation between –annualized volatility– and –PWR– was slightly negative (-0.35).

The Risk in Withdrawal Rates

Since our goal is to assess the risk in the historical PWR with a focus on the sequence risk, we will use the technique of Brownian Bridges to match the return of all simulation paths to the historical return of the 60/40 portfolio over rolling 30-year periods. We will use the historical full-period volatility of the portfolio over the period for the simulation.

This is essentially a conditional PWR risk based on assuming we know the full-period return of the path beforehand.

To more explicitly describe the process, consider a given 30-year period. We begin by computing the full-period annualized return and volatility of the 60/40 portfolio over that period.  We will then generate 10,000 simulations over this 30-year period but using the Brownian Bridge technique to ensure that all of the simulations have the exact same full-period annualized return and intrinsic volatility.  In essence, this approach allows us to vary the path of portfolio returns without altering the final return.  As PWR is a path-dependent metric, we should gain insight into the distribution of PWRs.

The percentile bands for the simulations using this method are shown below with the actual PWR in each period overlaid.

Source: Global Financial Data and Shiller Data Library. Calculations by Newfound Research. Returns are backtested and hypothetical. Past performance is not a guarantee of future results. Returns are gross of all fees. Returns assume the reinvestment of all distributions. None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary. You cannot invest in an index.

From this chart, we see two items of note: The percentile bands in the distribution roughly track the historical return over each of the periods, and the actual PWR fluctuates into the left and right tails of the distribution rather frequently.  Below we plot where the actual PWR actually falls within the simulated PWR distribution.

Source: Global Financial Data and Shiller Data Library. Calculations by Newfound Research. Returns are backtested and hypothetical. Past performance is not a guarantee of future results. Returns are gross of all fees. Returns assume the reinvestment of all distributions. None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary. You cannot invest in an index.

The actual PWR is below the 5th percentile 12% of the time, below the 1st percentile 4% of the time, above the 95th percentile 11% of the time, and above the 99th percentile 7% of the time.  Had our model been more well calibrated, we would expect the percentiles to align; e.g. the PWR should be below the 5th percentile 5% of the time and above the 99th percentile 1% of the time.

This seems odd until we realize that our model for the portfolio returns was likely too simplistic. We are assuming Geometric Brownian Motion for the returns. And while we are fixing the return over the entire simulation path to match that of the actual portfolio, the path to get there is assumed to have constant volatility and independent returns from one month to the next.

In reality, returns do not always follow these rules. For example, the skew of the monthly returns over the entire history is -0.36 and the excess kurtosis is 1.30. This tendency toward larger magnitude returns and returns that are skewed to the left can obscure some of the risk that is inherent in the PWRs.

Additionally, returns are not totally independent. While this is good for trend following strategies, it can lead to an understatement of risk as we explored in our previous commentary on Accounting for Autocorrelation in Assessing Drawdown Risk.

Over the full period, monthly returns of lags 1, 4, and 5 exhibit autocorrelation that is significant at the 95% confidence level.

Source: Global Financial Data and Shiller Data Library. Calculations by Newfound Research. Returns are backtested and hypothetical. Past performance is not a guarantee of future results. Returns are gross of all fees. Returns assume the reinvestment of all distributions. None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary. You cannot invest in an index.

To incorporate some of these effects in our simulations, we must move beyond the simplistic assumption of normally distributed returns.

First, we will fit a skewed normal distribution to the rolling historical data and use that to draw our random variables for each period. This is essentially what was done in the previous section for the normally distributed returns.

Then, to account for some autocorrelation, we will use the same adjustment to volatility as we used in the previously reference commentary on autocorrelation risk. For positive autocorrelations (which we saw in the previous graphs), this results in a higher volatility for the simulations (typically around 10% – 25% higher).

The two graphs below show the same analysis as before under this modified framework.

Source: Global Financial Data and Shiller Data Library. Calculations by Newfound Research. Returns are backtested and hypothetical. Past performance is not a guarantee of future results. Returns are gross of all fees. Returns assume the reinvestment of all distributions. None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary. You cannot invest in an index.

The historical PWR now fall more within the bounds of our simulated results.

Additionally, the 5th percentile band now shows that there were periods where a 4% withdrawal rule may not have made the cut.

Source: Global Financial Data and Shiller Data Library. Calculations by Newfound Research. Returns are backtested and hypothetical. Past performance is not a guarantee of future results. Returns are gross of all fees. Returns assume the reinvestment of all distributions. None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary. You cannot invest in an index.

Conclusion

Heuristics can be a great way to distill complex data into actionable insights, and the perfect withdrawal rate in retirement portfolios is no exception.

The 4% rule is a classic example where we may not be aware of the risk in using it. It is the commonly accepted lower bound for safe withdrawal rates, but this is only based on one realization of history.

The actual risk investors take on by using this number may be uncertain.

Using simulation techniques, we explored how different assumptions match the historical experience of retirement portfolios.

The simple assumptions (expected return and volatility) commonly used in financial planning Monte Carlo simulations do not seem to reflect as much variation as we have seen in the historical PWR. Therefore, relying on these assumptions can be risky for investors who are close to the “go-no-go” point; they do not have much room for failure and will be more likely to have to make cash flow adjustments in retirement.

Utilizing richer simulation methods (e.g. accounting for negative skew and autocorrelation like we did here or using a downside shocking method like we explored in A Shock to the Covariance System) may be necessary to successfully gauge that risk in a proposed PWR, especially as it pertains to the risk of failure in the financial plan.

Having a number to base planning calculations on makes life easier in the moment, but knowing the risk in using that number makes life easier going forward.

Page 2 of 5

Powered by WordPress & Theme by Anders Norén