The Research Library of Newfound Research

Author: Nathan Faber Page 4 of 5

Nathan is a Portfolio Manager at Newfound Research. At Newfound, Nathan is responsible for investment research, strategy development, and supporting the portfolio management team.

Prior to joining Newfound, he was a chemical engineer at URS, a global engineering firm in the oil, natural gas, and biofuels industry where he was responsible for process simulation development, project economic analysis, and the creation of in-house software.

Nathan holds a Master of Science in Computational Finance from Carnegie Mellon University and graduated summa cum laude from Case Western Reserve University with a Bachelor of Science in Chemical Engineering and a minor in Mathematics.

How to Benchmark Trend-Following

This post is available as a PDF download here.

Summary­

  • Benchmarking a trend-following strategy can be a difficult exercise in managing behavioral biases.
  • While the natural tendency is often to benchmark equity trend-following to all-equities (e.g. the S&P 500), this does not accurately give the strategy credit for choosing to be invested when the market is going up.
  • A 50/50 portfolio of equities and cash is generally an appropriate benchmark for long/flat trend-following strategies, both for setting expectations and for gauging current relative performance.
  • If we acknowledge that for a strategy to outperform over the long-run, it must undergo shorter periods of underperformance, using this symmetric benchmark can isolate market environments that underperformance should be expected.
  • Diversifying risk-management approaches (e.g. pairing strategic allocation with tactical trend-following) can manage events that are unfavorable to one strategy, and benchmarking is a tool to set expectations around the level of risk management necessary in different market environments.

Any strategy that deviates from the most basic is compared to a benchmark. But how do you choose an appropriate benchmark?

The complicated nature of benchmarking can be easily seen by considering something as simple as a value stock strategy.

You may pit your concentrated value manager you currently use up against the more diversified value manager you used previously. At that time, you may have compared that value manager to a systematic smart-beta ETF like the iShares S&P 500 Value ETF (ticker: IVE). And if you were invested in that ETF, you might compare its performance to the S&P 500.

What prevents you from benchmarking them all to the S&P 500? Or from benchmarking the concentrated value strategy to all of the other three?

Benchmark choices are not unique and are highly dependent on what aspect of performance you wish to measure.

Benchmarking is one of the most frequently abused facets of investing. It can be extremely useful when applied in the correct manner, but most of the time, it is simply a hurdle to sticking with an investment plan.

In an ideal world, the only benchmark for an investor would be whether or not they are on track for hitting their financial goals. However, in an industry obsessed with relative performance, choosing a benchmark is a necessary exercise.

This commentary will explore some of the important considerations when choosing a benchmark for trend-following strategies.

The Purpose of a Trend-Following Benchmark

As an investment manager, our goal with benchmarking is to check that a strategy’s performance is in line with our expectations. Performance versus a benchmark can answer questions such as:

  • Is the out- or underperformance appropriate for the given market environment?
  • Is the magnitude of out- or underperformance typical?
  • How is the strategy behaving in the context of other ways of managing risk?

With long/flat trend-following strategies, the appropriate benchmark should gauge when the manager is making correct or incorrect calls in either direction.

Unfortunately, we frequently see long/flat equity trend-following strategies benchmarked to an all-equity index like the S&P 500. This is similar to the coinflip game we outlined in our previous commentary about protecting and participating with trend-following.[1]

The behavioral implications of this kind of benchmarking are summarized in the table below.

The two cases with wrong calls – to move to cash when the market goes up or remain invested when the market goes down – are appropriately labeled, as is the correct call to move to cash when the market is going down. However, when the market is going up and the strategy is invested, it is merely keeping up with its benchmark even though it is behaving just as one would want it to.

To reward the strategy in either correct call case, the benchmark should consist of allocations to both equity and cash.

A benchmark like this can provide objective answers to the questions outlined above.

Deriving a Trend-Following Benchmark

Sticking with the trend-following strategy example we outlined in our previous commentary[2], we can look at some of the consequences of choosing different benchmarks in terms of how much the trend-following strategy deviates from them over time.

The chart below shows the annualized tracking error of the strategy to the range of strategic proportions of equity and cash.

Source: Kenneth French Data Library. Data from July 1926 – February 2018. Calculations by Newfound Research. Returns are gross of all fees, including transaction fees, taxes, and any management fees.  Returns assume the reinvestment of all distributions.  This document does not reflect the actual performance results of any Newfound investment strategy or index.  All returns are backtested and hypothetical.  Past performance is not a guarantee of future results.

The benchmark that minimizes the tracking error is a 47% allocation to equities and 53% to cash. This 0.47 is also the beta of the trend-following strategy, so we can think of this benchmark as accounting for the risk profile of the strategy over the entire 92-year period.

But what if we took a narrower view by constraining this analysis to recent performance?

The chart below shows the equity allocation of the benchmark that minimizes the tracking error to the trend-following strategy over rolling 1-year periods.

Source: Kenneth French Data Library. Data from July 1926 – February 2018. Calculations by Newfound Research. Returns are gross of all fees, including transaction fees, taxes, and any management fees.  Returns assume the reinvestment of all distributions.  This document does not reflect the actual performance results of any Newfound investment strategy or index.  All returns are backtested and hypothetical.  Past performance is not a guarantee of future results.

A couple of features stand out here.

First, if we constrain our lookback period to one year, a time-period over which many investors exhibit anchoring bias, then the “benchmark” that we may think we will closely track – the one we are mentally tied to – might be the one that we deviate the most from over the next year.

And secondly, the approximately 50/50 benchmark calculated using the entire history of the strategy is rarely the one that minimizes tracking error over the short term.

The median equity allocation in these benchmarks is 80%, the average is 67%, and the data is highly clustered at the extremes of 100% equity and 100% cash.

Source: Kenneth French Data Library. Data from July 1926 – February 2018. Calculations by Newfound Research. Returns are gross of all fees, including transaction fees, taxes, and any management fees.  Returns assume the reinvestment of all distributions. This document does not reflect the actual performance results of any Newfound investment strategy or index.  All returns are backtested and hypothetical.  Past performance is not a guarantee of future results.

The Intuitive Trend-Following Benchmark

Is there a problem in determining a benchmark using the tracking error over the entire period?

One issue is that it is being calculated with the benefit of hindsight. If you had started a trend-following strategy back in the 1930s, you would have arrived at a different equity allocation for the benchmark based on this analysis given the available data (e.g. using data up until the end of 1935 yields an equity allocation of 37%).

To remove this reliance on having a sufficiently long backtest, our preference is to rely more on the strategy’s rules and how we would use it in a portfolio to determine our trend-following benchmarks.

For a trend following strategy that pivots between stocks and cash, a 50/50 benchmark is a natural choice.

It is broad enough to include the assets in the trend-following strategy’s investment universe while being neutral to the calls to be long or flat.

Seeing the 50/50 portfolio be the answer to the tracking error minimization problem over the entire data simply provides empirical evidence for its use.

One argument against using a 50/50 blend could focus on the fact that the market is generally up more frequently than it is down, at least historically. While this is true, the magnitude of down moves has often been larger than the magnitude of up moves. Since this strategy is explicitly meant as a risk management tool, accounting for both the magnitude and the frequency is prudent.

Another argument against its use could be the belief that we are entering a different market environment where history will not be an accurate guide going forward. However, given the random nature of market moves coupled with the behavioral tendencies of investors to overreact, herd, and anchor, a benchmark close to a 50/50 is likely still a fitting choice.

Setting Expectations with a Trend-Following Benchmark

Now that we have a benchmark to use, how do we use it to set our expectations?

Neglecting the historical data for the moment, from the ex-ante perspective, it is helpful to decompose a typical market cycle into four different segments and assess how we expect trend-following to behave:

  • Initial decline – Equity markets begin to sell off, and the fully invested trend-following strategy underperforms the 50/50 benchmark.
  • Prolonged drawdown – The trend-following strategy adapts to the decline and moves to cash. The trend-following strategy outperforms.
  • Initial recovery – The trend-following strategy is still in cash and lags the benchmark as prices rebound off the bottom.
  • Sustained recovery – The trend-following strategy reinvests and captures more of the upside than the benchmark.

Of course, this is a somewhat ideal scenario that rarely plays out perfectly. Whipsaw events occur as prices recover (decline) before declining (recovering) again.

But it is important to note how the level of risk relative to this 50/50 benchmark varies over time.

Contrast this with something like an all equity strategy benchmarked to the S&P 500 where the risk is likely to be similar during most market environments.

Now, if we look at the historical data, we can see this borne out in the graph of the drawdowns for trend-following and the 50/50 benchmark.

Source: Kenneth French Data Library. Data from July 1926 – February 2018. Calculations by Newfound Research. Returns are gross of all fees, including transaction fees, taxes, and any management fees.  Returns assume the reinvestment of all distributions.  This document does not reflect the actual performance results of any Newfound investment strategy or index.  All returns are backtested and hypothetical.  Past performance is not a guarantee of future results.

In most prolonged and major (>20%) drawdowns, trend-following first underperforms the benchmark, then outperforms, then lags as equities improve, and then outperform again.

Using the most recent example of the Financial Crisis, we can see the capture ratios verses the benchmark in each regime.

Source: Kenneth French Data Library. Data from October 2007 – February 2018. Calculations by Newfound Research. Returns are gross of all fees, including transaction fees, taxes, and any management fees.  Returns assume the reinvestment of all distributions.  This document does not reflect the actual performance results of any Newfound investment strategy or index.  All returns are backtested and hypothetical.  Past performance is not a guarantee of future results.

The underperformance of the trend-following strategy verses the benchmark is in line with expectations based on how the strategy is desired to work.

Another way to use the benchmark to set expectations is to look at rolling returns historically. This gives context for the current out- or underperformance relative to the benchmark.

From this we can see which percentile the current return falls into or check to see how many standard deviations it is away from the average level of relative performance.

Source: Kenneth French Data Library. Data from July 1926 – February 2018. Calculations by Newfound Research. Returns are gross of all fees, including transaction fees, taxes, and any management fees.  Returns assume the reinvestment of all distributions.  This document does not reflect the actual performance results of any Newfound investment strategy or index.  All returns are backtested and hypothetical.  Past performance is not a guarantee of future results.

In all this, there are a few important points to keep in mind:

  • Price moves that occur faster than the scope of the trend-following measurement can be one source of the largest underperformance events.
  • Along a similar vein, whipsaw is a key risk of trend-following. Highly oscillatory markets will not be favorable to trend-following. In these scenarios, trend following can underperform even fully invested equities.
  • With percentile analysis, there is always a first time for anything. Having a rich data history covering a variety of market scenarios mitigates this, but setting new percentiles, either on the low end or high end, is always possible.
  • Sometimes a strategy is expected to lag its benchmark in a given market environment. A primary goal with benchmarking is it accurately set these expectations for the potential magnitude of relative performance and design the portfolio accordingly.

Conclusion

Benchmarking a trend-following strategy can be a difficult exercise in managing behavioral biases. With the tendency to benchmark all equity-based strategies to an all-equity index, investors often set themselves up for a let-down in a bull market with trend-following.

With benchmarking, the focus is often on lagging the benchmark by “too much.” This is what an all-equity benchmark can do to trend-following. However, the issue is symmetric: beating the benchmark by “too much” can also signal either an issue with the strategy or with the benchmark choice. This is why we would not benchmark a long/flat trend-following strategy to cash.

A 50/50 portfolio of equities and cash is generally an appropriate benchmark for long/flat trend-following strategies. This benchmark allows us to measure the strategy’s ability to correctly allocate when equities are both increasing or decreasing.

Too often, investors use benchmarking solely to see which strategy is beating the benchmark by the most. While this can be a use for very similar strategies (e.g. a set of different value managers), we must always be careful not to compare apples to oranges.

A benchmark should not conjure up an image of a dog race where the set of investment strategies are the dogs and the benchmark is the bunny out ahead, always leading the way.

We must always acknowledge that for a strategy to outperform over the long-run, it must undergo shorter periods of underperformance. Diversifying approaches can manage events that are unfavorable to one strategy, and benchmarking is a tool to set expectations around the level of risk management necessary in different market environments.

 

[1] https://blog.thinknewfound.com/2018/05/leverage-and-trend-following/

[2] https://blog.thinknewfound.com/2018/03/protect-participate-managing-drawdowns-with-trend-following/

Should You Dollar-Cost Average?

This post is available as a PDF download here.

Summary­­

  • Dollar-cost averaging (DCA) versus lump sum investing (LSI) is often a difficult decision fraught with emotion.
  • The historical and theoretical evidence contradicts the notion that DCA leads to better results from a return perspective, and only some measures of risk point to benefits in DCA.
  • Rather than holding cash while implementing DCA, employing a risk managed strategy can lead to better DCA performance even in a muted growth environment.
  • Ultimately, the best solution is the one that gets an investor into an appropriate portfolio, encourages them to stay on track for their long term financial goals, and appropriately manages any behavioral consequences along the way.

Dollar-cost averaging (DCA) is the process of investing equal amounts into an asset or a portfolio over a period of time at regular intervals. It is commonly thought of as a way to reduce the risk of investing at the worst possible time and seeing your investment immediately decline in value.

The most familiar form of dollar-cost averaging is regular investment directed toward retirement accounts. A fixed amount is deducted from each paycheck and typically invested within a 401(k) or IRA. When the securities in the account decline in value, more shares are purchased with the cash, and over the long run, the expectation is to invest at a favorable average price.

For this type of dollar-cost averaging, there is not a lot of input on the investor’s part; the cash is invested as it arrives. The process is involuntary once it is initiated.

A slightly different scenario for dollar-cost averaging happens when an investor has a lump sum to invest: the choice is to either invest it at once (“lump-sum investing”; LSI) or spread the investment over a specified time horizon using DCA.

In this case, the investor has options, and in this commentary we will explore some of the arguments for and against DCA with a lump sum with the intention of reducing timing risk in the market.

 

The Historical Case Against Dollar-Cost Averaging

Despite the conventional wisdom that DCA is a prudent idea, investors certainly have sacrificed a fair amount of return potential by doing it historically.

In their 2012 paper entitle Dollar-Cost Averaging Just Means Taking Risk Later[1], Vanguard looked at LSI versus DCA in the U.S., U.K., and Australia over rolling 10-year periods and found that for a 60/40 portfolio, LSI outperformed DCA about 2/3 of the time in each market.

If we assume that a lump sum is invested in the S&P 500 in equal monthly amounts over 12-months with the remaining balance held in cash earning the risk-free interest rate, we see a similar result over the period from 1926 to 2017.

Why does dollar-cost averaging look so bad?

In our previous commentary on Misattributing Bad Behavior[2], we discussed how the difference between investment return – equivalent to LSI –  and investor return – equivalent to DCA –  is partly due to the fact that investors are often making contributions in times of positive market returns. Over this 92 year period from 1926 to 2017, the market has had positive returns over 74% of the rolling 12-month periods.  Holding cash and investing at a later date means forgoing some of these positive returns.  From a theoretical basis, this opportunity cost is the equity risk premium: the expected excess return of equities over cash.

In our current example where investors voluntarily choose to dollar-cost average, the same effect is experienced.

Source: Kenneth French Data Library and Robert Shiller Data Library. Calculations by Newfound Research. Results are hypothetical. Past performance does not guarantee future results.

The average outperformance of the LSI strategy was 4.1%, and as expected, there is a strong correlation between how well the market does over the year and the benefit of LSI.

Source: Kenneth French Data Library and Robert Shiller Data Library. Calculations by Newfound Research. Results are hypothetical. Past performance does not guarantee future results.

 

Surely DCA Worked Somewhere

If the high equity market returns in the U.S., and as the Vanguard piece showed in the U.K. and Australia, were the force behind the attractiveness of lump sum investing, let’s turn to a market where returns were not so strong: Japan. As of the end of 2017, the MSCI Japan index was nearing its high water mark set at the end of 1989: a drawdown of 38 years.

Under the same analysis, using the International Monetary Fund’s (IMF) Japanese discount rate as a proxy for the risk-free rate in Japan, DCA only outperforms LSI slightly more than half of the time over the period from 1970 to 2017.

Source: MSCI and Federal Reserve of St. Louis. Calculations by Newfound Research. Results are hypothetical. Past performance does not guarantee future results.

Truncating the time frame to begin in 1989 penalizes DCA even more – perhaps surprisingly, given the negligible average return – with it now outperforming slightly under 50% of the time.

Over the entire time period, there is a similar relationship to the outperformance of LSI versus the performance of the Japanese equity index.

Source: MSCI and Federal Reserve of St. Louis. Calculations by Newfound Research. Results are hypothetical. Past performance does not guarantee future results.

 

The Truth About Dollar-Cost Averaging

Given this empirical evidence, why is dollar-cost averaging still frequently touted as a superior investing strategy?

The claims – many of which come from media outlets – that dollar-cost averaging is predominantly beneficial from a return perspective are false.  It nearly always sacrifices returns, and many examples highlighted in these articles paint pictures of hypothetical scenarios that, while grim, are very isolated and/or unrealistic given the historical data.

Moving beyond the empirical evidence, dollar-cost averaging is theoretically sub-optimal to lump sum investing in terms of expected return.

This was shown to be the case in a mean-variance framework in 1979 by George Constantinides.[6]

His argument was that rather than committing to a set investment schedule based on the initial information in the market, adopting a more flexible approach that adjusts the investment amount based on subsequent market information will outperform DCA.

In the years since, many other hypotheses have been put forward for why DCA should be beneficial – different investor utility functions, prospect theory, and mean reversion in equity returns, among others – and most have been shown to be inadequate to justify DCA.

More recently, Hayley (2012)[7] explains the flaw in many of the DCA arguments based on a cognitive error in assuming that the purchase at a lower average price increases the expected returns.

His argument is that since purchasing at the average price requires buying equal share amounts each period,  you can only invest the total capital at the true average price of a security or portfolio with perfect foreknowledge of how the price will move. This leads to a lower average purchase price for DCA compared to this equal share investing strategy.

But if you had perfect foreknowledge of the future prices, you would not choose to invest equal share amounts in the first place!

Thus, the equal share investing plan is a straw man comparison for DCA.

We can see this more clearly when we actually dive into examples that are similar to ones generally presented in favor of DCA.

We will call the equal share strategy that invests the entire capital amount, ES Hypothetical. This is the strategy that uses the knowledge of the price evolution.  The more realistic equal share investing strategy assumes that prices will remain fixed and purchases the same shares in each period as the DCA strategy purchases in the first period. The strategy is called ES Actual. Any remaining capital is invested in the final period regardless of whether it purchases more or fewer shares than desired, but the results would still hold if this amount were considered to still be held as cash (possibly borrowed if need be) since the analysis ends at this time step.

The following tables show the final account values for 4 simple market scenarios:

  1. Downtrend
  2. Uptrend
  3. Down then up
  4. Up then down

In every scenario, the DCA strategy purchases shares at a lower average cost than the ES Hypothetical strategy and ends up better off, but the true comparison is less clear cut.

The ES Actual and LSI strategies’ average purchase prices and final values may be higher or lower than DCA.

A Comparison of DCA to Equal Share Investing and LSI

Calculations by Newfound Research. All examples are hypothetical.

A More General Comparison of LSI and DCA

In these examples, DCA does outperform LSI half the time, but these examples are extremely contrived.

We can turn to simulations to get a better feel for how often LSI will outperform DCA and by how much under more realistic assumptions of asset price movements.

Using Monte Carlo, we can see how often LSI outperforms DCA for a variety of expected excess returns and volatilities over 12-month periods. Using expected excess returns allows us to neglect the return on cash.

For any positive expected return, LSI is expected to outperform more frequently at all volatility levels. The frequency increases as volatility decreases for a given expected return.

If the expected annual return is negative, then DCA outperforms more frequently.

Calculations by Newfound Research. Results assume Geometric Brownian Motion using the given parameters and compare investing all capital at the beginning of 12 months to investing capital equally at the beginning of each month.

Turning now to the actual amount of outperformance, we see a worse picture for DCA.

For more volatile assets, the expected outperformance is in LSI’s favor even at negative expected returns. This is the case despite what we saw before about DCA outperforming more frequently for these scenarios.

Calculations by Newfound Research. Results assume Geometric Brownian Motion using the given parameters and compare investing all capital at the beginning of 12 months to investing capital equally at the beginning of each month.

As interest rates increase, DCA will benefit assuming that the expected return on equities remains the same (i.e. the expected excess return decreases). However, even if we assume that the cash account could generate an extra 200 bps, which is generous given that this would imply that cash rates were near 4%, for the 15% volatility and 5% expected  excess return case, this would still mean that LSI would be expected to outperform DCA by 100 bps.

 

What About Risk?

It is clear that DCA does not generally outperform LSI from a pure return point-of-view, but what about when risk is factored in? After all, part of the reason DCA is so popular is because it is said to reduce the risk of investing at the worst possible time.

Under the same Monte Carlo setup, we can use the ulcer index to quantify this risk. The ulcer index measures the duration and severity of the drawdowns experienced in an investment, where a lower ulcer index value implies fewer and less severe drawdowns.

The chart below shows the median ratio of the LSI ulcer index and the DCA ulcer index. We plot the ratio to better compare the relative riskiness of each strategy.

Calculations by Newfound Research. Results assume Geometric Brownian Motion using the given parameters and compare investing all capital at the beginning of 12 months to investing capital equally at the beginning of each month.

As we would expect, since the DCA strategy linearly moves from cash to an investment, the LSI scheme takes on about twice the drawdown risk in many markets.

When the lump sum is invested, the whole investment is subject to the mercy of the market, but if DCA is used, the market exposure is only at its maximum in the last month.[8]

The illustration of this risk alone may be enough to convince investors that DCA meets its objective of smoothing out investment returns. However, at what cost?

Combining the expected outperformance and the risk embodied in the ulcer index shows that LSI is still expected to outperform on a risk adjusted basis between about 35% and 45% of the time.

Calculations by Newfound Research. Results assume Geometric Brownian Motion using the given parameters and compare investing all capital at the beginning of 12 months to investing capital equally at the beginning of each month.

While this is lower than it was from a pure return perspective, it should be taken with a grain of salt.

First, we know from the start that LSI will be more exposed to drawdowns. One possible solution would be treat a ratio of ulcer indices of 2 (instead of 1) as the base case.

Second, for an investor who is not checking their account monthly, the ulcer index may not mean much. If you only looked at the account value at the beginning and end of the year regardless of whether you did DCA or LSI, then LSI is generally expected to leave the account better off; the intermediate noise does not get “experienced.”

 

When Can DCA Work?

So now that we have shown that DCA is empirically and theoretically suboptimal to LSI , why might you still want to do it?

First, we believe there is still a risk reduction argument that makes sense when accounting for investor behavior. Most research has focused on risk in the form of volatility. We showed previously that focusing more on drawdown risk can lead to better risk-adjusted performance of DCA.

We could also look at the gain-to-pain ratio, defined here as the average outperformance divided by the average underperformance of the LSI strategy.

The following chart shows a sampling of asset classes expected returns and volatilities from Research Affiliates with indifference boundaries for different gain-to-pain ratios. Indifferences boundaries show the returns and volatilities with constant gain-to-pain ratios. For a given gain-to-pain ratio (e.g. 1.5 means that you will only accept the risk in LSI if its outperformance over DCA is 50% higher, on average), any asset class points that fall below that line are good candidates for DCA.

The table below shows which asset classes correspond to each region on the chart.

Source: Research Affiliates. Calculations by Newfound Research. Results assume Geometric Brownian Motion using the given parameters and compare investing all capital at the beginning of 12 months to investing capital equally at the beginning of each month.

As the indifference coefficient increases, the benefit of DCA from a gain-to-pain perspective becomes less. For volatile asset classes with lower expected returns (e.g. U.S. equities and long-term U.S. Treasuries), DCA may make sense. For less volatile assets like income focused funds and assets with higher expected growth like EM equities, LSI may be the route to pursue.

A second reason for using DCA is that there are also some market environments that are actually favorable to DCA. As we saw previously, down-trending markets lead to better absolute performance for DCA and volatility makes DCA more attractive from a drawdown risk perspective even in markets with positive expected returns.

Sideways markets are also good for DCA. So are markets that have a set final return.[9] The more volatility the better for DCA in these scenarios.

The chart below shows the return level below which DCA is favored.  If you are convinced that the market will return less than -0.6% this year, then DCA is expected to outperform LSI.

Calculations by Newfound Research. Results assume Brownian Bridges using the given parameters and compare investing all capital at the beginning of 12 months to investing capital equally at the beginning of each month.

While a set final return may be an unrealistic hope – who knows where the market will be a year from now? – it allows us to translate beliefs for market returns into an investing plan with DCA or LSI.

However, even though the current high-valuation environment has historically low expected returns for stocks and bonds, the returns over the next year may vary widely. The appeal of DCA may be stronger in this environment even though it is sub-optimal to LSI.

Instead of using DCA on its own as a risk management tool – one that may sacrifice too much of the return to be had – we can pair it with other risk management techniques to improve its odds of outperforming LSI.

Finding a DCA Middle Ground

One of the primary drags on DCA performance is the fact that much of the capital is sitting in cash for most of the time.

Is there a way to reduce this cost of waiting to invest?

One initial alternative to cash is to hold the capital in bonds. This is in line with the intuitive notion of beginning in a low risk profile and moving gradually to a higher one. While this improves the frequency of outperformance of DCA historically, it does little to improve the expected outperformance.

Another option is to utilize a risk managed sleeve that is designed to protect capital during market declines and participate in market growth. Using a simple tactical strategy that holds stocks when they are above their 10-month SMA and bonds otherwise illustrates this point, boosting the frequency of outperformance for DCA from 32% to 71%.

Source: Kenneth French Data Library and Robert Shiller Data Library. Calculations by Newfound Research. Results are hypothetical. Past performance does not guarantee future results.

Source: Kenneth French Data Library and Robert Shiller Data Library. Calculations by Newfound Research. Results are hypothetical. Past performance does not guarantee future results.

The tactical strategy narrows the distribution of expected outperformance much more than bonds.

Since we know that the tactical strategy did well over this historical period with the benefit of hindsight, we can also look at how it would have done if returns on stocks and bonds were scaled down to match the current expectations from Research Affiliates.[10]

Source: Kenneth French Data Library and Robert Shiller Data Library. Calculations by Newfound Research. Results are hypothetical. Past performance does not guarantee future results.

The frequency of outperformance is still in favor of the tactical strategy, and the distribution of outperformance exhibits trends similar to using the actual historical data.

Going back to the Japanese market example, we also see improvement in DCA using the tactical strategy. The benefit was smaller than in the U.S, but it was enough to make both the frequency and expected outperformance swing in favor of DCA, even for the period from 1989 to 2017.

Source: MSCI and Federal Reserve of St. Louis. Calculations by Newfound Research. Results are hypothetical. Past performance does not guarantee future results. Data from 1970 to 2017.

Deploying cash immediately into a risk-managed solution does not destroy the risk of DCA underperforming if it uses cash. The cost of using this method is that a tactical strategy can be exposed to whipsaw.

One way to mitigate the cost of whipsaw is to use a more diversified (in terms of process and assets) risk management sleeve.

 

Conclusion

Dollar-cost averaging verses lump sum investing is often a difficult decision fraught with emotion. Losing 10% of an investment right off the bat can be a hard pill to swallow. However, the case against DCA is backed up by empirical evidence and many theoretical arguments.

If a portfolio is deemed optimal based on an investor’s risk preferences and tolerances, then anything else would be suboptimal. But what is optimal on paper is not always the best for an investor who cannot stick with the plan.

Because of this, there are times when DCA can be beneficial. Certain measures of risk that account for drawdowns or the asymmetric psychological impacts of gains and losses point to some benefits for DCA over LSI.

Given that even in this low expected return market environment, the expected return on cash is still less than that on equities and bonds, deploying cash in a risk-managed solution or a strategy that has higher expected returns for the amount of risk it takes may be a better holding place for cash while implementing a DCA scheme.

It is important to move beyond a myopic view, commonly witnessed in the market, that DCA is best for every situation. Even though LSI may feel like market timing, DCA is simply another form of market timing. With relatively small balances, DCA can also increase commission costs and possibly requires more oversight or leads to higher temptation to check in on a portfolio, resulting in rash decisions.

Ultimately, the best solution is the one that gets an investor into an appropriate portfolio, encourages them to stay on track for their long term financial goals, and appropriately manages any behavioral consequences along the way.

 

[1] https://personal.vanguard.com/pdf/s315.pdf

[2] https://blog.thinknewfound.com/2017/02/misattributing-bad-behavior/

[3] A Note on the Suboptimality of Dollar-Cost Averaging as an Investment Policy, https://faculty.chicagobooth.edu/george.constantinides/documents/JFQA_1979.pdf

[4] Dollar-Cost Averaging: The Role of Cognitive Error, https://www.cass.city.ac.uk/__data/assets/pdf_file/0008/128384/Dollar-Cost-Averaging-09052012.pdf

[5] This is a form of sequence risk. In DCA, the initial returns on the investment do not have the same impact as the final period returns.

[6] Milevsky, Moshe A. and Posner, Steven E., A Continuous-Time Re-Examination of the Inefficiency of Dollar-Cost Averaging (January 1999). SSBFIN-9901. Available at SSRN: https://ssrn.com/abstract=148754

[7] Specifically, we use the “Yield & Growth” capital market assumptions from Research Affiliates.  These capital market assumptions account assume that there is no valuation mean reversion (i.e. valuations stay the same going forward).  The adjusted average nominal returns for U.S. equities and 10-year U.S. Treasuries are 5.3% and 3.3%, respectively.

Are Market Implied Probabilities Useful?

This post is available as a PDF download here.

Summary­­

  • Using historical data from the options market along with realized subsequent returns, we can translate risk-neutral probabilities into real-world probabilities.
  • Market implied probabilities are risk-neutral probabilities derived from the derivatives market. They incorporate both the probability of an event happening and the equilibrium cost associated with it.
  • Since investors have the flexibility of designing their own portfolios, real-world probabilities of event occurrences are more relevant to individuals than are risk-neutral probabilities.
  • With a better handle on the real-world probabilities, investors can make portfolio decisions that are in line with their own goals and risk tolerances, leaving the aggregate decision making to the policy makers.

Market-implied probabilities are just as the name sounds: weights that the market is assigning an event based upon current prices of financial instruments.  By deriving these probabilities, we can gain an understanding of the market’s aggregate forecast for certain events.  Fortunately, the Federal Reserve Bank of Minneapolis provides a very nice tool for visualizing market-implied probabilities without us having to derive them.[1]

For example, say that I am concerned about inflation over the next 5 years. I can see how the probability of a large increase has been falling over time and how the probability of a large decrease has fallen recently, with both currently hovering around 15%.

Historical Market Implied Probabilities of Large Moves in Inflation

Source: Minneapolis Federal Reserve

I can also look at the underlying probability distributions for these predictions, which are derived from the derivatives market, and compare the changes over time.

Market Implied Probability Distributions for Moves in Inflation

Source: Minneapolis Federal Reserve

From this example, we can judge that not only has the market’s implied inflation forecast increased, but the precision has also increased (i.e. lower standard deviation) and the probabilities have been skewed to the left with fatter tails (i.e. higher kurtosis).

Inflation is only one of many variables analyzed.

Also available is implied probability data for the S&P 500, short and long-term interest rates, major currencies versus the U.S. dollar, commodities (energy, metal, and agricultural), and a selection of the largest U.S. banks.

With all the recent talk about low volatility, the data for the S&P 500 over the next 12 months is likely to be particularly intriguing to investors and advisors alike.

Historical Market Implied Probabilities of Large Moves in the S&P 500

Source: Minneapolis Federal Reserve

The current market implied probabilities for both large increases and decreases (i.e. greater than a 20% move) are the lowest they have been since 2007.

Interpreting Market Implied Probabilities

A qualitative assessment of probability is generally difficult unless the difference is large. We can ask ourselves, for example, how we would react if the probability of a large loss jumped from 10% to 15%. We know that the latter case is riskier, but how does that translate into action?

The first step is understanding what the probability actually means.

Probability forecasts in weather are a good example of this necessity. Precipitation forecasts are a combination of forecaster certainty and coverage of the likely precipitation.[2] For example, if there is a 40% chance of rain, it could mean that the forecaster is 100% certain that it will rain in 40% of the area. Or it could mean that they are 40% certain that it will rain in 100% of the area.  Or it could mean that they are 80% certain that it will rain in 50% of the area.

Once you know what the probability even represents, you can have a better grasp on whether you should carry an umbrella.

In the case of market implied probabilities, what we have is the risk-neutral probability. These are the probabilities of an event given that investors are risk neutral; these probabilities factor in the both the likelihood of an event and the cost in the given state of the world. These are not the real-world probabilities of the market moving by a given amount. In fact, they can change over time even if the real-world probabilities do not.

To illustrate these differences between a risk-neutral probability and a real-world probability, consider a simple coin flip game. The coin is flipped one time. If it lands on heads, you make $1, and if it lands on tails, you lose $1.

The coin is fair, so the probability for the coin flip is 50%. How much would you pay to play this game?

If you answer is nothing, then you are risk neutral, and the risk neutral probability is also 50%.

However, risk averse players would say, “you have to pay me to play that game.” In this case, the risk neutral probability of a tails is greater than 50% because of the downside that players ascribe to that event.

Now consider a scenario where a tails still loses $1, but a heads pays out $100.  Chances are that even very risk-averse players would pay close to $1 to play this game.

In this case, the risk neutral probability of a heads would be much greater than 50%.

But in all cases, the actual likelihoods of heads and tails never changed; they still had a 50% real-world probability of occurring.

As with the game, investors who operate in the real world are generally risk averse. We pay premiums for insurance-like investments to protect in the states of the world we dread the most. As such, we would expect the risk-neutral probability of a “bad” event (e.g. the market down more than 20%) to be higher than the real-world probability.

Likewise, we would expect the risk-neutral probability of a “good” event (e.g. the market up more than 20%) to be lower than the real-world probability.

How Market Implied Probabilities Are Calculated

Note (or Warning): This section contains some calculus. If that is not of interest, feel free to skip to the next section; you won’t miss anything. For those interested, we will briefly cover how these probabilities are calculated to see what (or who), exactly, in the market implies them.

The options market contains call and put options over a wide array of strike prices and maturities. If we assume that the market is free from arbitrage, we can transform the price of put options into call options through put-call parity.[3]

In theory, if we knew the price of a call option for every strike price, we could calculate the risk-neutral probability distribution, fRN, as the second derivative with respect to the strike price.

where r is the risk-free rate, C is the price of a call option, K is the strike price and T-t is the time to option maturity.

Since options do not exist at every strike price, a curve is fit to the data to make it a continuous function that can be differentiated to yield the probability distribution.

Immediately, we see that the probabilities are set by the options market.

Are Market Implied Probabilities Useful?

Feldman et. al (2015), from the Minneapolis Fed, assert that market-based probabilities are a useful tool for policy makers.[4] Their argument centers around that fact that risk-neutral probabilities encapsulate both the probability of an event occurring – the real-world probability – and the cost/benefit of the event.

Assuming broad access to the options market, households or those acting on behalf of households can express their views on the chances of the event happening and the willingness to exchange cash flows in different states of the world by trading the appropriate options.

In the paper, the authors admit two main pitfalls:

  1. Participation – An issue can arise here since the people whose welfare the policy-makers are trying to consider may not be participating. Others outside the U.S. may also be influencing the probabilities.
  2. Illiquidity – Options do not always trade frequently enough in the fringes of the distribution where investors are usually most concerned. Because of this, any extrapolation must be robust.

However, they also refute many common arguments against using risk-neutral probabilities.

  1. These are not “true” probabilities – The fact that these market implied probabilities are model-independent and derived from household preferences rather than from a statistician’s model, with its own biased assumptions, is beneficial, especially since these market probabilities account for resource availability.
  2. No Household is “Typical” – In equilibrium, all households should be willing to rearrange their cash flows in different states of the world as long as the market is complete. Therefore, a policy-maker aligns their beliefs with those of the households in aggregate by using the market-based probabilities.

We have covered how policymakers often do not forecast very well themselves[5], which Ellison and Sargent argue may be intentional, stating that the FOMC may purposefully forecast incorrectly in order to form policy that is robust to model misspecification.[6]

Where a problem could arise is when an individual investor (i.e. a household) makes a decision for their own portfolio based on these risk-neutral probabilities.

We agree that having a financial market makes a “typical” investor more relevant than the “average fighter pilot” example in our previous commentary.[7]  But what a central governing body uses to make decisions is different from what may be relevant to an individual.

The ability to be flexible is key. In this case, an investor can construct their own portfolio.  It would be like a pilot constructing their own plane.

Getting to Real World Probabilities

Using the method outlined in Vincent-Humphreys and Noss (2012), we can transform risk-neutral probabilities into real-world probabilities, assuming that investor risk preferences are stable over time.[8]

Without getting too deep into the mathematical framework, the basic premise is that if we have a probability density function (PDF) for the risk-neutral probability, fRN, with a cumulative density function (CDF), FRN, we can multiply it by a calibration function, C, to obtain the real-world probability density function, fRW.

The beta distribution is a suitable choice for the calibration function.[9]  Using a beta distribution balances being parsimonious – it only has two parameters – with flexibility, since it allows for preserving the risk-neutral probability distribution by simply shifting the mean and adjusting the variance, skew, and kurtosis.

The beta distribution parameters are calculated using the realized value that the market implied probability represents (e.g. change in the S&P 500, interest rates commodity prices, etc.) over the subsequent time period.

Deriving the Real-World Probability for a Large Move in the S&P 500 

We have now covered what market-implied probabilities are and how they are calculated and discussed their usefulness for policy makers.

But individual investors price risk differently based on their own situations and preferences. Because of this, it is helpful to strip off the market-implied costs that are baked into the risk-neutral probabilities. The real-world probabilities could then be used to weigh stress testing scenarios or evaluate the cost of other risk management techniques that align more with investor goals than option strategies.

Using the framework outlined above, we can go through an example of transforming the market implied probabilities of large moves in the S&P 500 into their real-world probabilities.

Statistical aside: The options data starts in 2007, and with 10 years of data, we only have 10 non-overlapping data points, which reduces the power of the maximum likelihood estimate used to fit the beta distribution. However, with options expiring twice a month, we have 24 separate data sets to use for calculating standard errors. Since we are concerned more about the potential differences between the risk-neutral and real-world distributions, we could use the rolling 12-month periods and still see the same trends. As with any analysis with overlapping periods, there can be significant autocorrelation to deal with. By using the 6-month distribution data from the Minneapolis Fed, we could double the number of observations.

Since the Minneapolis Fed calculates the market implied (risk neutral) probability distribution and the summary statistics (numerically), we must first translate it into a functional form to extend the analysis. Based on the data and the summary statistics, the distribution is neither normal nor log-normal. It is left-skewed and has fat tails most of the time.

Market Implied Probability Distributions for Moves in the S&P 500

Source: Minneapolis Federal Reserve

We will assume that the distribution can be parameterized using a skewed generalized t-distribution, which allows for these properties and also encompasses a variety of other distributions including the normal and t-distributions.[10]  It has 5 parameters, which we will fit by matching the moments (mean, variance, skewness and kurtosis) of the distribution along with the 90th percentile value, since that tail of the distribution is generally the smaller of the two.[11]

We can check the fits using the reported median and the 10th percentile values to see how well they match.

Fit Percentile Values vs. Reported Values

Source: Minneapolis Fed.  Calculations by Newfound Research. 

There are instances where the reported distribution is bi-modal and would not be as accurately represented by the generalized skewed t distribution, but, as the above graph shows, the quantiles where our interest is focused line up decently well.

Now that we have our parameterized risk-neutral distribution for all time periods, the next step is to input the subsequent 12-month S&P 500 return into the CDF calculated at each point in time. While we don’t expect this risk-neutral distribution to necessarily produce a good forecast of the market return, this step produces the data needed to calibrate the beta function.

The graph below shows this CDF result over the rolling periods.

Cumulative Probabilities of Realized 12-month S&P 500 Returns using the Risk-Neutral Distribution from the Beginning of Each Period

Source: Minneapolis Fed and CSI.  Calculations by Newfound Research. 

The persistence of high and low values is evidence of the autocorrelation issue we discussed previously since the periods are overlapping.

The beta distribution function used to transition from the risk-neutral distribution to the real-world distribution has parameters j = 1.64 and k = 1.00 with standard errors of 0.09 and 0.05, respectively.

We can see how this function changes at the endpoints of the 95% confidence intervals for each parameter as a way to assess the uncertainty in the calibration.

Estimated Calibration Functions for 12-month S&P 500 Returns

Source: Minneapolis Fed and CSI.  Calculations by Newfound Research. Data from Jan 2007 to Nov 2017.

When we transform the risk-neutral distribution into the real-world distribution, the calibration function values that are less than 1 in the left tail reduce the probabilities of large market losses.

In the right tail, the calibration estimates show that real-world probabilities could be higher or lower than the risk-neutral probabilities depending on the second parameter’s value in the beta distribution (this corresponds to k being either greater than or less than 1).

With the risk-neutral distribution and the calibrated beta distribution, we now have all the pieces to calculate the real-world distribution at any point in the options data set.

The graph below shows how these functions affect the risk-neutral probability density using the most recent option data. As expected, much more of the density is centered around the mode, and the distribution is skewed to the right, even using the bounds of the confidence intervals (CI) for the beta distribution parameters. 

Risk Neutral and Real-World Probability Densities

Source: Minneapolis Fed and CSI.  Calculations by Newfound Research. Data as of 11/15/17. Past performance is no guarantee of future results.

 

Risk NeutralReal-WorldReal World (Range Based on Calibration)
Mean Return0.1%6.4%4.2% – 8.3%
Probability of a large loss (>20%)10.6%2.6%1.5% – 4.3%
Probability of a large gain (>20%)1.7%2.8%1.7% – 4.4%

 Source: Minneapolis Fed and CSI.  Calculations by Newfound Research. Data as of 11/15/17. Past performance is no guarantee of future results.

Based on this analysis, we see some interesting things occurring.

  • The mean return is considerably higher than the values suggested by many large firms, such as JP Morgan, BlackRock, and Research Affiliates.[12] Those estimates are generally for 7-10 years, so this doesn’t rule out having a good 2018, which is what the options market is showing.
  • The real-world probability of a large loss is considerably lower than the 10.6% risk-neutral probability. When firms like GMO are indicating that 10% levels are already unreasonably low, their assessment of complacency in the market would only get stronger.[13]
  • The lower end of the range for the real-world probability of a large gain is in line with the risk-neutral probability, suggesting that investors are seeking out risks (lower risk aversion) in a market with depressed yields on fixed income and low current volatility.

This also shows how looking at market implied probabilities can paint a skewed picture of the chances of an event occurring.

However, we must keep in mind that these real-world probabilities are still derived from the market-implied probabilities. In an efficient market world, all risks would correctly be priced into the market. But we know from the experience during the Financial Crisis that that is not always the case.

Our recommendation is to take all market probabilities with a grain of salt. Just because having a coin land on heads five times in a row has a probability of less than 4% doesn’t mean we should be surprised if it happens once. And coin flipping is something that we know the probability for.

Whether the market probabilities we use are risk-neutral or real-world, there are a lot of assumptions that go into calculating them, and the consequences of being wrong can have a large impact on portfolios. Risk management is important if the event occurs regardless of how likely it is to occur.

As with the weather, a 10% chance of a large loss versus a 4% chance is not a big difference in absolute terms, but a large portfolio loss is likely more devastating than getting rained on a bit should you decide not to bring an umbrella.

Conclusion

Market implied probabilities are risk-neutral probabilities derived from the derivatives market. If we assume that the market is efficient and that there is sufficient investor participation in these markets, then these probabilities can serve as a tool for governing organizations to adjust policy going forward.

However, these probabilities factor in both the actual probability of an event and the perceived cost to investors. Individual investors will attribute their own costs to such events (e.g. a retiree could be much more concerned about a 20% market drop than someone in the beginning of their career).

If individuals want to assess the probability of the event actually happening in order to make portfolio decisions, then they have to focus on the real-world probabilities.  Ultimately, an investor’s cost function associated with market events depends more on life circumstances. While a bad state of the world for an investor can coincide with a bad state of the world for the market (e.g. losing a job when the market tanks), risk in an individual’s portfolio should be managed for the individual, not the “typical household”.

While the real-world probability of an event is typically dependent on an economic or statistical model, we have presented a way to translate the market implied probabilities into real-world probabilities.

With a better handle on the real-world probabilities, investors can make portfolio decisions that are in line with their own goals and risk tolerances.

[1] https://www.minneapolisfed.org/banking/mpd

[2] https://www.weather.gov/ffc/pop

[3] https://en.wikipedia.org/wiki/Put%E2%80%93call_parity

[4] https://www.minneapolisfed.org/~/media/files/banking/mpd/optimal_outlooks_dec22.pdf

[5] https://blog.thinknewfound.com/2015/03/weekly-commentary-folly-forecasting/

[6] https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2160157

[7] https://blog.thinknewfound.com/2017/09/the-lie-of-averages/

[8] https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2093397

[9] The beta distribution takes arguments between 0 and 1, inclusive, and has a non-decreasing CDF. It was also used in Fackler and King (1990) – https://www.jstor.org/stable/1243146.

[10] https://cran.r-project.org/web/packages/sgt/vignettes/sgt.pdf

[11] Since we have 5 unknown parameters, we have to add in this fifth constraint. We could also have used the 10th percentile value or the median. Whichever, we use, we can see how well the other two align with the reported values.

[12] https://interactive.researchaffiliates.com/asset-allocation/

[13] https://www.gmo.com/docs/default-source/research-and-commentary/strategies/asset-allocation/the-s-p-500-just-say-no.pdf

Addressing Low Return Forecasts in Retirement with Tactical Allocation

This post is available for download as a PDF here.

Summary­­

  • The current return expectations for core U.S. equities and bonds paint a grim picture for the success of the 4% rule in retirement portfolios.
  • While varying the allocation to equities throughout the retirement horizon can provide better results, employing tactical strategies to systematically allocate to equities can more effectively reduce the risk that the sequence of market returns is unfavorable to a portfolio.
  • When a tactical strategy is combined with other incremental planning and portfolio improvements, such as prudent diversification, more accurate spending assessments, tax efficient asset location, and fee-conscious investing, a modest allocation can greatly boost likely retirement success and comfort.

Over the past few weeks, we have written a number of posts on retirement withdrawal planning.

The first was about the potential impact that high core asset valuations – and the associated muted forward return expectations – may have on retirement.

The second was about the surprisingly large impact that small changes in assumptions can have on retirement success, akin to the Butterfly Effect in chaos theory. Retirement portfolios can be very sensitive to assumed long-term average returns and assumptions about how a retiree’s spending will evolve over time.

In the first post, we presented a visualization like the following:

Historical Wealth Paths for a 4% Withdrawal Rate and 60/40 Stock/Bond Allocation
Source: Shiller Data Library.  Calculations by Newfound Research. Analysis uses real returns and assumes the reinvestment of dividends.  Returns are hypothetical index returns and are gross of all fees and expenses.  Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

The horizontal (x-axis) represents the year when retirement starts.  The vertical (y-axis) represents the years post-retirement.  The coloring of each cell represents the savings balance at a given point in time.  The meaning of each color as follows:

  • Green: Current account value greater than or equal to initial account value (e.g. an investor starting retirement with $1,000,000 has a current account balance that is at least $1,000,000).
  • Yellow: Current account value is between 75% and 100% of initial account value
  • Orange: Current account value is between 50% and 75% of the initial account value.
  • Red: Current account value is between 25% and 50% of the initial account value.
  • Dark Red: Current account value is between 0% and 25% of initial account value.
  • Black: Current account value is zero; the investor has run out of money.

We then recreated the visualization, but with one key modification: we adjusted the historical stock and bond returns downward so that the long-term averages are in line with realistic future return expectations[1] given current valuation levels.  We did this by subtracting the difference between the actual average log return and the forward-looking long return from each year’s return.  With this technique, we capture the effect of subdued average returns while retaining realistic behavior for shorter-term returns.

 

Historical Wealth Paths for a 4% Withdrawal Rate and 60/40 Stock/Bond Allocation with Current Return Expectations

Source: Shiller Data Library.  Calculations by Newfound Research. Analysis uses real returns and assumes the reinvestment of dividends.  Returns are hypothetical index returns and are gross of all fees and expenses.  Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

One downside of the above visualizations is that they only consider one withdrawal rate / portfolio composition combination.  If we want the see results for withdrawal rates ranging from 1% to 10% in 1% increments and portfolio combinations ranging from 0/100 stocks/bonds to 100/0 stocks/bonds in 20% increments, we would need sixty graphs!

To distill things a bit more, we looked at the historical “success” of various investment and withdrawal strategies.  We evaluated success on three metrics:

  1. Absolute Success Rate (“ASR”): The historical probability that an individual or couple will not run out of money before their retirement horizon ends.
  2. Comfortable Success Rate (“CSR”): The historical probability that an individual or couple will have at least the same amount of money, in real terms, at the end of their retirement horizon compared to what they started with.
  3. Ulcer Index (“UI”): The average pain of the wealth path over the retirement horizon where pain is measured as the severity and duration of wealth drawdowns relative to starting wealth. [2]

As a quick refresher, below we present the ASR for various withdrawal rate / risk profile combinations over a 30-year retirement horizon first using historical returns and then using historical returns adjusted to reflect current valuation levels.  The CSR and Ulcer Index table illustrated similar effects.

Absolute Success Rate for Various Combinations of Withdrawal Rate and Portfolio Composition – 30 Yr. Horizon

Absolute Success Rate for Various Combinations of Withdrawal Rate and Portfolio Composition with Average Stock and Bond Returns Equal to Current Expectations – 30 Yr. Horizon

Source: Shiller Data Library.  Calculations by Newfound Research.  Analysis uses real returns and assumes the reinvestment of dividends.  Returns are hypothetical index returns and are gross of all fees and expenses.  Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Overall, our analysis suggested that retirement withdrawal rates that were once safe may now deliver success rates that are no better – or even worse – than a coin flip.

The combined conclusion of these two posts is that the near future looks pretty grim for retirees and that an assumption that is slightly off can make the outcome even worse.

Now, we are going to explore a topic that can both mitigate low growth expectations and adapt a retirement portfolio to reduce the risk of a bad planning assumption. But first, some history.

 

How the 4% Rule Started

In 1994, Larry Bierwirth proposed the 4% rule, and William Bengen expanded on the research in the same year.[3], [4]

In the original research, the 4% rule was derived assuming that the investor held a 50/50 stock/bond portfolio, rebalanced annually, withdrew a certain percentage of the initial balance, and increased withdrawals in line with inflation. 4% is the highest percentage that could be withdrawn without ever running out of money over an historical 30-year retirement horizon.

Graphically, the 4% rule is the minimum value shown below.

Maximum Inflation Indexed Withdrawal to Deplete a 60/40 Portfolio Over a 30 Yr. Horizon

Source: Shiller Data Library.  Calculations by Newfound Research.  Analysis uses real returns and assumes the reinvestment of dividends.  Returns are hypothetical index returns and are gross of all fees and expenses.  Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Since its publication, the rule has become common knowledge to nearly all people in the field of finance and many people outside it. While it is a good rule-of-thumb and starting point for retirement analysis, we have two major issues with its broad application:

  1. It assumes that not running out of money is the only goal in retirement without considering implications of ending surpluses, return paths that differ from historical values, or evolving spending needs.
  2. It provides a false sense of security: just because 4% withdrawals never ran out of money in the past, that is not a 100% guarantee that they won’t in the future.

 

For example, if we adjust the stock and bond historical returns using the estimates from Research Affiliates (discussed previously) and replicate the analysis Bengen-style, the safe withdrawal rate is a paltry 2.6%.

 

Maximum Inflation Indexed Withdrawal to Deplete a 60/40 Portfolio Over a 30 Yr. Horizon using Current Return Estimates

Source: Shiller Data Library and Research Affiliates.  Calculations by Newfound Research.  Analysis uses real returns and assumes the reinvestment of dividends.  Returns are hypothetical index returns and are gross of all fees and expenses.  Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

While this paints a grim picture for retirement planning, it’s not likely how one would plan their financial future. If you were to base your retirement planning solely on this figure, you would have to save 54% more for retirement to generate the same amount of annual income as with the 4% rule, holding everything else constant.

In reality, even with the low estimates of forward returns, many of the scenarios had safe withdrawal rates closer to 4%. By putting a multi-faceted plan in place to reduce the risk of the “bad” scenarios, investors can hope for the best while still planning for the worst.

One aspect of a retirement plan can be a time-varying asset allocation scheme.

 

Temporal Risk in Retirement

Conventional wisdom says that equity risk should be reduced as one progresses through retirement. This is what is employed in many “through”-type target date funds that adjust equity exposure beyond the retirement age.

If we heed the “own your age in bonds” rule, then a retiree would decrease their equity exposure from 35% at age 65 to 5% at the end of a 30-year plan horizon.

Unfortunately, this thinking is flawed.

When a newly-minted retiree begins retirement, their success is highly dependent on their first few years of returns because that is when their account values are the largest. As they make withdrawals and are reducing their account values, the impact of a large drawdown in dollar terms is not nearly as large.  This is known as sequence risk.

As a simple example, consider three portfolio paths:

  • Portfolio A: -30% return in Year 1 and 6% returns for every year from Year 2 – Year 30.
  • Portfolio B: 6% returns for every year except for Year 15, in which there is a -30% return.
  • Portfolio C: 6% returns for every year from Year 1 – Year 29 and a -30% return in Year 30.

These returns work about to the expected returns on a 60/40 portfolio using Research Affiliates’ Yield & Growth expectations, and the drawdown is approximately in line with the drawdown on a 60/40 portfolio over the past decade.  We will assume 4% annual withdrawals and 2% annual inflation with the withdrawals indexed to inflation.

 

3 Portfolios with Identical Annualized Returns that Occur in Different Orders

Portfolio C fares the best, ending the 30-year period with 12% more wealth than it began with. Portfolio B makes it through, not as comfortably as Portfolio C but still with 61% of its starting wealth. Portfolio A, however, starts off stressful for the retiree and runs out of money in year 27.

Sequence risk is a big issue that retirement portfolios face, so how does one combat it with dynamic allocations?

 

The Rising Glide Path in Retirement

Kitces and Pfau (2012) proposed the rising glide path in retirement as a method to reduce sequence risk.[5]  They argued that since retirement portfolios are most exposed to market risk at the beginning of the retirement period, they should start with the lowest equity risk and ramp up as retirement progresses.

Based on Monte Carlo simulations using both capital market assumptions in line with historical values and reduced return assumptions for the current environment, the paper showed that investors can maximize their success rate and minimize their shortfall in bad (5th percentile) scenarios by starting with equity allocations of between 20% and 40% and increasing to 60% to 80% equity allocations through retirement.

We can replicate their analysis using the reduced historical return data, using the same metrics from before (ASR, CSR, and the Ulcer Index) to measure success, comfort, and stress, respectively.

 

Absolute Success Rate for Various Equity Glide Paths with Average Stock and Bond Returns Equal to Current Expectations – 30 Yr. Horizon with a 4% Initial Withdrawal Rate

Comfortable Success Rate for Various Equity Glide Paths with Average Stock and Bond Returns Equal to Current Expectations – 30 Yr. Horizon with a 4% Initial Withdrawal Rate

Ulcer Index for Various Equity Glide Paths with Average Stock and Bond Returns Equal to Current Expectations – 30 Yr. Horizon with a 4% Initial Withdrawal Rate

Source: Shiller Data Library and Research Affiliates.  Calculations by Newfound Research.  Analysis uses real returns and assumes the reinvestment of dividends.  Returns are hypothetical index returns and are gross of all fees and expenses.  Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

Note that the main diagonal in the chart represents static allocations, above the main diagonal represents the decreasing glide paths, and below the main diagonal represents increasing glide paths.

Since these returns are derived from the historical returns for stocks and bonds (again, accounting for a depressed forward outlook), they capture both the sequence of returns and shifting correlations between stocks and bonds better than Monte Carlo simulation. On the other hand, the sample size is limited, i.e. we only have about 4 non-overlapping 30 year periods.

Nevertheless, these data show that there was not a huge benefit or detriment to using either an increasing or decreasing equity glide path in retirement based on these metrics. If we instead look at minimizing expected shortfall in the bottom 10% of scenarios, similar to Kitces and Pfau, we find that a glide path starting at 40% rising to around 80% performs the best.

However, it will still be tough to rest easy with a plan that has an ASR of around 60 and a CSR of around 30 and an expected shortfall of 10 years of income.

With these unconvincing results, what can investors do to improve their retirement outcomes through prudent asset allocation?

 

Beyond a Static Glide Path

There is no reason to constrain portfolios to static glide paths. We have said before that the risk of a static allocation varies considerably over time. Simply dictating an equity allocation based on your age does not always make sense regardless of whether that allocation is increasing or decreasing.

If the market has a large drawdown, an investor should want to avoid this regardless of where they are in the retirement journey. Missing drawdowns is always beneficial as long as enough upside is subsequently realized.

In recent papers, Clare et al. (2017 and 2017) showed that trend following can boost safe withdrawal rates in retirement portfolios by managing sequence risk. [6],[7]

The million-dollar question is, “how tactical should we be?”

The following charts show the ASR, CSR, and Ulcer index values for static allocations to stocks, bonds, and a simple tactical strategy that invests in stocks when they are above their 10-month simple moving average (SMA) and in bonds otherwise.

The charts are organized by the minimum and maximum equity exposures along the rows and columns. The charts are symmetric across the main diagonal so that they can be compared to both increasing and decreasing equity glide paths.

The equity allocation is the minimum of the row and column headings, the tactical strategy allocation is the absolute difference between the headings, and the bond allocation is what’s needed to bring the total allocation to 100%.

For example, the 20% and 50% column is a portfolio of 20% equities, 30% tactical strategy, and 50% bonds. It has an ASR of 75, a CSR of 40, and an Ulcer index of 22.

 

Absolute Success Rate for Various Tactical Allocation Bounds Paths with Average Stock and Bond Returns Equal to Current Expectations – 30 Yr. Horizon with a 4% Initial Withdrawal Rate

Comfortable Success Rate for Various Tactical Allocation Bounds with Average Stock and Bond Returns Equal to Current Expectations – 30 Yr. Horizon with a 4% Initial Withdrawal Rate

Ulcer Index for Various Tactical Allocation Bounds with Average Stock and Bond Returns Equal to Current Expectations – 30 Yr. Horizon with a 4% Initial Withdrawal Rate

Source: Shiller Data Library and Research Affiliates.  Calculations by Newfound Research.  Analysis uses real returns and assumes the reinvestment of dividends.  Returns are hypothetical index returns and are gross of all fees and expenses.  Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

These charts show that being tactical is extremely beneficial under these muted return expectations and that being highly tactical is even better than being moderately tactical.

So, what’s stopping us from going whole hog with the 100% tactical portfolio?

Well, this is a case where a tactical strategy can reduce the risk of not making it through the 30-year retirement at the risk of greatly increasing the ending wealth. It may sound counterintuitive to say that ending with too much extra money is a risk, but when our goal is to make it through retirement comfortably, taking undue risks come at a cost.

For instance, we know that while the tactical strategy may perform well over a 30-year time horizon, it can go through periods of significant underperformance in the short-term, which can lead to stress and questioning of the investment plan. For example, in 1939 and 1940, the tactical strategy underperformed a 50/50 portfolio by 16% and 11%, respectively.

These times can be trying for investors, especially those who check their portfolios frequently.[8] Even the best-laid plan is not worth much if it cannot be adhered to.

Being tactical enough to manage the risk of having to make a major adjustment in retirement while keeping whipsaw, tracking error, and the cost of surpluses in check is key.

 

Sizing a Tactical Sleeve

If the goal is having the smallest tactical sleeve to boost the ASR and CSR and reduce the Ulcer index to acceptable levels in a low expected return environment, we can turn back to the expected shortfall in the bad (10th percentile) scenarios to determine how large of a tactical sleeve to should include in the portfolio. The analysis in the previous section showed that being tactical could yield ASRs and CSRs in the 80s and 90s (dark green).  This, however, requires a tactical sleeve between 50% and 70%, depending on the static equity allocation.

Thankfully, we do not have to put the entire burden on being tactical: we can diversify our approaches.  In the previous commentaries mentioned earlier, we covered a number of topics that can improve retirement results in a low expected return environment.

  • Thoroughly examine and define planning factors such as taxes and the evolution of spending throughout retirement.
  • Be strategic, not static: Have a thoughtful, forward-looking outlook when developing a strategic asset allocation. This means having a willingness to diversify U.S. stocks and bonds with the ever-expanding palette of complementary asset classes and strategies.
  • Utilize a hybrid active/passive approach for core exposures given the increasing availability of evidence-based, factor-driven investment strategies.
  • Be fee-conscious, not fee-centric. For many exposures (e.g. passive and long-only core stock and bond exposure), minimizing cost is certainly appropriate. However, do not let cost considerations preclude the consideration of strategies or asset classes that can bring unique return generating or risk mitigating characteristics to the portfolio.
  • Look beyond fixed income for risk management given low interest rates.
  • Recognize that the whole can be more than the sum of its parts by embracing not only asset class diversification, but also strategy/process diversification.

While each modification might only result in a small, incremental improvement in retirement outcomes, the compounding effect can be very beneficial.

The chart below shows the required tactical sleeve size needed to minimize shortfalls/surpluses for a given improvement in the annual returns (0bp through 150bps).

 

Tactical Allocation Strategy Size Needed to Minimize 10% Expected Shortfall/Surplus with Average Stock and Bond Returns Equal to Current Expectations for a Range of Annualized Return Improvements  – 30 Yr. Horizon with a 4% Initial Withdrawal Rate

Source: Shiller Data Library and Research Affiliates.  Calculations by Newfound Research.  Analysis uses real returns and assumes the reinvestment of dividends.  Returns are hypothetical index returns and are gross of all fees and expenses.  Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

For a return improvement of 125bps per year over the current forecasts for static U.S. equity and bond portfolios, with a static equity allocation of 50%, including a tactical sleeve of 20% would minimize the shortfall/surplus.

This portfolio essentially pivots around a static 60/40 portfolio, and we can compare the two, giving the same 125bps bonus to the returns for the static 60/40 portfolio.

 

Comparison of a Tactical Allocation Enhanced Portfolio with a Static 60/40 Portfolio with Average Stock and Bond Returns Equal to Current Expectations + 125bps per year   – 30 Yr. Horizon with a 4% Initial Withdrawal Rate

Source: Shiller Data Library and Research Affiliates.  Calculations by Newfound Research.  Analysis uses real returns and assumes the reinvestment of dividends.  Returns are hypothetical index returns and are gross of all fees and expenses.  Results may differ slightly from similar studies due to the data sources and calculation methodologies used for stock and bond returns.

 

In addition to the much more favorable statistics, the tactically enhanced portfolio only has a downside tracking error of 1.1% to the static 60/40 portfolio.

 

Conclusion: Being Dynamic in Retirement

From this historical analysis, high valuations of core assets in the U.S. suggest a grim outlook for the 4% rule. Predetermined dynamic allocation paths through retirement can help somewhat, but merely specifying an equity allocation based on one’s age loses sight of the changing risk a given market environment.

The sequence of market returns can have a large impact on retirement portfolios. If a drawdown happens early in retirement, subsequent returns may not be enough to provide the tailwind that they have in the past.

Investors who are able to be fee/expense/tax-conscious and adhere to prudent diversification may be able to incrementally improve their retirement outlook to the point where a modest allocation to a sleeve of tactical investment strategies can get their portfolio back to a comfortable success rate.

Striking a balance between shortfall/surplus risk and the expected experience during the retirement period along with a thorough assessment of risk tolerance in terms of maximum and minimum equity exposure can help dictate how flexible a portfolio should be.

In our QuBe Model Portfolios, we pair allocations to tactically managed solutions with systematic, factor based strategies to implement these ideas.

While long-term capital market assumptions are a valuable input in an investment process, adapting to shorter-term market movements to reduce sequence risk may be a crucial way to combat market environments where the low return expectations come to fruition.


[1] Specifically, we use the “Yield & Growth” capital market assumptions from Research Affiliates.  These capital market assumptions assume that there is no valuation mean reversion (i.e. valuations stay the same going forward).  The adjusted average nominal returns for U.S. equities and 10-year U.S. Treasuries are 5.3% and 3.1%, respectively, compared to the historical values of 9.0% and 5.3%.

[2] Normally, the Ulcer Index would be measured using true drawdown from peak, however, we believe that using starting wealth as the reference point may lead to a more accurate gauge of pain.

[3] Bierwirth, Larry. 1994. Investing for Retirement: Using the Past to Model the Future. Journal of Financial Planning, Vol. 7, no. 1 (January): 14-24.

[4] Bengen, William P. 1994. “Determining Withdrawal Rates Using Historical Data.” Journal of Financial Planning, vol. 7, no. 4 (October): 171-180.

[5] Pfau, Wade D. and Kitces, Michael E., Reducing Retirement Risk with a Rising Equity Glide-Path (September 12, 2013). Available at SSRN: https://ssrn.com/abstract=2324930

[6] Clare, A. and Seaton, J. and Smith, P. N. and Thomas, S. (2017). Can Sustainable Withdrawal Rates Be Enhanced by Trend Following? Available at SSRN: https://ssrn.com/abstract=3019089

[7] Clare, A. and Seaton, J. and Smith, P. N. and Thomas, S. (2017) Reducing Sequence Risk Using Trend Following and the CAPE Ratio. Financial Analysts Journal, Forthcoming. Available at SSRN: https://ssrn.com/abstract=2764933

[8] https://blog.thinknewfound.com/2017/03/visualizing-anxiety-active-strategies/

A Closer Look At Growth and Value Indices

In a commentary a few weeks ago entitled Growth Is Not “Not Value,” we discussed a problem in the index construction industry in which growth and value are often treated as polar opposites. This treatment can lead to unexpected portfolio holdings in growth and value portfolios. Specifically, we may end up tilting more toward shrinking, expensive companies in both growth and value indices.

2D Quadrants - What we're really getting

The picture of what we want for each index looks more like this:

2D Quandrants - What we want

The overlap is not a bad thing; it simply acknowledges that a company can be cheap and growing, arguably a very good set of characteristics.

A common way of combining growth and value scores into a single metric is to divide growth ranks by value ranks. As we showed in the previous commentary, many index providers do something similar to this.

Essentially this means that low growth gets lumped in with high value and vice versa.

But how much does this affect the index allocations? Maybe there just are not many companies that get included or excluded based on this process.

Let’s play index provider for a moment.

Using data from Morningstar and Yahoo! Finance at the end of 2015, we can construct growth and value scores for each company in the S&P 500 and see where they fall in the growth/value planes shown above.

To calculate the scores, we will use an approach similar to the one in last commentary where the composite growth score is the average of the normalized scores for EPS growth, sales growth, and ROA, and the composite value score is the average of the normalized scores for P/B, P/S, and P/E ratios.

The chart below shows the classification when we take an independent approach to selecting growth and value companies based on those in the top third of the ranks.2D Sort Growth and Value

In each class, 87% of the companies were identified as only being growth or value while 13% of companies were included in both growth and value.

The next chart shows the classifications when we use the ratio of growth to value ranks as a composite score and again select the top third.1D Sort Growth and Value

Relative to what we saw previously, growth and value now extend further into the non-value (expensive) and non-value (cheap) realms of the graph, respectively.

There is also no overlap between the two categories, but we are now missing 16% of the companies that we had identified as good growth or value candidates before. On the flip side, 16% of the companies we now include were not identified as growth or value previously in our independent sort.

If we trust our independent growth and value ranking methodologies, the combined growth and value metric leaves out over a third of the companies that were classified as both growth and value. These companies did not appear in either index under the combined scoring scheme.

With the level of diversification in some of these indices, a few companies may not make or break the performance, but leaving out the top ones defeats the purpose of our initial ranking system. As with the NCAA March Madness tournament (won by Corey with a second place finish by Justin), having a high seed may not guarantee superior performance, but it is often a good predictor (since 1979, the champion has only been lower than a 3 seed 5 times).

Based on this analysis, we can borrow the final warning to buyers from the previous commentary:

“when you’re buying value and growth products tracking any of these indices, you’re probably not getting what you expect – or likely want.”

… and say that the words “probably” and “likely” are definitely an understatement for those seeking the best growth and value companies based on this ranking.

Page 4 of 5

Powered by WordPress & Theme by Anders Norén