Category: Portfolio Construction Page 6 of 10

Fragility Case Study: Dual Momentum GEM

By Corey Hoffstein

On January 14, 2019

In Craftsmanship, Momentum, Popular, Portfolio Construction, Risk Management, Trend

This post is available as a PDF download here.

Summary

Recent market volatility has caused many tactical models to make sudden and significant changes in their allocation profiles.
Periods such as Q4 2018 highlight model specification risk: the sensitivity of a strategy’s performance to specific implementation decisions.
We explore this idea with a case study, using the popular Dual Momentum GEM strategy and a variety of lookback horizons for portfolio formation.
We demonstrate that the year-to-year performance difference can span hundreds, if not thousands, of basis points between the implementations.
By simply diversifying across multiple implementations, we can dramatically reduce model specification risk and even potentially see improvements in realized metrics such as Sharpe ratio and maximum drawdown.

Introduction

Among do-it-yourself tactical investors, Gary Antonacci’s Dual Momentum is the strategy we tend to see implemented the most. The Dual Momentum approach is simple: by combining both relative momentum and absolute momentum (i.e. trend following), Dual Momentum seeks to rotate into areas of relative strength while preserving the flexibility to shift entirely to safety assets (e.g. short-term U.S. Treasury bills) during periods of pervasive, negative trends.

In our experience, the precise implementation of Dual Momentum tends to vary (with various bells-and-whistles applied) from practitioner to practitioner. The most popular benchmark model, however, is the Global Equities Momentum (“GEM”), with some variation of Dual Momentum Sector Rotation (“DMSR”) a close second.

Recently, we’ve spoken to several members in our extended community who have bemoaned the fact that Dual Momentum kept them mostly aggressively positioned in Q4 2018 and signaled a defensive shift at the beginning of January 2019, at which point the S&P 500 was already in a -14% drawdown (having peaked at over -19% on December 24^th). Several DIYers even decided to override their signal in some capacity, either ignoring it entirely, waiting a few days for “confirmation,” or implementing some sort of “half-and-half” rule where they are taking a partially defensive stance.

Ignoring the fact that a decision to override a systematic model somewhat defeats the whole point of being systematic in the first place, this sort of behavior highlights another very important truth: there is a significant gap of risk that exists between the long-term supporting evidence of an investment style (e.g. momentum and trend) and the precise strategy we attempt to implement with (e.g. Dual Momentum GEM).

At Newfound, we call that gap model specification risk. There is significant evidence supporting both momentum and trend as quantitative styles, but the precise means by which we measure these concepts can lead to dramatically different portfolios and outcomes. When a portfolio’s returns are highly sensitive to its specification – i.e. slight variation in returns or model parameters lead to dramatically different return profiles – we label the strategy as fragile.

In this brief commentary, we will use the Global Equities Momentum (“GEM”) strategy as a case study in fragility.

Global Equities Momentum (“GEM”)

To implement the GEM strategy, an investor merely needs to follow the decision tree below at the end of each month.

From a practitioner stand-point, there are several attractive features about this model. First, it is based upon the long-run evidence of both trend-following and momentum. Second, it is very easy to model and generate signals for. Finally, it is fairly light-weight from an implementation perspective: only twelve potential rebalances a year (and often much less), with the portfolio only holding one ETF at a time.

Despite the evidence that “simple beats complex,” the simplicity of GEM belies its inherent fragility. Below we plot the equity curves for GEM implementations that employ different lookback horizons for measuring trend and momentum, ranging from 6- to 12-months.

Source: CSI Analytics. Calculations by Newfound Research. Returns are backtested and hypothetical. Returns assume the reinvestment of all distributions. Returns are gross of all fees except for underlying ETF expense ratios. None of the strategies shown reflect any portfolio managed by Newfound Research and were constructed solely for demonstration purposes within this commentary. You cannot invest in an index.

We can see a significant dispersion in potential terminal wealth. That dispersion, however, is not necessarily consistent with the notion that one formation period is inherently better than another. While we would argue, ex-ante, that there should be little performance difference between a 9-month and 10-month lookback – they both, after all, capture the notion of “intermediate-term trends” – the former returned just 43.1% over the period while the latter returned 146.1%.

These total return figures further hide the year-to-year disparity that exists. The 9-month model, for example, was not a consistent loser. Below we plot these results, highlighting both the best (blue) and worst (orange) performing specifications. We see that the yearly spread between these strategies can be hundreds-to-thousands of basis points; consider that in 2010, the strategy formed using a 10-month lookback returned 12.2% while the strategy formed using a 9-month lookback returned -9.31%.

Same thesis. Same strategy. Slightly different specification. Dramatically different outcomes. That single year is likely the difference between hired and fired for most advisors and asset managers.

☞ Explore a diversified approach with the Newfound/ReSolve Robust Equity Momentum Index.

For those bemoaning their 2018 return, note that the 10-month specification would have netted a positive result! That specification turned defensive at the end of October.

Now, some may cry “foul” here. The evidence for trend and momentum is, after all, centuries in length and the efficacy of all these horizons is supported. Surely the noise we see over this ten-year period would average out over the long run, right?

The unfortunate reality is that these performance differences are not expected to mean-revert. The gambler’s fallacy would have us believe that bad luck in one year should be offset by good luck in another and vice versa. Unfortunately, this is not the case. While we would expect, at any given point in time, that each strategy has equal likelihood of experiencing good or bad luck going forward, that luck is expected to occur completely independently from what has happened in the past.

The implication is that performance differences due to model specification are not expected to mean-revert and are therefore expected to be random, but very permanent, return artifacts.¹

The larger problem at hand is that none of us have a hundred years to invest. In reality, most investors have a few decades. And we act with the temperament of having just a few years. Therefore, bad luck can have very permanent and very scarring effects not only upon our psyche, but upon our realized wealth.

But consider what happens if we try to neutralize the role of model specification risk and luck by diversifying across the seven different models equally (rebalanced annually). We see that returns closer in line with the median result, a boost to realized Sharpe ratio, and a reduction in the maximum realized drawdown.

These are impressive results given that all we employed was naïve diversification.

Conclusion

The odd thing about strategy diversification is that it guarantees we will be wrong. Each and every year, we will, by definition, allocate at least part of our capital to the worst performing strategy. The potential edge, however, is in being vaguely wrong rather than precisely wrong. The former is annoying. The latter can be catastrophic.

In this commentary we use the popular Dual Momentum GEM strategy as a case study to demonstrate how model specification choices can lead to performance differences that span hundreds, if not thousands, of basis points a year. Unfortunately, we should not expect these performance differences to mean revert. The realizations of good and bad luck are permanent, and potentially very significant, artifacts within our track records.

By simply diversifying across the different models, however, we can dramatically reduce specification risk and thereby reduce strategy fragility.

To be clear, no amount of diversification will protect you from the risk of the style. As we like to say, “risk cannot be destroyed, only transformed.” In that vein, trend following strategies will always incur some sort of whipsaw risk. The question is whether it is whipsaw related to the style as a whole or to the specific implementation.

For example, in the graphs above we can see that Dual Momentum GEM implemented with a 10-month formation period experienced whipsaw in 2011 when few of the other implementations did. This is more specification whipsaw than style whipsaw. On the other hand, we can see that almost all the specifications exhibited whipsaw in late 2015 and early 2016, an indication of style whipsaw, not specification whipsaw.

Specification risk we can attempt to control for; style risk is just something we have to bear.

At Newfound, evidence such as this informs our own trend-following mandates. We seek to diversify ourselves across the axes of what (“what are we investing in?”), how (“how are we making the decisions?”), and when (“when are we making those decisions?”) in an effort to reduce specification risk and provide the greatest style consistency possible.

Is Multi-Manager Diversification Worth It?

By Corey Hoffstein

On January 7, 2019

This post is available as a PDF download here.

Summary

Portfolio risk is traditionally quantified by volatility. The benefits of diversification are measured in how portfolio volatility is changed with the addition or subtraction of different investments.
Another measure of portfolio risk is the dispersion in terminal wealth: a measure that attempts to capture the potential difference in realized returns. For example, two equity managers that each hold 30 stock portfolios may exhibit similar volatility levels but will likely have very different realized results.
In this commentary we explore existing literature covering the potential diversification benefits that can arise from combining multiple managers together.
Empirical evidence suggests that in heterogeneous categories (e.g. many hedge fund styles), combining managers can reduce portfolio volatility. Yet even in homogenous categories (e.g. equity style boxes), combining managers can have a pronounced effect on reducing the dispersion in terminal wealth.
Finally, we address the question as to whether manager diversification leads to dilution, arguing that a combination of managers will reduce idiosyncratic process risks but maintain overall style exposure.

Introduction

In their 2014 paper The Free Lunch Effect: The Value of Decoupling Diversification and Risk, Croce, Guinn, and Robinson draw a distinction between the risk reduction effects that occur due to de-risking and those that occur due to diversification benefits.

To illustrate the distinction, the authors compare the volatility of an all equity portfolio versus a balanced stock/bond mix. In the 1984-2014 sample period, they find that the all equity portfolio has an annualized volatility of 15.25% while the balanced portfolio has an annualized volatility of just 9.56%.

Over 75% of this reduction in volatility, however, is due simply to the fact that bonds were much less volatile than stocks over the period. In fact, of the 568-basis-point reduction, only 124 basis points was due to actual diversification benefits.

Why does this matter?

Because de-risking carries none of the benefits of diversification. If there is a commensurate trade-off between expected return and risk, then all we have done is reduced the expected return of our portfolio.¹

It is only by combining assets of like volatility – and, it is assumed, like expected return – that should allow us to enjoy the free lunch of diversification.

Unfortunately, unless you are willing to apply leverage (e.g. risky parity), the reality of finding such free lunch opportunities across assets is limited. The classic example of inter-asset diversification, though, is taught in Finance 101: as we add more stocks to a portfolio, we drive the contribution of idiosyncratic volatility towards zero.

Yet volatility is only one way to measure risk. If we build a portfolio of 30 stocks and you build a portfolio of 30 stocks, the portfolios may have nearly identical levels of volatility, but we almost assuredly will end up with different realized results. This difference between the expected and the realized is captured by a measure known as terminal wealth dispersion, first introduced by Robert Radcliffe in his book Investment: Concepts, Analysis, Strategy.

This form of risk naturally arises when we select between investment managers. Two managers may both select securities from the same universe using the same investment thesis, but the realized results of their portfolios can be starkly different. In rare cases, the specific choice of one manager over another can even lead to catastrophic results.

The selection of a manager reflects not only an allocation to an asset class, but also reflects an allocation to a process. In this commentary, we ask: how much diversification benefit exists in process diversification?

The Theory Behind Manager Diversification

In Factors from Scratch, the research team at O’Shaughnessy Asset Management (OSAM), in partnership with anonymous blogger Jesse Livermore, digs into the driving elements behind value and momentum equity strategies.

They find that value stocks do tend to exhibit negative EPS growth, but this decay in fundamentals is offset by multiple expansion. In other words, markets do appear to correctly identify companies with contracting fundamentals, but they also exaggerate and over-extrapolate that weakness. The historical edge for the strategy has been that the re-rating – measured via multiple expansion – tends to overcompensate for the contraction in fundamentals.

For momentum, OSAM finds a somewhat opposite effect. The strategy correctly identifies companies with strengthening fundamentals, but during the holding period a valuation contraction occurs as the market recognizes that its outlook might have been too optimistic. Historically, however, the growth outweighed the contraction to create a net positive effect.

These are the true, underlying economic and behavioral effects that managers are trying to capture when they implement value and momentum strategies.

These are not, however, effects we can observe directly in the market; they are effects that we have to forecast. To do so, we have to utilize semi-noisy signals that we believe are correlated. Therefore, every manager’s strategy will be somewhat inefficient at capturing these effects.

For example, there are a number of quantitative measures we may apply in our attempt to identify value opportunities; e.g. price-to-book, price-to-earnings, and EBITDA-to-enterprise-value to name a few. Two different noisy signals might end up with different performance just due to randomness.

This noise between signals is further compounded when we consider all the other decisions that must be made in the portfolio construction process. Two managers may use the same signals and still end up with very different portfolios based upon how the signals are translated into allocations.

Consider this: Morningstar currently² lists 1,217 large-cap value funds in its mutual fund universe and trailing 1-year returns ranged from 1.91% to -22.90%. This is not just a case of extreme outliers, either: the spread between the 10^thand 90^thpercentile returning funds was 871 basis points.

It bears repeating that these are funds that, in theory, are all trying to achieve the same goal: large-cap value exposure.

Yet this result is not wholly surprising to us. In Separating Ingredients and Recipe in Factor Investing we demonstrated that the performance dispersion between different momentum strategy definitions (e.g. momentum measure, look-back length, rebalance frequency, weighting scheme, et cetera) was larger than the performance dispersion between the traditional Fama-French factors themselves in 90% of rolling 1-year periods. As it turns out, intra-factor differences can cause greater dispersion than inter-factor differences.

Without an ex-ante view as to the superiority of one signal, one process, or one fund versus another, it seems prudent for a portfolio to have diversified exposure to a broad range of signals that seem plausibly related to the underlying phenomenon.

Literature Review

While foundational literature on modern portfolio diversification extends back to the 1950s, little has been written in the field of manager diversification. While it is a well-established teaching that a portfolio of 25-40 stocks is typically sufficient to reduce idiosyncratic risk, there is no matching rule for how many managers to combine together.

One of the earliest articles on the topic was written by Edward O’Neal in 1997, titled How Many Mutual Funds Constitute a Diversified Mutual Fund Portfolio?

Published in the Financial Analysts Journal, this article explores risk across two different dimensions: the volatility of returns over time and the dispersion in terminal period wealth. Again, the idea behind the latter measure is that two investors with identical horizons and different investments will achieve different terminal wealth levels, even if those investments have the same volatility.

Exploring equity mutual fund returns from 1986 to 1997, the study adopts a simulation-based approach to constructing portfolios and tracking returns. Multi-manager portfolios of varying sizes are randomly constructed and compared against other multi-manager portfolios of the same size.

O’Neal finds that while combining managers has little-to-no effect on volatility (manager returns were too homogenous), it had a significant effect upon the dispersion of terminal wealth. To quote the article,

Holding more than a single mutual fund in a portfolio appears to have substantial diversification benefits. The traditional measure of volatility, the time-series standard deviation, is not greatly influenced by holding multiple funds. Measures of the dispersion in terminal-wealth levels, however, which are arguably more important to long-term investors than time-series risk measures, can be reduced significantly. The greatest portion of the reduction occurs with the addition of small numbers of funds. This reduction in terminal-period wealth dispersion is evident for all holding periods studied. Two out of three downside risk measures are also substantially reduced by including multiple funds in a portfolio. These findings are especially important for investors who use mutual funds to fund fixed-horizon investment goals, such as retirement and college savings.

Allocating to three managers instead of just one could reduce the dispersion in terminal wealth by nearly 50%, an effect found to be quite consistent across the different time horizons measured.

In 1999, O’Neal teamed up with L. Franklin Fant to publish Do You Need More than One Manager for a Given Equity Style? Adopting a similar simulation-based approach, Fant and O’Neal explored multi-manager equity portfolios in the context of the style-box framework.

And, as before, they find that taking a multi-manager approach has little effect upon portfolio volatility.

It does, however, again prove to have a significant effect on the deviation in terminal wealth.

To quote the paper,

Regardless of the style category considered, the variability in terminal wealth levels is significantly reduced by using more managers. The first few additional managers make the most difference, as terminal wealth standard deviation declines at a decreasing rate with the number of managers. Concentrating on the variability of periodic portfolio returns fails to document the advantage of using multiple managers within style categories.
Second, some categories benefit more from additional managers than others. Plan sponsors would do well to allocate relatively more managers to the styles that display the greatest diversification benefits. Growth styles and small-cap styles appear to offer the greatest potential for diversification.

In 2002, François-Serge Lhabitant and Michelle Learned pursued a similar vein of research in the realm of hedge funds in their article Hedge Fund Diversification: How Much is Enough? They employ the same simulation-based approach but evaluate diversification effects within the different hedge fund styles.

They find that while diversification does little to affect the expected return for a given style, it does appear to help reduce portfolio volatility: sometimes quite significantly so. This somewhat contradictory result to the prior research is likely due to the fact that hedge funds within a given category exhibit far more heterogeneity in process and returns than do equity managers in the same style box.

(Note that while the graphs below only show the period 1990-1993, the paper explores three time periods: 1990-1993, 1994-1997, and 1998-2001 and finds a similar conclusion in all three).

Perhaps most importantly, however, they find a rather significant reduction in risk characteristics like a portfolio’s realized maximum drawdown.

To quote the article,

We find that naively adding more funds to a portfolio tends to leave returns stable, decrease the standard deviation, and reduce downside risk. Thus, diversification should be increased as long as the marginal benefits of adding a new asset to a portfolio exceeds the marginal cost.
…
If a sample of managers is relatively style pure, then a fewer number of managers will minimize the unsystematic risk of that style. On the contrary, if the sample is really heterogeneous, increasing the number of managers may still provide important diversification benefits.

Taken together, this literature paints an important picture:

Diversifying across managers in the same category will likely do little to reduce portfolio volatility, except in the cases where categories are broad enough to capture many heterogeneous managers.
Diversifying across managers appears to significantly reduce the potential dispersion in terminal wealth.

But why is minimizing “the dispersion of terminal wealth” important? The answer is the same reason why we diversify in the first place: risk management.

The potential for high dispersion in terminal wealth means that we can have dramatically different outcomes based upon the choices we are making, placing significant emphasis on our skill in manager selection. Choosing just one manager is more right style thinking rather than our preferred less wrong.

But What About Dilution?

The number one response we hear when we talk about manager diversification is: “when we combine managers, won’t we just dilute our exposure back to the market?”

The answer, as with all things, is: “it depends.” For the sake of brevity, we’re just going to leave it with, “no.”

No?

No.

If we identify three managers as providing exposure to value, then it makes little logical sense that somehow a combination of them would suddenly remove that exposure. Subtraction through addition only works if there is a negative involved; i.e. one of the managers would have to provide anti-value exposure to offset the others.

Remember that an active manager’s portfolio can always be decomposed into two pieces: the benchmark and a dollar-neutral long/short portfolio that isolates the active over/under-weights that manager has made.

To “dilute back to the benchmark,” we’d have to identify managers and then weight them such that all of their over/under-weights net out to equal zero.

Candidly, we’d be impressed if you managed to do that. Especially if you combine managers within the same style who should all be, at least directionally, taking similar bets. The dilution that occurs is only across those bets which they disagree on and therefore reflect the idiosyncrasies of their specific process.

What a multi-manager implementation allows us to diversify is our selection risk, leading to a return profile more “in-line” with a given style or category. In fact, Lhabitant and Learned (2002) demonstrated this exact notion with a graph that plots the correlation of multi-manager portfolios with their broad category. While somewhat tautological, an increase in manager diversification leads to a return profile closer to the given style than to the idiosyncrasies of those managers.

We can also see this with a practical example. Below we take several available ETFs that implement quantitative value strategies and plot their rolling 52-week return relative to the S&P 500. We also construct a multi-manager index (“MM_IDX”) that is a naïve, equal-weight portfolio. The only wrinkle to this portfolio is that ETFs are not introduced immediately, but rather slowly over a 12-month period.³

Source: CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Returns are total returns (i.e. assume the reinvestment of all distributions) and are gross of all fees except for underlying expense ratios of ETFs. Past performance does not guarantee future results.

We can see that while the multi-manager blend is never the best performing strategy, it is also never the worst. Never the hero; never a zero.

It should be noted that while manager diversification may be able to reduce the idiosyncratic returns that result from process differences, it will not prevent losses (or relative underperformance) of the underlying style itself. In other words, we might avoid the full brunt of losses specific to the Sequoia Fund, but no amount of diversification would prevent the relative drag seen by the quantitative value style in general over the last decade.

We can see this in the graph above by the fact that all the lines generally tend to move together. 2015 was bad for value managers. 2016 was much better. But we can also see that every once in a while, a specific implementation will hit a rough patch that is idiosyncratic to that approach; e.g. IWD in 2017 and most of 2018.

Multi-manager diversification is the tool that allows us to avoid the full brunt of this risk.

Conclusion

Taken together, the research behind manager diversification suggests:

In heterogeneous categories (e.g. many hedge fund styles), manager diversification may reduce portfolio volatility.
In more homogenous categories (e.g. equity style boxes), manager diversification may reduce the dispersion in terminal wealth.
Multi-manager implementations appear to reduce realized portfolio risk metrics such as maximum drawdown. This is likely partially due to the reduction in portfolio volatility, but also due to a reduction in exposure to funds that exhibit catastrophic losses.
Multi-manager implementations do not necessarily “dilute” the portfolio back to market exposure, but rather “dilute” the portfolio back to the style exposure, reducing exposure idiosyncratic process risk.

For advisors and investors, this evidence may cause a sigh of relief. Instead of having to spend time trying to identify the best manager or the best process, there may be significant advantages to simply “avoiding the brain damage”⁴ and allocating equally among a few. In other words, if you don’t know which low-volatility ETF to pick, just buy a couple and move on with your life.

But what are the cons?

A multi-manager approach may be tax inefficient, as we will need to rebalance allocations back to parity between the exposures.
A multi-manager approach may lead to fund bloat within a portfolio, doubling or tripling the number of holdings we have. While this is merely optical, except possibly in small portfolios, we recognize there exists an aversion to it.
By definition, performance will be middling: the cost of avoiding the full brunt of losers is that we also give up the full benefit of winners. We’re reluctant to label this as a con, as it is arguably the whole point of diversification, but it is worth pointing out that the same behavioral biases that emerge in portfolio reviews of asset allocation will likely re-emerge in reviews of manager selection, especially over short time horizons.

For investment managers, a natural interpretation of this research is that approaches blending different signals and portfolio construction methods together should lead to more consistent outcomes. It should be no surprise, then, that asset managers adopting machine learning are finding significant advantages with ensemble techniques. After all, they invoke the low-hanging fruit of manager diversification.

Perhaps most interesting is that this research suggests that fund-of-funds really are not such bad ideas so long as costs can be kept under control. As the asset management business continues to be more competitive, perhaps there is an edge – and a better client result – found in cooperation.

Dart-Throwing Monkeys and Process Diversification

By Corey Hoffstein

On December 24, 2018

In Portfolio Construction, Risk Management, Weekly Commentary

This post is available as a PDF download here.

Summary

This week’s commentary is a short addendum to last week’s piece, attempting to serve as a (very) brief and simplified summary of process diversification.
Volatility is only one way of measuring risk; dispersion in terminal wealth is another.
Using simulations of dart-throwing monkeys, we plot the dispersion in terminal wealth for different levels of portfolio and manager diversification.
We find that increased diversification within a portfolio as well as increased diversification across managers can lead to more consistent portfolio outcomes.

Introduction

In last week’s commentary (What do portfolios and teacups have in common?), we explored at great length the potential benefits of diversification in the domains of what, how, and when.

The crux of our argument is that for investors, return dispersions across time (i.e. “volatility”) can be a potentially misleading risk characteristic and that it is important to consider the potential dispersion in terminal wealth as well.

These are by no means original or unique thoughts. Often the advisors and institutions we work with intuitively understand them: they just have not been presented with the math to justify them.

Therefore, in contrast to last week’s rather expansive note, we aim to keep this week’s note short, simple, and punchy in an effort to drive how manager / process diversification can help deliver more consistent outcomes.

Dart-Throwing Monkeys

Consider the following experiment.

We begin with thousands and thousands of dart-throwing monkeys. Every month, the monkeys throw their darts at a board that determines how they will be invested for the next month. In this hypothetical scenario, we will assume that the monkeys are investing in different industry groups.¹

Some monkeys are “concentrated managers,” throwing just a single dart and holding that pick for the next month. Other monkeys are more diversified, throwing up to 30 darts each month and equally allocating their portfolio across their investments. Portfolio sizes can be either 1, 5, 10, 15, 20, 25, or 30 equally-allocated investments.

It is our job, as an allocator, to choose different monkeys to invest with. Do we invest with just 1 concentrated monkey manager? Five different diversified managers? How much difference does it really make at the end of the day?

We learn in Finance 101 that once we diversify our portfolio sufficiently, we have eliminated nonsystematic risk. But does that mean we expect the portfolios to necessarily end up in the same place?

As an example, if we pick 10 dart-throwing monkeys who each pick 10 investments per month, how different would we expect our final wealth level to be from another allocator who picks 10 different dart-throwing monkeys who each pick 10 investments per month?

Process Diversification and Terminal Wealth Dispersion

Below we plot the dispersion in terminal wealth² as a function of (1) the number of securities picked by each monkey manager and (2) the number of monkey managers we allocate to.

As an example of how to read this graph, the orange line tells us about portfolios comprised of monkey managers who pick five investments each. As we move from left to right, we learn about the dispersion in terminal wealth based upon the number of managers we allocate to.

We can think of this two ways. First, we can think of it as potential dispersion in results among our peers who make the same type of decision (e.g. picking 5 managers who pick 5 investments each) but different specific choices (e.g. might pick different managers). Second, we can think of this as the dispersion in possible results if we were able to live across infinite universes simultaneously.

Source: Kenneth French Data Library. Calculations by Newfound Research.

Unfortunately, we cannot live across infinite universes and this graph tells us that choosing a single, highly concentrated manager can lead to wildly different outcomes depending upon the manager we select.

As the managers further diversify and we further diversify among managers, this dispersion in potential outcomes decreases.³

Conclusion

The intuition behind these results is simple:

More diversified managers are more likely to overlap in portfolio holdings with one another, and therefore are likely to have more similar returns.
Similarly, as the number of managers we choose goes up, so does the likelihood of overlap in holdings with a peer who also selects the same number of managers.

It is equally valid to interpret this analysis as saying there is greater opportunity for out-performance in taking concentrated bets in highly concentrated managers. We would argue this is more right thinking: the win condition requires both that we pick the right managers and the managers pick the right stocks. While a little bit of diversification can go a long way here in clipping outlier events, the dispersion can still far exceed a more diversified approach.

At Newfound, we prefer the less wrong approach. Allocations to a few diversified managers each taking a different approach can lead to significantly less dispersion in outcomes and, therefore, allow for better financial planning.

What do portfolios and teacups have in common?

By Corey Hoffstein

On December 17, 2018

In Portfolio Construction, Risk Management, Weekly Commentary

This post is available as a PDF download here.

Summary

Portfolio risk is often measured as the variance of returns over time. Another form of risk is the variance of terminal wealth that can arise from small variations in strategy inputs or asset returns.
Strategies or portfolios that are more sensitive to small changes in inputs are inherently “fragile.”
Fragile strategy design makes it difficult to rely upon backtests or historical results in setting forward expectations.
We explore how diversification across the “what,” “how,” and “when,” axes of portfolio construction can help reduce strategy fragility.

Introduction

At Newfound, we spend a lot less time trying to figure out how to be more right than we spend trying to figure out how to be less wrong. One area of particular interest for us is the idea of unintended bets: the exposures in a portfolio we may not even be aware of. And if we knew we had the exposure, we might not even want it.

For example, consider a portfolio that invests in either broad U.S., broad international, or broad emerging market equities based upon valuations. A significant tilt towards non-U.S. assets may be a valuation-driven decision, but for U.S. investors it creates significant exposure to fluctuations in the U.S. dollar versus foreign currencies.

Of course, exposures are not limited only to assets. Exposures may be broader macro-economic, stylistic, thematic, geographic, or even political factors.

These unintended bets can go far beyond explicit and implicit exposures. In our example, the choice of how to measure value may lead to meaningfully different portfolios, despite the same overarching thesis. For example, a naïve CAPE ratio versus adjusting for differences in relative sector composition dramatically alters the view of whether international equities are significantly cheaper than U.S. equities. These potential differences capture what we like to call “model specification risk.”

Finally, we can be subject to unintended bets based upon when the portfolio is re-evaluated and reconstituted. Evaluating valuations in January, for example, may lead to a different decision versus evaluating them in July.

How can we avoid these unintended bets? At Newfound, we believe that the answer falls back to diversification: not only in the traditional sense of what we invest in, but also across how we make decisions and when we make them.

When left uncontrolled, unintended bets can make a strategy incredibly fragile.

What, precisely, does it mean for a strategy to be fragile? A strategy is fragile when small variations of strategy inputs – be it asset returns or other measures – lead to meaningful dispersion in realized results.

Now we want to distinguish between volatility and fragility. Volatility is the dispersion of strategy returns across time, while fragility is the dispersion in end-of-period wealth across variations of the strategy.

As an example, a portfolio that invests only in the S&P 500 is very volatile but not particularly fragile. Given the last ten years of returns for the S&P 500, slight variations in annual returns would not lead to significant dispersion in end-of-period wealth. On the other hand, a strategy that flips a coin every December and invests for the next year in the S&P 500 when it lands on heads or short-term U.S. Treasuries when it lands on tails would have lower expected volatility than the S&P 500 but would be much more fragile. We need simply consider a few scenarios (e.g. all heads or all tails) to understand the potential dispersion such a strategy is subject to.

In the remainder of this commentary, we will demonstrate how diversification across the what, how, and when axes can reduce strategy fragility.

The Experiment Setup

Since a large degree of our focus at Newfound is on managing trend equity mandates, we will explore fragility through the lens of the style of measuring trends. For those unfamiliar with the approach, trend equity strategies aim to capture a significant portion of equity market growth while avoiding substantial and prolonged drawdowns through the application of trend following. A naïve implementation of such an idea would be to invest in the S&P 500 when its prior 12-month return has been positive and invest in short-term U.S. Treasuries otherwise.

To learn something about the fragility of a strategy, we are going to have to inject some randomness. After all, no amount of history will tell us about the fragility of a teacup that has spent its entire life sitting on a shelf; we will need to see it fall on the floor to actually learn something.

As with our recent commentary When Simplicity Met Fragility, we will inject randomness by adding white noise to asset returns. Specifically, we will add to daily returns a draw from a random normal distribution with mean 0% and standard deviation 0.025%. Using this slightly altered history, we will then run our investment strategy.

By performing this process a large number of times (10,000 in this commentary), we can explore how the outcome of the strategy is impacted by these slight variations in return history. The greater the dispersion in results, the more fragile the strategy is.

To demonstrate how diversification across the three different axes can affect fragility, we will start with a naïve trend equity strategy – investing in broad U.S. equities using a single trend model that is rebalanced on a monthly basis – and vary the three components in isolation.

The What

The “what” axis simply asks, “what are we invested in?”

How can our choice of “what” affect fragility? Consider a slight variation to our coin-flip strategy from before. Instead of flipping a single coin, we will now flip two coins. The first coin determines whether we invest 50% of the portfolio in either the S&P 500 or short-term U.S. Treasuries, while the second coin determines whether we invest the other 50% of the portfolio in either the Russell 1000 or short-term U.S. Treasuries.

In our single coin example, each year we expected to invest in the S&P 500 50% of the time and in short-term U.S. Treasuries 50% of the time. With two coins, we now expect to be fully invested 25% of the time, partially invested 50% of the time, and divested 25% of the time.

Let’s take this notion to further limits. Consider now flipping 100 coins where each determines the allocation decision for 1% of our portfolio, where heads leads to an investment in a large-cap U.S. equity portfolio and tails means invest in short-term U.S. Treasuries. Now being fully invested or divested is an infinitesimally small probability event; in fact, for a given year there is a 95% chance that your allocation to equities falls between 40-60%.¹

Even though we’ve applied the exact same process to each investment, diversifying across more investments has dramatically reduced the fragility of our coin-flipping strategy.

Now let’s translate this from the theoretical to the practical. We will begin with a simple trend following strategy that invests in the underlying asset when prior 12-1 month returns have been positive or invests in the risk-free rate, re-evaluating the trend at the end of each month.

To explore the impact of diversifying our what, we will implement this strategy five different ways:

A single in-or-out decision on broad U.S. equities.
Applied across 5 equally-weighted U.S. equity industry groups.
Applied across 12 equally-weighted U.S. equity industry groups.
Applied across 30 equally-weighted U.S. equity industry groups.
Applied across 48 equally-weighted U.S. equity industry groups.

The graph below plots the distribution of log difference in terminal wealth against the median outcome for each of these five approaches. Lines within each “violin” show the 25^th, 50^th, and 75^thpercentiles.

The graph clearly demonstrates that by increasing our exposure across the “what” axis, the dispersion in terminal wealth is dramatically reduced.

Source: Kenneth French Data Library. Calculations by Newfound Research.

But why is reduced dispersion in terminal wealth necessarily better?

It implies a greater consistency in outcome, which is not only important for setting forward expectations, but is also important for evaluating past performance (whether backtested or live). This evidence tells us that if we are evaluating a trend equity strategy that employs a single model to make in-or-out decisions on broad U.S. equities on a monthly basis, it will be nearly impossible to tell whether the realized results are in line with reasonable expectations or overly optimistic (we can probably guess that they aren’t overly pessimistic, as those sorts of returns typically aren’t marketed).

To justify a concentration in the “what” axis, we would have to demonstrate that the worst-case scenarios would still represent a meaningful improvement in expected terminal wealth versus a more diversified approach.

It should be noted that our experiment design prohibits dispersion from every being fully reduced, as we are injecting randomness into past returns. Even if no strategy is applied, there will be some inherent dispersion in final wealth. For example, below we plot the dispersion that occurs simply from adding randomness to past returns with a buy-and-hold approach.

Increasing the number of assets in the portfolio inherently reduces dispersion for buy-and-hold because diversification helps drive the expected impact of the injected randomness towards its mean: zero. With only one asset, on the other hand, outlier events are free to wreak havoc on results.

Source: Kenneth French Data Library. Calculations by Newfound Research.

Note that adding a strategy on top of buy-and-hold can exacerbate the fragility issue, making diversification that much more important.

The How

The “how” axis asks, “how are we making investment decisions.”

Many investors are already somewhat familiar with diversification along the “how” axis, often diversifying their active exposures across multiple managers who might have similar investment mandates but slightly different processes.

We like to call this “process diversification” and think of it as akin to the parable of the blind men and the elephant. Each blind man touches a different part of the elephant and pronounces his belief in what he is touching based upon his isolated view. The blind man touching the leg, for example, might think he is touching a sturdy tree while the blind man touching the tail might believe he is grabbing a rope.

None is correct in isolation but taken together we may gain a more well-rounded picture.

Similarly, two managers may claim to invest based upon valuations, but the manner in which they do so gives them a very different picture of where value can be found.

The idea of process diversification was explored in the 1999 paper “Do You Need More than One Manager for a Given Equity Style?” by Franklin Fant and Edward O’Neal. Fant and O’Neal found that while a multi-manager approach does very little for return variability across time (i.e. portfolio volatility), it does a lot for end-of-period wealth variability. They find this to be true across almost all equity style box categories. In other words: taking a multi-manager approach can reduce fragility.

Let us return to our prior coin flip example. Instead of making a choice to invest in the S&P 500 based upon a coin-flip, however, we will combine a number of different signals. For example, we might flip a coin, roll a die, measure the weather, and look at the second hand of a clock. Each signal gives us some sort of in-or-out decision, and we average these decisions together to get our allocation. As with before, as we incorporate more signals, we decrease the probability that we end up with extreme allocations, leading to a more consistent terminal wealth distribution.

Again, we should stress here that the objective is not just outright elimination of dispersion in terminal wealth. After all, if that were our sole pursuit, we could simply stuff our money under our mattress. Rather, assuming we will be implementing some active investment strategy that we hope has a positive long-term expected return, our aim should be to reduce the dispersion in terminal wealth for that strategy.

Of course, in investing we would not expect the processes to be entirely independent. With trend following, for example, most popular models are actually mathematically linked to one another, and therefore generate signals that are highly correlated. Nevertheless, even modest diversification can have meaningful benefits with respect to strategy fragility.

To explore the impact of diversification along the how axis, we implement our trend following strategy six different ways. Each invests in broad U.S. equities and rebalances monthly but differs in the number of trend-following models employed.²

The results are plotted below.

Source: Kenneth French Data Library. Calculations by Newfound Research.

Again, we can see that increased diversification across the how axis dramatically reduces dispersion in terminal wealth. Our takeaway is largely the same: without an ex-ante view as to which particular model (or group of models) is best (i.e. a view of how to be more right), diversification can lead to greater consistency in results. We will be less wrong.

A subtler conclusion of this analysis is that it should be very, very difficult to necessarily conclude that one model is better than another. We can see that if we risk selecting just one model to govern our process, seemingly minor variations in historical returns leads can lead to dramatically different terminal wealth results, as evidenced by the bulging distribution. Inverting this line of thinking, we should also be suspect of any backtest that seeks to demonstrate the superiority of a given model using a single backtest. For example, just because a 12-1 month total return model performs better than a 10-month moving average model on historical S&P 500 returns, we should be highly skeptical as to the robustness of the conclusion that the 12-1 model is best.

The When

Then “when” axis asks, “when are we making our investment decision?”

This is an oft overlooked question in public markets, but it is commonly addressed in the world of private equity and venture capital. Due to the illiquid nature of those markets, investors will often attempt to diversify their business cycle risk by establishing positions in multiple funds over time, giving them exposure to different “vintages.” The idea here is simple: the opportunity set available at different points in time can vary and if we allocate all of our earmarked capital to a particular year, we may miss out on later opportunities.

Consider our original coin-flipping example where we flipped a single coin every December to determine whether we would buy the S&P 500 or hold our capital in short-term Treasuries. But why was it necessary that we make the decision in December? Why not July? Or January? Or September?

While we would not expect there to be point-in-time risk for coin flipping, we can still consider the net effect of a vintage-based allocation methodology. Here we will assume that we flip a coin each month and rebalance 1/12^thof our capital based upon the result.

Again, the probability of allocating to the extremes (100% invested or 100% divested) is dramatically reduced (each has approximately a 0.02% chance of occurring) and we reduce strategy fragility to any specific coin flip.

But just how impactful is this notion? Below we plot the rolling 1-year total return difference between two 60% S&P 500 / 40% 5-year U.S. Treasury fixed-mix portfolios, with one being rebalanced in February and one in August. Even for this highly simplified example, we can see that the total return spread between the two portfolios blows out to over 700 basis points in March 2010 due to the fact that the February portfolio rebalanced back into equities at nearly the exact bottom of the crisis.

Source: Global Financial Data. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.

To increase diversification across the “when” axis, we want to increase the number of vintages we deploy. For our trend following example, we will assume that the portfolio allocates between broad U.S. equities and the risk-free rate based upon a single model, but with an increasing number of evenly-spaced vintages. Again, we will run 10,000 simulations that each slightly perturb historical U.S. equity market returns and compare the terminal wealth variation for approaches that employ a different number of vintages.

We can see in the graph below that, as with the other axes of diversification, as we increase the number of vintages employed, the variance decreases. While the 25^thand 75^thpercentiles do not decrease as dramatically as for the other axes, we can see that the extreme variations are reined in substantially when we move from 1 monthly tranche to 4 weekly tranches.

Source: Kenneth French Data Library. Calculations by Newfound Research.

Conclusion

We see two critical conclusions from this analysis:

To develop confidence in achieving our objective we have to consider our sensitivity to unintended bets that may be included within the portfolio.

Fragility makes it incredibly difficult to distinguish between luck and skill, particularly as strategy fragility increases. This is true for both backtested and live performance.

To conclude our analysis, below we present a graph that combines diversification across all three axes. We again run 10,000 samples, randomly perturbing returns. For each sample, we then run four variations:

A single, randomly selected model run in broad U.S. equities that is rebalanced monthly.
A random selection of 3 models run on 5 industry groups in 2 bi-weekly tranches.
A random selection of 6 models run on 12 industry groups in 4 weekly tranches.
A random selection of 9 models run on 30 industry groups in 20 daily tranches.

It should come as no surprise that as we increase the amount of diversification across all three axes, the dispersion in terminal wealth is dramatically reduced.³

Source: Kenneth French Data Library. Calculations by Newfound Research.

It is also important to note that while our analysis focused on trend following strategies, this same line of thinking applies across all investment approaches. As an example, consider a quantitative value manager who buys the top five cheapest stocks, as measured by price-to-book, in the S&P 500 each December and then holds them for the next year. Questions worth pondering are:

What does it say about our conviction when the 6^thstock in the list is incredibly close to the 5^thstock?
What happens if some of our measures of book value are incorrect (or even just outdated)?
How different would the portfolio look if we ranked on another value measure (e.g. price-to-earnings)?
How different would the opportunity set be if we ranked every June versus every December?

While low levels of diversification across the what, how, and when axes are not necessarily an indicator that a model is inherently fragile, it should be a red flag that more effort is required to disprove that it is not fragile.

Measuring the Benefit of Diversification

By Nathan Faber

On November 5, 2018

In Portfolio Construction, Risk Management, Weekly Commentary

This post is available as a PDF download here.

Summary

The benefits of diversification are often touted, but many investors feel disappointed in diversified portfolios because of the dispersion in performance of the individual holdings.
In the context of three different unconstrained sleeves, we look at a way to measure and visualize the benefit (or detriment) of diversification based on achieving different objectives.
Through this lens, we get a picture of how good or bad the results might have been, which can lead to confidence either in the robustness of the allocation or in the need to take a different approach.
Since we only experience one path of history, it is difficult to assess the benefit of diversification unless we consider what could have happened.
We believe that taking a systematic approach does not fully remove the art of the analysis but can remove some of the behavioral biases that make sticking with a portfolio difficult in the first place.

Introduction

Diversification is a standard risk management tool in any portfolio. Reducing the impact of idiosyncratic risks in individual investments by holding a suite of stocks, asset classes, strategies, etc. produces a smoother investment ride most of the time and reduces the risk of negative surprises.

But in a world where we only experience one outcome out of the multitude of possibilities, gauging the benefit of diversification is difficult. It is even hard to do in hindsight, not so much because we can’t but more often that we won’t. The results already happened.

Over a single time period with no rebalancing, a diversified portfolio will underperform the best asset that it holds. This is a mathematical fact when there is any dispersion in the returns of the assets and it is why we have said that diversification will always disappoint. Our natural behavioral tendencies can often get the better of us, despite the fact that diversification might be doing a great job, especially when examined through the appropriate lens and measured in the context of what could have happened.

Last summer, we published a presentation entitled Building an Unconstrained Sleeve. In it, we looked at ways to combine traditional and non-traditional assets and strategies to target specific objectives: equity hedging, absolute return, and equity-like with downside management.

Now that we have 15 months of subsequent data for all the underlying strategies, we want to revisit that piece and explore the benefit of diversification in the context of hindsight.

A Recap of the Process

As a quick refresher, we included seven strategies and asset classes in the construction of our unconstrained sleeves:

Long/flat trend-following equities
Minimum volatility equities
Macro trend-following (managed futures)
Macro risk parity
Macro value
Macro income
Intermediate U.S. Treasuries

While these strategies are surely not exhaustive, they cover a range of factors (value, momentum, low volatility, etc.) and a global set of asset classes (equities, bonds, commodities, and currencies) commonly included in unconstrained sleeves. They were also selected because many of these strategies are conveniently packaged as ETFs or mutual funds, making the resulting sleeves more easily implementable.

Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results. Index returns are total returns and are gross of all fees.

Over the 15 months, world equity was by far the best performer and the spread between best-performing and worst-performing positions exceeded 20 percentage points. If you wanted high returns – and going back to our statement about how diversification will always disappoint – you could have just held world equities and been quite content.

But putting ourselves back in June 2017, we did not know a priori that simply holding equities would have generated the highest returns. Looking at this type of chart in November 2008 would have led to a very different emotional conclusion.

The aim of our original study was to develop unconstrained sleeves that would meet their objectives regardless of how the future played out. Therefore, we employed a simulation-based method that aimed to preserve some of the unique correlation structure between the strategies across different market environments and reduce the risk of overfitting to a single realization of history. With this approach, we constructed portfolios that targeted three different objectives that investors might be interested in:

Equity hedge – designed to offset significant equity losses.
Absolute return – designed to create a stable and consistent return stream in all environments.
Equity-like – designed to capture significant equity upside with reduced downside.

(Note: Greater detail about portfolio construction process, strategy descriptions, and performance attributes of each strategy can be found in our original presentation.)

But were our constructed portfolios successful in achieving their objectives out-of-sample? To analyze this question, as well as explore the benefits/detractors of diversification for each objective, we will calculate the distribution of what could have happened. The hope is that, each strategy would perform well relative to all other possible portfolios that could have been chosen for the sleeve.

Saying exactly what portfolios we could have chosen is where a little art comes into play. For example, in the equity-like strategies, it is difficult to say that a 100% bond portfolio would have ever been a viable option and therefore may not be an apt out-of-sample comparison.

However, since our original process did not have any specific override for these intuitive constraints, and since we do not wish to assert after-the-fact which portfolios would have been rejected, we will allow the entire potential allocation space to be fair game in our comparison.

There are a number of ways to sample the set of allocations over the 7 asset classes that could have formed the portfolios for each sleeve. Perhaps the most obvious choice would be to sample uniformly over the possible allocations. The issue to balance in this case is coverage of the space (a 6-dimensional simplex) with the number of samples. To be 95% confident that we sampled an allocation above 95% for only a single asset class would require nearly 200 million samples. We have used modified Sobol sequences in the past to ensure coverage of more of the space with fewer points. However, in the current case, to mimic the rounding that is often found in portfolio allocations, we will use a lattice of points spaced 2.5% apart covering the entire space. This requires just under 10 million points in the simulations.

Equity Hedge

This sleeve was designed to offset significant equity losses by limiting downside capture. The resulting optimized portfolio was relatively concentrated in two main positions that historically have exhibited low-to-negative correlations to equities and exhibited potential crisis alpha during significant and prolonged drawdowns.Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research.

The down capture this portfolio during the out-of-sample period was 0.44. This result falls in the 70^th percentile (that is, better than 70% of the other sample portfolios and where lower down-capture is better) when compared to the 10 million possible other portfolios we could have originally selected. Not surprisingly, the 100% intermediate-term Treasury portfolio had the best down capture (-0.05) over the out-of-sample. Of the portfolios with better down capture, Intermediate Treasuries and Macro – Income were generally the highest allocations.

This does not come as much of a surprise to anyone who has followed the managed futures space for the last 15 months. The category largely remains in a multi-year drawdown (peaking in early 2014), but it has also done little to offset the rapid sell-offs seen in equities in 2018. Therefore, with the full benefit of hindsight, any allocation to Macro – Trend in the original portfolio would be a detriment realizing our out-of-sample objective.

Yet even with this lackluster performance, an out-of-sample realized 70^th percentile result over a short, 15-month horizon is a result to be pleased with.

Absolute Return

This sleeve was designed to seek a stable and consistent return stream in all market environments. We aimed to accomplish this by utilizing a risk parity approach. As expected, this sleeve holds all asset classes and is very well diversified across them.

Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research.

To measure the success of the risk parity over the live period, we will look at the Gini coefficient for each of the ten million potential portfolios we could have initially selected. The Gini coefficient quantifies the equality of the distribution, with a value of 1 representing 100% concentration and 0 representing perfect equality.

The Gini coefficient of the actual portfolio was 0.25 which was in the 99.8^th percentile of possible outcomes (i.e. highly diversified on a relative basis). Here, the percentile estimate is padded by the fact that many of the simulated portfolios (e.g. the 100% ones) would clearly not be close to equal risk contribution.

Did our original portfolio achieve its out-of-sample goal? Here, we can evaluate success as to whether the realized contribution to risk of each exposure was close to equivalent; i.e. did we actually achieve risk parity as desired? We can see below that indeed we did, with the main exception of Macro – Trend, which was the most volatile asset class over the period.

Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research.

Over the sample space of potential portfolios, the portfolio with the minimum out-of-sample Gini coefficient (0.08) was tilted toward the less volatile and more diversifying asset classes (Intermediate Treasuries and Macro – Income). Even so, due to the limited granularity of the sampled portfolios, the risk contribution of Macro – Income was still half of that for each of the other strategies.

It is also worth noting how similar this solution is – generated with the complete benefit of hindsight – to our originally constructed portfolio.

Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research.

Equity-like with Downside Management

This sleeve was designed in an effort to capture equity market growth while managing the risk of severe and prolonged drawdowns. It was tilted toward the equity-like exposures with a split among risk management styles (trend, minimum volatility, macro strategies, etc.). The allocation to U.S. Treasuries is very small.

Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research.

For this portfolio, we have two variables to analyze: the up capture relative to global equities and the Ulcer index, a measure of the severity and duration of drawdowns. In the construction of the sleeve, the target was to keep the Ulcer index less than 25% of the value for global equities. The joint distribution of these quantities over the live period is shown below with the actual values over the live period for the sleeve indicated.

The realized Ulcer level was 68% of that of world equity – a far cry from the 25% that the portfolio was optimized for – and was in the 42^nd percentile while the up capture of 0.60 was in the 93^rd percentile.

With the explicit goal of achieving a relative Ulcer level, a comparison against the entire potential allocation space of 10 million portfolios is not appropriate. Therefore, we reduce the set of 10 million comparative portfolios to only those that would have given a relative Ulcer index less than 25% compared to world equities, eliminating approximately 40% of possible portfolios.

The distributions of allocations to each of the strategies in the acceptable subset are shown below. We can see that the more diversifying strategies take on a larger range of allocations.

Interestingly, looking only over this subset of the original 10 million portfolios improves the out-of-sample up capture of our originally constructed portfolio to the 99^th percentile but does not change the percentile of the Ulcer index over the live period. Why is this?

The correlation of the relative Ulcer index over the live period with that over the historical period is only 0.1, indicating that the out of sample data did not line up with our expectations at first glance. However, this makes sense when we recall that the optimization was carried out using data from much more extreme market environments (think 2001 and 2008). It is a good reminder that, just because you optimize for a certain parameter value does not mean you will get it over the live data.

Higher up-capture typically goes hand-in-hand with a higher Ulcer index, as higher return often requires bearing more risk. Therefore, one way to standardize our measures across the potential set of portfolios is to calculate the ratio of up-capture to the Ulcer index. With this transformation, the risk-adjusted up capture falls in the 87^th percentile over the set of sample allocations, indicating a very high realized risk-adjusted return.

Conclusion

We only experience one path of the world and do not know the infinite alternate course history could have taken. But it is exactly this infinitude of alternate states that diversification is meant to address.

Diversification generally has no apparent benefit unless we envision what could have happened. Unfortunately our innate natures make this difficult. We do not often value our realized path in this context. After all, none of these alternate states actually happened, so it is difficult to picture what we did not experience.

A quantitative approach can yield a systematic way to evaluate the benefit (or detriment) of diversification. This way, we are not relying as much on intuition – how did our performance feel? – and are looking through a more objective lens at our initial decisions.

In the examples using the Unconstrained Sleeves, diversification focused on more than just returns. The objectives that initially went in to the portfolio construction were the parameters of interest.

Taking a systematic approach does not fully remove the art of the analysis, as was evident in the construction of the potential sample of portfolios used in the comparisons, but having a process can remove some of the behavioral biases that make sticking with a portfolio difficult in the first place.

Category: Portfolio Construction Page 6 of 10

Fragility Case Study: Dual Momentum GEM

Summary­

Introduction

Global Equities Momentum (“GEM”)

Conclusion

Is Multi-Manager Diversification Worth It?

Summary­

Introduction

The Theory Behind Manager Diversification

Literature Review

But What About Dilution?

Conclusion

Dart-Throwing Monkeys and Process Diversification

Summary­

Introduction

Dart-Throwing Monkeys

Process Diversification and Terminal Wealth Dispersion

Conclusion

What do portfolios and teacups have in common?

Summary­

Introduction

The Experiment Setup

The What

The How

The When

Conclusion

Measuring the Benefit of Diversification

Summary­

Introduction

A Recap of the Process

Equity Hedge

Absolute Return

Equity-like with Downside Management

Conclusion

Summary

Summary

Summary

Summary

Summary