The Research Library of Newfound Research

Tag: process diversification

Dart-Throwing Monkeys and Process Diversification

This post is available as a PDF download here.


  • This week’s commentary is a short addendum to last week’s piece, attempting to serve as a (very) brief and simplified summary of process diversification.
  • Volatility is only one way of measuring risk; dispersion in terminal wealth is another.
  • Using simulations of dart-throwing monkeys, we plot the dispersion in terminal wealth for different levels of portfolio and manager diversification.
  • We find that increased diversification within a portfolio as well as increased diversification across managers can lead to more consistent portfolio outcomes.


In last week’s commentary (What do portfolios and teacups have in common?), we explored at great length the potential benefits of diversification in the domains of what, how, and when.

The crux of our argument is that for investors, return dispersions across time (i.e. “volatility”) can be a potentially misleading risk characteristic and that it is important to consider the potential dispersion in terminal wealth as well.

These are by no means original or unique thoughts.  Often the advisors and institutions we work with intuitively understand them: they just have not been presented with the math to justify them.

Therefore, in contrast to last week’s rather expansive note, we aim to keep this week’s note short, simple, and punchy in an effort to drive how manager / process diversification can help deliver more consistent outcomes.

Dart-Throwing Monkeys

Consider the following experiment.

We begin with thousands and thousands of dart-throwing monkeys.  Every month, the monkeys throw their darts at a board that determines how they will be invested for the next month.  In this hypothetical scenario, we will assume that the monkeys are investing in different industry groups.1

Some monkeys are “concentrated managers,” throwing just a single dart and holding that pick for the next month.  Other monkeys are more diversified, throwing up to 30 darts each month and equally allocating their portfolio across their investments.  Portfolio sizes can be either 1, 5, 10, 15, 20, 25, or 30 equally-allocated investments.

It is our job, as an allocator, to choose different monkeys to invest with.  Do we invest with just 1 concentrated monkey manager? Five different diversified managers? How much difference does it really make at the end of the day?

We learn in Finance 101 that once we diversify our portfolio sufficiently, we have eliminated nonsystematic risk.  But does that mean we expect the portfolios to necessarily end up in the same place?

As an example, if we pick 10 dart-throwing monkeys who each pick 10 investments per month, how different would we expect our final wealth level to be from another allocator who picks 10 different dart-throwing monkeys who each pick 10 investments per month?

Process Diversification and Terminal Wealth Dispersion

Below we plot the dispersion in terminal wealth2 as a function of (1) the number of securities picked by each monkey manager and (2) the number of monkey managers we allocate to.

As an example of how to read this graph, the orange line tells us about portfolios comprised of monkey managers who pick five investments each.  As we move from left to right, we learn about the dispersion in terminal wealth based upon the number of managers we allocate to.

We can think of this two ways.  First, we can think of it as potential dispersion in results among our peers who make the same type of decision (e.g. picking 5 managers who pick 5 investments each) but different specific choices (e.g. might pick different managers). Second, we can think of this as the dispersion in possible results if we were able to live across infinite universes simultaneously.

Source: Kenneth French Data Library. Calculations by Newfound Research.


Unfortunately, we cannot live across infinite universes and this graph tells us that choosing a single, highly concentrated manager can lead to wildly different outcomes depending upon the manager we select.

As the managers further diversify and we further diversify among managers, this dispersion in potential outcomes decreases.3


The intuition behind these results is simple:

  • More diversified managers are more likely to overlap in portfolio holdings with one another, and therefore are likely to have more similar returns.
  • Similarly, as the number of managers we choose goes up, so does the likelihood of overlap in holdings with a peer who also selects the same number of managers.

It is equally valid to interpret this analysis as saying there is greater opportunity for out-performance in taking concentrated bets in highly concentrated managers.  We would argue this is more right thinking: the win condition requires both that we pick the right managers and the managers pick the right stocks.  While a little bit of diversification can go a long way here in clipping outlier events, the dispersion can still far exceed a more diversified approach.

At Newfound, we prefer the less wrong approach.  Allocations to a few diversified managers each taking a different approach can lead to significantly less dispersion in outcomes and, therefore, allow for better financial planning.



Measuring Process Diversification in Trend Following

This post is available as a PDF download here.


  • We prefer to think about diversification in a three-dimensional framework: what, how, and when.
  • The “how” axis covers the process with which an investment decision is made.
  • There are a number of models that trend-followers might use to capture a trend. For example, trend-followers might employ a time-series momentum model, a price-minus moving average model, or a double moving average cross-over model.
  • Beyond multiple models, each model can have a variety of parameterizations. For example, a time-series momentum model can just as equally be applied with a 3-month formation period as an 18-month period.
  • In this commentary, we attempt to measure how much diversification opportunity is available by employing multiple models with multiple parameterizations in a simple long/flat trend-following process.

When investors talk about diversification, they typically mean across different investments.  Do not just by a single stock, for example, buy a basket of stocks in order to diversify away the idiosyncratic risk.

We call this “what” diversification (i.e. “what are you buying?”) and believe this is only one of three meaningful axes of diversification for investors.  The other two are “how” (i.e. “how are you making your decision?”) and “when” (i.e. “when are you making your decision?”).  In recent years, we have written a great deal about the “when” axis, and you can find a summary of that research in our commentary Quantifying Timing Luck.

In this commentary, we want to discuss the potential benefits of diversifying across the “how” axis in trend-following strategies.

But what, exactly, do we mean by this?  Consider that there are a number of ways investors can implement trend-following signals.  Some popular methods include:

  • Prior total returns (“time-series momentum”)
  • Price-minus-moving-average (e.g. price falls below the 200-day moving average)
  • Moving-average double cross-over (e.g. the 50-day moving average crosses the 200-day moving average)
  • Moving-average change-in-direction (e.g. the 200-day moving average slope turns positive or negative)

As it turns out, these varying methodologies are actually cousins of one another.  Recent research has established that these models can, more or less, be thought of as different weighting schemes of underlying returns.  For example, a time-series momentum model (with no skip month) derives its signal by averaging daily log returns over the lookback period equally.

With this common base, a number of papers over the last decade have found significant relationships between the varying methods.  For example:


Bruder, Dao, Richard, and Roncalli (2011)Moving-average-double-crossover is just an alternative weighting scheme for time-series momentum.
Marshall, Nguyen and Visaltanachoti (2014)Time-series momentum is related to moving-average-change-in-direction.
Levine and Pedersen (2015)Time-series-momentum and moving-average cross-overs are highly related; both methods perform similarly on 58 liquid futures contracts.
Beekhuizen and Hallerbach (2015)Mathematically linked moving averages with prior returns.
Zakamulin (2015)Price-minus-moving-average, moving-average-double-cross-over, and moving-average-change-of-direction can all be interpreted as a computation of a weighted moving average of momentum rules.


As we have argued in past commentaries, we do not believe any single method is necessarily superior to another.  In fact, it is trivial to evaluate these methods over different asset classes and time-horizons and find an example that proves that a given method provides the best result.

Without a crystal ball, however, and without any economic interpretation why one might be superior to another, the choice is arbitrary.  Yet the choice will ultimately introduce randomness into our results: a factor we like to call “process risk.”  A question we should ask ourselves is, “if we have no reason to believe one is better than another, why would we pick one at all?”

We like to think of it this way: ex-post, we will know whether the return over a given period is positive or negative.  Ex-ante, all we have is a handful of trend-following signals that are forecasting that direction.  If, historically, all of these trend signals have been effective, then there may be no reason to necessarily believe on over another.

Combining them, in many ways, is sort of like trying to triangulate on the truth. We have a number of models that all look at the problem from a slightly different perspective and, therefore, provide a slightly different interpretation.  A (very) loose analogy might be using the collective information from a number of cell towers in effort to pinpoint the geographic location of a cellphone.

We may believe that all of the trend models do a good job of identifying trends over the long run, but most will prove false from time-to-time in the short-run. By using them together, we can potentially increase our overall confidence when the models agree and decrease our confidence when they do not.

With all this in mind, we want to explore the simple question: “how much potential benefit does process diversification bring us?”

The Setup

To answer this question, we first generate a number of long/flat trend following strategies that invest in a broad U.S. equity index or the risk-free rate (both provided by the Kenneth French database and ranging from 1926 to 2018). There are 48 strategy variations in total constructed through a combination of four difference processes – time-series momentum, price-minus-moving-average, and moving-average double cross-over– and 16 different lookback periods (from the approximate equivalent of 3-to-18 months).

We then treat each of the 64 variations as its own unique asset.

To measure process diversification, we are going to use the concept of “independent bets.” The greater the number of independent bets within a portfolio, the greater the internal diversification. Below are a couple examples outlining the basic intuition for a two-asset portfolio:

  • If we have a portfolio holding two totally independent assets with similar volatility levels, a 50% allocation to each would maximize our diversification.Intuitively, we have equally allocated across two unique bets.
  • If we have a portfolio holding two totally independent assets with similar volatility levels, a 90% allocation to one asset and a 10% allocation to another would lead us to a highly concentrated bet.
  • If we have a portfolio holding two highly correlated assets, no matter the allocation split, we have a large, concentrated bet.
  • If we have a portfolio of two assets with disparate volatility levels, we will have a large concentrated bet unless the lower volatility asset comprises the vast majority of the portfolio.

To measure this concept mathematically, we are going to use the fact that the square of the “diversification ratio” of a portfolio is equal to the number of independent bets that portfolio is taking.1

Diversifying Parameterization Risk

Within process diversification, the first variable we can tweak is the formation period of our trend signal.  For example, if we are using a time-series momentum model that simply looks at the sign of the total return over the prior period, the length of that period may have a significant influence in the identification of a trend.  Intuition tells us that shorter formation periods might identify short-term trends as well as react to long-term trend changes more quickly but may be more sensitive to whipsaw risk.

To explore the diversification opportunities available to us simply by varying our formation parameterization, we build equal-weight portfolios comprised of two strategies at a time, where each strategy utilizes the same trend model but a different parameterization.  We then measure the number of independent bets in that combination.

We run this test for each trend following process independently.  As an example, we compare using a shorter lookback period with a longer lookback period in the context of time-series momentum in isolation. We will compare across models in the next section.

In the graphs below, L0 through L15 represent the lookback periods, with L0 being the shortest lookback period and L15 representing the longest lookback period.

As we might suspect, the largest increase in available bets arises from combining shorter formation periods with longer formation periods.  This makes sense, as they represent the two horizons that share the smallest proportion of data and therefore have the least “information leakage.” Consider, for example, a time-series momentum signal that has a 4-monnth lookback and one with an 8-month lookback. At all times, 50% of the information used to derive the latter model is contained within the former model.  While the technical details are subtler, we would generally expect that the more informational overlap, the less diversification is available.

We can see that combining short- and long-term lookbacks, the total number of bets the portfolio is taking from 1.0 to approximately 1.2.

This may not seem like a significant lift, but we should remember Grinold and Kahn’s Fundamental Law of Active Management:

Information Ratio = Information Coefficient x SQRT(Independent Bets)

Assuming the information coefficient stays the same, an increase in the number of independent bets from 1.0 to 1.2 increases our information ratio by approximately 10%.  Such is the power of diversification.

Another interesting way to approach this data is by allowing an optimizer to attempt to maximize the diversification ratio.  In other words, instead of only looking at naïve, equal-weight combinations of two processes at a time, we can build a portfolio from all available lookback variations.

Doing so may provide two interesting insights.

First, we can see how the optimizer might look to combine different variations to maximize diversification.  Will it barbell long and short lookbacks, or is there benefit to including medium lookbacks? Will the different processes have different solutions?  Second, by optimizing over the full history of data, we can find an upper limit threshold to the number of independent bets we might be able to capture if we had a crystal ball.

A few takeaways from the graphs above:

  • Almost all of the processes barbell short and long lookback horizons to maximize diversification.
  • The optimizer finds value, in most cases, in introducing medium-term lookback horizons as well. We can see for Time-Series MOM, the significant weights are placed on L0, L1, L6, L10, and L15.  While not perfectly spaced or equally weighted, this still provides a strong cross-section of available information.  Double MA Cross-Over, on the other hand, finds value in weighting L0, L8, and L15.
  • While the optimizer increases the number of independent bets in all cases versus a naïve, equal-weight approach, the pickup is not incredibly dramatic. At the end of the day, a crystal ball does not find a meaningfully better solution than our intuition may provide.

Diversifying Model Risk

Similar to the process taken in the above section, we will now attempt to quantify the benefits of cross-process diversification.

For each trend model, we will calculate the number of independent bets available by combining it with another trend model but hold the lookback period constant. As an example, we will combine the shortest lookback period of the Time-Series MOM model with the shortest lookback period of the MA Double Cross-Over.

We plot the results below of the number of independent bets available through a naïve, equal-weight combination.

We can see that model combinations can lift the number of independent bets from by 0.05 to 0.1.  Not as significant as the theoretical lift from parameter diversification, but not totally insignificant.

Combining Model and Parameterization Diversification

We can once again employ our crystal ball in an attempt to find an upper limit to the diversification available to trend followers, as well as the process / parameterization combinations that will maximize this opportunity.  Below, we plot the results.

We see a few interesting things of note:

  • The vast majority of models and parameterizations are ignored.
  • Time-Series MOM is heavily favored as a model, receiving nearly 60% of the portfolio weight.
  • We see a spread of weight across short, medium, and long-term weights. Short-term is heavily favored, with Time-Series MOM L0 and Price-Minus MA L0 approaching nearly 45% of model weight.
  • All three models are, ultimately, incorporated, with approximately 10% being allocated to Double MA Cross-Over, 30% to Price-Minus MA, and 60% to Time-Series MOM.

It is worth pointing out that naively allocating equally across all 48 models creates 1.18 independent bets while the full-period crystal ball generated 1.29 bets.

Of course, having a crystal ball is unrealistic.  Below, we look at a rolling window optimization that looks at the prior 5 years of weekly returns to create the most diversified portfolio.  To avoid plotting a graph with 48 different components, we have plot the results two ways: (1) clustered by process and (2) clustered by lookback period.

Using the rolling window, we see similar results as we saw with the crystal ball. First, Time-Series MOM is largely favored, often peaking well over 50% of the portfolio weights.  Second, we see that a barbelling approach is frequently employed, balancing allocations to the shortest lookbacks (L0 and L1) with the longest lookbacks (L14 and L15).  Mid-length lookbacks are not outright ignored, however, and L5 through L11 combined frequently make up 20% of the portfolio.

Finally, we can see that the rolling number of bets is highly variable over time, but optimization frequently creates a meaningful impact over an equal-weight approach.2


In this commentary, we have explored the idea of process diversification.  In the context of a simple long/flat trend-following strategy, we find that combining strategies that employ different trend identification models and different formation periods can lead to an increase in the independent number of bets taken by the portfolio.

As it specifically pertains to trend-following, we see that diversification appears to be maximized by allocating across a number of lookback horizons, with an optimizer putting a particular emphasis on barbelling shorter and longer lookback periods.

We also see that incorporating multiple processes can increase available diversification as well.  Interestingly, the optimizer did not equally diversify across models.  This may be due to the fact that these models are not truly independent from one another than they might seem.  For example, Zakamulin (2015) demonstrated that these models can all be decomposed into a different weighted average of the same general momentum rules.

Finding process diversification, then, might require moving to a process that may not have a common basis.  For example, trend followers might consider channel methods or a change in basis (e.g. constant volume bars instead of constant time bars).

Powered by WordPress & Theme by Anders Norén