Author: Corey Hoffstein Page 10 of 18

Corey is co-founder and Chief Investment Officer of Newfound Research.

Corey holds a Master of Science in Computational Finance from Carnegie Mellon University and a Bachelor of Science in Computer Science, cum laude, from Cornell University.

You can connect with Corey on LinkedIn or Twitter.

Dart-Throwing Monkeys and Process Diversification

By Corey Hoffstein

On December 24, 2018

In Portfolio Construction, Risk Management, Weekly Commentary

This post is available as a PDF download here.

Summary

This week’s commentary is a short addendum to last week’s piece, attempting to serve as a (very) brief and simplified summary of process diversification.
Volatility is only one way of measuring risk; dispersion in terminal wealth is another.
Using simulations of dart-throwing monkeys, we plot the dispersion in terminal wealth for different levels of portfolio and manager diversification.
We find that increased diversification within a portfolio as well as increased diversification across managers can lead to more consistent portfolio outcomes.

Introduction

In last week’s commentary (What do portfolios and teacups have in common?), we explored at great length the potential benefits of diversification in the domains of what, how, and when.

The crux of our argument is that for investors, return dispersions across time (i.e. “volatility”) can be a potentially misleading risk characteristic and that it is important to consider the potential dispersion in terminal wealth as well.

These are by no means original or unique thoughts. Often the advisors and institutions we work with intuitively understand them: they just have not been presented with the math to justify them.

Therefore, in contrast to last week’s rather expansive note, we aim to keep this week’s note short, simple, and punchy in an effort to drive how manager / process diversification can help deliver more consistent outcomes.

Dart-Throwing Monkeys

Consider the following experiment.

We begin with thousands and thousands of dart-throwing monkeys. Every month, the monkeys throw their darts at a board that determines how they will be invested for the next month. In this hypothetical scenario, we will assume that the monkeys are investing in different industry groups.¹

Some monkeys are “concentrated managers,” throwing just a single dart and holding that pick for the next month. Other monkeys are more diversified, throwing up to 30 darts each month and equally allocating their portfolio across their investments. Portfolio sizes can be either 1, 5, 10, 15, 20, 25, or 30 equally-allocated investments.

It is our job, as an allocator, to choose different monkeys to invest with. Do we invest with just 1 concentrated monkey manager? Five different diversified managers? How much difference does it really make at the end of the day?

We learn in Finance 101 that once we diversify our portfolio sufficiently, we have eliminated nonsystematic risk. But does that mean we expect the portfolios to necessarily end up in the same place?

As an example, if we pick 10 dart-throwing monkeys who each pick 10 investments per month, how different would we expect our final wealth level to be from another allocator who picks 10 different dart-throwing monkeys who each pick 10 investments per month?

Process Diversification and Terminal Wealth Dispersion

Below we plot the dispersion in terminal wealth² as a function of (1) the number of securities picked by each monkey manager and (2) the number of monkey managers we allocate to.

As an example of how to read this graph, the orange line tells us about portfolios comprised of monkey managers who pick five investments each. As we move from left to right, we learn about the dispersion in terminal wealth based upon the number of managers we allocate to.

We can think of this two ways. First, we can think of it as potential dispersion in results among our peers who make the same type of decision (e.g. picking 5 managers who pick 5 investments each) but different specific choices (e.g. might pick different managers). Second, we can think of this as the dispersion in possible results if we were able to live across infinite universes simultaneously.

Source: Kenneth French Data Library. Calculations by Newfound Research.

Unfortunately, we cannot live across infinite universes and this graph tells us that choosing a single, highly concentrated manager can lead to wildly different outcomes depending upon the manager we select.

As the managers further diversify and we further diversify among managers, this dispersion in potential outcomes decreases.³

Conclusion

The intuition behind these results is simple:

More diversified managers are more likely to overlap in portfolio holdings with one another, and therefore are likely to have more similar returns.
Similarly, as the number of managers we choose goes up, so does the likelihood of overlap in holdings with a peer who also selects the same number of managers.

It is equally valid to interpret this analysis as saying there is greater opportunity for out-performance in taking concentrated bets in highly concentrated managers. We would argue this is more right thinking: the win condition requires both that we pick the right managers and the managers pick the right stocks. While a little bit of diversification can go a long way here in clipping outlier events, the dispersion can still far exceed a more diversified approach.

At Newfound, we prefer the less wrong approach. Allocations to a few diversified managers each taking a different approach can lead to significantly less dispersion in outcomes and, therefore, allow for better financial planning.

What do portfolios and teacups have in common?

By Corey Hoffstein

On December 17, 2018

In Portfolio Construction, Risk Management, Weekly Commentary

This post is available as a PDF download here.

Summary

Portfolio risk is often measured as the variance of returns over time. Another form of risk is the variance of terminal wealth that can arise from small variations in strategy inputs or asset returns.
Strategies or portfolios that are more sensitive to small changes in inputs are inherently “fragile.”
Fragile strategy design makes it difficult to rely upon backtests or historical results in setting forward expectations.
We explore how diversification across the “what,” “how,” and “when,” axes of portfolio construction can help reduce strategy fragility.

Introduction

At Newfound, we spend a lot less time trying to figure out how to be more right than we spend trying to figure out how to be less wrong. One area of particular interest for us is the idea of unintended bets: the exposures in a portfolio we may not even be aware of. And if we knew we had the exposure, we might not even want it.

For example, consider a portfolio that invests in either broad U.S., broad international, or broad emerging market equities based upon valuations. A significant tilt towards non-U.S. assets may be a valuation-driven decision, but for U.S. investors it creates significant exposure to fluctuations in the U.S. dollar versus foreign currencies.

Of course, exposures are not limited only to assets. Exposures may be broader macro-economic, stylistic, thematic, geographic, or even political factors.

These unintended bets can go far beyond explicit and implicit exposures. In our example, the choice of how to measure value may lead to meaningfully different portfolios, despite the same overarching thesis. For example, a naïve CAPE ratio versus adjusting for differences in relative sector composition dramatically alters the view of whether international equities are significantly cheaper than U.S. equities. These potential differences capture what we like to call “model specification risk.”

Finally, we can be subject to unintended bets based upon when the portfolio is re-evaluated and reconstituted. Evaluating valuations in January, for example, may lead to a different decision versus evaluating them in July.

How can we avoid these unintended bets? At Newfound, we believe that the answer falls back to diversification: not only in the traditional sense of what we invest in, but also across how we make decisions and when we make them.

When left uncontrolled, unintended bets can make a strategy incredibly fragile.

What, precisely, does it mean for a strategy to be fragile? A strategy is fragile when small variations of strategy inputs – be it asset returns or other measures – lead to meaningful dispersion in realized results.

Now we want to distinguish between volatility and fragility. Volatility is the dispersion of strategy returns across time, while fragility is the dispersion in end-of-period wealth across variations of the strategy.

As an example, a portfolio that invests only in the S&P 500 is very volatile but not particularly fragile. Given the last ten years of returns for the S&P 500, slight variations in annual returns would not lead to significant dispersion in end-of-period wealth. On the other hand, a strategy that flips a coin every December and invests for the next year in the S&P 500 when it lands on heads or short-term U.S. Treasuries when it lands on tails would have lower expected volatility than the S&P 500 but would be much more fragile. We need simply consider a few scenarios (e.g. all heads or all tails) to understand the potential dispersion such a strategy is subject to.

In the remainder of this commentary, we will demonstrate how diversification across the what, how, and when axes can reduce strategy fragility.

The Experiment Setup

Since a large degree of our focus at Newfound is on managing trend equity mandates, we will explore fragility through the lens of the style of measuring trends. For those unfamiliar with the approach, trend equity strategies aim to capture a significant portion of equity market growth while avoiding substantial and prolonged drawdowns through the application of trend following. A naïve implementation of such an idea would be to invest in the S&P 500 when its prior 12-month return has been positive and invest in short-term U.S. Treasuries otherwise.

To learn something about the fragility of a strategy, we are going to have to inject some randomness. After all, no amount of history will tell us about the fragility of a teacup that has spent its entire life sitting on a shelf; we will need to see it fall on the floor to actually learn something.

As with our recent commentary When Simplicity Met Fragility, we will inject randomness by adding white noise to asset returns. Specifically, we will add to daily returns a draw from a random normal distribution with mean 0% and standard deviation 0.025%. Using this slightly altered history, we will then run our investment strategy.

By performing this process a large number of times (10,000 in this commentary), we can explore how the outcome of the strategy is impacted by these slight variations in return history. The greater the dispersion in results, the more fragile the strategy is.

To demonstrate how diversification across the three different axes can affect fragility, we will start with a naïve trend equity strategy – investing in broad U.S. equities using a single trend model that is rebalanced on a monthly basis – and vary the three components in isolation.

The What

The “what” axis simply asks, “what are we invested in?”

How can our choice of “what” affect fragility? Consider a slight variation to our coin-flip strategy from before. Instead of flipping a single coin, we will now flip two coins. The first coin determines whether we invest 50% of the portfolio in either the S&P 500 or short-term U.S. Treasuries, while the second coin determines whether we invest the other 50% of the portfolio in either the Russell 1000 or short-term U.S. Treasuries.

In our single coin example, each year we expected to invest in the S&P 500 50% of the time and in short-term U.S. Treasuries 50% of the time. With two coins, we now expect to be fully invested 25% of the time, partially invested 50% of the time, and divested 25% of the time.

Let’s take this notion to further limits. Consider now flipping 100 coins where each determines the allocation decision for 1% of our portfolio, where heads leads to an investment in a large-cap U.S. equity portfolio and tails means invest in short-term U.S. Treasuries. Now being fully invested or divested is an infinitesimally small probability event; in fact, for a given year there is a 95% chance that your allocation to equities falls between 40-60%.¹

Even though we’ve applied the exact same process to each investment, diversifying across more investments has dramatically reduced the fragility of our coin-flipping strategy.

Now let’s translate this from the theoretical to the practical. We will begin with a simple trend following strategy that invests in the underlying asset when prior 12-1 month returns have been positive or invests in the risk-free rate, re-evaluating the trend at the end of each month.

To explore the impact of diversifying our what, we will implement this strategy five different ways:

A single in-or-out decision on broad U.S. equities.
Applied across 5 equally-weighted U.S. equity industry groups.
Applied across 12 equally-weighted U.S. equity industry groups.
Applied across 30 equally-weighted U.S. equity industry groups.
Applied across 48 equally-weighted U.S. equity industry groups.

The graph below plots the distribution of log difference in terminal wealth against the median outcome for each of these five approaches. Lines within each “violin” show the 25^th, 50^th, and 75^thpercentiles.

The graph clearly demonstrates that by increasing our exposure across the “what” axis, the dispersion in terminal wealth is dramatically reduced.

Source: Kenneth French Data Library. Calculations by Newfound Research.

But why is reduced dispersion in terminal wealth necessarily better?

It implies a greater consistency in outcome, which is not only important for setting forward expectations, but is also important for evaluating past performance (whether backtested or live). This evidence tells us that if we are evaluating a trend equity strategy that employs a single model to make in-or-out decisions on broad U.S. equities on a monthly basis, it will be nearly impossible to tell whether the realized results are in line with reasonable expectations or overly optimistic (we can probably guess that they aren’t overly pessimistic, as those sorts of returns typically aren’t marketed).

To justify a concentration in the “what” axis, we would have to demonstrate that the worst-case scenarios would still represent a meaningful improvement in expected terminal wealth versus a more diversified approach.

It should be noted that our experiment design prohibits dispersion from every being fully reduced, as we are injecting randomness into past returns. Even if no strategy is applied, there will be some inherent dispersion in final wealth. For example, below we plot the dispersion that occurs simply from adding randomness to past returns with a buy-and-hold approach.

Increasing the number of assets in the portfolio inherently reduces dispersion for buy-and-hold because diversification helps drive the expected impact of the injected randomness towards its mean: zero. With only one asset, on the other hand, outlier events are free to wreak havoc on results.

Source: Kenneth French Data Library. Calculations by Newfound Research.

Note that adding a strategy on top of buy-and-hold can exacerbate the fragility issue, making diversification that much more important.

The How

The “how” axis asks, “how are we making investment decisions.”

Many investors are already somewhat familiar with diversification along the “how” axis, often diversifying their active exposures across multiple managers who might have similar investment mandates but slightly different processes.

We like to call this “process diversification” and think of it as akin to the parable of the blind men and the elephant. Each blind man touches a different part of the elephant and pronounces his belief in what he is touching based upon his isolated view. The blind man touching the leg, for example, might think he is touching a sturdy tree while the blind man touching the tail might believe he is grabbing a rope.

None is correct in isolation but taken together we may gain a more well-rounded picture.

Similarly, two managers may claim to invest based upon valuations, but the manner in which they do so gives them a very different picture of where value can be found.

The idea of process diversification was explored in the 1999 paper “Do You Need More than One Manager for a Given Equity Style?” by Franklin Fant and Edward O’Neal. Fant and O’Neal found that while a multi-manager approach does very little for return variability across time (i.e. portfolio volatility), it does a lot for end-of-period wealth variability. They find this to be true across almost all equity style box categories. In other words: taking a multi-manager approach can reduce fragility.

Let us return to our prior coin flip example. Instead of making a choice to invest in the S&P 500 based upon a coin-flip, however, we will combine a number of different signals. For example, we might flip a coin, roll a die, measure the weather, and look at the second hand of a clock. Each signal gives us some sort of in-or-out decision, and we average these decisions together to get our allocation. As with before, as we incorporate more signals, we decrease the probability that we end up with extreme allocations, leading to a more consistent terminal wealth distribution.

Again, we should stress here that the objective is not just outright elimination of dispersion in terminal wealth. After all, if that were our sole pursuit, we could simply stuff our money under our mattress. Rather, assuming we will be implementing some active investment strategy that we hope has a positive long-term expected return, our aim should be to reduce the dispersion in terminal wealth for that strategy.

Of course, in investing we would not expect the processes to be entirely independent. With trend following, for example, most popular models are actually mathematically linked to one another, and therefore generate signals that are highly correlated. Nevertheless, even modest diversification can have meaningful benefits with respect to strategy fragility.

To explore the impact of diversification along the how axis, we implement our trend following strategy six different ways. Each invests in broad U.S. equities and rebalances monthly but differs in the number of trend-following models employed.²

The results are plotted below.

Source: Kenneth French Data Library. Calculations by Newfound Research.

Again, we can see that increased diversification across the how axis dramatically reduces dispersion in terminal wealth. Our takeaway is largely the same: without an ex-ante view as to which particular model (or group of models) is best (i.e. a view of how to be more right), diversification can lead to greater consistency in results. We will be less wrong.

A subtler conclusion of this analysis is that it should be very, very difficult to necessarily conclude that one model is better than another. We can see that if we risk selecting just one model to govern our process, seemingly minor variations in historical returns leads can lead to dramatically different terminal wealth results, as evidenced by the bulging distribution. Inverting this line of thinking, we should also be suspect of any backtest that seeks to demonstrate the superiority of a given model using a single backtest. For example, just because a 12-1 month total return model performs better than a 10-month moving average model on historical S&P 500 returns, we should be highly skeptical as to the robustness of the conclusion that the 12-1 model is best.

The When

Then “when” axis asks, “when are we making our investment decision?”

This is an oft overlooked question in public markets, but it is commonly addressed in the world of private equity and venture capital. Due to the illiquid nature of those markets, investors will often attempt to diversify their business cycle risk by establishing positions in multiple funds over time, giving them exposure to different “vintages.” The idea here is simple: the opportunity set available at different points in time can vary and if we allocate all of our earmarked capital to a particular year, we may miss out on later opportunities.

Consider our original coin-flipping example where we flipped a single coin every December to determine whether we would buy the S&P 500 or hold our capital in short-term Treasuries. But why was it necessary that we make the decision in December? Why not July? Or January? Or September?

While we would not expect there to be point-in-time risk for coin flipping, we can still consider the net effect of a vintage-based allocation methodology. Here we will assume that we flip a coin each month and rebalance 1/12^thof our capital based upon the result.

Again, the probability of allocating to the extremes (100% invested or 100% divested) is dramatically reduced (each has approximately a 0.02% chance of occurring) and we reduce strategy fragility to any specific coin flip.

But just how impactful is this notion? Below we plot the rolling 1-year total return difference between two 60% S&P 500 / 40% 5-year U.S. Treasury fixed-mix portfolios, with one being rebalanced in February and one in August. Even for this highly simplified example, we can see that the total return spread between the two portfolios blows out to over 700 basis points in March 2010 due to the fact that the February portfolio rebalanced back into equities at nearly the exact bottom of the crisis.

Source: Global Financial Data. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.

To increase diversification across the “when” axis, we want to increase the number of vintages we deploy. For our trend following example, we will assume that the portfolio allocates between broad U.S. equities and the risk-free rate based upon a single model, but with an increasing number of evenly-spaced vintages. Again, we will run 10,000 simulations that each slightly perturb historical U.S. equity market returns and compare the terminal wealth variation for approaches that employ a different number of vintages.

We can see in the graph below that, as with the other axes of diversification, as we increase the number of vintages employed, the variance decreases. While the 25^thand 75^thpercentiles do not decrease as dramatically as for the other axes, we can see that the extreme variations are reined in substantially when we move from 1 monthly tranche to 4 weekly tranches.

Source: Kenneth French Data Library. Calculations by Newfound Research.

Conclusion

We see two critical conclusions from this analysis:

To develop confidence in achieving our objective we have to consider our sensitivity to unintended bets that may be included within the portfolio.

Fragility makes it incredibly difficult to distinguish between luck and skill, particularly as strategy fragility increases. This is true for both backtested and live performance.

To conclude our analysis, below we present a graph that combines diversification across all three axes. We again run 10,000 samples, randomly perturbing returns. For each sample, we then run four variations:

A single, randomly selected model run in broad U.S. equities that is rebalanced monthly.
A random selection of 3 models run on 5 industry groups in 2 bi-weekly tranches.
A random selection of 6 models run on 12 industry groups in 4 weekly tranches.
A random selection of 9 models run on 30 industry groups in 20 daily tranches.

It should come as no surprise that as we increase the amount of diversification across all three axes, the dispersion in terminal wealth is dramatically reduced.³

Source: Kenneth French Data Library. Calculations by Newfound Research.

It is also important to note that while our analysis focused on trend following strategies, this same line of thinking applies across all investment approaches. As an example, consider a quantitative value manager who buys the top five cheapest stocks, as measured by price-to-book, in the S&P 500 each December and then holds them for the next year. Questions worth pondering are:

What does it say about our conviction when the 6^thstock in the list is incredibly close to the 5^thstock?
What happens if some of our measures of book value are incorrect (or even just outdated)?
How different would the portfolio look if we ranked on another value measure (e.g. price-to-earnings)?
How different would the opportunity set be if we ranked every June versus every December?

While low levels of diversification across the what, how, and when axes are not necessarily an indicator that a model is inherently fragile, it should be a red flag that more effort is required to disprove that it is not fragile.

When Simplicity Met Fragility

By Corey Hoffstein

On October 29, 2018

In Craftsmanship, Portfolio Construction, Risk Management, Weekly Commentary

This post is available as a PDF download here.

Summary

Research suggests that simple heuristics are often far more robust than more complicated, theoretically optimal solutions.
Taken too far, we believe simplicity can actually introduce significant fragility into an investment process.
Using trend equity as an example, we demonstrate how using only a single signal to drive portfolio allocations can make a portfolio highly sensitive to the impact of randomness, clouding our ability to determine the difference between skill and luck.
We demonstrate that a slightly more complicated process that combines signals significantly reduces the portfolio’s sensitivity to randomness.
We believe that the optimal level of simplicity is found at the balance of diversification benefit and introduced estimation risk. When a more complicated process can introduce meaningful diversification gain into a strategy or portfolio with little estimation risk, it should be considered.

Introduction

In the world of finance, simple can be surprisingly robust. DeMiguel, Garlappi, and Uppal (2005)¹, for example, demonstrate that a naïve, equal-weight portfolio frequently delivers higher Sharpe ratios, higher certainty-equivalent returns, and lower turnover out-of-sample than competitive “optimal” allocation policies. In one of our favorite papers, Haldane (2012)²demonstrates that simplified heuristics often outperform more complicated algorithms in a variety of fields.

Yet taken to an extreme, we believe that simplicity can have the opposite effect, introducing extreme fragility into an investment strategy.

As an absurd example, consider a highly simplified portfolio that is 100% allocated to U.S. equities. Introducing bonds into the portfolio may not seem like a large mental leap but consider that this small change introduces an axis of decision making that brings with it a number of considerations. The proportion we allocate between stocks and bonds requires, at the very least, estimates of an investor’s objectives, risk tolerances, market outlook, and confidence levels in these considerations.³

Despite this added complexity, few investors would consider an all-equity portfolio to be more “robust” by almost any reasonable definition of robustness.

Yet this is precisely the type of behavior we see all too often in tactical portfolios – and particularly in trend equity strategies – where investors follow a single signal to make dramatic allocation decisions.

So Close and Yet So Far

To demonstrate the potential fragility of simplicity, we will examine several trend-following signals applied to broad U.S. equities:

Price minus the 10-month moving average
12-1 month total return
13-minus-34-week exponential moving average cross-over

Below we plot over time the distance each of these signals is from turning off. Whenever the line crosses over the 0% threshold, it means the signal has flipped direction, with negative values indicating a sell and positive values indicating a buy.

In orange we highlight those periods where the signal is within 1% of changing direction. We can see that for each signal there are numerous occasions where the signal was within this threshold but avoided flipping direction. Similarly, we can see a number of scenarios where the signal just breaks the 0% threshold only to revert back shortly thereafter. In the former case, the signal has often just managed to avoid whipsaw, while in the latter it has usually become unfortunately subject to it.

Source: Kenneth French Data Library. Calculations by Newfound Research.

Is the avoidance of whipsaw representative of the “skill” of the signals while the realization of whipsaw is just bad luck? Or might it be that the avoidance of whipsaw is often just as much luck as the realization of whipsaw is poor skill? How can we determine what is skill and what is luck when there are so many “close calls” and “just hits”?

What is potentially confusing for investors new to this space is that academic literature and practitioner evidence finds that these highly simplified approaches are surprisingly robust across a variety of investment vehicles, geographies, and time periods. What we must stress, however, is that evidence of general robustness is not evidence of specific robustness; i.e. there is little evidence suggesting that a single approach applied to a single instrument over a specific time horizon will be particularly robust.

What Randomness Tells Us About Fragility

To emphasize the potential fragility on utilizing a single in-or-out signal to drive our allocation decisions, we run a simple test:

Begin with daily market returns
Add a small amount of white noise (mean 0%; standard deviation 0.025%) to daily market returns
Calculate a long/flat trend equity strategy using 12-1 month momentum signals⁴
Calculate the rolling 12-month return of the strategy minus the alternate market history return.
Repeat 1,000 times to generate 1,000 slightly alternate histories.

The design of this test aims to deduce how fragile a strategy is via the introduction of randomness. By measuring 12-month rolling relative returns versus the modified benchmarks, we can compare the 1,000 slightly alternate histories to one another in an effort to determine the overall stability of the strategy itself.

Now bear with us, because while the next graph is a bit difficult to read, it succinctly captures the thrust of our entire thesis. At each point in time, we first calculate the average 12-month relative return of all 1,000 strategies. This average provides a baseline of expected relative strategy performance.

Next, we calculate the maximum and minimum relative 12-month relative performance and subtract the average. This spread – which is plotted in the graph below – aims to capture the potential return differential around the expected strategy performance due to randomness. Or, put another way, the spread captures the potential impact of luck in strategy results due only to slight changes in market returns.

Source: Kenneth French Data Library. Calculations by Newfound Research.

We can see that the spread frequently exceeds 5% and sometimes even exceeds 10. Thus, a tiny bit of injected randomness has a massive effect upon our realized results. Using a single signal to drive our allocation appears particularly fragile and success or failure over the short run can largely be dictated by the direction the random winds blow.

A backtest based upon a single signal may look particularly good, but this evidence suggests we should dampen our confidence as the strategy may actually have just been the accidental beneficiary of good fortune. In this situation, it is nearly impossible to identify skill from luck when in a slightly alternate universe we may have had substantially different results. After all, good luck in the past can easily turn into misfortune in the future.

Now let us perform the same exercise again using the same random sequences we generated. But rather than using a single signal to drive our allocation we will blend the three trend-following approaches above to determine the proportional amount of equities the portfolio should hold.⁵ We plot the results below using the same scale in the y-axis as the prior plot.

Source: Kenneth French Data Library. Calculations by Newfound Research.

We can see that our more complicated approach actually exhibits a significant reduction in the effects of randomness, with outlier events significantly decreased and far more symmetry in both positive and negative impacts.

Below we plot the actual spreads themselves. We can see that the spread from the combined signal approach is lower than the single signal approach on a fairly consistent basis. In the cases where the spread is larger, it is usually because the sensitivity is arising from either the 10-month SMA or 13-minus-34-week EWMA signals. Were spreads for single signal strategies based upon those approaches plotted, they would likely be larger during those time periods.

Source: Kenneth French Data Library. Calculations by Newfound Research.

Conclusion

So, where is the balance? How can we tell when simplicity creates robustness and simplicity introduces fragility? As we discussed in our article A Case Against Overweighting International Equity, we believe the answer is diversificationversus estimation risk.

In our case above, each trend signal is just a model: an estimate of what the underlying trend is. As with all models, it is imprecise and our confidence level in any individual signal at any point in time being correct may actually be fairly low. We can wrap this all together by simply saying that each signal is actually shrouded in a distribution of estimation risk. But by combining multiple trend signals, we exploit the benefits of diversification in an effort to reduce our overall estimation risk.

Thus, while we may consider a multi-model approach less transparent and more complicated, that added layer of complication serves to increase internal diversification and reduce estimation risk.

It should not go overlooked that the manner in which the signals were blended represents a model with its own estimation risk. Our choice to simply equally-weight the signals indicates a zero-confidence position in views about relative model accuracy and relative marginal diversification benefits among the models. Had we chosen a more complicated method of combining signals, it is entirely possible that the realized estimation risk could overwhelm the diversification gain we aimed to benefit from in the first place. Or, conversely, that very same added estimation risk could be entirely justified if we could continue to meaningfully improve diversification benefits.

If we return back to our original example of a 100% equity portfolio versus a blended stock-bond mix, the diversification versus estimation risk trade-off becomes obvious. Introducing bonds into our portfolio creates such a significant diversification gain that the estimation risk is often an insignificant consideration. The same might not be true, however, in a tactical equity portfolio.

Research and empirical evidence suggest that simplicity is surprisingly robust. But we should be skeptical of simplicity for the sake of simplicity when it foregoes low-hanging diversification opportunities, lest we make our portfolios and strategies unintentionally fragile.

Attack of the Clone: Lessons from Replicating Long/Short Equity

By Corey Hoffstein

On October 22, 2018

In Risk & Style Premia, Weekly Commentary

This post is available as a PDF download here.

Summary

In this commentary we attempt to identify the sources of performance in long/short equity strategies.
Using Kalman Filtering, we attempt to replicate the Credit Suisse Long/Short Liquid Index with a set of common factors designed to capture equity beta, regional, and style tilts.
We find that as a category, long/short equity managers make significant changes to their equity beta and regional tilts over time.
Year-to-date, we find that tilts towards foreign developed equities, emerging market equities, and the value premium have been the most significant detractors from index performance.
We believe that the consistent relative out-performance of U.S. equities against international peers has removed an important alpha source for long/short equity managers when they are benchmarked against U.S. equities.

Please note that analysis performed in this commentary is only through 8/31/2018 despite a publishing date of 10/22/2018 due to data availability.

Introduction

Since 4/30/1994, the Credit Suisse Long/Short Equity Hedge Fund (“CS L/S EQHF”) Index has returned 9.0% annualized with an 8.8% annualized volatility and a maximum drawdown of just 22%. While the S&P 500 has bested it on an absolute return basis – returning 10.0% annualized – it has done so with considerably more risk, exhibiting 14.4% annualized volatility and a maximum drawdown of 51%. Capturing 90% of the long-term annualized return of the S&P 500 with only 60% of the volatility and less than half the maximum drawdown is an astounding feat. Particularly because this is not the performance of a single star manager, but the blended returns of dozens of managers.

Yet absolute performance in this category has languished as of late. While the S&P 500 has returned an astounding 13.5% annualized over the last five years, the CS L/S EQHF Index has only returned 5.6% annualized. Of course, returns are only part of the story, but this performance is in stark contrast to the relative performance experienced during the 2003-2007 bull market. From 12/31/2003 to 12/31/2007, the average rolling 1-year performance difference between the S&P 500 and the CS L/S EQHF Index was less than 1 basis point whereas the average rolling 1-year performance differential from 12/31/2010 to 12/31/2017 was 877 basis points. Year-to-date performance in 2018 has been no exception to this trend. The CS L/S EQHF Index is up just 2.1% compared to a positive 9.7% for the S&P 500, with several popular strategies faring far worse.

Now, before we dive any deeper, we want to address the obvious: comparing long/short equity returns against the S&P 500 is foolish. The long-term beta of the category is less than 0.5, so it should not come as a surprise that absolute returns have languished during a period where vanilla U.S. equity beta has been one of the best performing asset classes. Nevertheless, while the CS L/S EQHF typically exhibited higher risk-adjusted returns than equity beta from 1994 through 2011, the reverse has been true since 2012.

Identifying precisely why both absolute and relative risk-adjusted performance has declined over the last several years can be difficult, as the category as a whole is incredibly varied in nature. Consider this index definition from Credit Suisse:

The Credit Suisse Long/Short Equity Hedge Fund Index is a subset of the Credit Suisse Hedge Fund Index that measures the aggregate performance of long/short equity funds. Long/short equity funds typically invest in both long and short sides of equity markets, generally focusing on diversifying or hedging across particular sectors, regions or market capitalizations. Managers typically have the flexibility to shift from value to growth; small to medium to large capitalization stocks; and net long to net short. Managers can also trade equity futures and options as well as equity-related securities and debt or build portfolios that are more concentrated than traditional long-only equity funds.

The wide degree of flexibility means that we would expect significant dispersion in individual strategy performance. Examining a broad index may still be useful, however, as we may be able to decipher the large muscle movements that have driven common performance. In order to do so, we have to get under the hood and try to replicate the index using common factor exposures.

Figure 1: Credit Suisse Long/Short Equity Indices

Data from 12/1993-8/2018

	Annualized Return	Annualized Volatility	Sharpe Ratio
Credit Suisse Long/Short Hedge Fund Index	8.6%	8.9%	0.68
Credit Suisse Long/Short Liquid Index	7.7%	9.4%	0.60
Credit Suisse AllHedge Long/Short Equity Index	3.6%	8.0%	0.29

Source: Kenneth French Data Library and Credit Suisse. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results.

Replicating Long/Short Equity Returns

To gain a better understanding of what is driving long/short equity returns, we attempt to construct a strategy that replicates the returns of the Credit Suisse Long/Short Liquid Index (“CS L/S LAB”). We have selected this index because return data is available on a daily basis, unlike many other long/short equity indexes which only provide monthly returns.

It is worth noting that this index is itself a replicating index, attempting to track the CS L/S EQHF Index using liquid instruments. In other words, we’re attempting a rather meta experiment: replicating a replicator. This may introduce unintended noise into our effort, but we feel that the benefit of daily index level data more than offsets this risk.

Based upon the category description above, we pre-construct several long/short indices that aim to isolate equity beta, regional tilts, and style tilt effects. To capture beta, we construct the following long/short index:

Long S&P 500 / Short Cash: The excess returns offered by U.S. large-cap equities

To capture regional, size, and industry effects, we construct the following long/short indexes:

Long Russell 2000 / Short S&P 500: Relative performance of small-cap equities versus large-cap equities
Long MSCI EAFE / Short S&P 500: Relative performance of international developed equities versus U.S. equities
Long MSCI EM / Short S&P 500: Relative performance of emerging market equities versus U.S. equities
Long Nasdaq 100 / Short S&P 500: Relative performance of “concentrated” large-cap equities versus broad large-cap equities¹

To capture certain style premia, we construct the following long/short indexes:

Long Russell 1000 Value / Short Russell 1000 Growth: Relative performance of large-cap value versus large-cap growth.²
Long High Momentum / Short Low Momentum: Relative performance of recent winners versus recent losers.

All long/short indexes are assumed to be dollar-neutral in construction and are rebalanced on a monthly basis.

A simple way of implementing index tracking is through a rolling-window regression. In such an approach, the returns of the CS L/S LAB Index are regressed against the returns of the long/short portfolios. The factor loadings would then reflect the weights of the replicating portfolio.

In practice, the problem with such an approach is that achieving statistical significance requires a number of observations far in excess to the number of factors. Were we to use monthly returns, for example, we might need to employ upwards of three years of data. Yet, as we know from the introductory description of the long/short equity category, these strategies are likely to change their exposures rapidly, even on an aggregate scale. One potential solution is to employ weekly or daily returns. Yet even when this data is available, we must still determine the appropriate rolling window length as well as consider how to handle statistically insignificant explanatory variables and perform model selection.

With this in mind, we elected to utilize an approach called Kalman Filtering. This algorithm is designed to produce estimates for a series of unknown variables based upon a series of inputs that may contain statistical noise or other inaccuracies. The benefit of this model is that we need not specify a lookback window: the model dynamically updates for each new observation based upon how well the model fits the data and how noisy the algorithm believes the data to be.

As it pertains to the problem at hand, we set up our unknown variables to be the weights of the replicating factors in our portfolio. We feed the algorithm the daily returns of these factors and set it to solve for the weights that will minimize the tracking distance to the daily returns of the CS L/S LAB Index. In Figure 2 we plot the cumulative returns of the CS L/S LAB Index and our Kalman Tracker portfolio. We can see that while the Kalman Tracker does not perfectly capture the magnitude of the moves exhibited by the CS L/S LAB Index, it does generally capture the shape and significant transitions within the index. While not a perfect replica, this may be a “good enough” approximation for us to glean some information from the underlying exposures.

Figure 2: Credit Suisse Long/Short Liquid Index and Hypothetical Kalman Tracker

Source: Kenneth French Data Library, Credit Suisse, and CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results. Index returns are total returns and are gross of all fees except for underlying ETF expense ratios of ETFs utilized by the Kalman Tracker. The Kalman Tracker does not reflect any strategy offered or managed by Newfound Research and was constructed exclusively for the purpose of this commentary.

The Time-Varying Exposures of Long/Short Equity

In Figure 3 we plot the underlying factor weights of our replicating strategy over time, specifically magnifying year-to-date exposures.

Figure 3: Underlying Exposure Weights for Kalman Tracker

We can see several effects:

Factor exposures do indeed exhibit significant time-varying behavior. For example, prior to 2008 there was a large tilt towards foreign-developed equities, whereas post-2008 exposure remained largely U.S. focused.
Beta exposure is time-varying. While there is latent beta exposure in the long/short factors, we can approximate overall beta exposure by simply isolating S&P 500 exposure. In March 2008, exposure peaked at 72% and then was cut quickly throughout the year. By January 2009, the index was net short. Post-crisis, exposure was rebuilt back to nearly 70% by September 2011, but has been declining since. Exposure currently sits at 28%. Has all this equity timing been valuable? In Figure 4 we plot the cumulative return of the index’s long-term average beta exposure and the cumulative return from beta timing. We can see that beta timing has, over the long run, been neither a significant contributor nor detractor from performance. Yet crisis-period returns suggest that long/short equity strategies may employ convex trading strategies (e.g. trend-following or constant proportion portfolio insurance).
Size, value, and momentum tilts are not particularly significant in magnitude, with the exception of value during the 2008 crisis. Interestingly, exposure to value was negative during that time period, implying that the index was long growth and short value. Concentrated large-cap exposure has been a rather consistent bet in the post-2008 era, reflecting a tilt towards growth.
Regional bets have been largely absent post-2008, at least with respect to their pre-2008 magnitude. We think it is important to pause and acknowledge the impact that benchmarking can have on perceived value add. Consider Figure 5 where we plot the cumulative returns of regional tilts towards international developed and emerging markets. We can see that prior to 2008, a tilt away from U.S. equities was successful in both cases, and after 2011 both were a losing bet. In the post-2011 environment, if a manager successfully makes the call to tilt towards U.S. equities, an entirely U.S. equity benchmark will effectively nullify the impact since the bet is already fully encapsulated in the benchmark! In other words, by choice of benchmark we have eliminated a source of value-add for the manager. Had we elected a global equity benchmark instead, the manager’s flexibility could potentially create value in both environments.

Figure 4: Cumulative Returns of Kalman Tracker’s Long-Term Average S&P 500 Exposure and Time-Varying Exposure

Source: Kenneth French Data Library, Credit Suisse, and CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results.

Figure 5: Cumulative Returns of Regional Tilts

Source: CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results.

What has driven performance in 2018? We see three primary components.

Entering the year, the index carried a nearly 40% allocation to equity beta. While exposure declined to about 33% by the end of the month, it was rapidly cut down to just 20% after the first week February. By mid-March this position was rebuilt to approximately 30%.We estimate that average beta exposure has been a 3.4% contributor to year-to-date returns, while market timing has been a -0.3% detractor.
After Q1, there was an increase in exposure to MSCI EAFE, MSCI EM, and value tilts. We estimate that these tilts have been -1.9%, -2.3%, and -1.1% detractors from performance, respectively.It is possible that these tilts all reflect the same underlying bet towards global value. Or it may be the case that the global tilts reflect a bet on a weakening dollar. We should not hesitate to remember that these figures are all statistically derived, so an equally valid possibility is that they are entirely wrong in the first place. It is worth noting that the value tilt – which is expressed as long Russell 1000 Value and short Russell 1000 Growth – does neutralize some of the sectors tilts expressed in the concentrated large-cap position discussed in the next bullet. The true net effect may not actually be a tilt towards value within the index, but rather just a reduction in the tilt towards growth.
The largest positive contributor to returns year-to-date has been the concentrated large-cap tilt. Implemented as long Nasdaq 100 / short S&P 500, this tilt largely expresses a bet on information technology, telecommunication services, and consumer discretionary sectors. Specifically, year-to-date is represents a significant overweight towards individual names like Apple, Amazon, Microsoft, Google, and Facebook.

Conclusion

Has long/short equity lost its mojo?

By replicating index performance using liquid factors, we can extract the common drivers of performance. What we found was that pre-2008 performance was largely driven by equity beta and a significant tilt towards foreign developed equities.

After 2011, regional tilts were losing bets. Fortunately, we can see that such tilts were significantly reduced – if not outright removed – from the index composition. Nevertheless, if we benchmark to a U.S. equity index (even if properly risk-adjusted), the accuracy of this trade will be entirely discounted because it is fully embedded in the index itself. In other words, by benchmarking against U.S. equities, the best a manager can do during a period when U.S. equities outperform is keep up with the index. Consider that year-to-date the MSCI ACWI has returned just 3.5%: much closer to the 2.1% of the CS L/S EQHF Index quoted in the introduction.

We can also see a significant tilt towards concentrated U.S. equities in the post-crisis era. This trade captured the relative performance of sectors like technology, telecommunication services, and consumer discretionary and from 12/31/2009 to 8/31/2018 returned 4.5% annualized.

Taken together, it is hard to argue that aggregate timing skill is not being displayed in the long/short equity category. We simply have to use the right measuring stick and not expect the timing to work over every shorter-term period.

Of course, this analysis should all be taken with a grain of salt. Our replicating index is by no means a perfect fit (though it is a very good fit from 2012 onward) and it is entirely possible that we selected the wrong set of explanatory features. Furthermore, we have only analyzed one index. The performance of the Credit Suisse Liquid Long/Short Index is not identical to that of the HFRI Equity Hedge Index, the Wilshire BRI Long/Short Equity Index, or the Morningstar Global Long/Short Index. Analysis using those indices may very well lead to different conclusions. Finally, the mathematics of this exercise does not make the factor tea-leaves any easier to decipher: we are ultimately attempting to create a narrative where one need not apply.

It is worth acknowledging that our analysis is categorical about an asset class where investors have little ability to make an indexed investment. Rather, allocation to long/short equity is still dominated by individual manager selection. This means that that investor mileage will vary considerably and that our analysis herein may not apply to any specific manager. After all, we are attempting to analyze aggregate results and it is impossible to unscramble eggs.

Yet it does raise the question: if the aggregate category has such attractive features and can be tracked well with liquid factors, why have trackers not taken off as a popular – and much lower cost – solution for investors looking to index their long/short equity exposure? Another potential solution may be for investors to unbundle and rebuild. For example, we find that the beta exposure of $1 invested in the long/short category can be captured efficiently by $0.5 of trend equity exposure, freeing up $0.5 for other high-conviction alpha strategies.

Diversifying core equity exposure is a goal of many investors. Long/short equity provides one way to do this. In addition to potentially highlighting some of the performance drivers for long/short equity, this replication exercise shows that there may be other, more transparent, ways to achieve this goal.

A Carry-Trend-Hedge Approach to Duration Timing

By Corey Hoffstein

On October 15, 2018

In Carry, Risk & Style Premia, Trend, Weekly Commentary

This post is available as a PDF download here.

Summary

In this paper we discuss simple rules for timing exposure to 10-year U.S. Treasuries.
We explore signals based upon the slope of the yield curve (“carry”), prior returns (“trend”), and prior equity returns (“hedge”).
We implement long/short implementations of each strategy covering the time period of 1962-2018.
We find that all three methods improve both total and risk-adjusted returns when compared to long-only exposure to excess bond returns.
Naïve combination of both strategies and signals appears to improve realized risk-adjusted returns, promoting the benefits of process diversification.

Introduction

In this strategy brief, we discuss three trading rules for timing exposure to duration. Specifically, we seek to time the excess returns generated from owning 10-year U.S. Treasury bonds over short rates. This piece is meant as a companion to our prior, longer-form explorations Duration Timing with Style Premiaand Timing Bonds with Value, Momentum, and Carry. In contrast, the trading rules herein are simplistic by design in an effort to highlight the efficacy of the signals.

We explore three different signals in this piece:

The slope of the yield curve (“term spread”);
Prior realized excess bond returns; and
Prior realized equity market returns.

In contrast to prior studies, we do not consider traditional value measures, such as real yields, or explicit estimates of the bond risk premium, as they are less easily calculated. Nevertheless, the signals studied herein capture a variety of potential influences upon bond markets, including inflation shocks, economic shocks, policy shocks, marginal utility shocks, and behavioral anomalies.

The strategies based upon our signals are implemented as dollar-neutral long/short portfolios that go long a constant maturity 10-year U.S. Treasury bond index and short a short-term U.S. Treasury index (assumed to be a 1-year index prior to 1982 and a 3-month index thereafter). We compare these strategies to a “long-only” implementation that is long the 10-year U.S. Treasury bond index and short the short-term U.S. Treasury index in order to capture the excess realized return associated with duration.

Implementing our strategies as dollar-neutral long/short portfolios allows them to be interpreted in a variety of useful manners. For example, one obvious interpretation is an overlay implemented on an existing bond portfolio using Treasury futures. However, another interpretation may simply be to guide investors as to whether to extend or contract their duration exposure around a more intermediate-term bond portfolio (e.g. a 5-year duration).

At the end of the piece, we explore the potential diversification benefits achieved by combining these strategies in both an integrated (i.e. signal combination) and composite (i.e. strategy combination) fashion.

Slope of the Yield Curve

In past research on timing duration, we considered explicit measures of the bond risk premium as well as valuation. In Duration Timing with Style Premiawe used a simple signal based upon real yield, which had the problem of being predominately long over the last several decades. In Timing Bonds with Value, Momentum, and Carry we compared a de-trended real yield against recent levels in an attempt to capture more short-term valuation fluctuations.

In both of these prior research pieces, we also explicitly considered the slope of the yield curve as a predictor of future excess bond returns. One complicating factor to carry signals is that rate steepness simultaneously captures both the expectation of rising short rates as well as an embedded risk premium. In particular, evidence suggests that mean-reverting rate expectations dominate steepness when short rates are exceptionally low or high. Anecdotally, this may be due to the fact that the front end of the curve is determined by central bank policy while the back end is determined by inflation expectations.

Thus, despite being a rather blunt measure, steepness may simultaneously be related to business cycles, credit cycles and monetary policy cycles. To quote Ilmanen (2011):

A steep [yield curve] coincides with high unemployment rate (correlation +0.45) and predictsfast economic growth. [Yield curve] countercyclicality may explain its ability to predict near-term bond and stock returns: high required premia near business cycle troughs result in a steep [yield curve], while low required premia near business cycle peaks result in an inverted [yield curve].

Therefore, while estimates of real yield may seek to be explicit measures of value, we may consider carry to be an ancillary measure as well, as a high carry tends to be associated with a high term premium. In Figure 1 we plot the annualized next month excess bond return based upon the quartile (using the prior 10 years of information) that the term spread falls into. We can see a significant monotonic improvement from the 1^stto the 4^thquartiles, indicating that higher levels of carry, relative to the past, are positive indicators of future returns.

Therefore, we construct our carry strategy as follows:

At the end of each month, calculate the term spread between 10- and 1-year U.S. Treasuries.
Calculate the realized percentile of this spread by comparing it against the prior 10-years of daily term spread measures.
If the carry score is in the top two thirds, go long excess bond returns. If the carry score is in the bottom third, go short excess bond returns.
Trade at the close of the 1^sttrading day of the month.

Returns for this strategy are plotted in Figure 2. Our research suggests that the backtested results of this model can be significantly improved through the use of longer holding periods and portfolio tranching. Another potential improvement is to scale exposure linearly to the current percentile. We will leave these implementations as exercises to readers.

Figure 1

Source: Kenneth French Data Library, Federal Reserve of St. Louis. Calculations by Newfound Research. Returns are backtested and hypothetical. Return data relies on hypothetical indices and is exclusive of all fees and expenses. Returns assume the reinvestment of all distributions. The Carry Long/Short strategy does not reflect any strategy offered or managed by Newfound Research and was constructed exclusively for the purposes of this commentary. It is not possible to invest in an index. Past performance does not guarantee future results.

Figure 2

Data from 1972-2018

	Annualized Return	Annualized Volatility	Sharpe Ratio
Long Only	2.1%	7.6%	0.27
CARRY L/S	2.6%	7.7%	0.33

Trend in Bond Returns

Momentum, in both its relative and absolute (i.e. “trend”) forms, has a long history among both practitioners and academics (see our summary piece Two Centuries of Momentum).

The literature covering momentum in bond returns, however, varies in precisely whatprior returns matter. There are three primary categories: (1) change in bond yields (e.g. Ilmanen (1997)), (2) total return of individual bonds (e.g. Kolanovic and Wei (2015) and Brooks and Moskowitz (2017)), and (3) total return of bond indices (or futures) (e.g. Asness, Moskowitz, and Pedersen (2013), Durham (2013), and Hurst, Ooi, Pedersen (2014))

In our view, the approaches have varying trade-offs:

While empirical evidence suggests that nominal interest rates can exhibit secular trends, rate evolution is most frequently modeled as mean-reversionary. Our research suggests that very short-term momentum can be effective, but leads to a significant amount of turnover.
The total return of individual bonds makes sense if we plan on running a cross-sectional bond model (i.e. identifying individual bonds), but is less applicable if we want to implement with a constant maturity index.
The total return of a bond index may capture past returns that are attributable to securities that have been recently removed.

We think it is worth noting that the latter two methods can capture yield curve effects beyond shift, including roll return, steepening and curvature changes. In fact, momentum in general may even be able to capture other effects such as flight-to-safety and liquidity (supply-demand) factors.

In this piece, we elect to measure momentum as an exponentially-weighting average of prior log returns of the total return excess between long and short bond indices. We measure this average at the end of each month and go long duration when it is positive and short duration when it is negative. In Figure 4 we plot the results of this method based upon a variety of lookback periods that approximate 1-, 3-, 6-, and 12-month formation periods.

Figure 3

	MOM 21	MOM 63	MOM 126	MOM 252
MOM 21	1.00	0.87	0.65	0.42
MOM 63	0.87	1.00	0.77	0.53
MOM 126	0.65	0.77	1.00	0.76
MOM 252	0.42	0.53	0.76	1.00

We see varying success in the methods, with only MOM 63 and MOM 256 exhibiting better risk-adjusted return profiles. Despite this long-term success, we can see that MOM 63 remains in a drawdown that began in the early 2000s, highlighting the potential risk of relying too heavily on a specific measure or formation period. In Figure 3 we calculate the correlation between the different momentum strategies. As we found in Measuring Process Diversification in Trend Following, diversification opportunities appear to be available by mixing both short- and long-term formation periods.

With this in mind, we elect for the following momentum implementation:

At the end of each month, calculate both a 21- and 252-day exponentially-weighted moving average of realized daily excess log returns.
When both signals are positive, go long duration; when both signals are negative, go short duration; when signals are mixed, stay flat.
Rebalance at the close of the next trading day.

The backtested results of this strategy are displayed in Figure 5.

As with carry, we find that there are potential craftsmanship improvements that can be made with this strategy. For example, implementing with four tranches, weekly rebalances appears to significantly improve backtested risk-adjusted returns. Furthermore, there may be benefits that can be achieved by incorporating other means of measuring trends as well as other lookback periods (see Diversifying the What, When, and How of Trend Following and Measuring Process Diversification in Trend Following).

Figure 4

Data from 1963-2018

	Annualized Return	Annualized Volatility	Sharpe Ratio
Long Only	1.5%	7.3%	0.21
MOM 21	1.4%	7.5%	0.19
MOM 63	1.8%	7.4%	0.25
MOM 128	1.3%	7.4%	0.18
MOM 252	1.9%	7.4%	0.26

Source: Kenneth French Data Library, Federal Reserve of St. Louis. Calculations by Newfound Research. Returns are backtested and hypothetical. Return data relies on hypothetical indices and is exclusive of all fees and expenses. Returns assume the reinvestment of all distributions. The Momentum strategies do not reflect any strategies offered or managed by Newfound Research and were constructed exclusively for the purposes of this commentary. It is not possible to invest in an index. Past performance does not guarantee future results.

Figure 5

Data from 1963-2018

	Annualized Return	Annualized Volatility	Sharpe Ratio
Long Only	1.5%	7.2%	0.21
MOM L/S	1.7%	6.3%	0.28

Source: Kenneth French Data Library, Federal Reserve of St. Louis. Calculations by Newfound Research. Returns are backtested and hypothetical. Return data relies on hypothetical indices and is exclusive of all fees and expenses. Returns assume the reinvestment of all distributions. The Momentum Long/Short strategy does not reflect any strategy offered or managed by Newfound Research and was constructed exclusively for the purposes of this commentary. It is not possible to invest in an index. Past performance does not guarantee future results.

Safe-Haven Premium

Stocks and bonds generally exhibit a positive correlation over time. One thesis for this long-term relationship is the present value model, which argues that declining yields, and hence increasing bond prices, increase the value of future discounted cash flows and therefore the fair value of equities. Despite this long-term relationship, shocks in economic growth, inflation, and even monetary policy can overwhelm the discount rate thesis and create a regime-varying correlation structure.

For example, empirical evidence suggests that high quality bonds can exhibit a safe haven premium during periods of economic stress. Using real equity prices as a proxy for wealth, Ilmanen (1995) finds that “wealth-dependent relative risk aversion appears to be an important source of bond return predictability.” Specifically, inverse wealth is a significant positive predictor of future excess bond returns at both world and local (U.S., Canada, Japan, Germany, France, and United Kingdom) levels. Ilmanen (2003) finds that, “stock-bond correlations are more likely to be negative when inflation is low, growth is slow, equities are weak, and volatility is high.”

To capitalize on this safe-haven premium, we derive a signal based upon prior equity returns. Specifically, we utilize an exponentially weighted average of prior log returns to estimate the underlying trend of equities. We then compare this estimate to a 10-year rolling window of prior estimates, calculating the current percentile.

In Figure 6 we plot the annualized excess bond return for the month following, assuming signals are generated at the close of each month and trades are placed at the close of the following trading day. We can see several effects. First, next month returns for 1^stquartile equity momentum – i.e. very poor equity returns – tends to be significantly higher than other quartiles. Second, excess bond returns in the month following very strong equity returns tend to be poor. We would posit that these two effects are two sides of the same coin: the safe-haven premium during 1^stquartile periods and an unwind of the premium in 4^thquartile periods. Finally, we can see that 2^ndand 3^rdquartile returns tend to be positive, in line with the generally positive excess bond return over the measured period.

In an effort to isolate the safe-haven premium, we construct the following strategy:

At the end of each month, calculate an equity momentum measure by taking a 63-day exponentially weighted average of prior daily log-returns.
Calculate the realized percentile of this momentum measure by comparing it against the prior 10-years of daily momentum measures.
If the momentum score is in the bottom quartile, go long excess bond returns. If the momentum score is in the top quartile, go short excess bond returns. Otherwise, remain flat.
Trade at the close of the 1^sttrading day of the month.

Returns for this strategy are plotted in Figure 7. As expected based upon the quartile design, the strategy only spends 24% of its time long, 23% of its time short, and the remainder of its time flat. Despite this even split in time, approximately 2/3^rdsof the strategy’s return comes from the periods when the strategy is long.

Figure 6

Source: Kenneth French Data Library, Federal Reserve of St. Louis. Calculations by Newfound Research. Returns are backtested and hypothetical. Return data relies on hypothetical indices and is exclusive of all fees and expenses. Returns assume the reinvestment of all distributions. The Equity Momentum Long/Short strategy does not reflect any strategy offered or managed by Newfound Research and was constructed exclusively for the purposes of this commentary. It is not possible to invest in an index. Past performance does not guarantee future results.

Figure 7

Data from 1962-2018

	Annualized Return	Annualized Volatility	Sharpe Ratio
Long Only	1.5%	7.2%	0.21
Equity Mom L/S	1.9%	5.7%	0.34

Source: Kenneth French Data Library, Federal Reserve of St. Louis. Calculations by Newfound Research. Returns are backtested and hypothetical. Return data relies on hypothetical indices and is exclusive of all fees and expenses. Returns assume the reinvestment of all distributions. The Equity Momentum Long/Short strategy does not reflect any strategy offered or managed by Newfound Research and was constructed exclusively for the purposes of this commentary. It is not possible to invest in an index. Past performance does not guarantee future results.

Combining Signals

Despite trading the same underlying instrument, variation in strategy construction means that we can likely benefit from process diversification in constructing a combined strategy. Figure 8 quantifies the available diversification by measuring full-period correlations among the strategies from joint inception (1972). We can also see that the strategies exhibit low correlation to the Long Only implementation, suggesting that they may introduce diversification benefits to a strategic duration allocation as well.

Figure 8

	LONG ONLY	CARRY L/S	MOM L/S	EQ MOM L/S
LONG ONLY	1.00	0.42	0.33	-0.09
CARRY L/S	0.42	1.00	0.40	-0.09
MOM L/S	0.33	0.40	1.00	-0.13
EQ MOM L/S	-0.10	-0.10	-0.19	1.00

We explore two different implementations of a diversified strategy. In the first, we simply combine the three strategies in equal-weight, rebalancing on a monthly basis. This implementation can be interpreted as three sleeves of a larger portfolio construction. In the second implementation, we combine underlying long/short signals. When the net signal is positive, the strategy goes 100% long duration and when the signal is negative, it goes 100% short. This can be thought of as an integrated approach that takes a majority-rules voting approach. Results for these strategies are plotted in Figure 9. We note the substantial increase in the backtested Sharpe Ratio of these diversified approaches in comparison to their underlying components outlined in prior sections.

It is important to note that despite strong total and risk-adjusted returns, the strategies spend only approximately 54% of their time net-long duration, with 19% of their time spent flat and 27% of their time spent short. While slightly biased long, this breakdown provides evidence that strategies are not simply the beneficiaries of a bull market in duration over the prior several decades.

Figure 9

Data from 1972-2018

	Annualized Return	Annualized Volatility	Sharpe Ratio
Long Only	2.1%	7.6%	0.27
Combined L/S	2.5%	4.3%	0.58
Integrated L/S	3.5%	7.1%	0.49

Source: Kenneth French Data Library, Federal Reserve of St. Louis. Calculations by Newfound Research. Returns are backtested and hypothetical. Return data relies on hypothetical indices and is exclusive of all fees and expenses. Returns assume the reinvestment of all distributions. Neither the Combined Long/Short or Integrated Long/Short strategies reflect any strategy offered or managed by Newfound Research and were constructed exclusively for the purposes of this commentary. It is not possible to invest in an index. Past performance does not guarantee future results.

Conclusion

In this research brief, we continued our exploration of duration timing strategies. We aimed to implement several signals that were simple by construction. Specifically, we evaluated the impact of term spread, prior excess bond returns, and prior equity returns on next month’s excess bond returns. Despite their simplicity, we find that all three signals can potentially offer investors insight for tactical timing decisions.

While we believe that significant craftsmanship improvements can be made in all three strategies, low hanging improvement may simply come from combining the approaches. We find a meaningful improvement in Sharpe Ratio by naively combining these strategies in both a sleeve-based and integrated signal fashion.

Bibliography

Asness, Clifford S. and Moskowitz, Tobias J. and Pedersen, Lasse Heje, Value and Momentum Everywhere (June 1, 2012). Chicago Booth Research Paper No. 12-53; Fama-Miller Working Paper. Available at SSRN: https://ssrn.com/abstract=2174501 or http://dx.doi.org/10.2139/ssrn.2174501

Brooks, Jordan and Moskowitz, Tobias J., Yield Curve Premia (July 1, 2017). Available at SSRN: https://ssrn.com/abstract=2956411 or http://dx.doi.org/10.2139/ssrn.2956411

Durham, J. Benson, Momentum and the Term Structure of Interest Rates (December 1, 2013). FRB of New York Staff Report No. 657. Available at SSRN: https://ssrn.com/abstract=2377379 or http://dx.doi.org/10.2139/ssrn.2377379

Hurst, Brian and Ooi, Yao Hua and Pedersen, Lasse Heje, A Century of Evidence on Trend-Following Investing (June 27, 2017). Available at SSRN: https://ssrn.com/abstract=2993026 or http://dx.doi.org/10.2139/ssrn.2993026

Ilmanen, Antti, Time-Varying Expected Returns in International Bond Markets, Journal of Finance, Vol. 50, No. 2, 1995, pp. 481-506.

Ilmanen, Antti, Forecasting U.S. Bond Returns, Journal of Fixed Income, Vol. 7, No. 1, 1997, pp. 22-37.

Ilmanen, Antti, Stock-Bond Correlations, Journal of Fixed Income, Vol. 13, No. 2, 2003, pp. 55-66.

Ilmanen, Antti. Expected Returns an Investor’s Guide to Harvesting Market Rewards. John Wiley, 2011.

Kolanovic, Marko, and Wei, Zhen, Momentum Strategies Across Asset Classes (April 2015). Available at https://www.cmegroup.com/education/files/jpm-momentum-strategies-2015-04-15-1681565.pdf

Author: Corey Hoffstein Page 10 of 18

Dart-Throwing Monkeys and Process Diversification

Summary­

Introduction

Dart-Throwing Monkeys

Process Diversification and Terminal Wealth Dispersion

Conclusion

What do portfolios and teacups have in common?

Summary­

Introduction

The Experiment Setup

The What

The How

The When

Conclusion

When Simplicity Met Fragility

Summary­

Introduction

So Close and Yet So Far

What Randomness Tells Us About Fragility

Conclusion

Attack of the Clone: Lessons from Replicating Long/Short Equity

Summary­

Introduction

Replicating Long/Short Equity Returns

The Time-Varying Exposures of Long/Short Equity

Conclusion

A Carry-Trend-Hedge Approach to Duration Timing

Summary­

Introduction

Slope of the Yield Curve

Trend in Bond Returns

Safe-Haven Premium

Combining Signals

Conclusion

Bibliography

Summary

Summary

Summary

Summary

Summary