Tag: timing luck

The Dumb (Timing) Luck of Smart Beta

On November 18, 2019

In Craftsmanship, Defensive, Momentum, Popular, Portfolio Construction, Risk & Style Premia, Value, Weekly Commentary

This post is available as a PDF download here.

Summary

In past research notes we have explored the impact of rebalance timing luck on strategic and tactical portfolios, even using our own Systematic Value methodology as a case study.
In this note, we generate empirical timing luck estimates for a variety of specifications for simplified value, momentum, low volatility, and quality style portfolios.
Relative results align nicely with intuition: higher concentration and less frequent rebalancing leads to increasing levels of realized timing luck.
For more reasonable specifications – e.g. 100 stock portfolios rebalanced semi-annually – timing luck ranges between 100 and 400 basis points depending upon the style under investigation, suggesting a significant risk of performance dispersion due only to when a portfolio is rebalanced and nothing else.
The large magnitude of timing luck suggests that any conclusions drawn from performance comparisons between smart beta ETFs or against a standard style index may be spurious.

We’ve written about the concept of rebalance timing luck a lot. It’s a cowbell we’ve been beating for over half a decade, with our first article going back to August 7^th, 2013.

As a reminder, rebalance timing luck is the performance dispersion that arises from the choice of a particular rebalance date (e.g. semi-annual rebalances that occur in June and December versus March and September).

We’ve empirically explored the impact of rebalance timing luck as it relates to strategic asset allocation, tactical asset allocation, and even used our own Systematic Value strategy as a case study for smart beta. All of our results suggest that it has a highly non-trivial impact upon performance.

This summer we published a paper in the Journal of Index Investing that proposed a simple solution to the timing luck problem: diversification. If, for example, we believe that our momentum portfolio should be rebalanced every quarter – perhaps as an optimal balance of cost and signal freshness – then we proposed splitting our capital across the three portfolios that spanned different three-month rebalance periods (e.g. JAN-APR-JUL-OCT, FEB-MAY-AUG-NOV, MAR-JUN-SEP-DEC). This solution is referred to either as “tranching” or “overlapping portfolios.”

The paper also derived a formula for estimating timing luck ex-ante, with a simplified representation of:

Where L is the timing luck measure, T is turnover rate of the strategy, F is how many times per year the strategy rebalances, and S is the volatility of a long/short portfolio that captures the difference of what a strategy is currently invested in versus what it could be invested in if the portfolio was reconstructed at that point in time.

Without numbers, this equation still informs some general conclusions:

Higher turnover strategies have higher timing luck.
Strategies that rebalance more frequently have lower timing luck.
Strategies with a less constrained universe will have higher timing luck.

Bullet points 1 and 3 may seem similar but capture subtly different effects. This is likely best illustrated with two examples on different extremes. First consider a very high turnover strategy that trades within a universe of highly correlated securities. Now consider a very low turnover strategy that is either 100% long or 100% short U.S. equities. In the first case, the highly correlated nature of the universe means that differences in specific holdings may not matter as much, whereas in the second case the perfect inverse correlation means that small portfolio differences lead to meaningfully different performance.

L, in and of itself, is a bit tricky to interpret, but effectively attempts to capture the potential dispersion in performance between a particular rebalance implementation choice (e.g. JAN-APR-JUL-OCT) versus a timing-luck-neutral benchmark.

After half a decade, you’d would think we’ve spilled enough ink on this subject.

But given that just about every single major index still does not address this issue, and since our passion for the subject clearly verges on fever pitch, here comes some more cowbell.

Equity Style Portfolio Definitions

In this note, we will explore timing luck as it applies to four simplified smart beta portfolios based upon holdings of the S&P 500 from 2000-2019:

Value: Sort on earnings yield.
Momentum: Sort on prior 12-1 month returns.
Low Volatility: Sort on realized 12-month volatility.
Quality: Sort on average rank-score of ROE, accruals ratio, and leverage ratio.

Quality is a bit more complicated only because the quality factor has far less consistency in accepted definition. Therefore, we adopted the signals utilized by the S&P 500 Quality Index.

For each of these equity styles, we construct portfolios that vary across two dimensions:

Number of Holdings: 50, 100, 150, 200, 250, 300, 350, and 400.
Frequency of Rebalance: Quarterly, Semi-Annually, and Annually.

For the different rebalance frequencies, we also generate portfolios that represent each possible rebalance variation of that mix. For example, Momentum portfolios with 50 stocks that rebalance annually have 12 possible variations: a January rebalance, February rebalance, et cetera. Similarly, there are 12 possible variations of Momentum portfolios with 100 stocks that rebalance annually.

By explicitly calculating the rebalance date variations of each Style x Holding x Frequency combination, we can construct an overlapping portfolios solution. To estimate empirical annualized timing luck, we calculate the standard deviation of monthly return dispersion between the different rebalance date variations of the overlapping portfolio solution and annualize the result.

Empirical Timing Luck Results

Before looking at the results plotted below, we would encourage readers to hypothesize as to what they expect to see. Perhaps not in absolute magnitude, but at least in relative magnitude.

For example, based upon our understanding of the variables affecting timing luck, would we expect an annually rebalanced portfolio to have more or less timing luck than a quarterly rebalanced one?

Should a more concentrated portfolio have more or less timing luck than a less concentrated variation?

Which factor has the greatest risk of exhibiting timing luck?

Source: Sharadar. Calculations by Newfound Research.

To create a sense of scale across the styles, below we isolate the results for semi-annual rebalancing for each style and plot it.

Source: Sharadar. Calculations by Newfound Research.

In relative terms, there is no great surprise in these results:

More frequent rebalancing limits the risk of portfolios changing significantly between rebalance dates, thereby decreasing the impact of timing luck.
More concentrated portfolios exhibit larger timing luck.
Faster-moving signals (e.g. momentum) tend to exhibit more timing luck than more stable, slower-moving signals (e.g. low volatility).

What is perhaps the most surprising is the sheer magnitude of timing luck. Consider that the S&P 500 Enhanced Value, Momentum, Low Volatility, and Quality portfolios all hold 100 securities and are rebalanced semi-annually. Our study suggests that timing luck for such approaches may be as large as 2.5%, 4.4%, 1.1%, and 2.0% respectively.

But what does that really mean? Consider the realized performance dispersion of different rebalance date variations of a Momentum portfolio that holds the top 100 securities in equal weight and is rebalanced on a semi-annual basis.

Source: Sharadar. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.

The 4.4% estimate of annualized timing luck is a measure of dispersion between each underlying variation and the overlapping portfolio solution. If we isolate two sub-portfolios and calculate rolling 12-month performance dispersion, we can see that the difference can be far larger, as one might exhibit positive timing luck while the other exhibits negative timing luck. Below we do precisely this for the APR-OCT and MAY-NOV rebalance variations.

In fact, since these variations are identical in every which way except for the date on which they rebalance, a portfolio that is long the APR-OCT variation and short the MAY-NOV variation would explicitly capture the effects of rebalance timing luck. If we assume the rebalance timing luck realized by these two portfolios is independent (which our research suggests it is), then the volatility of this long/short is approximately the rebalance timing luck estimated above scaled by the square-root of two.

Derivation: For variations v_i and v_j and overlapping-portfolio solution V, then:

Thus, if we are comparing two identically-managed 100-stock momentum portfolios that rebalance semi-annually, our 95% confidence interval for performance dispersion due to timing luck is +/- 12.4% (2 x SQRT(2) x 4.4%).

Even for more diversified, lower turnover portfolios, this remains an issue. Consider a 400-stock low-volatility portfolio that is rebalanced quarterly. Empirical timing luck is still 0.5%, suggesting a 95% confidence interval of 1.4%.

S&P 500 Style Index Examples

One critique of the above analysis is that it is purely hypothetical: the portfolios studied above aren’t really those offered in the market today.

We will take our analysis one step further and replicate (to the best of our ability) the S&P 500 Enhanced Value, Momentum, Low Volatility, and Quality indices. We then created different rebalance schedule variations. Note that the S&P 500 Low Volatility index rebalances quarterly, so there are only three possible rebalance variations to compute.

We see a meaningful dispersion in terminal wealth levels, even for the S&P 500 Low Volatility index, which appears at first glance in the graph to have little impact from timing luck.

	Minimum Terminal Wealth	Maximum Terminal Wealth
Enhanced Value	$4.45	$5.45
Momentum	$3.07	$4.99
Low Volatility	$6.16	$6.41
Quality	$4.19	$5.25

We should further note that there does not appear to be one set of rebalance dates that does significantly better than the others. For Value, FEB-AUG looks best while JUN-DEC looks the worst; for Momentum it’s almost precisely the opposite.

Furthermore, we can see that even seemingly closely related rebalances can have significant dispersion: consider MAY-NOV and JUN-DEC for Momentum. Here is a real doozy of a statistic: at one point, the MAY-NOV implementation for Momentum is down -50.3% while the JUN-DEC variation is down just -13.8%.

These differences are even more evident if we plot the annual returns for each strategy’s rebalance variations. Note, in particular, the extreme differences in Value in 2009, Momentum in 2017, and Quality in 2003.

Conclusion

In this study, we have explored the impact of rebalance timing luck on the results of smart beta / equity style portfolios.

We empirically tested this impact by designing a variety of portfolio specifications for four different equity styles (Value, Momentum, Low Volatility, and Quality). The specifications varied by concentration as well as rebalance frequency. We then constructed all possible rebalance variations of each specification to calculate the realized impact of rebalance timing luck over the test period (2000-2019).

In line with our mathematical model, we generally find that those strategies with higher turnover have higher timing luck and those that rebalance more frequently have less timing luck.

The sheer magnitude of timing luck, however, may come as a surprise to many. For reasonably concentrated portfolios (100 stocks) with semi-annual rebalance frequencies (common in many index definitions), annual timing luck ranged from 1-to-4%, which translated to a 95% confidence interval in annual performance dispersion of about +/-1.5% to +/-12.5%.

The sheer magnitude of timing luck calls into question our ability to draw meaningful relative performance conclusions between two strategies.

We then explored more concrete examples, replicating the S&P 500 Enhanced Value, Momentum, Low Volatility, and Quality indices. In line with expectations, we find that Momentum (a high turnover strategy) exhibits significantly higher realized timing luck than a lower turnover strategy rebalanced more frequently (i.e. Low Volatility).

For these four indices, the amount of rebalance timing luck leads to a staggering level of dispersion in realized terminal wealth.

“But Corey,” you say, “this only has to do with systematic factor managers, right?”

Consider that most of the major equity style benchmarks are managed with annual or semi-annual rebalance schedules. Good luck to anyone trying to identify manager skill when your benchmark might be realizing hundreds of basis points of positive or negative performance luck a year.

Timing Luck and Systematic Value

By Corey Hoffstein

On July 29, 2019

In Craftsmanship, Risk & Style Premia, Value, Weekly Commentary

This post is available as a PDF download here.

Summary

We have shown many times that timing luck – when a portfolio chooses to rebalance – can have a large impact on the performance of tactical strategies.
However, fundamental strategies like value portfolios are susceptible to timing luck, as well.
Once the rebalance frequency of a strategy is set, we can mitigate the risk of choosing a poor rebalance date by diversifying across all potential variations.
In many cases, this mitigates the risk of realizing poor performance from an unfortunate choice of rebalance date while achieving a risk profile similar to the top tier of potential strategy variations.
By utilizing strategies that manage timing luck, the investors can more accurately assess performance differences arising from luck and skill.

On August 7^th, 2013 we wrote a short blog post titled The Luck of Rebalance Timing. That means we have been prattling on about the impact of timing luck for over six years now (with apologies to our compliance department…).

(For those still unfamiliar with the idea of timing luck, we will point you to a recent publication from Spring Valley Asset Management that provides a very approachable introduction to the topic.¹)

While most of our earliest studies related to the impact of timing luck in tactical strategies, over time we realized that timing luck could have a profound impact on just about any strategy that rebalances on a fixed frequency. We found that even a simple fixed-mix allocation of stocks and bonds could see annual performance spreads exceeding 700bp due only to the choice of when they rebalanced in a given year.

In seeking to generalize the concept, we derived a formula that would estimate how much timing luck a strategy might have. The details of the derivation can be found in our paper recently published in the Journal of Index Investing, but the basic formula is:

Here T is strategy turnover, F is how many times per year the strategy rebalances, and S is the volatility of a long/short portfolio capturing the difference between what the strategy is currently invested in versus what it could be invested in.

We’re biased, but we think the intuition here works out fairly nicely:

The higher a strategy’s turnover, the greater the impact of our choice of rebalance dates. For example, if we have a value strategy that has 50% turnover per year, an implementation that rebalances in January versus one that rebalances in July might end up holding very different securities. On the other hand, if the strategy has just 1% turnover per year, we don’t expect the differences in holdings to be very large and therefore timing luck impact would be minimal.
The more frequently we rebalance, the lower the timing luck. Again, this makes sense as more frequent rebalancing limits the potential difference in holdings of different implementation dates. Again, consider a value strategy with 50% turnover. If our portfolio rebalances every other month, there are two potential implementations: one that rebalances January, March, May, etc. and one that rebalances February, April, June, etc. We would expect the difference in portfolio holdings to be much more limited than in the case where we rebalance only annually.²
The last term, S, is most easily explained with an example. If we have a portfolio that can hold either the Russell 1000 or the S&P 500, we do not expect there to be a large amount of performance dispersion regardless of when we rebalance or how frequently we do so. The volatility of a portfolio that is long the Russell 1000 and short the S&P 500 is so small, it drives timing luck near zero. On the other hand, if a portfolio can hold the Russell 1000 or be short the S&P 500, differences in holdings due to different rebalance dates can lead to massive performance dispersion. Generally speaking, S is larger for more highly concentrated strategies with large performance dispersion in their investable universe.

Timing Luck in Smart Beta

To date, we have not meaningfully tested timing luck in the realm of systematic equity strategies.³ In this commentary, we aim to provide a concrete example of the potential impact.

A few weeks ago, however, we introduced our Systematic Value portfolio, which seeks to deliver concentrated exposure to the value style while avoiding unintended process and timing luck bets.

To achieve this, we implement an overlapping portfolio process. Each month we construct a concentrated deep value portfolio, selecting just 50 stocks from the S&P 500. However, because we believe the evidence suggests that value is a slow-moving signal, we aim for a holding period between 3-to-5 years. To achieve this, our capital is divided across the prior 60 months of portfolios.⁴

Which all means that we have monthly snapshots of deep value⁵ portfolios going back to November 2012, providing us data to construct all sorts of rebalance variations.

The Luck of Annual Rebalancing

Given our portfolio snapshots, we will create annually rebalanced portfolios. With monthly portfolios, there are twelve variations we can construct: a portfolio that reconstitutes each January; one that reconstitutes each February; a portfolio that reconstitutes each March; et cetera.

Below we plot the equity curves for these twelve variations.

Source: CSI Analytics. Calculations by Newfound Research. Results are hypothetical. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Past performance is not an indicator of future results.

We cannot stress enough that these portfolios are all implemented using a completely identical process. The only difference is when they run that process. The annualized returns range from 9.6% to 12.2%. And those two portfolios with the largest disparity rebalanced just a month apart: January and February.

To avoid timing luck, we want to diversify when we rebalance. The simplest way of achieving this goal is through overlapping portfolios. For example, we can build portfolios that rebalance annually, but allocate to two different dates. One portfolio could place 50% of its capital in the January rebalance index and 50% in the July rebalance index.

Another variation could place 50% of its capital in the February index and 50% in the August index.⁶ There are six possible variations, which we plot below.

The best performing variation (January and July) returned 11.7% annualized, while the worst (February and August) returned 9.7%. While the spread has narrowed, it would be dangerous to confuse 200bp annualized for alpha instead of rebalancing luck.

We can go beyond just two overlapping portfolios, though. Below we plot the three variations that contain four overlapping portfolios (January-April-July-October, February-May-August-November, and March-June-September-December). The best variation now returns 10.9% annualized while the worst returns 10.1% annualized. We can see how overlapping portfolios are shrinking the variation in returns.

Finally, we can plot the variation that employs 12 overlapping portfolios. This variation returns 10.6% annualized; almost perfectly in line with the average annualized return of the underlying 12 variations. No surprise: diversification has neutralized timing luck.

But besides being “average by design,” how can we measure the benefits of diversification?

As with most ensemble approaches, we see a reduction in realized risk metrics. For example, below we plot the maximum realized drawdown for annual variations, semi-annual variations, quarterly variations, and the monthly variation. While the dispersion is limited to just a few hundred basis points, we can see that the diversification embedded in the monthly variation is able to reduce the bad luck of choosing an unfortunate rebalance date.

Just Rebalance more Frequently?

One of the major levers in the timing luck equation is how frequently the portfolio is rebalanced. However, we firmly believe that while rebalancing frequency impacts timing luck, timing luck should not be a driving factor in our choice of rebalance frequency.

Rather, rebalance frequency choices should be a function of the speed at which our signal decays (e.g. fast-changing signals such as momentum versus slow-changing signals like value) versus implementation costs (e.g. explicit trading costs, market impact, and taxes). Only after this choice is made should we seek to limit timing luck.

Nevertheless, we can ask the question, “how does rebalancing more frequently impact timing luck in this case?”

To answer this question, we will evaluate quarterly-rebalanced portfolios. The distinction here from the quarterly overlapping portfolios above is that the entire portfolio is rebalanced each quarter rather than only a quarter of the portfolio. Below, we plot the equity curves for the three possible variations.

The best performing variation returns 11.7% annualized while the worst returns 9.7% annualized, for a spread of 200 basis points. This is actually larger than the spread we saw with the three quarterly overlapping portfolio variations, and likely due to the fact that turnover within the portfolios increased meaningfully.

While we can see that increasing the frequency of rebalancing can help, in our opinion the choice of rebalance frequency should be distinct from the choice of managing timing luck.

Conclusion

In our opinion, there are at least two meaningful conclusions here:

The first is for product manufacturers (e.g. index issuers) and is rather simple: if you’re going to have a fixed rebalance schedule, please implement overlapping portfolios. It isn’t hard. It is literally just averaging. We’re all better off for it.

The second is for product users: realize that performance dispersion between similarly-described systematic strategies can be heavily influenced by when they rebalance. The excess return may really just be a phantom of luck, not skill.

The solution to this problem, in our opinion, is to either: (1) pick an approach and just stick to it regardless of perceived dispersion, accepting the impact of timing luck; (2) hold multiple approaches that rebalance on different days; or (3) implement an approach that accounts for timing luck.

We believe the first approach is easier said than done. And without a framework for distinguishing between timing luck and alpha, we’re largely making arbitrary choices.

The second approach is certainly feasible but has the potential downside of requiring more holdings as well as potentially forcing an investor to purchase an approach they are less comfortable with. For example, blending IWD (Russell 1000 Value), RPV (S&P 500 Pure Value), VLUE (MSCI U.S. Enhanced Value), and QVAL (Alpha Architect U.S. Quantitative Value) may create a portfolio that rebalances on many different dates (annual in May; annual in December; semi-annual in May and November; and quarterly, respectively), it also introduces significant process differences. Though research suggests that investors may benefit from further manager/process diversification.

For investors with conviction in a single strategy implementation, the last approach is certainly the best. Unfortunately, as far as we are aware, there are only a few firms who actively implement overlapping portfolios (including Newfound Research, O’Shaughnessy Asset Management, AQR, and Research Affiliates). Until more firms adopt this approach, timing luck will continue to loom large.

What do portfolios and teacups have in common?

By Corey Hoffstein

On December 17, 2018

In Portfolio Construction, Risk Management, Weekly Commentary

This post is available as a PDF download here.

Summary

Portfolio risk is often measured as the variance of returns over time. Another form of risk is the variance of terminal wealth that can arise from small variations in strategy inputs or asset returns.
Strategies or portfolios that are more sensitive to small changes in inputs are inherently “fragile.”
Fragile strategy design makes it difficult to rely upon backtests or historical results in setting forward expectations.
We explore how diversification across the “what,” “how,” and “when,” axes of portfolio construction can help reduce strategy fragility.

Introduction

At Newfound, we spend a lot less time trying to figure out how to be more right than we spend trying to figure out how to be less wrong. One area of particular interest for us is the idea of unintended bets: the exposures in a portfolio we may not even be aware of. And if we knew we had the exposure, we might not even want it.

For example, consider a portfolio that invests in either broad U.S., broad international, or broad emerging market equities based upon valuations. A significant tilt towards non-U.S. assets may be a valuation-driven decision, but for U.S. investors it creates significant exposure to fluctuations in the U.S. dollar versus foreign currencies.

Of course, exposures are not limited only to assets. Exposures may be broader macro-economic, stylistic, thematic, geographic, or even political factors.

These unintended bets can go far beyond explicit and implicit exposures. In our example, the choice of how to measure value may lead to meaningfully different portfolios, despite the same overarching thesis. For example, a naïve CAPE ratio versus adjusting for differences in relative sector composition dramatically alters the view of whether international equities are significantly cheaper than U.S. equities. These potential differences capture what we like to call “model specification risk.”

Finally, we can be subject to unintended bets based upon when the portfolio is re-evaluated and reconstituted. Evaluating valuations in January, for example, may lead to a different decision versus evaluating them in July.

How can we avoid these unintended bets? At Newfound, we believe that the answer falls back to diversification: not only in the traditional sense of what we invest in, but also across how we make decisions and when we make them.

When left uncontrolled, unintended bets can make a strategy incredibly fragile.

What, precisely, does it mean for a strategy to be fragile? A strategy is fragile when small variations of strategy inputs – be it asset returns or other measures – lead to meaningful dispersion in realized results.

Now we want to distinguish between volatility and fragility. Volatility is the dispersion of strategy returns across time, while fragility is the dispersion in end-of-period wealth across variations of the strategy.

As an example, a portfolio that invests only in the S&P 500 is very volatile but not particularly fragile. Given the last ten years of returns for the S&P 500, slight variations in annual returns would not lead to significant dispersion in end-of-period wealth. On the other hand, a strategy that flips a coin every December and invests for the next year in the S&P 500 when it lands on heads or short-term U.S. Treasuries when it lands on tails would have lower expected volatility than the S&P 500 but would be much more fragile. We need simply consider a few scenarios (e.g. all heads or all tails) to understand the potential dispersion such a strategy is subject to.

In the remainder of this commentary, we will demonstrate how diversification across the what, how, and when axes can reduce strategy fragility.

The Experiment Setup

Since a large degree of our focus at Newfound is on managing trend equity mandates, we will explore fragility through the lens of the style of measuring trends. For those unfamiliar with the approach, trend equity strategies aim to capture a significant portion of equity market growth while avoiding substantial and prolonged drawdowns through the application of trend following. A naïve implementation of such an idea would be to invest in the S&P 500 when its prior 12-month return has been positive and invest in short-term U.S. Treasuries otherwise.

To learn something about the fragility of a strategy, we are going to have to inject some randomness. After all, no amount of history will tell us about the fragility of a teacup that has spent its entire life sitting on a shelf; we will need to see it fall on the floor to actually learn something.

As with our recent commentary When Simplicity Met Fragility, we will inject randomness by adding white noise to asset returns. Specifically, we will add to daily returns a draw from a random normal distribution with mean 0% and standard deviation 0.025%. Using this slightly altered history, we will then run our investment strategy.

By performing this process a large number of times (10,000 in this commentary), we can explore how the outcome of the strategy is impacted by these slight variations in return history. The greater the dispersion in results, the more fragile the strategy is.

To demonstrate how diversification across the three different axes can affect fragility, we will start with a naïve trend equity strategy – investing in broad U.S. equities using a single trend model that is rebalanced on a monthly basis – and vary the three components in isolation.

The What

The “what” axis simply asks, “what are we invested in?”

How can our choice of “what” affect fragility? Consider a slight variation to our coin-flip strategy from before. Instead of flipping a single coin, we will now flip two coins. The first coin determines whether we invest 50% of the portfolio in either the S&P 500 or short-term U.S. Treasuries, while the second coin determines whether we invest the other 50% of the portfolio in either the Russell 1000 or short-term U.S. Treasuries.

In our single coin example, each year we expected to invest in the S&P 500 50% of the time and in short-term U.S. Treasuries 50% of the time. With two coins, we now expect to be fully invested 25% of the time, partially invested 50% of the time, and divested 25% of the time.

Let’s take this notion to further limits. Consider now flipping 100 coins where each determines the allocation decision for 1% of our portfolio, where heads leads to an investment in a large-cap U.S. equity portfolio and tails means invest in short-term U.S. Treasuries. Now being fully invested or divested is an infinitesimally small probability event; in fact, for a given year there is a 95% chance that your allocation to equities falls between 40-60%.¹

Even though we’ve applied the exact same process to each investment, diversifying across more investments has dramatically reduced the fragility of our coin-flipping strategy.

Now let’s translate this from the theoretical to the practical. We will begin with a simple trend following strategy that invests in the underlying asset when prior 12-1 month returns have been positive or invests in the risk-free rate, re-evaluating the trend at the end of each month.

To explore the impact of diversifying our what, we will implement this strategy five different ways:

A single in-or-out decision on broad U.S. equities.
Applied across 5 equally-weighted U.S. equity industry groups.
Applied across 12 equally-weighted U.S. equity industry groups.
Applied across 30 equally-weighted U.S. equity industry groups.
Applied across 48 equally-weighted U.S. equity industry groups.

The graph below plots the distribution of log difference in terminal wealth against the median outcome for each of these five approaches. Lines within each “violin” show the 25^th, 50^th, and 75^thpercentiles.

The graph clearly demonstrates that by increasing our exposure across the “what” axis, the dispersion in terminal wealth is dramatically reduced.

Source: Kenneth French Data Library. Calculations by Newfound Research.

But why is reduced dispersion in terminal wealth necessarily better?

It implies a greater consistency in outcome, which is not only important for setting forward expectations, but is also important for evaluating past performance (whether backtested or live). This evidence tells us that if we are evaluating a trend equity strategy that employs a single model to make in-or-out decisions on broad U.S. equities on a monthly basis, it will be nearly impossible to tell whether the realized results are in line with reasonable expectations or overly optimistic (we can probably guess that they aren’t overly pessimistic, as those sorts of returns typically aren’t marketed).

To justify a concentration in the “what” axis, we would have to demonstrate that the worst-case scenarios would still represent a meaningful improvement in expected terminal wealth versus a more diversified approach.

It should be noted that our experiment design prohibits dispersion from every being fully reduced, as we are injecting randomness into past returns. Even if no strategy is applied, there will be some inherent dispersion in final wealth. For example, below we plot the dispersion that occurs simply from adding randomness to past returns with a buy-and-hold approach.

Increasing the number of assets in the portfolio inherently reduces dispersion for buy-and-hold because diversification helps drive the expected impact of the injected randomness towards its mean: zero. With only one asset, on the other hand, outlier events are free to wreak havoc on results.

Source: Kenneth French Data Library. Calculations by Newfound Research.

Note that adding a strategy on top of buy-and-hold can exacerbate the fragility issue, making diversification that much more important.

The How

The “how” axis asks, “how are we making investment decisions.”

Many investors are already somewhat familiar with diversification along the “how” axis, often diversifying their active exposures across multiple managers who might have similar investment mandates but slightly different processes.

We like to call this “process diversification” and think of it as akin to the parable of the blind men and the elephant. Each blind man touches a different part of the elephant and pronounces his belief in what he is touching based upon his isolated view. The blind man touching the leg, for example, might think he is touching a sturdy tree while the blind man touching the tail might believe he is grabbing a rope.

None is correct in isolation but taken together we may gain a more well-rounded picture.

Similarly, two managers may claim to invest based upon valuations, but the manner in which they do so gives them a very different picture of where value can be found.

The idea of process diversification was explored in the 1999 paper “Do You Need More than One Manager for a Given Equity Style?” by Franklin Fant and Edward O’Neal. Fant and O’Neal found that while a multi-manager approach does very little for return variability across time (i.e. portfolio volatility), it does a lot for end-of-period wealth variability. They find this to be true across almost all equity style box categories. In other words: taking a multi-manager approach can reduce fragility.

Let us return to our prior coin flip example. Instead of making a choice to invest in the S&P 500 based upon a coin-flip, however, we will combine a number of different signals. For example, we might flip a coin, roll a die, measure the weather, and look at the second hand of a clock. Each signal gives us some sort of in-or-out decision, and we average these decisions together to get our allocation. As with before, as we incorporate more signals, we decrease the probability that we end up with extreme allocations, leading to a more consistent terminal wealth distribution.

Again, we should stress here that the objective is not just outright elimination of dispersion in terminal wealth. After all, if that were our sole pursuit, we could simply stuff our money under our mattress. Rather, assuming we will be implementing some active investment strategy that we hope has a positive long-term expected return, our aim should be to reduce the dispersion in terminal wealth for that strategy.

Of course, in investing we would not expect the processes to be entirely independent. With trend following, for example, most popular models are actually mathematically linked to one another, and therefore generate signals that are highly correlated. Nevertheless, even modest diversification can have meaningful benefits with respect to strategy fragility.

To explore the impact of diversification along the how axis, we implement our trend following strategy six different ways. Each invests in broad U.S. equities and rebalances monthly but differs in the number of trend-following models employed.²

The results are plotted below.

Source: Kenneth French Data Library. Calculations by Newfound Research.

Again, we can see that increased diversification across the how axis dramatically reduces dispersion in terminal wealth. Our takeaway is largely the same: without an ex-ante view as to which particular model (or group of models) is best (i.e. a view of how to be more right), diversification can lead to greater consistency in results. We will be less wrong.

A subtler conclusion of this analysis is that it should be very, very difficult to necessarily conclude that one model is better than another. We can see that if we risk selecting just one model to govern our process, seemingly minor variations in historical returns leads can lead to dramatically different terminal wealth results, as evidenced by the bulging distribution. Inverting this line of thinking, we should also be suspect of any backtest that seeks to demonstrate the superiority of a given model using a single backtest. For example, just because a 12-1 month total return model performs better than a 10-month moving average model on historical S&P 500 returns, we should be highly skeptical as to the robustness of the conclusion that the 12-1 model is best.

The When

Then “when” axis asks, “when are we making our investment decision?”

This is an oft overlooked question in public markets, but it is commonly addressed in the world of private equity and venture capital. Due to the illiquid nature of those markets, investors will often attempt to diversify their business cycle risk by establishing positions in multiple funds over time, giving them exposure to different “vintages.” The idea here is simple: the opportunity set available at different points in time can vary and if we allocate all of our earmarked capital to a particular year, we may miss out on later opportunities.

Consider our original coin-flipping example where we flipped a single coin every December to determine whether we would buy the S&P 500 or hold our capital in short-term Treasuries. But why was it necessary that we make the decision in December? Why not July? Or January? Or September?

While we would not expect there to be point-in-time risk for coin flipping, we can still consider the net effect of a vintage-based allocation methodology. Here we will assume that we flip a coin each month and rebalance 1/12^thof our capital based upon the result.

Again, the probability of allocating to the extremes (100% invested or 100% divested) is dramatically reduced (each has approximately a 0.02% chance of occurring) and we reduce strategy fragility to any specific coin flip.

But just how impactful is this notion? Below we plot the rolling 1-year total return difference between two 60% S&P 500 / 40% 5-year U.S. Treasury fixed-mix portfolios, with one being rebalanced in February and one in August. Even for this highly simplified example, we can see that the total return spread between the two portfolios blows out to over 700 basis points in March 2010 due to the fact that the February portfolio rebalanced back into equities at nearly the exact bottom of the crisis.

Source: Global Financial Data. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.

To increase diversification across the “when” axis, we want to increase the number of vintages we deploy. For our trend following example, we will assume that the portfolio allocates between broad U.S. equities and the risk-free rate based upon a single model, but with an increasing number of evenly-spaced vintages. Again, we will run 10,000 simulations that each slightly perturb historical U.S. equity market returns and compare the terminal wealth variation for approaches that employ a different number of vintages.

We can see in the graph below that, as with the other axes of diversification, as we increase the number of vintages employed, the variance decreases. While the 25^thand 75^thpercentiles do not decrease as dramatically as for the other axes, we can see that the extreme variations are reined in substantially when we move from 1 monthly tranche to 4 weekly tranches.

Source: Kenneth French Data Library. Calculations by Newfound Research.

Conclusion

We see two critical conclusions from this analysis:

To develop confidence in achieving our objective we have to consider our sensitivity to unintended bets that may be included within the portfolio.

Fragility makes it incredibly difficult to distinguish between luck and skill, particularly as strategy fragility increases. This is true for both backtested and live performance.

To conclude our analysis, below we present a graph that combines diversification across all three axes. We again run 10,000 samples, randomly perturbing returns. For each sample, we then run four variations:

A single, randomly selected model run in broad U.S. equities that is rebalanced monthly.
A random selection of 3 models run on 5 industry groups in 2 bi-weekly tranches.
A random selection of 6 models run on 12 industry groups in 4 weekly tranches.
A random selection of 9 models run on 30 industry groups in 20 daily tranches.

It should come as no surprise that as we increase the amount of diversification across all three axes, the dispersion in terminal wealth is dramatically reduced.³

Source: Kenneth French Data Library. Calculations by Newfound Research.

It is also important to note that while our analysis focused on trend following strategies, this same line of thinking applies across all investment approaches. As an example, consider a quantitative value manager who buys the top five cheapest stocks, as measured by price-to-book, in the S&P 500 each December and then holds them for the next year. Questions worth pondering are:

What does it say about our conviction when the 6^thstock in the list is incredibly close to the 5^thstock?
What happens if some of our measures of book value are incorrect (or even just outdated)?
How different would the portfolio look if we ranked on another value measure (e.g. price-to-earnings)?
How different would the opportunity set be if we ranked every June versus every December?

While low levels of diversification across the what, how, and when axes are not necessarily an indicator that a model is inherently fragile, it should be a red flag that more effort is required to disprove that it is not fragile.

Quantifying Timing Luck

By Corey Hoffstein

On January 22, 2018

In Craftsmanship, Risk Management, Weekly Commentary

This blog post is available as a PDF download here.

Summary

When two managers implement identical strategies, but merely choose to rebalance on different days, we call variance between their returns “timing luck.”
Timing luck can easily be overcome by using a method of overlapping portfolios, but few firms do this in practice.
We believe the magnitude of timing luck impact is much larger than most believe, particularly in tactical strategies.
We derive a model to estimate the impact of timing luck, using only values that can be easily estimated from portfolios implemented without the overlapping portfolio technique.
We find that timing luck looms large in many different types of strategies.

As a pre-emptive warning, this week’s commentary is a math derivation. We think it is a very relevant derivation – one which we have not seen before – but a derivation nonetheless. If math is not your thing, this might be one to skip.

If math is your thing: consider this a request for comments. The derivation here will be rather informal sketch, and we think there are other improvements still lingering.

What is “Timing Luck?”

The basic concept of timing luck is that when we choose to rebalance can have a profound impact on our performance results. For example, if we rebalance an investment strategy once a month, the choice to rebalance at the end of the month will lead to different performance than had we elected to rebalance mid-month.

We call this performance differential “timing luck,” and we believe it is an overlooked, non-negligible portfolio construction risk.

As an example, consider a simple stock/cash timing model that rebalances monthly, investing in a broad U.S. equity index when its 12-1 month return is positive, and a constant maturity 1-year U.S. Treasury index otherwise. Depending on which day of the month you choose to rebalance (we will assume 21 variations to represent 21 trading days), your results may be dramatically different.

Source: Kenneth French Data Library, Federal Reserve of St. Louis. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.

The best performing strategy had an annualized return of 11.1%, while the worst returned just 9.6%. Compounded over 55 years, and that 150 basis point (“bps”) differential leads to an astounding difference in final wealth. With a standard deviation between 50-year annualized returns of 0.42%, the 1-year annualized estimate of performance variation due to timing luck is 314bps!

Again, an identical process is employed: the only difference between these results is the choice of what day of the month to rebalance.

That small choice, and the good luck or misfortune it realizes, can easily be the difference between “hired” and “fired.”

Is There a Solution to Timing Luck?

In the past, we have argued that overlapping portfolios can be utilized to minimize the impact of timing luck. The idea of overlapping portfolios is as follows: given an investment process and a holding period, we can invest across multiple managers that invest utilizing the same process but have offset holding periods.[1]

For example, below each manager has a four time-step holding period, and we utilize four managers to minimize timing luck from a single implementation.

The proof that this approach minimizes timing luck is as follows.

Assume that we have N managers, all following an identical investment process with identical holding period, but whose rebalance points are offset from one another by one period.

Consider that at any point in time, we can define the portfolio of Manager #2 to be the portfolio of Manager #1 plus a dollar-neutral long/short portfolio that captures the differences in holdings between them. Similarly, Manager #3’s portfolio can be thought of as Manager #2’s portfolio plus a dollar-neutral long/short portfolio. This continues in a circular manner, where Manager #1’s portfolio can be thought of as Manager #N’s portfolio plus a dollar-neutral long/short.

Given that the managers all follow an identical process, we would expect them to have the same long-term expected return. Thus, the expected return of the dollar-neutral long/short portfolios is zero.

However, the variance of the dollar-neutral long/short portfolios captures the risk of timing luck.

In allocating capital between the N portfolios, our goal is to minimize timing luck. Put another way, we want to find the allocation that results in the minimum variance portfolio of the long/short portfolios. Fortunately, there is a simple, closed form solution for calculating the minimum variance portfolio:

Here, w is our solution (an Nx1 vector of weights), Sigma is the covariance matrix and is an Nx1 vector of 1s. To solve this equation, we need the covariance matrix between the long/short portfolios. Since each portfolio is employing an identical process, we can assume that each of the long/short portfolios should have equal variance. Without loss of generality, we can assume variances are equal to 1 and replace our covariance matrix, Sigma, with a correlation matrix, C.

The correlations between long/short portfolios will largely depend on the process in question and the amount of overlap between portfolios. That said, because each manager runs an identical process, we would expect that the long-term correlation between Portfolio #2’s long/short and Portfolio #1’s long/short to be identical to the correlation between Portfolio #3’s long/short and Portfolio #2’s. Similarly, the correlation between Portfolio #3’s and Portfolio #1’s long/shorts should be the same as the correlation between Portfolio #N’s and Portfolio #2’s.

Following this logic (and remembering the circular nature of the rebalances), we can ignore exact numbers and fill in a correlation matrix using variables:

This correlation matrix has two special properties. First, being a correlation matrix, it is symmetric. Second, it is circulant: each row is rotated one element to the right of the preceding row. A special property of a symmetric circulant matrix is that its inverse – in this case C^-1 – is also symmetric circulant. This property guarantees that C^-11 is equal to k1 for some constant k.

Which means we can re-write our minimum variance solution as:

Since the constant will cancel out, we are left with:

Thus, our optimal solution is an equal-weight allocation to all N portfolios.

Highlighted in gold below, we can see the result of this approach using the same stock/cash example as before. Specifically, the gold portfolio uses each of the 21 variations as a different sub-portfolio.

Source: Kenneth French Data Library. Calculations by Newfound Research. Past performance is not an indicator of future results. Performance is backtested and hypothetical. Performance figures are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes. Performance assumes the reinvestment of all distributions.

While we have a solution for timing luck, a question that lingers is: “how much will timing luck affect my particular strategy?”

The Setup

We assume an active investment strategy with constant portfolio of variance (S²), constant and continuous annualized turnover (T; e.g. 0.5 for 50% annual turnover), and consistent rebalances at discrete frequency (f; e.g. 1/12 for monthly).

We will also assume that the portfolio contains no static components. This allows us to interpret 100% turnover as meaning that the entire portfolio was turned over, rather than that 50% of the portfolio was turn over twice.

To quantify the magnitude of timing luck, we will calculate the variance of a dollar-neutral, long/short portfolio that is long a discrete implementation (i.e. rebalancing at a fixed interval) of this strategy (D) and short the theoretically optimal infinite overlapping portfolio implementation (M – for “meta”).

As before, the expected return of this long/short is zero, but its variance captures the return differences created by timing luck.

Differences between the Discrete and Continuous Portfolios

The long/short portfolio is defined as (D – M). However, we would expect the holdings of D to overlap with the holdings of M. How much overlap will depend on both portfolio turnover and rebalance frequency.

Assume, for a moment, that M does not have infinite overlapping portfolios, but a finite number N, each uniformly spaced across the holding period.

If we assume 100% turnover that is continuous, we would expect that the first overlapping portfolio, implemented at t=1/N, to have (1 – 1/N) percent of its holdings identical to D (i.e. not “turned over”). On the other hand, the portfolio implemented at t = (N-1)/N will have just 1/N percent of its holdings identical to D.

Thus, we can say that if M contains N discrete overlapping portfolios, we can expect M and D to overlap by:

Which we can reduce,

If we take the limit as N goes to infinity – i.e. we have infinite overlapping portfolios – then we are simply left with:

Thus, the overlap we expect between our discretely implemented portfolio, D, and the portfolio with infinite overlapping portfolios, M, is a simple function of the expected turnover during the holding period.

We can then define our long/short portfolio:

Where Q is the portfolio of holdings in M that are not in D.

We should pause here, for a moment, as this is where our assumption of “no static portfolio elements” becomes relevant. We defined (1) to be the amount M and D overlap. Technically, if we allow securities to be sold and then repurchased, (1) represents a lower limit to how much M and D overlap. As an absurd example, consider a portfolio that creates 100% turnover by buying and selling the same 1% of the portfolio 100 times. Thus, Q in (6) need not necessarily be unique from D; part of D could be contained in Q.

By assuming that no part of the portfolio is static, we are assuming that over the (very) long run, the average turnover experience over a holding period does not include repurchase of sold securities, and thus (1) is the amount of overlap and D and Q are independent holdings.

This assumption is likely fairer for traditionally active portfolios that focus on security selection, but potentially less realistic for tactical strategies that often sell and re-purchase the same exposure. More on this later.

Defining,

We can re-write,

Solving for Timing Luck

We can then solve for the variance of the long/short portfolio,

Expanding:

As D and Q both represent viable allocation schemes for the portfolio, we will assume that they share the same long-term portfolio variance, S². This assumption may be fair, over the long run, for traditional stock-selection portfolios, but likely less fair for highly tactical portfolios that can meaningfully shift their portfolio risk exposures.

Thus,

Replacing back our definition for a, we are left with:

Or, that the annualized volatility due to timing luck (L) is:

What is Corr(D,Q)?

The least easily interpreted – or calculated – term in our equation is the correlation between our discrete portfolio, D, and the non-overlapping securities found in the infinite overlapping portfolios implementation, Q.

The intuitive interpretation here is that when the securities held in our discrete portfolio are highly correlated to those that are not held but the optimal strategy recommends we hold, then we would expect the difference to have less impact. On the other hand, if those securities are negatively correlated, then the discrete rebalance choice could lead to significant additional volatility.

Estimating this value, however, may be difficult to do empirically.

One potential answer is to use the intra-portfolio correlation (“IPC”) of an equal-weight portfolio of representative assets or securities. The intuition here is that we expect each asset to experience, on average, an equivalent amount of turnover due to our assumption that there are no static positions in the portfolio.

Thus, taking the IPC of an equal-weight portfolio of representative securities allows us to express the view that while we do not know which securities will be different at any given point in time, we expect over the long-run that all securities will be “missing” with equal frequency and magnitude, and therefore the IPC is representative of the long-term correlation between D and Q.

Estimating Timing Luck in our Stock/Cash Tactical Strategy

The assumptions required for our estimate of timing luck may work well with traditional security selection portfolios (or, at least, quantitative implementations of factors like value, momentum, defensive etc.), but will it work with tactical portfolios?

Using our prior stock/cash example, let’s estimate the expected magnitude of timing luck. Using one of the discrete implementations, we estimate that turnover is 67% per year. Our rebalance frequency is monthly (1/12) and the intra-portfolio correlation between stocks and bonds is assumed to be 0%. Finally, the long-term volatility of the strategy is about 12.2%.

Using these figures, we estimate:

This is a somewhat disappointing result, as we had calculated prior that the actual timing luck was 314bps. Our estimate is less than 1/6^th of the actual figure!

Part of the problem may be that many of the assumptions we outlined are violated with our example tactical strategy. We think the bigger problem is that our estimates for these variables, when using a highly tactical strategy, are simply wrong.

In our equation, we assumed that turnover would be continuous. This is because we are using turnover as a proxy for the decay speed of our alpha signal.

What does this mean? As an example, value strategies rely on value signals that tend to decay slowly. When a stock is identified as being a value stock, it tends to stay that way for some time. Therefore, if you build a portfolio off of these signals, you would expect low turnover. Momentum signals, on the other hand, tend to decay more quickly. A stock that is labeled as high momentum this month may no longer be high momentum in three months’ time. Thus, momentum strategies tend to be high turnover.

This relationship does not necessarily hold for tactical strategies.

In our tactical example, we rebalance monthly because we believe the time-series momentum has a short forecast horizon. However, with only two assets, the strategy can go years without turnover. Worse, the same strategy might miss a signal because it is only sampling in a discrete manner and therefore understate true turnover in a continuous framework.

If we were to look at the turnover of a tactical strategy implemented with the same rules but rebalanced daily, we would see a turnover rate over 300%. This would increase our estimate up to 215bps. Still well below the realized 314bps, but certainly high enough to raise eyebrows about the impact of timing luck in tactical portfolios not implemented using overlapping portfolios.

We should also remember that timing luck is determined by the difference in holdings between the discrete strategy and the meta strategy. We had assumed that the portfolios D and Q would have the same volatility, but in a strategy that shifts between stocks and bonds, this most certainly is not the case. This means that long-run volatility in such a tactical strategy can actually be misleadingly low.

Consider the situation when the tactical strategy goes to cash based upon a short-lived signal; i.e. the meta strategy will not build a significant cash position. The realized volatility of the strategy will dampen the perceived timing luck, when in reality the volatility difference between the two portfolios is quite large.

In our specific tactical example, we know that when D is stocks, Q is bonds and vice versa. With this insight, we can re-write equation (10):

Which we can simplify as:

Which is simply just a constant times the variance of a portfolio that is 100% long stocks and -100% short bonds (or vice versa; the variance will be the same).

If we use this equation and the variance of a long/short stock/bond portfolio and our prior estimate of 300% turnover, we get an estimate of timing luck volatility of 191bps.

Note that using this concept, there may be a more generic solution that is possible using some measure of active variance (likely scaled by active share).

Conclusion

In this piece we have demonstrated the potentially massive impact of timing luck, addressed how to solve for it, and derived a model that can be used to estimate the magnitude of timing luck risk in strategies that do not employ an overlapping portfolios technique.

While our derived approach is not perfect – as we saw in its application with our tactical example – we believe it is an important step forward in being able to quantify the potential risk that timing luck creates.

[1] In reality, we probably wouldn’t hire a different manager to implement the same strategy with different rebalance timing even if we could find such managers. A more feasible solution would be for a single manager to run different sleeves implementing each rebalance iteration.

Tag: timing luck

The Dumb (Timing) Luck of Smart Beta

Summary

Equity Style Portfolio Definitions

Empirical Timing Luck Results

S&P 500 Style Index Examples

Conclusion

Timing Luck and Systematic Value

Summary­

Timing Luck in Smart Beta

The Luck of Annual Rebalancing

Just Rebalance more Frequently?

Conclusion

What do portfolios and teacups have in common?

Summary­

Introduction

The Experiment Setup

The What

The How

The When

Conclusion

Quantifying Timing Luck

Summary­­

What is “Timing Luck?”

Is There a Solution to Timing Luck?

The Setup

Differences between the Discrete and Continuous Portfolios

Solving for Timing Luck

What is Corr(D,Q)?

Estimating Timing Luck in our Stock/Cash Tactical Strategy

Conclusion

Summary

Summary

Summary