We've written about timing luck many, many, many times before. But in each post, we've discussed strategies that were making active decisions on some fixed time basis. Most of the time it was strategies that were making monthly decisions about whether to be in-or-out of the market.
October 2014 of this year highlights the problem with fixed-time sampling. For most tactical investors, choosing a month-end or month-beginning rebalance period worked in their favor, as they avoided any potential whipsaw from the intra-month 7% drawdown. But the disparity between intra-month volatility versus close-to-close volatility highlighted how the choice of when to measure portfolio statistics could event have a very large impact on strategic asset allocations. From the 16th of September to the 16th of October, the S&P 500 tumbled 6.65%; from the 30th of September to the 31st of October, the market climbed 2.35%. Using monthly returns from the ETF SPY, one of those returns represents a -1.79 standard deviation move and the other is a 0.46 standard deviation move.
So we thought it would be worth exploring further whether timing luck – which may affect estimates of expected return, volatility, and correlation – can play a significant role in strategic portfolio construction.
To keep the test simple, we'll just use a portfolio of stocks (ETF: SPY) and bonds (ETF: AGG) and perform a standard mean-variance optimization to calculate efficient frontiers. To capture the effects of timing luck, we'll estimate a month to be 21 trading days and create 21 different "monthly" return streams, each one starting on a different day. From this "monthly" data, we'll calculate expected returns and covariances as input to our efficient frontier calculation (nothing complicated – just sample means and covariances).
Using the last three years of data, we find that changing how you define a month can have a dramatic effect on the result of the efficient frontier we build.
But perhaps the disparity in the curves is just a factor of the time period selected or even the number of samples in the period. Given that each curve is built off of 36 samples, one or two outliers could cause significant divergence. So we re-ran the same test using the prior 6 years of data.
Interestingly, we still get a fair bit of dispersion, but it seems to be in the level of the curve and less in the slope and curvature.
Finally, we tried the prior 9 years.
We see a fairly tightly clustered bundle of curves except for one extreme outlier.
So what's the takeaway here? Well, 9 years of monthly data gives us 108 data-points; perhaps enough to expect that no matter how you sample, you'll get just about every type of market dynamic and when you sample becomes less important. Except, of course, you just so happen to get that one weird outlier.
Many firms use far less than 9 years. In this sort of statistical portfolio construction, there is an underlying assumption that the future will look like the near past, and so less data, or exponentially-weighted data, is used to help overweight nearer-term market dynamics.
So clearly how and when we choose to split up our time-series into independent samples can meaningfully impact even a strategic portfolio construction. So what can we do? Well, I am certainly not the first to draw this conclusion. In a certain sense, all we've really done is slightly perturbed our inputs via resampling – which is the basis behind the resampled efficient frontier. By averaging these samples together, we may get a more robust estimate for the "true" efficient frontier.
October should stand as a wake-up call to anyone doing allocation of any form: when you make your decisions and how you split your data can have a dramatic effect on your results.