This post is available as a PDF download here.
Summary
- In this commentary we attempt to identify the sources of performance in long/short equity strategies.
- Using Kalman Filtering, we attempt to replicate the Credit Suisse Long/Short Liquid Index with a set of common factors designed to capture equity beta, regional, and style tilts.
- We find that as a category, long/short equity managers make significant changes to their equity beta and regional tilts over time.
- Year-to-date, we find that tilts towards foreign developed equities, emerging market equities, and the value premium have been the most significant detractors from index performance.
- We believe that the consistent relative out-performance of U.S. equities against international peers has removed an important alpha source for long/short equity managers when they are benchmarked against U.S. equities.
Please note that analysis performed in this commentary is only through 8/31/2018 despite a publishing date of 10/22/2018 due to data availability.
Introduction
Since 4/30/1994, the Credit Suisse Long/Short Equity Hedge Fund (“CS L/S EQHF”) Index has returned 9.0% annualized with an 8.8% annualized volatility and a maximum drawdown of just 22%. While the S&P 500 has bested it on an absolute return basis – returning 10.0% annualized – it has done so with considerably more risk, exhibiting 14.4% annualized volatility and a maximum drawdown of 51%. Capturing 90% of the long-term annualized return of the S&P 500 with only 60% of the volatility and less than half the maximum drawdown is an astounding feat. Particularly because this is not the performance of a single star manager, but the blended returns of dozens of managers.
Yet absolute performance in this category has languished as of late. While the S&P 500 has returned an astounding 13.5% annualized over the last five years, the CS L/S EQHF Index has only returned 5.6% annualized. Of course, returns are only part of the story, but this performance is in stark contrast to the relative performance experienced during the 2003-2007 bull market. From 12/31/2003 to 12/31/2007, the average rolling 1-year performance difference between the S&P 500 and the CS L/S EQHF Index was less than 1 basis point whereas the average rolling 1-year performance differential from 12/31/2010 to 12/31/2017 was 877 basis points. Year-to-date performance in 2018 has been no exception to this trend. The CS L/S EQHF Index is up just 2.1% compared to a positive 9.7% for the S&P 500, with several popular strategies faring far worse.
Now, before we dive any deeper, we want to address the obvious: comparing long/short equity returns against the S&P 500 is foolish. The long-term beta of the category is less than 0.5, so it should not come as a surprise that absolute returns have languished during a period where vanilla U.S. equity beta has been one of the best performing asset classes. Nevertheless, while the CS L/S EQHF typically exhibited higher risk-adjusted returns than equity beta from 1994 through 2011, the reverse has been true since 2012.
Identifying precisely why both absolute and relative risk-adjusted performance has declined over the last several years can be difficult, as the category as a whole is incredibly varied in nature. Consider this index definition from Credit Suisse:
The Credit Suisse Long/Short Equity Hedge Fund Index is a subset of the Credit Suisse Hedge Fund Index that measures the aggregate performance of long/short equity funds. Long/short equity funds typically invest in both long and short sides of equity markets, generally focusing on diversifying or hedging across particular sectors, regions or market capitalizations. Managers typically have the flexibility to shift from value to growth; small to medium to large capitalization stocks; and net long to net short. Managers can also trade equity futures and options as well as equity-related securities and debt or build portfolios that are more concentrated than traditional long-only equity funds.
The wide degree of flexibility means that we would expect significant dispersion in individual strategy performance. Examining a broad index may still be useful, however, as we may be able to decipher the large muscle movements that have driven common performance. In order to do so, we have to get under the hood and try to replicate the index using common factor exposures.
Figure 1: Credit Suisse Long/Short Equity Indices
Data from 12/1993-8/2018
Annualized Return | Annualized Volatility | Sharpe Ratio | |
Credit Suisse Long/Short Hedge Fund Index | 8.6% | 8.9% | 0.68 |
Credit Suisse Long/Short Liquid Index | 7.7% | 9.4% | 0.60 |
Credit Suisse AllHedge Long/Short Equity Index | 3.6% | 8.0% | 0.29 |
Source: Kenneth French Data Library and Credit Suisse. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results.
Replicating Long/Short Equity Returns
To gain a better understanding of what is driving long/short equity returns, we attempt to construct a strategy that replicates the returns of the Credit Suisse Long/Short Liquid Index (“CS L/S LAB”). We have selected this index because return data is available on a daily basis, unlike many other long/short equity indexes which only provide monthly returns.
It is worth noting that this index is itself a replicating index, attempting to track the CS L/S EQHF Index using liquid instruments. In other words, we’re attempting a rather meta experiment: replicating a replicator. This may introduce unintended noise into our effort, but we feel that the benefit of daily index level data more than offsets this risk.
Based upon the category description above, we pre-construct several long/short indices that aim to isolate equity beta, regional tilts, and style tilt effects. To capture beta, we construct the following long/short index:
- Long S&P 500 / Short Cash: The excess returns offered by U.S. large-cap equities
To capture regional, size, and industry effects, we construct the following long/short indexes:
- Long Russell 2000 / Short S&P 500: Relative performance of small-cap equities versus large-cap equities
- Long MSCI EAFE / Short S&P 500: Relative performance of international developed equities versus U.S. equities
- Long MSCI EM / Short S&P 500: Relative performance of emerging market equities versus U.S. equities
- Long Nasdaq 100 / Short S&P 500: Relative performance of “concentrated” large-cap equities versus broad large-cap equities1
To capture certain style premia, we construct the following long/short indexes:
- Long Russell 1000 Value / Short Russell 1000 Growth: Relative performance of large-cap value versus large-cap growth.2
- Long High Momentum / Short Low Momentum: Relative performance of recent winners versus recent losers.
All long/short indexes are assumed to be dollar-neutral in construction and are rebalanced on a monthly basis.
A simple way of implementing index tracking is through a rolling-window regression. In such an approach, the returns of the CS L/S LAB Index are regressed against the returns of the long/short portfolios. The factor loadings would then reflect the weights of the replicating portfolio.
In practice, the problem with such an approach is that achieving statistical significance requires a number of observations far in excess to the number of factors. Were we to use monthly returns, for example, we might need to employ upwards of three years of data. Yet, as we know from the introductory description of the long/short equity category, these strategies are likely to change their exposures rapidly, even on an aggregate scale. One potential solution is to employ weekly or daily returns. Yet even when this data is available, we must still determine the appropriate rolling window length as well as consider how to handle statistically insignificant explanatory variables and perform model selection.
With this in mind, we elected to utilize an approach called Kalman Filtering. This algorithm is designed to produce estimates for a series of unknown variables based upon a series of inputs that may contain statistical noise or other inaccuracies. The benefit of this model is that we need not specify a lookback window: the model dynamically updates for each new observation based upon how well the model fits the data and how noisy the algorithm believes the data to be.
As it pertains to the problem at hand, we set up our unknown variables to be the weights of the replicating factors in our portfolio. We feed the algorithm the daily returns of these factors and set it to solve for the weights that will minimize the tracking distance to the daily returns of the CS L/S LAB Index. In Figure 2 we plot the cumulative returns of the CS L/S LAB Index and our Kalman Tracker portfolio. We can see that while the Kalman Tracker does not perfectly capture the magnitude of the moves exhibited by the CS L/S LAB Index, it does generally capture the shape and significant transitions within the index. While not a perfect replica, this may be a “good enough” approximation for us to glean some information from the underlying exposures.
Figure 2: Credit Suisse Long/Short Liquid Index and Hypothetical Kalman Tracker
Source: Kenneth French Data Library, Credit Suisse, and CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results. Index returns are total returns and are gross of all fees except for underlying ETF expense ratios of ETFs utilized by the Kalman Tracker. The Kalman Tracker does not reflect any strategy offered or managed by Newfound Research and was constructed exclusively for the purpose of this commentary.
The Time-Varying Exposures of Long/Short Equity
In Figure 3 we plot the underlying factor weights of our replicating strategy over time, specifically magnifying year-to-date exposures.
Figure 3: Underlying Exposure Weights for Kalman Tracker
We can see several effects:
- Factor exposures do indeed exhibit significant time-varying behavior. For example, prior to 2008 there was a large tilt towards foreign-developed equities, whereas post-2008 exposure remained largely U.S. focused.
- Beta exposure is time-varying. While there is latent beta exposure in the long/short factors, we can approximate overall beta exposure by simply isolating S&P 500 exposure. In March 2008, exposure peaked at 72% and then was cut quickly throughout the year. By January 2009, the index was net short. Post-crisis, exposure was rebuilt back to nearly 70% by September 2011, but has been declining since. Exposure currently sits at 28%. Has all this equity timing been valuable? In Figure 4 we plot the cumulative return of the index’s long-term average beta exposure and the cumulative return from beta timing. We can see that beta timing has, over the long run, been neither a significant contributor nor detractor from performance. Yet crisis-period returns suggest that long/short equity strategies may employ convex trading strategies (e.g. trend-following or constant proportion portfolio insurance).
- Size, value, and momentum tilts are not particularly significant in magnitude, with the exception of value during the 2008 crisis. Interestingly, exposure to value was negative during that time period, implying that the index was long growth and short value. Concentrated large-cap exposure has been a rather consistent bet in the post-2008 era, reflecting a tilt towards growth.
- Regional bets have been largely absent post-2008, at least with respect to their pre-2008 magnitude. We think it is important to pause and acknowledge the impact that benchmarking can have on perceived value add. Consider Figure 5 where we plot the cumulative returns of regional tilts towards international developed and emerging markets. We can see that prior to 2008, a tilt away from U.S. equities was successful in both cases, and after 2011 both were a losing bet. In the post-2011 environment, if a manager successfully makes the call to tilt towards U.S. equities, an entirely U.S. equity benchmark will effectively nullify the impact since the bet is already fully encapsulated in the benchmark! In other words, by choice of benchmark we have eliminated a source of value-add for the manager. Had we elected a global equity benchmark instead, the manager’s flexibility could potentially create value in both environments.
Figure 4: Cumulative Returns of Kalman Tracker’s Long-Term Average S&P 500 Exposure and Time-Varying Exposure
Source: Kenneth French Data Library, Credit Suisse, and CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results.
Figure 5: Cumulative Returns of Regional Tilts
Source: CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results.
What has driven performance in 2018? We see three primary components.
- Entering the year, the index carried a nearly 40% allocation to equity beta. While exposure declined to about 33% by the end of the month, it was rapidly cut down to just 20% after the first week February. By mid-March this position was rebuilt to approximately 30%.We estimate that average beta exposure has been a 3.4% contributor to year-to-date returns, while market timing has been a -0.3% detractor.
- After Q1, there was an increase in exposure to MSCI EAFE, MSCI EM, and value tilts. We estimate that these tilts have been -1.9%, -2.3%, and -1.1% detractors from performance, respectively.It is possible that these tilts all reflect the same underlying bet towards global value. Or it may be the case that the global tilts reflect a bet on a weakening dollar. We should not hesitate to remember that these figures are all statistically derived, so an equally valid possibility is that they are entirely wrong in the first place. It is worth noting that the value tilt – which is expressed as long Russell 1000 Value and short Russell 1000 Growth – does neutralize some of the sectors tilts expressed in the concentrated large-cap position discussed in the next bullet. The true net effect may not actually be a tilt towards value within the index, but rather just a reduction in the tilt towards growth.
- The largest positive contributor to returns year-to-date has been the concentrated large-cap tilt. Implemented as long Nasdaq 100 / short S&P 500, this tilt largely expresses a bet on information technology, telecommunication services, and consumer discretionary sectors. Specifically, year-to-date is represents a significant overweight towards individual names like Apple, Amazon, Microsoft, Google, and Facebook.
Conclusion
Has long/short equity lost its mojo?
By replicating index performance using liquid factors, we can extract the common drivers of performance. What we found was that pre-2008 performance was largely driven by equity beta and a significant tilt towards foreign developed equities.
After 2011, regional tilts were losing bets. Fortunately, we can see that such tilts were significantly reduced – if not outright removed – from the index composition. Nevertheless, if we benchmark to a U.S. equity index (even if properly risk-adjusted), the accuracy of this trade will be entirely discounted because it is fully embedded in the index itself. In other words, by benchmarking against U.S. equities, the best a manager can do during a period when U.S. equities outperform is keep up with the index. Consider that year-to-date the MSCI ACWI has returned just 3.5%: much closer to the 2.1% of the CS L/S EQHF Index quoted in the introduction.
We can also see a significant tilt towards concentrated U.S. equities in the post-crisis era. This trade captured the relative performance of sectors like technology, telecommunication services, and consumer discretionary and from 12/31/2009 to 8/31/2018 returned 4.5% annualized.
Taken together, it is hard to argue that aggregate timing skill is not being displayed in the long/short equity category. We simply have to use the right measuring stick and not expect the timing to work over every shorter-term period.
Of course, this analysis should all be taken with a grain of salt. Our replicating index is by no means a perfect fit (though it is a very good fit from 2012 onward) and it is entirely possible that we selected the wrong set of explanatory features. Furthermore, we have only analyzed one index. The performance of the Credit Suisse Liquid Long/Short Index is not identical to that of the HFRI Equity Hedge Index, the Wilshire BRI Long/Short Equity Index, or the Morningstar Global Long/Short Index. Analysis using those indices may very well lead to different conclusions. Finally, the mathematics of this exercise does not make the factor tea-leaves any easier to decipher: we are ultimately attempting to create a narrative where one need not apply.
It is worth acknowledging that our analysis is categorical about an asset class where investors have little ability to make an indexed investment. Rather, allocation to long/short equity is still dominated by individual manager selection. This means that that investor mileage will vary considerably and that our analysis herein may not apply to any specific manager. After all, we are attempting to analyze aggregate results and it is impossible to unscramble eggs.
Yet it does raise the question: if the aggregate category has such attractive features and can be tracked well with liquid factors, why have trackers not taken off as a popular – and much lower cost – solution for investors looking to index their long/short equity exposure? Another potential solution may be for investors to unbundle and rebuild. For example, we find that the beta exposure of $1 invested in the long/short category can be captured efficiently by $0.5 of trend equity exposure, freeing up $0.5 for other high-conviction alpha strategies.
Diversifying core equity exposure is a goal of many investors. Long/short equity provides one way to do this. In addition to potentially highlighting some of the performance drivers for long/short equity, this replication exercise shows that there may be other, more transparent, ways to achieve this goal.
When Simplicity Met Fragility
By Corey Hoffstein
On October 29, 2018
In Craftsmanship, Portfolio Construction, Risk Management, Weekly Commentary
This post is available as a PDF download here.
Summary
Introduction
In the world of finance, simple can be surprisingly robust. DeMiguel, Garlappi, and Uppal (2005)1, for example, demonstrate that a naïve, equal-weight portfolio frequently delivers higher Sharpe ratios, higher certainty-equivalent returns, and lower turnover out-of-sample than competitive “optimal” allocation policies. In one of our favorite papers, Haldane (2012)2demonstrates that simplified heuristics often outperform more complicated algorithms in a variety of fields.
Yet taken to an extreme, we believe that simplicity can have the opposite effect, introducing extreme fragility into an investment strategy.
As an absurd example, consider a highly simplified portfolio that is 100% allocated to U.S. equities. Introducing bonds into the portfolio may not seem like a large mental leap but consider that this small change introduces an axis of decision making that brings with it a number of considerations. The proportion we allocate between stocks and bonds requires, at the very least, estimates of an investor’s objectives, risk tolerances, market outlook, and confidence levels in these considerations.3
Despite this added complexity, few investors would consider an all-equity portfolio to be more “robust” by almost any reasonable definition of robustness.
Yet this is precisely the type of behavior we see all too often in tactical portfolios – and particularly in trend equity strategies – where investors follow a single signal to make dramatic allocation decisions.
So Close and Yet So Far
To demonstrate the potential fragility of simplicity, we will examine several trend-following signals applied to broad U.S. equities:
Below we plot over time the distance each of these signals is from turning off. Whenever the line crosses over the 0% threshold, it means the signal has flipped direction, with negative values indicating a sell and positive values indicating a buy.
In orange we highlight those periods where the signal is within 1% of changing direction. We can see that for each signal there are numerous occasions where the signal was within this threshold but avoided flipping direction. Similarly, we can see a number of scenarios where the signal just breaks the 0% threshold only to revert back shortly thereafter. In the former case, the signal has often just managed to avoid whipsaw, while in the latter it has usually become unfortunately subject to it.
Source: Kenneth French Data Library. Calculations by Newfound Research.
Is the avoidance of whipsaw representative of the “skill” of the signals while the realization of whipsaw is just bad luck? Or might it be that the avoidance of whipsaw is often just as much luck as the realization of whipsaw is poor skill? How can we determine what is skill and what is luck when there are so many “close calls” and “just hits”?
What is potentially confusing for investors new to this space is that academic literature and practitioner evidence finds that these highly simplified approaches are surprisingly robust across a variety of investment vehicles, geographies, and time periods. What we must stress, however, is that evidence of general robustness is not evidence of specific robustness; i.e. there is little evidence suggesting that a single approach applied to a single instrument over a specific time horizon will be particularly robust.
What Randomness Tells Us About Fragility
To emphasize the potential fragility on utilizing a single in-or-out signal to drive our allocation decisions, we run a simple test:
The design of this test aims to deduce how fragile a strategy is via the introduction of randomness. By measuring 12-month rolling relative returns versus the modified benchmarks, we can compare the 1,000 slightly alternate histories to one another in an effort to determine the overall stability of the strategy itself.
Now bear with us, because while the next graph is a bit difficult to read, it succinctly captures the thrust of our entire thesis. At each point in time, we first calculate the average 12-month relative return of all 1,000 strategies. This average provides a baseline of expected relative strategy performance.
Next, we calculate the maximum and minimum relative 12-month relative performance and subtract the average. This spread – which is plotted in the graph below – aims to capture the potential return differential around the expected strategy performance due to randomness. Or, put another way, the spread captures the potential impact of luck in strategy results due only to slight changes in market returns.
Source: Kenneth French Data Library. Calculations by Newfound Research.
We can see that the spread frequently exceeds 5% and sometimes even exceeds 10. Thus, a tiny bit of injected randomness has a massive effect upon our realized results. Using a single signal to drive our allocation appears particularly fragile and success or failure over the short run can largely be dictated by the direction the random winds blow.
A backtest based upon a single signal may look particularly good, but this evidence suggests we should dampen our confidence as the strategy may actually have just been the accidental beneficiary of good fortune. In this situation, it is nearly impossible to identify skill from luck when in a slightly alternate universe we may have had substantially different results. After all, good luck in the past can easily turn into misfortune in the future.
Now let us perform the same exercise again using the same random sequences we generated. But rather than using a single signal to drive our allocation we will blend the three trend-following approaches above to determine the proportional amount of equities the portfolio should hold.5 We plot the results below using the same scale in the y-axis as the prior plot.
Source: Kenneth French Data Library. Calculations by Newfound Research.
We can see that our more complicated approach actually exhibits a significant reduction in the effects of randomness, with outlier events significantly decreased and far more symmetry in both positive and negative impacts.
Below we plot the actual spreads themselves. We can see that the spread from the combined signal approach is lower than the single signal approach on a fairly consistent basis. In the cases where the spread is larger, it is usually because the sensitivity is arising from either the 10-month SMA or 13-minus-34-week EWMA signals. Were spreads for single signal strategies based upon those approaches plotted, they would likely be larger during those time periods.
Source: Kenneth French Data Library. Calculations by Newfound Research.
Conclusion
So, where is the balance? How can we tell when simplicity creates robustness and simplicity introduces fragility? As we discussed in our article A Case Against Overweighting International Equity, we believe the answer is diversificationversus estimation risk.
In our case above, each trend signal is just a model: an estimate of what the underlying trend is. As with all models, it is imprecise and our confidence level in any individual signal at any point in time being correct may actually be fairly low. We can wrap this all together by simply saying that each signal is actually shrouded in a distribution of estimation risk. But by combining multiple trend signals, we exploit the benefits of diversification in an effort to reduce our overall estimation risk.
Thus, while we may consider a multi-model approach less transparent and more complicated, that added layer of complication serves to increase internal diversification and reduce estimation risk.
It should not go overlooked that the manner in which the signals were blended represents a model with its own estimation risk. Our choice to simply equally-weight the signals indicates a zero-confidence position in views about relative model accuracy and relative marginal diversification benefits among the models. Had we chosen a more complicated method of combining signals, it is entirely possible that the realized estimation risk could overwhelm the diversification gain we aimed to benefit from in the first place. Or, conversely, that very same added estimation risk could be entirely justified if we could continue to meaningfully improve diversification benefits.
If we return back to our original example of a 100% equity portfolio versus a blended stock-bond mix, the diversification versus estimation risk trade-off becomes obvious. Introducing bonds into our portfolio creates such a significant diversification gain that the estimation risk is often an insignificant consideration. The same might not be true, however, in a tactical equity portfolio.
Research and empirical evidence suggest that simplicity is surprisingly robust. But we should be skeptical of simplicity for the sake of simplicity when it foregoes low-hanging diversification opportunities, lest we make our portfolios and strategies unintentionally fragile.