The benefits of diversification are often touted, but many investors feel disappointed in diversified portfolios because of the dispersion in performance of the individual holdings.
In the context of three different unconstrained sleeves, we look at a way to measure and visualize the benefit (or detriment) of diversification based on achieving different objectives.
Through this lens, we get a picture of how good or bad the results might have been, which can lead to confidence either in the robustness of the allocation or in the need to take a different approach.
Since we only experience one path of history, it is difficult to assess the benefit of diversification unless we consider what could have happened.
We believe that taking a systematic approach does not fully remove the art of the analysis but can remove some of the behavioral biases that make sticking with a portfolio difficult in the first place.
Introduction
Diversification is a standard risk management tool in any portfolio. Reducing the impact of idiosyncratic risks in individual investments by holding a suite of stocks, asset classes, strategies, etc. produces a smoother investment ride most of the time and reduces the risk of negative surprises.
But in a world where we only experience one outcome out of the multitude of possibilities, gauging the benefit of diversification is difficult. It is even hard to do in hindsight, not so much because we can’t but more often that we won’t. The results already happened.
Over a single time period with no rebalancing, a diversified portfolio will underperform the best asset that it holds. This is a mathematical fact when there is any dispersion in the returns of the assets and it is why we have said that diversification will always disappoint. Our natural behavioral tendencies can often get the better of us, despite the fact that diversification might be doing a great job, especially when examined through the appropriate lens and measured in the context of what could have happened.
Last summer, we published a presentation entitled Building an Unconstrained Sleeve. In it, we looked at ways to combine traditional and non-traditional assets and strategies to target specific objectives: equity hedging, absolute return, and equity-like with downside management.
Now that we have 15 months of subsequent data for all the underlying strategies, we want to revisit that piece and explore the benefit of diversification in the context of hindsight.
A Recap of the Process
As a quick refresher, we included seven strategies and asset classes in the construction of our unconstrained sleeves:
Long/flat trend-following equities
Minimum volatility equities
Macro trend-following (managed futures)
Macro risk parity
Macro value
Macro income
Intermediate U.S. Treasuries
While these strategies are surely not exhaustive, they cover a range of factors (value, momentum, low volatility, etc.) and a global set of asset classes (equities, bonds, commodities, and currencies) commonly included in unconstrained sleeves. They were also selected because many of these strategies are conveniently packaged as ETFs or mutual funds, making the resulting sleeves more easily implementable.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results. Index returns are total returns and are gross of all fees.
Over the 15 months, world equity was by far the best performer and the spread between best-performing and worst-performing positions exceeded 20 percentage points. If you wanted high returns – and going back to our statement about how diversification will always disappoint – you could have just held world equities and been quite content.
But putting ourselves back in June 2017, we did not know a priori that simply holding equities would have generated the highest returns. Looking at this type of chart in November 2008 would have led to a very different emotional conclusion.
The aim of our original study was to develop unconstrained sleeves that would meet their objectives regardless of how the future played out. Therefore, we employed a simulation-based method that aimed to preserve some of the unique correlation structure between the strategies across different market environments and reduce the risk of overfitting to a single realization of history. With this approach, we constructed portfolios that targeted three different objectives that investors might be interested in:
Equity hedge – designed to offset significant equity losses.
Absolute return – designed to create a stable and consistent return stream in all environments.
Equity-like – designed to capture significant equity upside with reduced downside.
(Note: Greater detail about portfolio construction process, strategy descriptions, and performance attributes of each strategy can be found in our original presentation.)
But were our constructed portfolios successful in achieving their objectives out-of-sample? To analyze this question, as well as explore the benefits/detractors of diversification for each objective, we will calculate the distribution of what could have happened. The hope is that, each strategy would perform well relative to all other possible portfolios that could have been chosen for the sleeve.
Saying exactly what portfolios we could have chosen is where a little art comes into play. For example, in the equity-like strategies, it is difficult to say that a 100% bond portfolio would have ever been a viable option and therefore may not be an apt out-of-sample comparison.
However, since our original process did not have any specific override for these intuitive constraints, and since we do not wish to assert after-the-fact which portfolios would have been rejected, we will allow the entire potential allocation space to be fair game in our comparison.
There are a number of ways to sample the set of allocations over the 7 asset classes that could have formed the portfolios for each sleeve. Perhaps the most obvious choice would be to sample uniformly over the possible allocations. The issue to balance in this case is coverage of the space (a 6-dimensional simplex) with the number of samples. To be 95% confident that we sampled an allocation above 95% for only a single asset class would require nearly 200 million samples. We have used modified Sobol sequences in the past to ensure coverage of more of the space with fewer points. However, in the current case, to mimic the rounding that is often found in portfolio allocations, we will use a lattice of points spaced 2.5% apart covering the entire space. This requires just under 10 million points in the simulations.
Equity Hedge
This sleeve was designed to offset significant equity losses by limiting downside capture. The resulting optimized portfolio was relatively concentrated in two main positions that historically have exhibited low-to-negative correlations to equities and exhibited potential crisis alpha during significant and prolonged drawdowns.Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research.
The down capture this portfolio during the out-of-sample period was 0.44. This result falls in the 70th percentile (that is, better than 70% of the other sample portfolios and where lower down-capture is better) when compared to the 10 million possible other portfolios we could have originally selected. Not surprisingly, the 100% intermediate-term Treasury portfolio had the best down capture (-0.05) over the out-of-sample. Of the portfolios with better down capture, Intermediate Treasuries and Macro – Income were generally the highest allocations.
This does not come as much of a surprise to anyone who has followed the managed futures space for the last 15 months. The category largely remains in a multi-year drawdown (peaking in early 2014), but it has also done little to offset the rapid sell-offs seen in equities in 2018. Therefore, with the full benefit of hindsight, any allocation to Macro – Trend in the original portfolio would be a detriment realizing our out-of-sample objective.
Yet even with this lackluster performance, an out-of-sample realized 70th percentile result over a short, 15-month horizon is a result to be pleased with.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results. Index returns are total returns and are gross of all fees.
Absolute Return
This sleeve was designed to seek a stable and consistent return stream in all market environments. We aimed to accomplish this by utilizing a risk parity approach. As expected, this sleeve holds all asset classes and is very well diversified across them.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research.
To measure the success of the risk parity over the live period, we will look at the Gini coefficient for each of the ten million potential portfolios we could have initially selected. The Gini coefficient quantifies the equality of the distribution, with a value of 1 representing 100% concentration and 0 representing perfect equality.
The Gini coefficient of the actual portfolio was 0.25 which was in the 99.8th percentile of possible outcomes (i.e. highly diversified on a relative basis). Here, the percentile estimate is padded by the fact that many of the simulated portfolios (e.g. the 100% ones) would clearly not be close to equal risk contribution.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results. Index returns are total returns and are gross of all fees.
Did our original portfolio achieve its out-of-sample goal? Here, we can evaluate success as to whether the realized contribution to risk of each exposure was close to equivalent; i.e. did we actually achieve risk parity as desired? We can see below that indeed we did, with the main exception of Macro – Trend, which was the most volatile asset class over the period.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research.
Over the sample space of potential portfolios, the portfolio with the minimum out-of-sample Gini coefficient (0.08) was tilted toward the less volatile and more diversifying asset classes (Intermediate Treasuries and Macro – Income). Even so, due to the limited granularity of the sampled portfolios, the risk contribution of Macro – Income was still half of that for each of the other strategies.
It is also worth noting how similar this solution is – generated with the complete benefit of hindsight – to our originally constructed portfolio.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research.
Equity-like with Downside Management
This sleeve was designed in an effort to capture equity market growth while managing the risk of severe and prolonged drawdowns. It was tilted toward the equity-like exposures with a split among risk management styles (trend, minimum volatility, macro strategies, etc.). The allocation to U.S. Treasuries is very small.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research.
For this portfolio, we have two variables to analyze: the up capture relative to global equities and the Ulcer index, a measure of the severity and duration of drawdowns. In the construction of the sleeve, the target was to keep the Ulcer index less than 25% of the value for global equities. The joint distribution of these quantities over the live period is shown below with the actual values over the live period for the sleeve indicated.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results. Index returns are total returns and are gross of all fees.
The realized Ulcer level was 68% of that of world equity – a far cry from the 25% that the portfolio was optimized for – and was in the 42nd percentile while the up capture of 0.60 was in the 93rd percentile.
With the explicit goal of achieving a relative Ulcer level, a comparison against the entire potential allocation space of 10 million portfolios is not appropriate. Therefore, we reduce the set of 10 million comparative portfolios to only those that would have given a relative Ulcer index less than 25% compared to world equities, eliminating approximately 40% of possible portfolios.
The distributions of allocations to each of the strategies in the acceptable subset are shown below. We can see that the more diversifying strategies take on a larger range of allocations.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results. Index returns are total returns and are gross of all fees.
Interestingly, looking only over this subset of the original 10 million portfolios improves the out-of-sample up capture of our originally constructed portfolio to the 99th percentile but does not change the percentile of the Ulcer index over the live period. Why is this?
The correlation of the relative Ulcer index over the live period with that over the historical period is only 0.1, indicating that the out of sample data did not line up with our expectations at first glance. However, this makes sense when we recall that the optimization was carried out using data from much more extreme market environments (think 2001 and 2008). It is a good reminder that, just because you optimize for a certain parameter value does not mean you will get it over the live data.
Higher up-capture typically goes hand-in-hand with a higher Ulcer index, as higher return often requires bearing more risk. Therefore, one way to standardize our measures across the potential set of portfolios is to calculate the ratio of up-capture to the Ulcer index. With this transformation, the risk-adjusted up capture falls in the 87th percentile over the set of sample allocations, indicating a very high realized risk-adjusted return.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results. Index returns are total returns and are gross of all fees.
Conclusion
We only experience one path of the world and do not know the infinite alternate course history could have taken. But it is exactly this infinitude of alternate states that diversification is meant to address.
Diversification generally has no apparent benefit unless we envision what could have happened. Unfortunately our innate natures make this difficult. We do not often value our realized path in this context. After all, none of these alternate states actually happened, so it is difficult to picture what we did not experience.
A quantitative approach can yield a systematic way to evaluate the benefit (or detriment) of diversification. This way, we are not relying as much on intuition – how did our performance feel? – and are looking through a more objective lens at our initial decisions.
In the examples using the Unconstrained Sleeves, diversification focused on more than just returns. The objectives that initially went in to the portfolio construction were the parameters of interest.
Taking a systematic approach does not fully remove the art of the analysis, as was evident in the construction of the potential sample of portfolios used in the comparisons, but having a process can remove some of the behavioral biases that make sticking with a portfolio difficult in the first place.
Measuring the Benefit of Diversification
By Nathan Faber
On November 5, 2018
In Portfolio Construction, Risk Management, Weekly Commentary
This post is available as a PDF download here.
Summary
Introduction
Diversification is a standard risk management tool in any portfolio. Reducing the impact of idiosyncratic risks in individual investments by holding a suite of stocks, asset classes, strategies, etc. produces a smoother investment ride most of the time and reduces the risk of negative surprises.
But in a world where we only experience one outcome out of the multitude of possibilities, gauging the benefit of diversification is difficult. It is even hard to do in hindsight, not so much because we can’t but more often that we won’t. The results already happened.
Over a single time period with no rebalancing, a diversified portfolio will underperform the best asset that it holds. This is a mathematical fact when there is any dispersion in the returns of the assets and it is why we have said that diversification will always disappoint. Our natural behavioral tendencies can often get the better of us, despite the fact that diversification might be doing a great job, especially when examined through the appropriate lens and measured in the context of what could have happened.
Last summer, we published a presentation entitled Building an Unconstrained Sleeve. In it, we looked at ways to combine traditional and non-traditional assets and strategies to target specific objectives: equity hedging, absolute return, and equity-like with downside management.
Now that we have 15 months of subsequent data for all the underlying strategies, we want to revisit that piece and explore the benefit of diversification in the context of hindsight.
A Recap of the Process
As a quick refresher, we included seven strategies and asset classes in the construction of our unconstrained sleeves:
While these strategies are surely not exhaustive, they cover a range of factors (value, momentum, low volatility, etc.) and a global set of asset classes (equities, bonds, commodities, and currencies) commonly included in unconstrained sleeves. They were also selected because many of these strategies are conveniently packaged as ETFs or mutual funds, making the resulting sleeves more easily implementable.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results. Index returns are total returns and are gross of all fees.
Over the 15 months, world equity was by far the best performer and the spread between best-performing and worst-performing positions exceeded 20 percentage points. If you wanted high returns – and going back to our statement about how diversification will always disappoint – you could have just held world equities and been quite content.
But putting ourselves back in June 2017, we did not know a priori that simply holding equities would have generated the highest returns. Looking at this type of chart in November 2008 would have led to a very different emotional conclusion.
The aim of our original study was to develop unconstrained sleeves that would meet their objectives regardless of how the future played out. Therefore, we employed a simulation-based method that aimed to preserve some of the unique correlation structure between the strategies across different market environments and reduce the risk of overfitting to a single realization of history. With this approach, we constructed portfolios that targeted three different objectives that investors might be interested in:
(Note: Greater detail about portfolio construction process, strategy descriptions, and performance attributes of each strategy can be found in our original presentation.)
But were our constructed portfolios successful in achieving their objectives out-of-sample? To analyze this question, as well as explore the benefits/detractors of diversification for each objective, we will calculate the distribution of what could have happened. The hope is that, each strategy would perform well relative to all other possible portfolios that could have been chosen for the sleeve.
Saying exactly what portfolios we could have chosen is where a little art comes into play. For example, in the equity-like strategies, it is difficult to say that a 100% bond portfolio would have ever been a viable option and therefore may not be an apt out-of-sample comparison.
However, since our original process did not have any specific override for these intuitive constraints, and since we do not wish to assert after-the-fact which portfolios would have been rejected, we will allow the entire potential allocation space to be fair game in our comparison.
There are a number of ways to sample the set of allocations over the 7 asset classes that could have formed the portfolios for each sleeve. Perhaps the most obvious choice would be to sample uniformly over the possible allocations. The issue to balance in this case is coverage of the space (a 6-dimensional simplex) with the number of samples. To be 95% confident that we sampled an allocation above 95% for only a single asset class would require nearly 200 million samples. We have used modified Sobol sequences in the past to ensure coverage of more of the space with fewer points. However, in the current case, to mimic the rounding that is often found in portfolio allocations, we will use a lattice of points spaced 2.5% apart covering the entire space. This requires just under 10 million points in the simulations.
Equity Hedge
This sleeve was designed to offset significant equity losses by limiting downside capture. The resulting optimized portfolio was relatively concentrated in two main positions that historically have exhibited low-to-negative correlations to equities and exhibited potential crisis alpha during significant and prolonged drawdowns.Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research.
The down capture this portfolio during the out-of-sample period was 0.44. This result falls in the 70th percentile (that is, better than 70% of the other sample portfolios and where lower down-capture is better) when compared to the 10 million possible other portfolios we could have originally selected. Not surprisingly, the 100% intermediate-term Treasury portfolio had the best down capture (-0.05) over the out-of-sample. Of the portfolios with better down capture, Intermediate Treasuries and Macro – Income were generally the highest allocations.
This does not come as much of a surprise to anyone who has followed the managed futures space for the last 15 months. The category largely remains in a multi-year drawdown (peaking in early 2014), but it has also done little to offset the rapid sell-offs seen in equities in 2018. Therefore, with the full benefit of hindsight, any allocation to Macro – Trend in the original portfolio would be a detriment realizing our out-of-sample objective.
Yet even with this lackluster performance, an out-of-sample realized 70th percentile result over a short, 15-month horizon is a result to be pleased with.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results. Index returns are total returns and are gross of all fees.
Absolute Return
This sleeve was designed to seek a stable and consistent return stream in all market environments. We aimed to accomplish this by utilizing a risk parity approach. As expected, this sleeve holds all asset classes and is very well diversified across them.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research.
To measure the success of the risk parity over the live period, we will look at the Gini coefficient for each of the ten million potential portfolios we could have initially selected. The Gini coefficient quantifies the equality of the distribution, with a value of 1 representing 100% concentration and 0 representing perfect equality.
The Gini coefficient of the actual portfolio was 0.25 which was in the 99.8th percentile of possible outcomes (i.e. highly diversified on a relative basis). Here, the percentile estimate is padded by the fact that many of the simulated portfolios (e.g. the 100% ones) would clearly not be close to equal risk contribution.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results. Index returns are total returns and are gross of all fees.
Did our original portfolio achieve its out-of-sample goal? Here, we can evaluate success as to whether the realized contribution to risk of each exposure was close to equivalent; i.e. did we actually achieve risk parity as desired? We can see below that indeed we did, with the main exception of Macro – Trend, which was the most volatile asset class over the period.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research.
Over the sample space of potential portfolios, the portfolio with the minimum out-of-sample Gini coefficient (0.08) was tilted toward the less volatile and more diversifying asset classes (Intermediate Treasuries and Macro – Income). Even so, due to the limited granularity of the sampled portfolios, the risk contribution of Macro – Income was still half of that for each of the other strategies.
It is also worth noting how similar this solution is – generated with the complete benefit of hindsight – to our originally constructed portfolio.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research.
Equity-like with Downside Management
This sleeve was designed in an effort to capture equity market growth while managing the risk of severe and prolonged drawdowns. It was tilted toward the equity-like exposures with a split among risk management styles (trend, minimum volatility, macro strategies, etc.). The allocation to U.S. Treasuries is very small.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research.
For this portfolio, we have two variables to analyze: the up capture relative to global equities and the Ulcer index, a measure of the severity and duration of drawdowns. In the construction of the sleeve, the target was to keep the Ulcer index less than 25% of the value for global equities. The joint distribution of these quantities over the live period is shown below with the actual values over the live period for the sleeve indicated.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results. Index returns are total returns and are gross of all fees.
The realized Ulcer level was 68% of that of world equity – a far cry from the 25% that the portfolio was optimized for – and was in the 42nd percentile while the up capture of 0.60 was in the 93rd percentile.
With the explicit goal of achieving a relative Ulcer level, a comparison against the entire potential allocation space of 10 million portfolios is not appropriate. Therefore, we reduce the set of 10 million comparative portfolios to only those that would have given a relative Ulcer index less than 25% compared to world equities, eliminating approximately 40% of possible portfolios.
The distributions of allocations to each of the strategies in the acceptable subset are shown below. We can see that the more diversifying strategies take on a larger range of allocations.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results. Index returns are total returns and are gross of all fees.
Interestingly, looking only over this subset of the original 10 million portfolios improves the out-of-sample up capture of our originally constructed portfolio to the 99th percentile but does not change the percentile of the Ulcer index over the live period. Why is this?
The correlation of the relative Ulcer index over the live period with that over the historical period is only 0.1, indicating that the out of sample data did not line up with our expectations at first glance. However, this makes sense when we recall that the optimization was carried out using data from much more extreme market environments (think 2001 and 2008). It is a good reminder that, just because you optimize for a certain parameter value does not mean you will get it over the live data.
Higher up-capture typically goes hand-in-hand with a higher Ulcer index, as higher return often requires bearing more risk. Therefore, one way to standardize our measures across the potential set of portfolios is to calculate the ratio of up-capture to the Ulcer index. With this transformation, the risk-adjusted up capture falls in the 87th percentile over the set of sample allocations, indicating a very high realized risk-adjusted return.
Source: St. Louis Federal Reserve, MSCI, Salient, HFRI, CSI Analytics. Calculations by Newfound Research. It is not possible to invest in an index. Past performance does not guarantee future results. Index returns are total returns and are gross of all fees.
Conclusion
We only experience one path of the world and do not know the infinite alternate course history could have taken. But it is exactly this infinitude of alternate states that diversification is meant to address.
Diversification generally has no apparent benefit unless we envision what could have happened. Unfortunately our innate natures make this difficult. We do not often value our realized path in this context. After all, none of these alternate states actually happened, so it is difficult to picture what we did not experience.
A quantitative approach can yield a systematic way to evaluate the benefit (or detriment) of diversification. This way, we are not relying as much on intuition – how did our performance feel? – and are looking through a more objective lens at our initial decisions.
In the examples using the Unconstrained Sleeves, diversification focused on more than just returns. The objectives that initially went in to the portfolio construction were the parameters of interest.
Taking a systematic approach does not fully remove the art of the analysis, as was evident in the construction of the potential sample of portfolios used in the comparisons, but having a process can remove some of the behavioral biases that make sticking with a portfolio difficult in the first place.