This post is available as a PDF download here.
Summary
- In this research note, we continue our exploration of credit.
- Rather than test a quantitative signal, we explore credit changes through the lens of statistical decomposition.
- As with the Treasury yield curve, we find that changes in the credit spread curve can be largely explained by Level, Slope, and Curvature (so long as we adjust for relative volatility levels).
- We construct stylized portfolios to reflect these factors, adjusting position weights such that they contribute an equal amount of credit risk. We then neutralize interest rate exposure such that the return of these portfolios represents credit-specific information.
- We find that the Level trade suggests little-to-no realized credit premium over the last 25 years, and Slope suggests no realized premium of junk-minus-quality within credit either. However, results may be largely affected by idiosyncratic events (e.g. LTCM in 1998) or unhedged risks (e.g. sector differences in credit indices).
In this week’s research note, we continue our exploration of credit with a statistical decomposition of the credit spread curve. Just as the U.S. Treasury yield curve plots yields versus maturity, the credit spread curve plots excess yield versus credit quality, providing us insight into how much extra return we demand for the risks of declining credit quality.
Source: Federal Reserve of St. Louis; Bloomberg. Calculations by Newfound Research.
Our goal in analyzing the credit spread curve is to gain a deeper understanding of the principal drivers behind its changes. In doing so, we hope to potentially gain intuition and ideas for trading signals between low- and high-quality credit.
To begin our, we must first construct our credit spread curve. We will use the following index data to represent our different credit qualities.
- Aaa: Bloomberg U.S. Corporate Aaa Index (LCA3TRUU)
- Aa: Bloomberg U.S. Corporate Aa Index (LCA2TRUU)
- A:Bloomberg U.S. Corporate A Index (LCA1TRUU)
- Baa: Bloomberg U.S. Corporate Baa Index (LCB1TRUU)
- Ba: Bloomberg U.S. Corporate HY Ba Index (BCBATRUU)
- B: Bloomberg U.S. Corporate HY B Index (BCBHTRUU)
- Caa: Bloomberg U.S. Corporate HY Caa Index (BCAUTRUU)
Unfortunately, we cannot simply plot the yield-to-worst for each index, as spread captures the excess yield. Which raises the question: excess to what? As we want to isolate the credit component of the yield, we need to remove the duration-equivalent Treasury rate.
Plotting the duration of each credit index over time, we can immediately see why incorporating this duration data will be important. Not only do durations vary meaningfully over time (e.g. Aaa durations varying between 4.95 and 11.13), but they also deviate across quality (e.g. Caa durations currently sit near 3.3 while Aaa durations are north of 11.1).
Source: Bloomberg.
To calculate our credit spread curve, we must first calculate the duration-equivalent Treasury bond yield for each index at each point in time. For each credit index at each point in time, we use the historical Treasury yield curve to numerically solve for the Treasury maturity that matches the credit index’s duration. We then subtract that matching rate from the credit index’s reported yield-to-worst to estimate the credit spread.
We plot the spreads over time below.
Source: Federal Reserve of St. Louis; Bloomberg. Calculations by Newfound Research.
Statistical Decomposition: Eigen Portfolios
With our credit spreads in hand, we can now attempt to extract the statistical drivers of change within the curve. One method of achieving this is to:
- Calculate month-to-month differences in the curve.
- Calculate the correlation matrix of the differences.
- Calculate an eigenvalue decomposition of the correlation matrix.
Stopping after just the first two steps, we can begin to see some interesting visual patterns emerge in the correlation matrix.
- There is not a monotonic decline in correlation between credit qualities. For example, Aaa is not more highly correlated to Aa than Ba and A is more correlated to B than it is Aa.
- Aaa appears to behave rather uniquely.
- Baa, Ba, B, and to a lesser extent Caa, appear to visually cluster in behavior.
- Ba, B, and Caa do appear to have more intuitive correlation behavior, with correlations increasing as credit qualities get closer.
Step 3 might seem foreign for those unfamiliar with the technique, but in this context eigenvalue decomposition has an easy interpretation. The process will take our universe of credit indices and return a universe of statistically independent factor portfolios, where each portfolio is made up of a combination of credit indices.
As our eigenvalue decomposition was applied to the correlation matrix of credit spread changes, the factors will explain the principal vectors of variance in credit spread changes. We plot the weights of the first three factors below.
Source: Federal Reserve of St. Louis; Bloomberg. Calculations by Newfound Research.
For anyone who has performed an eigenvalue decomposition on the yield curve before, three familiar components emerge.
We can see that Factor #1 applies nearly equal-weights across all the credit indices. Therefore, we label this factor “level” as it represents a level shift across the entire curve.
Factor #2 declines in weight from Aaa through Caa. Therefore, we label this factor “slope,” as it controls steepening and flattening of the credit curve.
Factor #3 appears as a barbell: negative weights in the wings and positive weights in the belly. Therefore, we call this factor “curvature,” as it will capture convexity changes in the curve.
Together, these three factors explain 80% of the variance in credit spread changes. Interestingly, the 4thfactor – which brings variance explained up to 87.5% – also looks very much like a curvature trade, but places zero weight on Aaa and barbells Aa/Caa against A/Baa. We believe this serves as further evidence as to the unique behavior of Aaa credit.
Tracking Credit Eigen Portfolios
As we mentioned, each factor is constructed as a combination of exposure to our Aaa-Caa credit universe; in other words, they are portfolios! This means we can track their performance over time and see how these different trades behave in different market regimes.
To avoid overfitting and estimation risk, we decided to simplify the factor portfolios into more stylized trades, whose weights are plotted below (though ignore, for a moment, the actual weights, as they are meant only to represent relative weighting within the portfolio and not absolute level). Note that the Level trade has a cumulative positive weight while the Slope and Curvature trades sum to zero.
To actually implement these trades, we need to account for the fact that each credit index will have a different level of credit duration.
Akin to duration, which measure’s a bond’s sensitivity to interest rate changes, credit duration measures a bond’s sensitivity to changes in its credit spread. As with Treasuries, we need to adjust the weights of our trades to account for this difference in credit durations across our indices.
For example, if we want to place a trade that profits in a steepening of the Treasury yield curve, we might sell 10-year US Treasuries and buy 2-year US Treasuries. However, we would not buy and sell the same notional amount, as that would leave us with a significantly negative duration position. Rather, we would scale each leg such that their durations offset. In the end, this causes us to buy significantly more 2s than we sell 10s.
To continue, therefore, we must calculate credit spread durations.
Without this data on hand, we employ a statistical approach. Specifically, we take monthly total return data and subtract yield return and impact from interest rate changes (employing the duration-matched rates we calculated above). What is left over is an estimate of return due to changes in credit spreads. We then regress these returns against changes in credit spreads to calculate credit spread durations, which we plot below.
Source: Federal Reserve of St. Louis; Bloomberg. Calculations by Newfound Research.
The results are a bit of a head scratcher. Unlike duration in the credit curve which typically increases monotonically across maturities, we get a very different effect here. Aaa credit spread duration is 10.7 today while Caa credit spread duration is 2.8. How is that possible? Why is lower-quality credit not more sensitiveto credit changes than higher quality credit?
Here we run into a very interesting empirical result in credit spreads: spread change is proportional to spread level. Thus, a true “level shift” rarely occurs in the credit space; e.g. a 1bp change in the front-end of the credit spread curve may actually manifest as a 10bp change in the back end. Therefore, the lower credit spread duration of the back end of the curve is offset by larger changes.
There is some common-sense intuition to this effect. Credit has a highly non-linear return component: defaults. If we enter an economic environment where we expect an increase in default rates, it tends to happen in a non-linear fashion across the curve. To offset the larger increase in defaults in lower quality credit, investors will demand larger corresponding credit spreads.
(Side note: this is why we saw that the Baa–Aaa spread did not appear to mean-revert as cleanly as the log-difference of spreads did in last week’s commentary, Value and the Credit Spread.)
While our credit spread durations may be correct, we still face a problem: weighting such that each index contributes equal credit spread duration will create an outsized weight to the Caa index.
DTS Scaling
Fortunately, some very smart folks thought about this problem many years ago. Recognizing the stability of relative spread changes, Dor, Dynkin, Hyman, Houweling, van Leeuwen, and Penninga (2007)recommend the measure of duration times spread (“DTS”) for credit risk.
With a more appropriate measure of credit sensitivity, we can now scale our stylized factor portfolio weights such that each position contributes an equal level of DTS. This will have two effects: (1) the relative weights in the portfolios will change over time, and (2) the notional size of the portfolios will change over time.
We scale each position such that (1) they contribute an equal level of DTS to the portfolio and (2) each leg of the portfolio has a total DTS of 500bps. The Level trade, therefore, represents a constant 500bps of DTS risk over time, while the Slope and Curvature trades represent 0bps, as the longs and short legs net out.
One problem still remains: interest rate risk. As we plotted earlier in this piece, the credit indices have time-varying – and sometimes substantial – interest rate exposure. This creates an unintended bet within our portfolios.
Fortunately, unlike the credit curve, true level shift does empirically apply in the Treasury yield curve. Therefore, to simplify matters, we construct a 5-year zero-coupon bond, which provides us with a constant duration instrument. At each point in time, we calculate the net duration of our credit trades and use the 5-year ZCB to neutralize the interest rate risk. For example, if the Level portfolio has a duration of 1, we would take a -20% notional position in the 5-year ZCB.
Source: Federal Reserve of St. Louis; Bloomberg. Calculations by Newfound Research.
Some things we note when evaluating the portfolios over time:
- In all three portfolios, notional exposure to higher credit qualities is substantially larger than lower credit qualities. This captures the meaningfully higher exposure that lower credit quality indices have to credit risk than higher quality indices.
- The total notional exposure of each portfolio varies dramatically over time as market regimes change. In tight spread environments, DTS is low, and therefore notional exposures increase. In wide spread environments – like 2008 – DTS levels expand dramatically and therefore only a little exposure is necessary to achieve the same risk target.
- 2014 highlights a potential problem with our approach: as Aaa spreads reached just 5bps, DTS dipped as low as 41bps, causing a significant swing in notional exposure to maintain the same DTS contribution.
Conclusion
The fruit of our all our labor is the graph plotted below, which shows the growth of $1 in our constant DTS, stylized credit factor portfolios.
What can we see?
First and foremost, constant credit exposure has not provided much in the last 25 years until recently. It would appear that investors did not demand a high enough premium for the risks that were realized over the period, which include the 1998 LTCM blow-up, the burst of the dot-com bubble, and the 2008 recession.
From 12/31/2008 lows through Q1 2019, however, a constant 500bps DTS exposure generated a 2.0% annualized return with 2.4% annualized volatility, reflecting a nice annual premium for investors willing to bear the credit risk.
Slope captures the high-versus-low-quality trade. We can see that junk meaningfully out-performed quality in the 1990s, after which there really did not appear to be a meaningful difference in performance until 2013 when oil prices plummeted and high yield bond prices collapsed. This result does highlight a potential problem in our analysis: the difference in sector composition of the underlying indices. High yield bonds had an outsized reaction compared to higher quality investment grade credit due to more substantial exposure to the energy sector, leading to a lop-sided reaction.
What is also interesting about the Slope trade is that the market did not seem to price a meaningful premium for holding low-quality credit over high-quality credit.
Finally, we can see that Curvature (“barbell versus belly”) – trade was rather profitable for the first decade, before deflating pre-2008 and going on a mostly-random walk ever since. However, as mentioned when the curvature trade was initially introduced, the 4th factor in our decomposition also appeared to reflect a similar trade but shorts Aa and Caa versus a long position in A and Baa. This trade has been a fairly consistent money-loser since the early 2000s, indicating that a barbell of high quality (just not Aaa) and junk might do better than the belly of the curve.
It is worth pointing out that these trades represent a significant amount of compounding estimation – from duration-matching Treasury rates to credit spread durations – which also means a significant risk of compounding estimation error. Nevertheless, we believe there are a few takeaways worth exploring further:
- The Level trade appears highly regime dependent (in positive and negative economic environments), suggesting a potential opportunity for on/off credit trades.
- The 4th factor is a consistent loser, suggesting a potential structural tilt that can be made by investors by holding quality and junk (e.g. QLTA + HYG) rather than the belly of the curve (LQD). Implementing this in a long-only fashion would require more substantial analysis of duration trade-offs, as well as a better intuition as to whythe returns are emerging as they are.
- Finally, a recognition that maintaining a constant credit risk level requires reducing notional exposure as rates go up, as rate changes are proportional to rate levels. This is an important consideration for strategic asset allocation.
Ensemble Multi-Asset Momentum
By Corey Hoffstein
On July 22, 2019
In Craftsmanship, Momentum, Popular, Risk & Style Premia, Risk Management, Weekly Commentary
This post is available as a PDF download here.
Summary
Early in the 2010s, a suite of index-linked products came to market that raised billions of dollars. These products – offered by just about every major bank – sought to simultaneously exploit the diversification benefits of modern portfolio theory and the potential for excess returns from the momentum anomaly.
While each index has its own bells and whistles, they generally follow the same approach:
And despite their differences, we can see in plotting their returns below that these indices generally share a common return pattern, indicating a common, driving style.
Source: Bloomberg.
Frequent readers will know that “monthly rebalance” is an immediate red flag for us here at Newfound: an indicator that timing luck is likely lurking nearby.
Replicating Multi-Asset Momentum
To test the impact of timing luck, we replicate a simple multi-asset momentum strategy based upon available index descriptions.
We rebalance the portfolio at the end of each month. Our optimization process seeks to identify the portfolio with a realized volatility less than 5% that would have maximized returns over the prior six months, subject to a number of position and asset-level limits. If the 5% volatility target is not achievable, the target is increased by 1% until a portfolio can be constructed that satisfies our constraints.
We use the following ETFs and asset class limits:
As a naïve test for timing luck, rather than assuming the index rebalances at the end of each month, we will simply assume the index rebalances every 21 trading days. In doing so, we can construct 21 different variations of the index, each representing the results from selecting a different rebalance date.
Source: CSI Analytics; Calculations by Newfound Research. Results are backtested and hypothetical. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes, with the exception of underlying ETF expense ratios. Past performance is not an indicator of future results.
As expected, the choice of rebalance date has a meaningful impact. Annualized returns range from 4.7% to 5.5%, Sharpe ratios range from 0.6 to 0.9, and maximum drawdowns range from 9.9% to 20.8%.
On a year-by-year basis, the only thing that is consistent is the large spread between the worst and best-performing rebalance date. On average, the yearly spread exceeds 400 basis points.
Min
Max
-9.91%
0.85%
2.36%
4.59%
6.46%
9.65%
3.31%
10.15%
6.76%
10.83%
3.42%
6.13%
5.98%
10.60%
-5.93%
-2.51%
4.18%
8.45%
9.60%
11.62%
-6.00%
-2.53%
5.93%
10.01%
* Partial year starting 7/22/2018
We’ve said it in the past and we’ll say it again: timing luck can be the difference between hired and fired. And while we’d rather be on the side of good luck, the lack of control means we’d rather just avoid this risk all together.
If it isn’t nailed down for a reason, diversify it
The choice of when to rebalance is certainly not the only free variable of our multi-asset momentum strategy. Without an explicit view as to why a choice is made, our preference is always to diversify so as to avoid specification risk.
We will leave the constraints (e.g. volatility target and weight constraints) well enough alone in this example, but we should consider the process by which we’re measuring past returns as well as the horizon over which we’re measuring it. There is plenty of historical efficacy to using prior 6-month total returns for momentum, but no lack of evidence supporting other lookback horizons or measurements.
Therefore, we will use three models of momentum: prior total return, the distance of price from its moving average, and the distance of a short-term moving average from a longer-term moving average. We will vary the parameterization of these signals to cover horizons ranging from 3- to 15-months in length.
We will also vary which day of the month the portfolio rebalances on.
By varying the signal, the lookback horizon, and the rebalance date, we can generate hundreds of different portfolios, all supported by the same theoretical evidence but having slightly different realized results due to their particular specification.
Our robust portfolio emerges by calculating the weights for all these different variations and averaging them together, in many ways creating a virtual strategy-of-strategies.
Below we plot the result of this –ensemble approach– as compared to a –random sample of the underlying specifications–. We can see that while there are specifications that do much better, there are also those that do much worse. By employing an ensemble approach, we forgo the opportunity for good luck and avoid the risk of bad luck. Along the way, though, we may pick up some diversification benefits: the Sharpe ratio of the ensemble approach fell in the top quartile of specifications and its maximum drawdown was in the bottom quartile (i.e. lower drawdown).
Source: CSI Analytics; Calculations by Newfound Research. Results are backtested and hypothetical. Results assume the reinvestment of all distributions. Results are gross of all fees, including, but not limited to, manager fees, transaction costs, and taxes, with the exception of underlying ETF expense ratios. Past performance is not an indicator of future results.
Conclusion
In this commentary, we again demonstrate the potential risk of needless specification and the potential power of diversification.
Using a popular multi-asset momentum model as our example, we again find a significant amount of timing luck lurking in a monthly rebalance specification. By building a virtual strategy-of-strategies, we are able to manage this risk by partially rebalancing our portfolio on different days.
We go a step further, acknowledging that processrepresents another axis of risk. Specifically, we vary both how we measure momentum and the horizon over which it is measured. Through the variation of rebalance days, model specifications, and lookback horizons, we generate over 500 different strategy specifications and combine them into a virtual strategy-of-strategies to generate our robust multi-asset momentum model.
As with prior commentaries, we find that the robust model is able to effectively reduce the risk of both specification and timing luck. But perhaps most importantly, it was able to harvest the benefits of diversification, realizing a Sharpe ratio in the top quartile of specifications and a maximum drawdown in the lowest quartile.