This post is available as a PDF download here.
Summary
- We continue our exploration of quantitative signals in fixed income.
- We use a measure of credit curve steepness as a valuation signal for timing exposure between corporate bonds and U.S. Treasuries.
- The value signal generates a 0.84% annualized return from 1950 to 2019 but is highly regime dependent with meaningful drawdowns.
- Introducing a naïve momentum strategy significantly improves the realized Sharpe ratio and drawdown profile, but does not reduce the regime-based nature of the returns.
- With a combined return of just 1.0% annualized, this strategy may not prove effective after appropriate discounting for hindsight bias, costs, and manager fees. The signal itself, however, may be useful in other contexts.
In the last several weeks, we have been exploring the application of quantitative signals to fixed income.
- In Tactical Credit we explored trend-following strategies with high yield bonds.
- In Quantitative Styles and Multi-Sector Bonds we built off a prior piece (Navigating Municipal Bonds with Factors) and explored the cross-sectional application of momentum, value, carry, reversal, and volatility signals in a broad fixed income universe.
- In Time Series Signals and Multi-Sector Bonds we explored the same momentum, value, carry, and reversal signals as market timing signals.
Recent cross-sectional studies also build off of further research we’ve done in the past on applying trend, value, carry, and explicit measures of the bond risk premium as duration timing mechanisms (see Duration Timing with Style Premia; Timing Bonds with Value, Momentum, and Carry; and A Carry-Trend-Hedge Approach to Duration Timing).
Broadly, our studies have found:
- Value (measured as deviation from real yield), momentum (prior 12-month returns), and carry (yield-to-worst) were all profitable factors in cross-section municipal bond sector long/short portfolios.
- Value (measured as deviation from real yield), trend (measured as prior return), and carry (measured as term spread + roll yield) have historically been effective timing signals for U.S. duration exposure.
- Prior short-term equity returns proved to be an effective signal for near-term returns in U.S. Treasuries (related to the “flight-to-safety premium”).
- Short-term trend proved effective for high yield bond timing, but the results were vastly determined by performance in 2000-2003 and 2008-2009. While the strategy appeared to still be able to harvest relative carry between high-yield bonds and core fixed income in other environments, a significant proportion of returns came from avoiding large drawdowns in high yield.
- Short-term cross-section momentum (prior total returns), value (z-score of loss-adjusted yield-to-worst), carry (loss-adjusted yield-to-worst), and 3-year reversals all appeared to offer robust signals for relative selection in fixed income sectors. The time period covered in the study, however, was limited and mostly within a low-inflation regime.
- Application of momentum, value, carry, and reversal as timing signals proved largely ineffective for generating excess returns.
In this week’s commentary, we want to further contribute to research by introducing a value timing signal for credit.
Finding Value in Credit
Identifying a value signal requires some measure or proxy of an asset’s “fair” value. What can make identifying value in credit so difficult is that there are a number of moving pieces.
Conceptually, credit spreads should be proportional to default rates, recovery rates, and aggregate risk appetite, making determining whether spreads are cheap or expensive rather complicated. Prior literature typically tackles the problem with one of three major categories of models:
- Econometric: “Fair value” of credit spreads is modeled through a regression that typically explicitly accounts for default and recovery rates. Inputs are often related to economic and market variables, such as equity market returns, 10-year minus 2-year spreads, corporate leverage, and corporate profitability. Bottom-up analysis may use metrics such as credit quality, maturity, supply, and liquidity.
- Merton Model: Based upon the idea the bond holders have sold a put on a company’s asset value. Therefore, options pricing models can be used to calculate a credit spread. Inputs include the total asset value, asset volatility, and leverage of the firm under analysis.
- Spread Signal: A simple statistical model derived from credit spread themselves. For example, a rolling z-score of option-adjusted spreads or deviations from real yield. Other models (e.g. Haghani and Dewey (2016)) have used spread plus real yield versus a long-run constant (e.g. “150 basis points”).
The first method requires a significant amount of economic modeling. The second approach requires a significant amount of extrapolation from market data. The third method, while computationally (and intellectually) less intensive, requires a meaningful historical sample that realistically needs to cover at least one full market cycle.
While attractive for its simplicity, there are a number of factors that complicate the third approach.
First, if spreads are measured against U.S. Treasuries, the metric may be polluted by information related to Treasuries due to their idiosyncratic behavior (e.g. scarcity effects and flight-to-safety premiums). Structural shifts in default rates, recovery rates, and risk appetites may also cause a problem, as spreads may appear unduly thin or wide compared to past regimes.
In light of this, in this piece we will explore a similarly simple-to-calculate spread signal, but one that hopefully addresses some of these short-comings.
Baa vs. Aaa Yields
In order to adjust for these problems, we propose looking at the steepness of the credit curve itself by comparing prime / high-grade yield versus lower-medium grade yields. For example, we could compare Moody’s Season Aaa Corporate Bond Yield and Moody’s Season Baa Corporate Bond Yield. In fact, we will use these yields for the remainder of this study.
We may be initially inclined to measure the steepness of the credit curve by taking the difference in yield spreads, which we plot below.
Source: Federal Reserve of St. Louis. Calculations by Newfound Research.
We can find a stronger mean-reverting signal, however, if we calculate the log-difference in yields.
Source: Federal Reserve of St. Louis. Calculations by Newfound Research.
We believe this transformation is appropriate for two reasons. First, the log transformation helps control for the highly heteroskedastic and skewed nature of credit spreads.
Second, it helps capture both the steepness andthe level of the credit curve simultaneously. For example, a 50-basis-point premium when Aaa yield is 1,000 basis points is very different than when Aaa yield is 100 basis points. In the former case, investors may not feel any pressure to bear excess risk to achieve their return objectives, and therefore a 50-basis-point spread may be quite thin. In the latter case, 50 basis points may represent a significant step-up in relative return level in an environment where investors have either low default expectations, high recovery expectations, high risk appetite, or some combination thereof.
Another way of interpreting our signal is that it informs us about the relative decisions investors must make about their expected dispersion in terminal wealth.
Constructing the Value Strategy
With our signal in hand, we can now attempt to time credit exposure. When our measure signals that the credit curve is historically steep, we will take credit risk. When our signal indicates that the curve is historically flat we will avoid it.
Specifically, we will construct a dollar-neutral long/short portfolio using the Dow Jones Corporate Bond Index (“DJCORP”) and a constant maturity 5-year U.S. Treasury index (“FV”). We will calculate a rolling z-score of our steepness measure and go long DJCORP and short FV when the z-score is positive and place the opposite trade when the z-score is negative.
In line with prior studies, we will apply an ensemble approach. Portfolios are reformed monthly using formation ranging from 3-to-6 years with holding periods ranging from 1-to-6 months. Portfolio weights for the resulting strategy are plotted below.
Source: Federal Reserve of St. Louis and Global Financial Data. Calculations by Newfound Research.
We should address the fact that while both corporate bond yield and index data is available back to the 1930s, we have truncated our study to ignore dates prior to 12/1949 to normalize for a post-war period. It should be further acknowledged that the Dow Jones Corporate Bond index used in this study did not technically exist until 2002. Prior to that date, the index return tracks a Dow Jones Bond Aggregate, which was based upon four sub-indices: high-grade rails, second-grade rails, public utilities, and industries. This average existed from 1915 to 1976, when it was replaced with a new average at that point when the number of railway bonds was no longer sufficient to maintain the average.
Below we plot the returns of our long/short strategy.
Source: Federal Reserve of St. Louis and Global Financial Data. Calculations by Newfound Research. Returns are hypothetical and backtested. Returns are gross of all management fees, transaction fees, and taxes, but net of underlying fund fees. Total return series assumes the reinvestment of all distributions.
The strategy has an annualized return of 0.84% with a volatility of 3.89%, generating a Sharpe ratio of 0.22. Of course, long-term return statistics belie investor and manager experience, with this strategy exhibiting at least two periods of decade-plus-long drawdowns. In fact, the strategy really has just four major return regimes: 1950 to 1970 (-0.24% annualized), 1970 to 1987 (2.59% annualized), 1987 to 2002 (-0.33%), and 2002 to 2019 (1.49% annualized).
Try the strategy out in the wrong environment and we might be in for a lot of pain.
Momentum to the Rescue?
It is no secret that value and momentum go together like peanut butter and jelly. Instead of tweaking our strategy to death in order to improve it, we may just find opportunity in combining it with a negatively correlated signal.
Using an ensemble model, we construct a dollar-neutral long/short momentum strategy that compares prior total returns of DJCORP and FV. Rebalanced monthly, the portfolios use formation periods ranging from 9-to-15 months and holding periods ranging from 1-to-6 months.
Below we plot the growth of $1 in our value strategy, our momentum strategy, and a 50/50 combination of the two strategies that is rebalanced monthly.
Source: Federal Reserve of St. Louis and Global Financial Data. Calculations by Newfound Research. Returns are hypothetical and backtested. Returns are gross of all management fees, transaction fees, and taxes, but net of underlying fund fees. Total return series assumes the reinvestment of all distributions.
The first thing we note is – even without calculating any statistics – the meaningful negative correlation we see in the equity curves of the value and momentum strategies. This should give us confidence that there is the potential for significant improvement through diversification.
The momentum strategy returns 1.11% annualized with a volatility of 3.92%, generating a Sharpe ratio of 0.29. The 50/50 combination strategy, however, returns 1.03% annualized with a volatility of just 2.16% annualized, resulting in a Sharpe ratio of 0.48.
While we still see significant regime-driven behavior, the negative regimes now come at a far lower cost.
Conclusion
In this study we introduce a simple value strategy based upon the steepness of the credit curve. Specifically, we calculated a rolling z-score on the log-difference between Moody’s Seasoned Baa and Aaa yields. We interpreted a positive z-score as a historically steep credit curve and therefore likely one that would revert. Similarly, when z-scores were negative, we interpreted the signal as a flat credit curve, and therefore a period during which taking credit risk is not well compensated.
Employing an ensemble approach, we generated a long/short strategy that would buy the Dow Jones Corporate Bond Index and short 5-year U.S. Treasuries when credit appeared cheap and place the opposite trade when credit appeared expensive. We found that this strategy returned 0.84% annualized with a volatility of 3.89% from 1950 to 2019.
Unfortunately, our value signal generated significantly regime-dependent behavior with decade-long drawdowns. This not only causes us to question the statistical validity of the signal, but also the practicality of implementing it.
Fortunately, a naively constructed momentum signal provides ample diversification. While a combination strategy is still highly regime-driven, the drawdowns are significantly reduced. Not only do returns meaningfully improve compared to the stand-alone value signal, but the Sharpe ratio more-than-doubles.
Unfortunately, our study leveraged a long/short construction methodology. While this isolates the impact of active returns, long-only investors must cut return expectations of the strategy in half, as a tactical timing model can only half-implement this trade without leverage. A long-only switching strategy, then, would only be expected to generate approximately 0.5% annualized excess return above a 50% Dow Jones Corporate Bond Index / 50% 5-Year U.S. Treasury index portfolio.
And that’s before adjustments for hindsight bias, trading costs, and manager fees.
Nevertheless, more precise implementation may lead to better results. For example, our indices neither perfectly matched the credit spreads we evaluated, nor did they match each other’s durations. Furthermore, while this particular implementation may not survive costs, this signal may still provide meaningful information for other credit-based strategies.
Decomposing the Credit Curve
By Corey Hoffstein
On July 8, 2019
In Credit, Risk & Style Premia, Weekly Commentary
This post is available as a PDF download here.
Summary
In this week’s research note, we continue our exploration of credit with a statistical decomposition of the credit spread curve. Just as the U.S. Treasury yield curve plots yields versus maturity, the credit spread curve plots excess yield versus credit quality, providing us insight into how much extra return we demand for the risks of declining credit quality.
Source: Federal Reserve of St. Louis; Bloomberg. Calculations by Newfound Research.
Our goal in analyzing the credit spread curve is to gain a deeper understanding of the principal drivers behind its changes. In doing so, we hope to potentially gain intuition and ideas for trading signals between low- and high-quality credit.
To begin our, we must first construct our credit spread curve. We will use the following index data to represent our different credit qualities.
Unfortunately, we cannot simply plot the yield-to-worst for each index, as spread captures the excess yield. Which raises the question: excess to what? As we want to isolate the credit component of the yield, we need to remove the duration-equivalent Treasury rate.
Plotting the duration of each credit index over time, we can immediately see why incorporating this duration data will be important. Not only do durations vary meaningfully over time (e.g. Aaa durations varying between 4.95 and 11.13), but they also deviate across quality (e.g. Caa durations currently sit near 3.3 while Aaa durations are north of 11.1).
Source: Bloomberg.
To calculate our credit spread curve, we must first calculate the duration-equivalent Treasury bond yield for each index at each point in time. For each credit index at each point in time, we use the historical Treasury yield curve to numerically solve for the Treasury maturity that matches the credit index’s duration. We then subtract that matching rate from the credit index’s reported yield-to-worst to estimate the credit spread.
We plot the spreads over time below.
Source: Federal Reserve of St. Louis; Bloomberg. Calculations by Newfound Research.
Statistical Decomposition: Eigen Portfolios
With our credit spreads in hand, we can now attempt to extract the statistical drivers of change within the curve. One method of achieving this is to:
Stopping after just the first two steps, we can begin to see some interesting visual patterns emerge in the correlation matrix.
Step 3 might seem foreign for those unfamiliar with the technique, but in this context eigenvalue decomposition has an easy interpretation. The process will take our universe of credit indices and return a universe of statistically independent factor portfolios, where each portfolio is made up of a combination of credit indices.
As our eigenvalue decomposition was applied to the correlation matrix of credit spread changes, the factors will explain the principal vectors of variance in credit spread changes. We plot the weights of the first three factors below.
Source: Federal Reserve of St. Louis; Bloomberg. Calculations by Newfound Research.
For anyone who has performed an eigenvalue decomposition on the yield curve before, three familiar components emerge.
We can see that Factor #1 applies nearly equal-weights across all the credit indices. Therefore, we label this factor “level” as it represents a level shift across the entire curve.
Factor #2 declines in weight from Aaa through Caa. Therefore, we label this factor “slope,” as it controls steepening and flattening of the credit curve.
Factor #3 appears as a barbell: negative weights in the wings and positive weights in the belly. Therefore, we call this factor “curvature,” as it will capture convexity changes in the curve.
Together, these three factors explain 80% of the variance in credit spread changes. Interestingly, the 4thfactor – which brings variance explained up to 87.5% – also looks very much like a curvature trade, but places zero weight on Aaa and barbells Aa/Caa against A/Baa. We believe this serves as further evidence as to the unique behavior of Aaa credit.
Tracking Credit Eigen Portfolios
As we mentioned, each factor is constructed as a combination of exposure to our Aaa-Caa credit universe; in other words, they are portfolios! This means we can track their performance over time and see how these different trades behave in different market regimes.
To avoid overfitting and estimation risk, we decided to simplify the factor portfolios into more stylized trades, whose weights are plotted below (though ignore, for a moment, the actual weights, as they are meant only to represent relative weighting within the portfolio and not absolute level). Note that the Level trade has a cumulative positive weight while the Slope and Curvature trades sum to zero.
To actually implement these trades, we need to account for the fact that each credit index will have a different level of credit duration.
Akin to duration, which measure’s a bond’s sensitivity to interest rate changes, credit duration measures a bond’s sensitivity to changes in its credit spread. As with Treasuries, we need to adjust the weights of our trades to account for this difference in credit durations across our indices.
For example, if we want to place a trade that profits in a steepening of the Treasury yield curve, we might sell 10-year US Treasuries and buy 2-year US Treasuries. However, we would not buy and sell the same notional amount, as that would leave us with a significantly negative duration position. Rather, we would scale each leg such that their durations offset. In the end, this causes us to buy significantly more 2s than we sell 10s.
To continue, therefore, we must calculate credit spread durations.
Without this data on hand, we employ a statistical approach. Specifically, we take monthly total return data and subtract yield return and impact from interest rate changes (employing the duration-matched rates we calculated above). What is left over is an estimate of return due to changes in credit spreads. We then regress these returns against changes in credit spreads to calculate credit spread durations, which we plot below.
Source: Federal Reserve of St. Louis; Bloomberg. Calculations by Newfound Research.
The results are a bit of a head scratcher. Unlike duration in the credit curve which typically increases monotonically across maturities, we get a very different effect here. Aaa credit spread duration is 10.7 today while Caa credit spread duration is 2.8. How is that possible? Why is lower-quality credit not more sensitiveto credit changes than higher quality credit?
Here we run into a very interesting empirical result in credit spreads: spread change is proportional to spread level. Thus, a true “level shift” rarely occurs in the credit space; e.g. a 1bp change in the front-end of the credit spread curve may actually manifest as a 10bp change in the back end. Therefore, the lower credit spread duration of the back end of the curve is offset by larger changes.
There is some common-sense intuition to this effect. Credit has a highly non-linear return component: defaults. If we enter an economic environment where we expect an increase in default rates, it tends to happen in a non-linear fashion across the curve. To offset the larger increase in defaults in lower quality credit, investors will demand larger corresponding credit spreads.
(Side note: this is why we saw that the Baa–Aaa spread did not appear to mean-revert as cleanly as the log-difference of spreads did in last week’s commentary, Value and the Credit Spread.)
While our credit spread durations may be correct, we still face a problem: weighting such that each index contributes equal credit spread duration will create an outsized weight to the Caa index.
DTS Scaling
Fortunately, some very smart folks thought about this problem many years ago. Recognizing the stability of relative spread changes, Dor, Dynkin, Hyman, Houweling, van Leeuwen, and Penninga (2007)recommend the measure of duration times spread (“DTS”) for credit risk.
With a more appropriate measure of credit sensitivity, we can now scale our stylized factor portfolio weights such that each position contributes an equal level of DTS. This will have two effects: (1) the relative weights in the portfolios will change over time, and (2) the notional size of the portfolios will change over time.
We scale each position such that (1) they contribute an equal level of DTS to the portfolio and (2) each leg of the portfolio has a total DTS of 500bps. The Level trade, therefore, represents a constant 500bps of DTS risk over time, while the Slope and Curvature trades represent 0bps, as the longs and short legs net out.
One problem still remains: interest rate risk. As we plotted earlier in this piece, the credit indices have time-varying – and sometimes substantial – interest rate exposure. This creates an unintended bet within our portfolios.
Fortunately, unlike the credit curve, true level shift does empirically apply in the Treasury yield curve. Therefore, to simplify matters, we construct a 5-year zero-coupon bond, which provides us with a constant duration instrument. At each point in time, we calculate the net duration of our credit trades and use the 5-year ZCB to neutralize the interest rate risk. For example, if the Level portfolio has a duration of 1, we would take a -20% notional position in the 5-year ZCB.
Source: Federal Reserve of St. Louis; Bloomberg. Calculations by Newfound Research.
Some things we note when evaluating the portfolios over time:
Conclusion
The fruit of our all our labor is the graph plotted below, which shows the growth of $1 in our constant DTS, stylized credit factor portfolios.
What can we see?
First and foremost, constant credit exposure has not provided much in the last 25 years until recently. It would appear that investors did not demand a high enough premium for the risks that were realized over the period, which include the 1998 LTCM blow-up, the burst of the dot-com bubble, and the 2008 recession.
From 12/31/2008 lows through Q1 2019, however, a constant 500bps DTS exposure generated a 2.0% annualized return with 2.4% annualized volatility, reflecting a nice annual premium for investors willing to bear the credit risk.
Slope captures the high-versus-low-quality trade. We can see that junk meaningfully out-performed quality in the 1990s, after which there really did not appear to be a meaningful difference in performance until 2013 when oil prices plummeted and high yield bond prices collapsed. This result does highlight a potential problem in our analysis: the difference in sector composition of the underlying indices. High yield bonds had an outsized reaction compared to higher quality investment grade credit due to more substantial exposure to the energy sector, leading to a lop-sided reaction.
What is also interesting about the Slope trade is that the market did not seem to price a meaningful premium for holding low-quality credit over high-quality credit.
Finally, we can see that Curvature (“barbell versus belly”) – trade was rather profitable for the first decade, before deflating pre-2008 and going on a mostly-random walk ever since. However, as mentioned when the curvature trade was initially introduced, the 4th factor in our decomposition also appeared to reflect a similar trade but shorts Aa and Caa versus a long position in A and Baa. This trade has been a fairly consistent money-loser since the early 2000s, indicating that a barbell of high quality (just not Aaa) and junk might do better than the belly of the curve.
It is worth pointing out that these trades represent a significant amount of compounding estimation – from duration-matching Treasury rates to credit spread durations – which also means a significant risk of compounding estimation error. Nevertheless, we believe there are a few takeaways worth exploring further: