This post is available as a PDF download here.
Summary
- Bond timing has been difficult for the past 35 years as interest rates have declined, especially since bonds started the period with high coupons.
- With low current rates and higher durations, the stage may be set for systematic, factor-based bond investing.
- Strategies such as value, momentum, and carry have done well historically, especially on a risk-adjusted basis.
- Diversifying across these three strategies and employing prudent leverage takes advantage of differences in the processes and the information contained in their joint decisions.
This commentary is a slight re-visit and update to a commentary we wrote last summer, Duration Timing with Style Premia[1]. The models we use here are similar in nature, but have been updated with further details and discussion, warranting a new piece.
Historically Speaking, This is a Bad Idea
Let’s just get this out of the way up front: the results of this study are probably not going to look great.
Since interest rates peaked in September 1981, the excess return of a constant maturity 10-year U.S. Treasury bond index has been 3.6% annualized with only 7.3% volatility and a maximum drawdown of 16.4%. In other words, about as close to a straight line up and to the right as you can get.
Source: Federal Reserve of St. Louis. Calculations by Newfound Research.
With the benefit of hindsight, this makes sense. As we demonstrated in Did Declining Rates Actually Matter?[2], the vast majority of bond index returns over the last 30+ years have been a result of the high average coupon rate. High average coupons kept duration suppressed, meaning that changes in rates produced less volatile movements in bond prices.
Source: Federal Reserve of St. Louis. Calculations by Newfound Research.
Ultimately, we estimate that roll return and benefits from downward shifts in the yield curve only accounted for approximately 30% of the annualized return.
Put another way, whenever you got “out” of bonds over this period, there was a very significant opportunity cost you were experiencing in terms of foregone interest payments, which accounted for 70% of the total return.
If we use this excess return as our benchmark, we’ve made the task nearly impossible for ourselves. Consider that if we are making “in or out” tactical decisions (i.e. no leverage or shorting) and our benchmark is fully invested at all times, we can only outperform due to our “out” calls. Relative to the long-only benchmark, we get no credit for correct “in” calls since correct “in” calls mean we are simply keeping up with the benchmark. (Note: Broadly speaking, this highlights the problems with applying traditional benchmarks to tactical strategies.) In a period of consistently positive returns, our “out” calls must be very accurate, in fact probably unrealistically accurate, to be able to outperform.
When you put this all together, we’re basically asking, “Can you create a tactical strategy that can only outperform based upon its calls to get out of the market over a period of time when there was never a good time to sell?”
The answer, barring some serious data mining, is probably, “No.”
This Might Now be a Good Idea
Yet this idea might have legs.
Since the 10-year rate peaked in 1981, the duration of a constant maturity 10-year U.S. bond index has climbed from 4.8 to 8.7. In other words, bonds are now 1.8x more sensitive to changes in interest rates than they were 35 years ago.
If we decompose bond returns in the post-crisis era, we can see that shifts in the yield curve have played a large role in year-to-year performance. The simple intuition is that as coupons get smaller, they are less effective as cushions against rate volatility.
Higher durations and lower coupons are a potential double whammy when it comes to fixed income volatility.
Source: Federal Reserve of St. Louis. Calculations by Newfound Research.
With rates low and durations high, strategies like value, momentum, and carry may afford us more risk-managed access to fixed income.
Timing Bonds with Value
Following the standard approach taken in most literature, we will use real yields as our measure of value. Specifically, we will estimate real yield by taking the current 10-year U.S. Treasury rate minus the 10-year forecasted inflation rate from Philadelphia Federal Reserve’s Survey of Professional Forecasters.[3]
To come up with our value timing signal, we will compare real yield to a 3-year exponentially weighted average of real yield.
Here we need to be a bit careful. With a secular decline in real yields over the last 30 years, comparing current real yield against a trailing average of real yield will almost surely lead to an overvalued conclusion, as the trailing average will likely be higher.
Thus, we need to de-trend twice. We first subtract real yield from the trailing average, and then subtract this difference from a trailing average of differences. Note that if there is no secular change in real yields over time, this second step should have zero impact. What this is measuring is the deviation of real yields relative to any linear trend.
After both of these steps, we are left with an estimate of how far our real rates are away from fair value, where fair value is defined by our particular methodology rather than any type of economic analysis. When real rates are below our fair value estimate, we believe they are overvalued and thus expect rates to go up. Similarly, when rates are above our fair value estimate, we believe they are undervalued and thus expect them to go down.
Source: Federal Reserve of St. Louis. Philadelphia Federal Reserve. Calculations by Newfound Research.
Before we can use this valuation measure as our signal, we need to take one more step. In the graph above, we see that the deviation from fair value in September 1993 was approximately the same as it was in June 2003: -130bps (implying that rates were 130bps below fair value and therefore bonds were overvalued). However, in 1993, rates were at about 5.3% while in 2003 rates were closer to 3.3%. Furthermore, duration was about 0.5 higher in 2003 than it was 1993.
In other words, a -130bps deviation from fair value does not mean the same thing in all environments.
One way of dealing with this is by forecasting the actual bond return over the next 12 months, including any coupons earned, by assuming real rates revert to fair value (and taking into account any roll benefits due to yield curve steepness). This transformation leaves us with an actual forecast of expected return.
We need to be careful, however, as our question of whether to invest or not is not simply based upon whether the bond index has a positive expected return. Rather, it is whether it has a positive expected return in excess of our alternative investment. In this case, that is “cash.” Here, we will proxy cash with a constant maturity 1-year U.S. Treasury index.
Thus, we need to net out the expected return from the 1-year position, which is just its yield. [4]
Source: Federal Reserve of St. Louis. Philadelphia Federal Reserve. Calculations by Newfound Research.
While the differences here are subtle, had our alternative position been something like a 5-year U.S. Treasury Index, we may see much larger swings as the impact of re-valuation and roll can be much larger.
Using this total expected return, we can create a simple timing model that goes long the 10-year index and short cash when expected excess return is positive and short the 10-year index and long cash when expected excess return is negative. As we are forecasting our returns over a 1-year period, we will employ a 1-year hold with 52 overlapping portfolios to mitigate the impact of timing luck.
We plot the results of the strategy below.
Source: Federal Reserve of St. Louis. Philadelphia Federal Reserve. Calculations by Newfound Research. Results are hypothetical and backtested. Past performance is not a guarantee of future results. Returns are gross of all fees (including management fees, transaction costs, and taxes). Returns assume the reinvestment of all income and distributions.
The value strategy return matches the 10-year index excess return nearly exactly (2.1% vs 2.0%) with just 70% of the volatility (5.0% vs 7.3%) and 55% of the max drawdown (19.8% versus 36.2%).
Timing Bonds with Momentum
For all the hoops we had to jump through with value, the momentum strategy will be fairly straightforward.
We will simply look at the trailing 12-1 month total return of the index versus the alternative (e.g. the 10-year index vs. the 1-year index) and invest in the security that has outperformed and short the other. For example, if the 12-1 month total return for the 10-year index exceeds that of the 1-year index, we will go long the 10-year and short the 1-year, and vice versa.
Since momentum tends to decay quickly, we will use a 1-month holding period, implemented with four overlapping portfolios.
Source: Federal Reserve of St. Louis. Philadelphia Federal Reserve. Calculations by Newfound Research. Results are hypothetical and backtested. Past performance is not a guarantee of future results. Returns are gross of all fees (including management fees, transaction costs, and taxes). Returns assume the reinvestment of all income and distributions.
(Note that this backtest starts earlier than the value backtest because it only requires 12 months of returns to create a trading signal versus 6 years of data – 3 for the value anchor and 3 to de-trend the data – for the value score.)
Compared to the buy-and-hold approach, the momentum strategy increases annualized return by 0.5% (1.7% versus 1.2%) while closely matching volatility (6.7% versus 6.9%) and having less than half the drawdown (20.9% versus 45.7%).
Of course, it cannot be ignored that the momentum strategy has largely gone sideways since the early 1990s. In contrast to how we created our bottom-up value return expectation, this momentum approach is a very blunt instrument. In fact, using momentum this way means that returns due to differences in yield, roll yield, and re-valuation are all captured simultaneously. We can really think of decomposing our momentum signal as:
10-Year Return – 1-Year Return = (10-Year Yield – 1-Year Yield) + (10-Year Roll – 1-Year Roll) + (10-Year Shift – 1-Year Shift)
Our momentum score is indiscriminately assuming momentum in all the components. Yet when we actually go to put on our trade, we do not need to assume momentum will persist in the yield and roll differences: we have enough data to measure them explicitly.
With this framework, we can isolate momentum in the shift component by removing yield and roll return expectations from total returns.
Source: Federal Reserve of St. Louis. Calculations by Newfound Research.
Ultimately, the difference in signals is minor for our use of 10-year versus 1-year, though it may be far less so in cases like trading the 10-year versus the 5-year. The actual difference in resulting performance, however, is more pronounced.
Source: Federal Reserve of St. Louis. Philadelphia Federal Reserve. Calculations by Newfound Research. Results are hypothetical and backtested. Past performance is not a guarantee of future results. Returns are gross of all fees (including management fees, transaction costs, and taxes). Returns assume the reinvestment of all income and distributions.
Ironically, by doing worse mid-period, the adjusted momentum long/short strategy appears to be more consistent in its return from the early 1990s through present. We’re certain this is more noise than signal, however.
Timing Bonds with Carry
Carry is the return we earn by simply holding the investment, assuming everything else stays constant. For a bond, this would be the yield-to-maturity. For a constant maturity bond index, this would be the coupon yield (assuming we purchase our bonds at par) plus any roll yield we capture.
Our carry signal, then, will simply be the difference in yields between the 10-year and 1-year rates plus the difference in expected roll return.
For simplicity, we will assume roll over a 1-year period, which makes the expected roll of the 1-year bond zero. Thus, this really becomes, more or less, a signal to be long the 10-year when the yield curve is positively sloped, and long the 1-year when it is negatively sloped.
As we are forecasting returns over the next 12-month period, we will use a 12-month holding period and implement with 52 overlapping portfolios.
Source: Federal Reserve of St. Louis. Philadelphia Federal Reserve. Calculations by Newfound Research. Results are hypothetical and backtested. Past performance is not a guarantee of future results. Returns are gross of all fees (including management fees, transaction costs, and taxes). Returns assume the reinvestment of all income and distributions.
Again, were we comparing the 10-year versus the 5-year instead of the 10-year versus the 1-year, the roll can have a large impact. If the curve is fairly flat between the 5- and 10-year rates, but gets steep between the 5- and the 1-year rates, then the roll expectation from the 5-year can actually overcome the yield difference between the 5- and the 10-year rates.
Building a Portfolio of Strategies
With three separate methods to timing bonds, we can likely benefit from process diversification by constructing a portfolio of the approaches. The simplest method to do so is to simply give each strategy an equal share. Below we plot the results.
Source: Federal Reserve of St. Louis. Philadelphia Federal Reserve. Calculations by Newfound Research. Results are hypothetical and backtested. Past performance is not a guarantee of future results. Returns are gross of all fees (including management fees, transaction costs, and taxes). Returns assume the reinvestment of all income and distributions.
Indeed, by looking at per-strategy performance, we can see a dramatic jump in Information Ratio and an exceptional reduction in maximum drawdown. In fact, the maximum drawdown of the equal weight approach is below that of any of the individual strategies, highlighting the potential benefit of diversifying away conflicting investment signals.
Strategy | Annualized Return | Annualized Volatility | Information Ratio | Max Drawdown |
10-Year Index Excess Return | 2.0% | 7.3% | 0.27 | 36.2% |
Value L/S | 2.0% | 5.0% | 0.41 | 19.8% |
Momentum L/S | 1.9% | 6.9% | 0.27 | 20.9% |
Carry L/S | 2.5% | 6.6% | 0.38 | 20.1% |
Equal Weight | 2.3% | 4.0% | 0.57 | 10.2% |
Source: Federal Reserve of St. Louis. Philadelphia Federal Reserve. Calculations by Newfound Research. Results are hypothetical and backtested. Past performance is not a guarantee of future results. Returns are gross of all fees (including management fees, transaction costs, and taxes). Returns assume the reinvestment of all income and distributions. Performance measured from 6/1974 to 1/2018, representing the full overlapping investment period of the strategies.
One potential way to improve upon the portfolio construction is by taking into account the actual covariance structure among the strategies (correlations shown in the table below). We can see that, historically, momentum and carry have been fairly positively correlated while value has been independent, if not slightly negatively correlated. Therefore, an equal-weight approach may not be taking full advantage of the diversification opportunities presented.
Value L/S | Momentum L/S | Carry L/S | |
Value L/S | 1.0 | -0.2 | -0.1 |
Momentum L/S | -0.2 | 1.0 | 0.6 |
Carry L/S | -0.1 | 0.6 | 1.0 |
To avoid making any assumptions about the expected returns of the strategies, we will construct a portfolio where each strategy contributes equally to the overall risk profile (“ERC”). So as to avoid look-ahead bias, we will use an expanding window to compute our covariance matrix (seeding with at least 5 years of data). While the weights vary slightly over time, the result is a portfolio where the average weights are 43% value, 27% momentum, and 30% carry.
The ERC approach matches the equal-weight approach in annualized return, but reduces annualized volatility from 4.2% to 3.8%, thereby increasing the information ratio from 0.59 to 0.64. The maximum drawdown also falls from 10.2% to 8.7%.
A second step we can take is to try to use the “collective intelligence” of the strategies to set our risk budget. For example, we can have our portfolio target the long-term volatility of the 10-year Index Excess Return, but scale this target between 0-2x depending on how invested we are.
For example, if the strategies are, in aggregate, only 20% invested, then our target volatility would be 0.4x that of the long-term volatility. If they are 100% invested, though, then we would target 2x the long-term volatility. When the strategies are providing mixed signals, we will simply target the long-term volatility level.
Unfortunately, such an approach requires going beyond 100% notional exposure, often requiring 2x – if not 3x – leverage when current volatility is low. That makes this system less useful in the context of “bond timing” since we are now placing a bet on current volatility remaining constant and saying that our long-term volatility is an appropriate target.
One way to limit the leverage is to increase how much we are willing to scale our risk target, but truncate our notional exposure at 100% per leg. For example, we can scale our risk target between 0-4x. This may seem very risky (indeed, an asymmetric bet), but since we are clamping our notional exposure to 100% per leg, we should recognize that we will only hit that risk level if current volatility is greater than 4x that of the long-term average and all the strategies recommend full investment.
With a little mental arithmetic, the approach it is equivalent to saying: “multiply the weights by 4x and then scale based on current volatility relative to historical volatility.” By clamping weights between -100% and +100%, the volatility targeting really does not come into play until current volatility is 4x that of long-term volatility. In effect, we leg into our trades more quickly, but de-risk when volatility spikes to abnormally high levels.
Source: Federal Reserve of St. Louis. Philadelphia Federal Reserve. Calculations by Newfound Research. Results are hypothetical and backtested. Past performance is not a guarantee of future results. Returns are gross of all fees (including management fees, transaction costs, and taxes). Returns assume the reinvestment of all income and distributions.
Compared to the buy-and-hold model, the variable risk ERC model increases annualized returns by 90bps (2.4% to 3.3%), reduces volatility by 260bps (7.6% to 5.0%), doubles the information ratio (0.31 to 0.66) and halves the maximum drawdown (30% to 15%).
There is no magic to the choice of “4” above: it is just an example. In general, we can say that as the number goes higher, the strategy will approach a binary in-or-out system and the volatility scaling will have less and less impact.
Conclusion
Bond timing has been hard for the past 35 years as interest rates have declined. Small current coupons do not provide nearly the cushion against rate volatility that investors have been used to, and these lower rates mean that bonds are also exposed to higher duration.
These two factors are a potential double whammy when it comes to fixed income volatility.
This can open the door for systematic, factor-based bond investing.
Value, momentum, and carry strategies have all historically outperformed a buy-and-hold bond strategy on a risk adjusted basis despite the bond bull market. Diversifying across these three strategies and employing prudent leverage takes advantage of differences in the processes and the information contained in their joint decisions.
We should point out that in the application of this approach, there were multiple periods of time in the backtest where the strategy went years without being substantially invested. A smooth, nearly 40-year equity curve tells us very little about what it is actually like to sit on the sidelines during these periods and we should not underestimate the emotional burden of using such a timing strategy.
Even with low rates and high rate movement sensitivity, bonds can still play a key role within a portfolio. Going forward, however, it may be prudent for investors to consider complementary risk-management techniques within their bond sleeve.
[1] https://blog.thinknewfound.com/2017/06/duration-timing-style-premia/
[2] https://blog.thinknewfound.com/2017/04/declining-rates-actually-matter/
[3] Prior to the availability of the 10-year inflation estimate, the 1-year estimate is utilized; prior to the 1-year inflation estimate availability, the 1-year GDP price index estimate is utilized.
[4] This is not strictly true, as it largely depends on how the constant maturity indices are constructed. For example, if they are rebalanced on a monthly basis, we would expect that re-valuation and roll would have impact on the 1-year index return. We would also have to alter the horizon we are forecasting over as we are assuming we are rolling into new bonds (with different yields) more frequently.
Factor Fimbulwinter
By Corey Hoffstein
On June 11, 2018
In Carry, Defensive, Momentum, Popular, Risk & Style Premia, Trend, Value, Weekly Commentary
This post is available as a PDF download here.
Summary
In Norse mythology, Fimbulvetr (commonly referred to in English as “Fimbulwinter”) is a great and seemingly never-ending winter. It continues for three seasons – long, horribly cold years that stretch on longer than normal – with no intervening summers. It is a time of bitterly cold, sunless days where hope is abandoned and discord reigns.
This winter-to-end-all-winters is eventually punctuated by Ragnarok, a series of events leading up to a great battle that results in the ultimate death of the major gods, destruction of the cosmos, and subsequent rebirth of the world.
Investment mythology is littered with Ragnarok-styled blow-ups and we often assume the failure of a strategy will manifest as sudden catastrophe. In most cases, however, failure may more likely resemble Fimbulwinter: a seemingly never-ending winter in performance with returns blown to-and-fro by the harsh winds of randomness.
Value investors can attest to this. In particular, the disciples of price-to-book have suffered greatly as of late, with “expensive” stocks having outperformed “cheap” stocks for over a decade. The academic interpretation of the factor sits nearly 25% belowits prior high-water mark seen in December 2006.
Expectedly, a large number of articles have been written about the death of the value factor. Some question the factor itself, while others simply argue that price-to-book is a broken implementation.
But are these simply retrospective narratives, driven by a desire to have an explanation for a result that has defied our expectations? Consider: if price-to-book had exhibited positive returns over the last decade, would we be hearing from nearly as large a number of investors explaining why it is no longer a relevant metric?
To be clear, we believe that many of the arguments proposed for why price-to-book is no longer a relevant metric are quite sound. The team at O’Shaughnessy Asset Management, for example, wrote a particularly compelling piece that explores how changes to accounting rules have led book value to become a less relevant metric in recent decades.1
Nevertheless, we think it is worth taking a step back, considering an alternate course of history, and asking ourselves how it would impact our current thinking. Often, we look back on history as if it were the obvious course. “If only we had better prior information,” we say to ourselves, “we would have predicted the path!”2 Rather, we find it more useful to look at the past as just one realized path of many that’s that could have happened, none of which were preordained. Randomness happens.
With this line of thinking, the poor performance of price-to-book can just as easily be explained by a poor roll of the dice as it can be by a fundamental break in applicability. In fact, we see several potential truths based upon performance over the last decade:
The problem at hand is two-fold: (1) the statistical evidence supporting most factors is considerable and (2) the decade-to-decade variance in factor performance is substantial. Taken together, you run into a situation where a mere decade of underperformance likely cannot undue the previously established significance. Just as frustrating is the opposite scenario. Consider that these two statements are not mutually exclusive: (1) price-to-book is broken, and (2) price-to-book generates positive excess return over the next decade.
In investing, factor return variance is large enough that the proof is not in the eating of the short-term return pudding.
The small-cap premium is an excellent example of the difficulty in discerning, in real time, the integrity of an established factor. The anomaly has failed to establish a meaningful new high since it was originally published in 1981. Only in the last decade – nearly 30 years later – have the tides of the industry finally seemed to turn against it as an established anomaly and potential source of excess return.
Thirty years.
The remaining broadly accepted factors – e.g. value, momentum, carry, defensive, and trend – have all been demonstrated to generate excess risk-adjusted returns across a variety of economic regimes, geographies, and asset classes, creating a great depth of evidence supporting their existence. What evidence, then, would make us abandon faith from the Church of Factors?
To explore this question, we ran a simple experiment for each factor. Our goal was to estimate how long it would take to determine that a factor was no longer statistically significant.
Our assumption is that the salient features of each factor’s return pattern will remain the same (i.e. autocorrelation, conditional heteroskedasticity, skewness, kurtosis, et cetera), but the forward average annualized return will be zero since the factor no longer “works.”
Towards this end, we ran the following experiment:
For each factor, we ran this test 10,000 times, creating a distribution that tells us how many years into the future we would have to wait until we were certain, from a statistical perspective, that the factor is no longer significant.
Sixty-seven years.
Based upon this experience, sixty-seven years is median number of years we will have to wait until we officially declare price-to-book (“HML,” as it is known in the literature) to be dead.3 At the risk of being morbid, we’re far more likely to die before the industry finally sticks a fork in price-to-book.
We perform this experiment for a number of other factors – including size (“SMB” – “small-minus-big”), quality (“QMJ” – “quality-minus-junk”), low-volatility (“BAB” – “betting-against-beta”), and momentum (“UMD” – “up-minus-down”) – and see much the same result. It will take decades before sufficient evidence mounts to dethrone these factors.
Now, it is worth pointing out that these figures for a factor like momentum (“UMD”) might be a bit skewed due to the design of the test. If we examine the long-run returns, we see a fairly docile return profile punctuated by sudden and significant drawdowns (often called “momentum crashes”).
Since a large proportion of the cumulative losses are contained in these short but pronounced drawdown periods, demeaning the time-series ultimately means that the majority of 12-month periods actually exhibit positive returns. In other words, by selecting random 12-month samples, we actually expect a high frequency of those samples to have a positive return.
For example, using this process, 49.1%, 47.6%, 46.7%, 48.8% of rolling 12-month periods are positive for HML, SMB, QMJ, and BAB factors respectively. For UMD, that number is 54.7%. Furthermore, if you drop the worst 5% of rolling 12-month periods for UMD, the average positive period is 1.4x larger than the average negative period. Taken together, not only are you more likely to select a positive 12-month period, but those positive periods are, on average, 1.4x larger than the negative periods you will pick, except for the rare (<5%) cases.
The process of the test was selected to incorporate the salient features of each factor. However, in the case of momentum, it may lead to somewhat outlandish results.
Conclusion
While an evidence-based investor should be swayed by the weight of the data, the simple fact is that most factors are so well established that the majority of current practitioners will likely go our entire careers without experiencing evidence substantial enough to dismiss any of the anomalies.
Therefore, in many ways, there is a certain faith required to use them going forward. Yes, these are ideas and concepts derived from the data. Yes, we have done our best to test their robustness out-of-sample across time, geographies, and asset classes. Yet we must also admit that there is a non-zero probability, however small it is, that these are false positives: a fact we may not have sufficient evidence to address until several decades hence.
And so a bit of humility is warranted. Factors will not suddenly stand up and declare themselves broken. And those that are broken will still appear to work from time-to-time.
Indeed, the death of a factor will be more Fimulwinter than Ragnarok: not so violent to be the end of days, but enough to cause pain and frustration among investors.
Addendum
We have received a large number of inbound notes about this commentary, which fall upon two primary lines of questions. We want to address these points.
How were the tests impacted by the Bayesian inference process?
The results of the tests within this commentary are rather astounding. We did seek to address some of the potential flaws of the methodology we employed, but by-in-large we feel the overarching conclusion remains on a solid foundation.
While we only presented the results of the Bayesian inference approach in this commentary, as a check we actually tested two other approaches:
The two tests were in effort to isolate the effects of the different components of our test.
What we found was that while the reported figures changed, the overall magnitude did not. In other words, the median death-date of HML may not have been 67 years, but the order of magnitude remained much the same: decades.
Stepping back, these results were somewhat a foregone conclusion. We would not expect an effect that has been determined to be statistically significant over a hundred year period to unravel in a few years. Furthermore, we would expect a number of scenarios that continue to bolster the statistical strength just due to randomness alone.
Why are we defending price-to-book?
The point of this commentary was not to defend price-to-book as a measure. Rather, it was to bring up a larger point.
As a community, quantitative investors often leverage statistical significance as a defense for the way we invest.
We think that is a good thing. We should look at the weight of the evidence. We should be data driven. We should try to find ideas that have proven to be robust over decades of time and when applied in different markets or with different asset classes. We should want to find strategies that are robust to small changes in parameterization.
Many quants would argue (including us among them), however, that there also needs to be a why. Why does this factor work? Without the why, we run the risk of glorified data mining. With the why, we can choose for ourselves whether we believe the effect will continue going forward.
Of course, there is nothing that prevents the why from being pure narrative fallacy. Perhaps we have simply weaved a story into a pattern of facts.
With price-to-book, one might argue we have done the exact opposite. The effect, technically, remains statistically significant and yet plenty of ink has been spilled as to why it shouldn’t work in the future.
The question we must answer, then, is, “when does statistically significant apply and when does it not?” How can we use it as a justification in one place and completely ignore it in others?
Furthermore, if we are going to rely on hundreds of years of data to establish significance, how can we determine when something is “broken” if the statistical evidence does not support it?
Price-to-book may very well be broken. But that is not the point of this commentary. The point is simply that the same tools we use to establish and defend factors may prevent us from tearing them down.