This post is available as a PDF download here.
Summary
- Defensive equity strategies are comprised of stocks that lose less than the market during bear markets while keeping up with the market during a bull market.
- Coarse sorts on metrics such as volatility, beta, value, and momentum lead to diversified portfolios but have mixed results in terms of their defensive characteristics, especially through different crisis periods that may favor one metric over another.
- Using non-linear machine learning techniques is a desirable way to identify certain combinations of factors that lead to better defensive equity strategies over multiple periods.
- By applying techniques such as random forests and gradient boosting to two sample defensive equity metrics, we find that machine learning does not add significant value over a low volatility sort, given the features included in the model.
- While this by no means rules out the benefits of machine learning techniques, it shows how a blanket application of it is not a panacea for investing during crisis periods.
There is no shortage of hypotheses as to what characteristics define a stock that will outperform in a bear market. Some argue that value stocks should perform well, given their relative valuation buffer (the “less far to fall” argument). Some argue for a focus on balance sheet strength while others argue that cash-flow is the ultimate life blood of a company and should be prioritized. There are even arguments for industry preferences based upon economic cyclicality.
Each recession and crisis is unique, however, and therefore the characteristics of stocks that fare best will likely change. For example, the dot-com run-up caused a large number of real-economy businesses to be sorted into the “cheap” bucket of the value factor. These companies also tended to have higher quality earnings and lower beta / volatility than the dot-com stocks.
Common sense would indicate that unconstrained value may be a natural counter-hedge towards large, speculative bubbles, but we need only look towards 2008 – a credit and liquidity event – to see that value is not a panacea for every type of crisis.
It is for this reason that some investors prefer to take their cues from market-informed metrics such as beta, volatility, momentum, or trading volume.
Regardless of approach, there are some philosophical limitations we should consider when it comes to expectations with defensive equity portfolios. First, if we were able to identify an approach that could avoid market losses, then we would expect that strategy to also have negative alpha.1 If this were not the case, we could construct an arbitrage.
Therefore, in designing a defensive equity portfolio, our aim should be to provide ample downside protection against market losses while minimizing the relative upside participation cost of doing so.
Traditional linear sorts – such as buying the lowest volatility stocks – are coarse by design. They aim to robustly capture a general truth and hedge missed subtleties through diversification. For example, while some stocks deserve to be cheap and some stocks are expensive for good reason, naïve value sorts will do little to distinguish them from those that are unjustifiably cheap or rich.
For a defensive equity portfolio, however, this coarseness may not only reduce effectiveness, but it may also increase the implicit cost. Therefore, in this note we implement non-linear techniques in an effort to more precisely identify combinations of characteristics that may create a more effective defensive equity strategy.
The Strategy Objective
To start, we must begin by defining precisely what we mean by a “defensive equity strategy.” What are the characteristics that would make us label one security as defensive and another as not? Or, potentially better, is there a characteristic that allows us to rank securities on a gradient of defensiveness?
This is not a trivial decision, as our entire exercise will attempt to maximize the probability of correctly identifying securities with this characteristic.
As our goal is to find those securities which provide the most protection during equity market routs but bleed the least during equity market rallies, we chose a metric that scored how closely a stock’s return reflected the payoff of a call option on the S&P 500 over the next 63 trading days (approximately 3 months).
In other words, if the S&P 500 is positive over the next 63 trading days, the score of a security is equal to the squared difference between its return and the S&P 500’s return. If the market’s return is negative, the score of a security is simply its squared return.
To determine whether this metric reflects the type of profile we want, we can create a long/short portfolio. Each month we rank securities by their scores and select the quintile with the lowest scores. Securities are then weighted by their market capitalization. Securities are held for three months and the portfolio is implemented with three tranches. The short leg of the portfolio is the market rather than the highest quintile, as we are explicitly trying to identify defense against the market.
To create a scalable solution, we restrict our investable universe to those in the top 1,000 securities by market capitalization.
We plot the performance below.
Source: Sharadar Fundamentals. Calculations by Newfound Research. Returns are hypothetical and backtested. Returns are gross of all fees including, but not limited to, management fees, transaction fees, and taxes. Returns assume the reinvestment of all distributions.
We can see that the strategy is relatively flat during bull markets (1998-2000, 2003-2007, 2011-2015, 2016-2018), but rallies during bear markets and sudden market shocks (2000-2003, 2008, 2011, 2015/2016, Q4 2018, and 2020).
Interestingly, despite having no sector constraints and not explicitly targeting tracking error at the portfolio level, the resulting portfolio ends up well diversified across sectors, though it does appear to make significant short-term jumps in sector weights. We can also see an increasing tilt towards Technology over the last 3 years in the portfolio. In recent months, positions in Financials and Industrials have been almost outright eliminated.
Source: Sharadar Fundamentals. Calculations by Newfound Research.
Of course, this metric is explicitly forward looking. We’re using a crystal ball to peer into the future and identify those stocks that track the best on the way up and protect the best on the way down. Our goal, then, is to use a variety of company and security characteristics to accurately forecast this score.
We will include a variety of characteristics and features, including:
- Size: Market Capitalization.
- Valuation: Book-to-Price, Earnings-to-Price, Free Cash Flow-to-Price, Revenue-to-EV, and EBITDA-to-EV.
- Momentum: 12-1 Month Return and 1-Month Return.
- Risk: Beta, Volatility, Idiosyncratic Volatility, and Ulcer Index.
- Quality: Accruals, ROA, ROE, CFOA, GPOA, Net Margin, Asset Turnover, Leverage, and Payout Ratio.
- Growth: Internal Growth Rate, EPS Growth, Revenue Growth.
These 24 features are all cross-sectionally ranked at each point in time. We also include dummy variables for each security to represent sector inclusion as well as whether the company has positive Net Income and whether the company has positive Operating Cash Flow.
Note that we are not including any market regime characteristics, such information about market returns, volatility, interest rates, credit spreads, sentiment, or monetary or fiscal policy. Had we included such features, our resulting model may end up as a factor switching approach, changing which characteristics it selects based upon the market environment. This may be an interesting model in its own right, but our goal herein is simply to design a static, non-linear factor sort.
Random Forests
Our first approach will be to apply a random forest algorithm, which is an ensemble learning method. The approach uses a training data set to build a number of individual decision trees whose results are then re-combined to create the ultimate decision. By training each tree on a subset of data and considering only a subset of features for each node, we can create trees that may individually have high variance, but as an aggregate forest reduce variance without necessarily increasing bias.
As an example, this means that one tree may be built using a mixture of low volatility and quality features, while another may be built using valuation and momentum features. Each tree is able to model a non-linear relationship, but by restricting tree depth and building trees using random subsets of data and features, we can prevent overfitting.
There are a number of hyperparameters that can be set to govern the model fit. For example, we can set the maximum depth of the individual trees as well as the number of trees we want to fit. Fitting hyperparameters is an art unto itself, and rather than go down the rabbit hole of tuning hyperparameters via cross-validation, we did our best to select reasonable hyper parameters. We elected to train the model on 50% of our data (March 1998 to March 2009), with a total of 100 trees each with a maximum depth of 2.
The results of the exercise are plotted below.
Source: Sharadar Fundamentals. Calculations by Newfound Research.
The performance does appear to provide defensive properties both in- and out-of-sample, with meaningful returns generated in 2000-2002, 2008, Q3 and Q4 of 2011, June 2015 through June 2016, and Q4 2008.
We can see that the allocations also express a number of static sector concentrations (e.g. Consumer Defensive) as well as some cyclical changes (e.g. Finances pre- and post-2007).
We can also gain insight into how the portfolio composition changes by looking at the weighted characteristic scores of the long leg of the portfolio over time.
Source: Sharadar Fundamentals. Calculations by Newfound Research.
It is important to remember that characteristics are cross-sectionally ranked across stocks. For some characteristics, higher is often considered better (e.g. a higher earnings-to-price cheaper is considered cheaper), whereas for other factors lower is better (e.g. lower volatility is considered to have less risk).
We can see that some characteristics are static tilts: higher market capitalization, positive operating cash flow, positive net income, and lower risk characteristics. Other characteristics are more dynamic. By 12/2008, the portfolio has tilted heavily towards high momentum stocks. A year later, the portfolio has tilted heavily towards low momentum stocks.
What is somewhat difficult to disentangle is whether these static and dynamic effects are due to the non-linear model we have developed, or whether it’s simply that applying static tilts results in the dynamic tilts. For example, if we only applied a low volatility tilt, is it possible that the momentum tilts would emerge naturally?
Unfortunately, the answer appears to be the latter. If we plot a long/short portfolio that goes long the bottom quintile of stocks ranked on realized 1-year volatility and short the broad market, we see a very familiar equity curve.
Source: Sharadar Fundamentals. Calculations by Newfound Research. Returns are hypothetical and backtested. Returns are gross of all fees including, but not limited to, management fees, transaction fees, and taxes. Returns assume the reinvestment of all distributions.
It would appear that the random forest model effectively identified the benefits of low volatility securities. And while out-of-sample performance does appear to provide more ample defense during 2011, 2015-2016, and 2018 than the low volatility tilt, it also has significantly greater performance drag.
Gradient Boosting
One potential improvement we might consider is to apply a gradient boosting model. Rather than simply building our decision trees independently in parallel, we can build them sequentially such that each tree is built on a modified version of the original data set (e.g. increasing the weights of those data points that were harder to classify and decreasing the weights on those that were easier).
Rather than just generalize to a low-volatility proxy, gradient boosting may allow our decision tree process to pick up upon greater subtleties and conditional relationships in the data. For comparison purposes, we’ll assume the same maximum tree depth and number of trees as the random forest method.
In initially evaluating the importance of features, it does appear that low volatility remains a critical factor, but other characteristics – such as momentum, free cash flow yield, and payout ratio – are close seconds. This may be a hint that gradient boosting was able to identify more subtle relationships.
Unfortunately, in evaluating the sector characteristics over time, we see a very similar pattern. Though we can notice that sectors like Technology have received a meaningfully higher allocation with this methodology versus the random forest approach.
Source: Sharadar Fundamentals. Calculations by Newfound Research.
If we compare long/short portfolios, we find little meaningful difference to our past results. Our model simply seems to identify a (historically less effective) low volatility model.
Source: Sharadar Fundamentals. Calculations by Newfound Research. Returns are hypothetical and backtested. Returns are gross of all fees including, but not limited to, management fees, transaction fees, and taxes. Returns assume the reinvestment of all distributions.
Re-Defining Defensiveness
When we set out on this problem, we made a key decision to define a stock’s defensiveness by how closely it is able to replicate the payoff of a call option on the S&P 500. What if we had elected another definition, though? For example, we could define defensive stocks as those that minimize the depth and frequency of drawdowns using a measure like the Ulcer Index.
Below we replicate the above tests but use forward 12-month Ulcer Index as our target score (or, more precisely, a security’s forward 12-month cross-sectional Ulcer Index rank).
We again begin by constructing an index that has perfect foresight, buying a market-capitalization weighted portfolio of securities that rank in the lowest quintile of forward 12-month ulcer index. We see a very different payoff profile than before, with strong performance exhibited in both bull and bear markets.
By focusing on forward 12-month scores rather than 3-month scores, we also see a far steadier sector allocation profile over time. Interestingly, we still see meaningful sector tilts over time, with sectors like Technology, Financials, and Consumer Defensives coming in and out of favor over time.
Source: Sharadar Fundamentals. Calculations by Newfound Research. Returns are hypothetical and backtested. Returns are gross of all fees including, but not limited to, management fees, transaction fees, and taxes. Returns assume the reinvestment of all distributions.
We again use a gradient boosted random forest model to try to model our target scores. We find that five of the top six most important features are price return related, either measuring return or risk.
Despite the increased emphasis on momentum, the resulting long/short index still echoes a naïve low-volatility sort. This is likely because negative momentum and high volatility have become reasonably correlated proxies for one another in recent years.
While returns appear improved from prior attempts, the out-of-sample performance (March 2009 and onward) is almost identical to that of the low-volatility long/short.
Source: Sharadar Fundamentals. Calculations by Newfound Research. Returns are hypothetical and backtested. Returns are gross of all fees including, but not limited to, management fees, transaction fees, and taxes. Returns assume the reinvestment of all distributions.
Conclusion
In this research note we sought to apply machine learning techniques to factor portfolio construction. Our goal was to exploit the ability of machine learning models to model non-linear relationships, hoping to come up with a more nuanced definition of a defensive equity portfolio.
In our first test, we defined a security’s defensiveness by how closely it was able to replicate the payoff of a call option on the S&P 500 over rolling 63-day (approximately 3-month) periods. If the market was up, we wanted to pick stocks that closely matched the market’s performance; if the market was down, we wanted to pick stocks that minimized drawdown.
After pre-engineering a set of features to capture both company and stock dynamics, we first turned to a random forest model. We chose this model as the decision tree structure would allow us to model conditional feature dynamics. By focusing on generating a large number of shallow trees we aimed to avoid overfitting while still reducing overall model variance.
Training the model on data from 1999-2009, we found that the results strongly favored companies exhibiting positive operating cash flow, positive earnings, and low realized risk characteristics (e.g. volatility and beta). Unfortunately, the model did not appear to provide any meaningful advantage versus a simple linear sort on volatility.
We then turned to applying gradient boosting to our random forest. This approach builds trees in sequence such that each tree seeks to improve upon the last. We hoped that such an approach would allow the random forest to build more nuance than simply scoring on realized volatility.
Unfortunately, the results remained largely the same.
Finally, we decided to change our definition of defensiveness by focusing on the depth and frequency of drawdowns with the Ulcer Index. Again, after re-applying the gradient boosted random forest model, we found little difference in realized results versus a simple sort on volatility (especially out-of-sample).
One answer for these similar results may be that our objective function is highly correlated to volatility measures. For example, if stocks follow a geometric Brownian motion process, those with higher levels of volatility should have deeper drawdowns. And if the best predictor of future realized volatility is past realized volatility, then it is no huge surprise that the models ultimately fell back towards a naïve volatility sort.
Interestingly, value, quality, and growth characteristics seemed largely ignored. We see two potential reasons for this.
The first possibility is that they were simply subsumed by low volatility with respect to our objective. If this were the case, however, we would see little feature importance placed upon them, but would still expect their weighted average characteristic scores within our portfolios to be higher (or lower). While this is true for select features (e.g. payout ratio), the importance of others appears largely cyclical (e.g. earnings-to-price). In fact, during the fall out of the dot-com bubble, weighted average value scores remained between 40 and 70.
The second reason is that the fundamental drivers behind each market sell-off are different. Factors tied to company metrics (e.g. valuation, quality, or growth), therefore, may be ill-suited to navigate different types of sell offs. For example, value was the natural antithesis to the speculative dot-com bubble. However, during the recent COVID-19 crisis, it has been the already richly priced technology stocks that have fared the best. Factors based upon security characteristics (e.g. volatility, returns, or volume) may be better suited to dynamically adjust to market changes.
While our results were rather lackluster, we should acknowledge that we have really only scratched the surface of machine learning techniques. Furthermore, our results are intrinsically linked to how we’ve defined our problem and the features we engineered. A more thoughtful target score or a different set of features may lead to substantially different results.
Option-Based Trend Following
By Nathan Faber
On June 23, 2020
In Risk Management, Trend, Weekly Commentary
This post is available as a PDF download here.
Summary
The non-linear payoff of trend following strategies has many similarities to options strategies, and by way of analogy, we can often gain insight into which market environments will favor trend following and why.
In our previous research piece, Straddles and Trend Following, we looked at purchasing straddles – that is, a call option and a put option – with a strike price tied to the anchor price of the trend following model. For example, if the trend following model invested in equities when the return over the past 12 months was positive, for a security that was at $100 12-months ago and is at $120 today, we would purchase a call and a put option with a strike price of $100. In this case, the call would be 20% in-the-money (ITM) and the put would be out-of-the-money (OTM).
In essence, this strategy acted like an insurance policy where the payout was tied to a reversion in the trend signal, and the premium paid when the trend signal was strong was small.
This concept of insurance is an important discussion topic in trend following strategies. The risk we must manage in these types of strategies, either directly through insurance or some other indirect means like diversification, is whipsaw.
In this commentary, we will construct an options strategy that is similar to a trend following strategy. The option strategy will pay a premium up-front to avoid whipsaw. By comparing this strategy to trend following that bears the full risk of whipsaw, we can set a better practical bound for how much investors should expect to pay or earn for bearing this risk.
Methodology and Data
For this analysis, we will use the S&P 500 index for equity returns, the 1-year LIBOR rate as the risk-free rate, and options data on the S&P 500 (SPX options).
To bridge the gap between practice and abstraction, we will utilize a volatility surface calibrated to real option data to price options. We will constrain our SPX options to $5 increments and interpolate total implied variance to get prices for options that were either illiquid or not included in the data set.
For the most part, we will stick to options that expire on the third Friday of each month and will mention when we deviate from that assumption.
The long/short trend equity strategy looks at total returns of equities over 12 months. If this return is positive, the strategy invests in equities for the following month. If the return is negative, the strategy shorts equities for the following month and earns the risk-free rate on the cash. The strategy is rebalanced monthly on the options expiration dates.
For the option-based trend strategy, on each rebalance date, we will purchase a 1-month call if the trend signal is positive or a put if the trend signal is negative. We will purchase all options at-the-money (ATM) and hold them to expiration. The strategy is fully cash-collateralized. Any premium is paid on the options roll date, interest is earned on the remaining account balance, and the option payout is realized on the next roll date.
Why are we now using ATM options when previous research used ITM and OTM options, potentially deeply ITM or OTM?
Here we are looking to isolate the cost of whipsaw in the premium paid for the option while earning a payout that is close to that of the underlying in the event that our trend signal is correct. If we utilized OTM options, then our premium would be lower but we would realize smaller gains if the underlying followed the trend. ITM options would have downside exposure before the protection kicked in.
We are also not using straddles since we do not want to pay extra premium for the chance to profit off a whipsaw. The underlying assumption here is that there is value in the trend following signal. Either strategy is able to capitalize on that (i.e. it’s the control variable); the strategies primarily differ in their treatment of whipsaw costs.
The High Cost of ATM Options
The built-in whipsaw protection in the options does not come cheap. The chart below shows the –L/S trend following strategy–, the –option-based trend strategy–, and the ratio of the two (dotted).Source: DiscountOptionsData.com. Calculations by Newfound Research. Returns are hypothetical and backtested. Returns are gross of all fees including, but not limited to, management fees, transaction fees, and taxes. Returns assume the reinvestment of all distributions.
During normal market environments and even in prolonged equity-market drawdown periods like 2008, trend following outperformed the option-based strategy. Earning the full return on the underlying equity is generally beneficial.
However, something that is “generally beneficial” can be erased very quickly. In March 2020, the trend following strategy reverted back to the level of the option-based strategy. If you had only looked at cumulative returns over those 15 years, you would not be able to tell much difference between the two.
The following chart highlights these tail effects.
Source: DiscountOptionsData.com. Calculations by Newfound Research. Returns are hypothetical and backtested. Returns are gross of all fees including, but not limited to, management fees, transaction fees, and taxes. Returns assume the reinvestment of all distributions.
In most months, the option-based strategy forfeits its ~1.5% premium for the ATM option. The 75th percentile cutoff is 2.2% and the 90th percentile cutoff is 2.9%. These premiums have occasionally spiked to 6-7%.
While these premiums are not always forfeited without some offsetting gain, they are always paid relative to the trend following strategy.
A 3% whipsaw event in trend should definitely not be a surprise based on the typical up-front cost of the option strategy.
Source: DiscountOptionsData.com. Calculations by Newfound Research.
But What About a 30% Whipsaw?
Now that’s a good question.
Up until March 2020, for the 15 years prior, the largest whipsaws relative to the options strategy were 12-13%. This is the epitome of tail risk, and it can be disheartening to think that now that we have seen 30% underperformance, we should probably expect more at some point in the (hopefully very distant) future.
However, a richer sample set can shed some light on this very poor performance.
Let’s relax our assumption that we roll the options and rebalance the trend strategies on the third Friday of the month and instead allow rebalances and rolls on any day in the month. Since we are dealing with one-month options, this is not beyond implementation since there are typically options listed that expire on Monday, Wednesday, and Friday.
The chart below shows all of these option strategies and how large of an effect that roll / rebalance timing luck can have.
Source: DiscountOptionsData.com. Calculations by Newfound Research. Returns are hypothetical and backtested. Returns are gross of all fees including, but not limited to, management fees, transaction fees, and taxes. Returns assume the reinvestment of all distributions.
With timing luck in both the options strategies and trend following, there can be large effects when the luck cuts opposite ways.
The worst returns between rebalances of trend following relative to each options strategy highlight how bad the realized path in March 2020 truly was.
Source: DiscountOptionsData.com. Calculations by Newfound Research. Returns are hypothetical and backtested. Returns are gross of all fees including, but not limited to, management fees, transaction fees, and taxes. Returns assume the reinvestment of all distributions.
In many of the trend following and option strategies pairs, the worst underperformance of trend following over any monthlong period was around 10%.
Returning to the premise that the options strategies are analogous to trend following, we see the same effects of timing luck that we have explored in previous research: effects that make comparing variants of the same strategy or similar strategies more nuanced. Whether an option strategy is used for research, benchmarking, or active investing, the implications of this timing luck should be taken into account.
But even without taking a multi-model approach at this point to the options strategy, can we move toward a deeper understanding of when it may be an effective way to offset some of the risk of whipsaw?
I’d Gladly Pay You Tuesday for a Whipsaw Risk Today
With the two extremes of paying for whipsaw up front with options and being fully exposed to whipsaw through trend following, perhaps there is a way to tailor this whipsaw risk profile. If the risk of whipsaw is elevated but the cost of paying for the insurance is cheap, then the options strategy may be favorable. On the other hand, if option premiums are high, trend following may more efficiently capture the market returns.
The price of the options (or their implied volatilities) is a natural place to start investigating this topic since it encapsulates the premium for whipsaw insurance. The problem is that it may not be a reliable signal if there is no barrier to efficiency in the options market, either behavioral or structural.
Comparing the ATM option implied volatilities with the trend signal (12-month trailing returns), we see a negative correlation, which indicates that the options-based strategy will have a higher hurdle rate of return in strongly downtrending market environments.
Source: DiscountOptionsData.com. Calculations by Newfound Research.
But this is only one piece of the puzzle.
Do these implied volatilities relate to the forward 1-month returns for the S&P 500?
Based on the above scatterplot: not really. However, since we are merely sticking implied volatility in the middle of the trend following signal and the forward return, and we believe that trend following works over the long run, then we must believe there is some relationship between implied volatility and forward returns.
While this monthly trend following signal is directionally correct over the next month 60% of the time, historically, that says nothing about the magnitude of the returns based on the signal.
Without looking too much into the data to avoid overfitting a model, we will set a simple cutoff of 20% implied volatility. If options cost more than that, we will utilize trend following. If they cost less, we will invest in the options strategy.
We will also compare it to a 50/50 blend of the two.
Source: DiscountOptionsData.com. Calculations by Newfound Research. Returns are hypothetical and backtested. Returns are gross of all fees including, but not limited to, management fees, transaction fees, and taxes. Returns assume the reinvestment of all distributions.
The switching strategy (gray line) worked well until around 2013 when the option prices were cheap, but the risk of whipsaw was not realized. It did make it through 2015, 2016 and 4Q 2018 better than trend following.
When viewed in a broader context of a portfolio, since these are alternative strategies, it does not take a huge allocation to make a difference. These strategies manage equity risk, so we can pair them with an allocation to the S&P 500 (SPY) and see how the aggregate statistics are affected over the period from 2005 to April 2020.
The chart below plots the efficient frontiers of allocations to 100% SPY at the point of convergence on the right of the graph) to 40% SPY on the left of the graph with the remainder allocated to the risk- management strategy.
The Sharpe ratio is maximized at a 35% allocation to the switching strategy, a 25% allocation to the option-based strategy, and 10% for the trend following strategy.
Source: DiscountOptionsData.com. Calculations by Newfound Research. Returns are hypothetical and backtested. Returns are gross of all fees including, but not limited to, management fees, transaction fees, and taxes. Returns assume the reinvestment of all distributions.
Conclusion
In this research note, we explored the link between trend following and options strategies using 1-month ATM put and call options, depending on the sign of the trend.
The cost of ATM options Is generally 1.5% of the portfolio value, but the fact that this cost can spike upwards of 9% should justify larger whipsaws in trend following strategies. Very large whipsaws, like in March 2020, not only show that the cost can be seemingly unbounded but also that there is significant exposure to timing luck based upon the option roll dates.
Then, we moved on to investigating a simple way to allocate between the two strategies based upon the cost of the options, When the options were cheap, we used that strategy, and when they were expensive, we invested in the trend following strategy. A modest allocation is enough to make a different in the realized efficient frontier.
Deciding to pay the up-front payment of the whipsaw insurance premium, bear the full risk a whipsaw, or land somewhere in between is largely up to investor preferences. It is risky to have a large downside potential, but the added benefit of no premiums can be enough to offset the risk.
An implied volatility threshold was a rather crude signal for assessing the risk of whipsaw and the price of insuring against it. Further research into one or multiple signals and a robust process for aggregating them into an investment decision is needed to make more definitive statements on when trend following is better than options or vice versa. The extent that whipsaw can be mitigated while still maintaining the potential to earn diversified returns is likely limited, but the optimal blend of trend following and options can be a beneficial guideline for investors to weather both sudden and prolonged drawdowns.