Author: Corey Hoffstein Page 11 of 18

Corey is co-founder and Chief Investment Officer of Newfound Research.

Corey holds a Master of Science in Computational Finance from Carnegie Mellon University and a Bachelor of Science in Computer Science, cum laude, from Cornell University.

You can connect with Corey on LinkedIn or Twitter.

Decomposing Trend Equity

By Corey Hoffstein

On September 24, 2018

In Risk & Style Premia, Risk Management, Trend, Weekly Commentary

This post is available as a PDF download here.

Summary

We introduce the simple arithmetic of portfolio construction where a strategy can be broken into a strategic allocation and a self-financing trading strategy.
For long/flat trend equity strategies, we introduce two potential decompositions.
The first implementation is similar to equity exposure with a put option overlay. The second is similar to a 50% equity / 50% cash allocation with a 50% overlay to a straddle.
By evaluating the return profile of the active trading strategy in both decompositions, we can gain a better understanding for how we expect the strategy to perform in different environments.
In both cases, we can see that trend equity can be thought of as a strategic allocation to equities – seeking to benefit from the equity risk premium – plus an alternative strategy that seeks to harvest benefits from the trend premium.

The Simple Arithmetic of Portfolio Construction

In our commentary A Trend Equity Primer, we introduced the concept of trend equity, a category of strategies that aim to harvest the long-term benefits of the equity risk premium while managing downside risk through the application of trend following. In this brief follow-up piece, we aim to provide further transparency into the behavior of trend equity strategies by decomposing this category of strategies into component pieces.

First, what do we mean by “decompose”?

As it turns out, the arithmetic of portfolios is fairly straight forward. Consider this simple scenario: we currently hold a portfolio consisting entirely of asset A and want to hold a portfolio that is 50% A and 50% of some asset B. What do we do?

Figure 1

No, this is not a trick question. The straightforward answer is that we sell 50% of our exposure in A and buy 50% of our exposure in B. As it turns out, however, this is entirely equivalent to holding our portfolio constant and simply going short 50% exposure in A and using the proceeds to purchase 50% notional portfolio exposure in B (see Figure 2). Operationally, of course, these are very different things. Thinking about the portfolio in this way, however, can be constructive to truly understanding the implications of the trade.

The difference in performance between our new portfolio and our old portfolio will be entirely captured by the performance of this long/short overlay. This tells us, for example, that the new portfolio will outperform the old portfolio when asset B outperforms asset A, since the long/short portfolio effectively captures the spread in performance between asset B and asset A.

Figure 2: Portfolio Arithmetic – Long/Short Overlay

Relative to our original portfolio, the long/short represents our active bets. A slightly more nuanced view of this arithmetic requires scaling our active bets such that each leg is equal to 100%, and then only implementing a portion of that overlay. It is important to note that the overlay is “dollar-neutral”: in other words, the dollars allocated to the short leg and the long leg add up to zero. This is also called “self-funding” because it is presumed that we would enter the short position and then use the cash generated to purchase our long exposure, allowing us to enter the trade without utilizing any capital.

Figure 3: Portfolio Arithmetic – Scaled Long/Short Overlay

In our prior example, a portfolio that is 50% long B and 50% short A is equivalent to 50% exposure to a portfolio that is 100% long B and 100% short A. The benefit of taking this extra step is that it allows us to decompose our trade into two pieces: the active bets we are making and the sizing of these bets.

Decomposing Trend Equity

Trend equity strategies are those strategies that seek to combine structural exposure to equities with the potential benefits of an active trend-following trading strategy. A simple example of such a strategy is a “long/flat” strategy that invests in large-cap U.S. equities when the measured trend in large-cap U.S. equities is positive and otherwise invests in short-term U.S. Treasuries (or any other defensive asset class).

An obvious question with a potentially non-obvious answer is, “how do we benchmark such a strategy?” This is where we believe decomposition can be informative. Our goal should be to decompose the portfolio into two pieces: the strategic benchmark allocation and a dollar-neutral long/short trading strategy that captures the manager’s active bets.

For long/flat trend equity strategies, we believe there are two obvious decompositions, which we outline in Figure 4.

Figure 4

	Strategic Position	Trend Strategy
Decomposition		Positive Trend	Negative Trend
Strategic + Flat/Short Trend Strategy	100% Equity	No Position	-100% Equity 100% ST US Treasuries
Strategic + 50% Long/Short Trend Strategy	50% Equity 50% ST US Treasuries	100% Equity -100% ST US Treasuries	-100% Equity +100% ST US Treasuries

Equity + Flat/Short

The first decomposition achieves the long/flat strategy profile by assuming a strategic allocation that is allocated to U.S. equities. This is complemented by a trading strategy that goes short large-cap U.S. equities when the trend is negative, investing the available cash in short-term U.S. Treasuries, and does nothing otherwise.

The net effect is that when trends are positive, the strategy remains fully invested in large-cap U.S. equities. When trends are negative, the overlay nets out exposure to large-cap U.S. equities and leaves the portfolio exposed only to short-term U.S. Treasuries.

In Figures 5, we plot the return profile of a hypothetical flat/short large-cap U.S. equity strategy.

Figure 5: A Flat/Short U.S. Equity Strategy

Source: Newfound Research. Return data relies on hypothetical indices and is exclusive of all fees and expenses. Returns assume the reinvestment of all dividends. Flat/Short Equity shorts U.S. Large-Cap Equity when the prior month has a positive 12-1 month total return, investing available capital in 3-month U.S. Treasury Bills. The strategy assumes zero cost of shorting. The Flat/Short Equity strategy does not reflect any strategy offered or managed by Newfound Research and was constructed exclusively for the purposes of this commentary. It is not possible to invest in an index. Past performance does not guarantee future results.

The flat/short strategy has historically achieved a payoff structure that looks very much like a put option: positive returns during significantly negative return regimes, and (on average) slight losses otherwise. Of course, unlike a put option where the premium paid is known upfront, the flat/short trading strategy pays its premium in the form of “whipsaw” resulting from trend reversals. These head-fakes cause the strategy to “short low” and “cover high,” creating realized losses.

Our expectation for future returns, then, is a combination of the two underlying strategies:

100% Strategic Equity: We should expect to earn, over the long run, the equity risk premium at the risk of large losses due to economic shocks.
100% Flat/Short Equity: Empirical evidence suggests that we should expect a return profile similar to a put option, with negative returns in most environments and the potential for large, positive returns during periods where large-cap U.S. equities exhibit large losses. Historically, the premium for the trend-following “put option” has been significantly less than the premium for buying actual put options. As a result, hedging with trend-following has delivered higher risk-adjusted returns. Note, however, that trend-following is rarely helpful in protecting against sudden losses (e.g. October 1987) like an actual put option would be.

Taken together, our long-term return expectation should be the equity risk premium minus the whipsaw costs of the flat/short strategy. The drag in return, however, is payment for the expectation that significant left-tail events will be meaningfully offset. In many ways, this decomposition lends itself nicely to thinking of trend equity as a “defensive equity” allocation.

Figure 6: Combination of U.S. Large-Cap Equities and a Flat/Short Trend-Following Strategy

Source: Newfound Research. Return data relies on hypothetical indices and is exclusive of all fees and expenses. Returns assume the reinvestment of all dividends. Flat/Short Equity shorts U.S. Large-Cap Equity when the prior month has a negative 12-1 month total return, investing available capital in 3-month U.S. Treasury Bills. The strategy assumes zero cost of shorting. The Flat/Short Equity strategy does not reflect any strategy offered or managed by Newfound Research and was constructed exclusively for the purposes of this commentary. It is not possible to invest in an index. Past performance does not guarantee future results.

50% Equity/50% Cash + 50% Long/Short

The second decomposition achieves the long/flat strategy profile by assuming a strategic allocation that is 50% large-cap U.S. equities and 50% short-term U.S. Treasuries. The overlaid trend strategy now goes both long and short U.S. equities depending upon the underlying trend signal, going short and long large-cap U.S. Treasuries to keep the dollar-neutral profile of the overlay.

One difference in this approach is that to achieve the desired long/flat return profile, only 50% exposure to the long/short strategy is required. As before, the net effect is such that when trends are positive, the portfolio is invested entirely in large-cap U.S. equities (as the short-term U.S. Treasury positions cancel out), and when trends are negative, the portfolio is entirely invested in short-term U.S. Treasuries.

In Figures 7, we plot the return profile of a hypothetical long/short large-cap U.S. equity strategy.

Figure 7: A Long/Short Equity Trend-Following Strategy

Source: Newfound Research. Return data relies on hypothetical indices and is exclusive of all fees and expenses. Returns assume the reinvestment of all dividends. Long/Short Equity goes long U.S. Large-Cap Equity when the prior month has a positive 12-1 month total return, shorting an equivalent amount in 3-month U.S. Treasury Bills. When the prior month has a negative 12-1 month total return, the strategy goes short U.S. Large-Cap Equity, investing available capital in 3-month U.S. Treasury Bills. The strategy assumes zero cost of shorting. The Long/Short Equity strategy does not reflect any strategy offered or managed by Newfound Research and was constructed exclusively for the purposes of this commentary. It is not possible to invest in an index. Past performance does not guarantee future results.

We can see the traditional “smile” associated with long/short trend-following strategies. With options, this payoff profile is reminiscent of a straddle, a strategy that combines a position in a put and a call option to profit in both extremely positive and negative environments. The premium paid to buy these options causes the strategy to lose money in more normal environments. We see a similar result with the long/short trend-following approach.

As before, our expectation for future returns is a combination of the two underlying strategies:

50% Equity / 50% Cash: We should expect to earn, over the long run, about half the equity risk premium, but only expect to suffer about half the losses associated with equities.
50% Long/Short Equity: The “smile” payoff associated with trend following should increase exposure to equities in the positive tail and help offset losses in the negative tail, at the cost of whipsaw during periods of trend reversals.

Taken together, we should expect equity up-capture exceeding 50% in strongly trending years, a down-capture less than 50% in strongly negatively trending years, and a slight drag in more normal environments. We believe that this form of decomposition is most useful when investors are planning to fund their trend equity from both stocks and bonds, effectively using it as a risk pivot within their portfolio.

In Figure 8, we plot the return combined return profile of the two component pieces. Note that it is identical to Figure 6.

Figure 8

Conclusion

In this commentary, we continued our exploration of trend equity strategies. To gain a better sense of how we should expect trend equity strategies to perform, we introduce the basic arithmetic of portfolio construction that we later use to decompose trend equity into a strategic allocation plus a self-funded trading strategy.

In the first decomposition, we break trend equity into a strategic, passive allocation in large-cap U.S. equities plus a self-funding flat/short trading strategy. The flat/short strategy sits in cash when trends in large-cap U.S. equities are positive and goes short large-cap U.S. equities when trends are negative. In isolating the flat/short trading strategy, we see a return profile that is reminiscent of the payoff of a put option, exhibiting negative returns in positive market environments and large gains during negative market environments.

For investors planning on utilizing trend equity as a form of defensive equity, this decomposition is appropriate. It clearly demonstrates that we should expect returns that are less than passive equity during almost all market environments, with the exception being extreme negative tail events, where the trading strategy aims to hedge against significant losses. While we would expect to be able to measure manager skill by the amount of drag created to equities during positive markets (i.e. the “cost of the hedge”), we can see from the hypothetical example inn Figure 5 that there is considerable variation year-to-year, making short-term analysis difficult.

In our second decomposition, we break trend equity into a strategic portfolio that is 50% large-cap U.S. equity / 50% short-term U.S. Treasury plus a self-funding long/short trading strategy. If the flat/short trading strategy was similar to a put option, the long/short trading strategy is similar to a straddle, exhibiting profit in the wings of the return distribution and losses near the middle.

This particular decomposition is most relevant to investors who plan on funding their trend equity exposure from both stocks and bonds, allowing the position to serve as a risk pivot within their overall allocation. The strategic contribution provides partial exposure to the equity risk premium, but the trading strategy aims to add value in both tails, demonstrating that trend equity can potentially increase returns in both strongly positive and strongly negative environments.

In both cases, we can see that trend equity can be thought of as a strategic allocation to equities – seeking to benefit from the equity risk premium – plus an alternative strategy that seeks to harvest benefits from the trend premium.

In this sense, trend equity strategies help investors achieve capital efficiency. Allocations to the alternative return premia, in this case trend, does not require allocating away from the strategic, long-only portfolio. Rather, exposure to both the strategic holdings and the trend-following alternative strategy can be gained in the same package.

A Trend Equity Primer

By Corey Hoffstein

On September 17, 2018

In Risk & Style Premia, Risk Management, Trend, Weekly Commentary

This post is available as a PDF download here.

Summary

Trend-following strategies exploit the fact that investors exhibit behavioral biases that cause trends to persist.
While many investment strategies have a concave payoff profile that reaps small rewards at the risk of large losses, trend-following strategies exhibit a convex payoff profile, one that pays small premiums with the potential of a large reward.
By implementing a trend-following strategy on equities, investors can tap into both the long-term return premium from holding equities and the convex payoff profile associated with trend following.
There are multiple ways to include a trend-following equity strategy in a portfolio, and the method of incorporation will affect the overall risk and return expectations in different market environments.
As long as careful consideration is given to whipsaw, hedging ability, and implementation costs, trend-following equity can be a potentially useful diversifier in most traditionally allocated portfolios.

A Balance of Risks

Most investors – individual and institutional alike – live in the balance of two risks: failing slow and failing fast. Most investors are familiar with the latter: the risk of large and sudden drawdowns that can permanently impair an investor’s lifestyle or ability to meet future liabilities. Slow failure, on the other hand, occurs when an investor fails to grow their portfolio at a speed sufficient to offset inflation and withdrawals.

Investors have traditionally managed these risks through asset allocation, balancing exposure to growth-oriented asset classes (e.g. equities) with more conservative, risk-mitigating exposures (e.g. cash or bonds). How these assets are balanced is typically governed by where an investor falls in their investment lifecycle and which risk has the greatest impact upon the probability of their future success.

For example, younger investors who have a large proportion of their future wealth tied up in human capital often have very little risk of failing fast, as they are not presently relying upon withdrawals from their investment capital. Evidence suggests that the risk of fast failure peaks for pre- and early-retirees, whose future lifestyle will be largely predicated upon the amount of capital they are able to maintain into early retirement. Later-stage retirees, on the other hand, once again become subject to the risk of failing slow, as longer lifespans put greater pressure upon the initial retirement capital to last.

Trend equity strategies seek to address both risks simultaneously by maintaining equity exposure when trends are positive and de-risking the portfolio when trends are negative. Empirical evidence suggests that such strategies may allow investors to harvest a significant proportion of the long-term equity risk premium while significantly reducing the impact of severe and prolonged drawdowns.

The Potential Hedging Properties of Trend Following

When investors buy stocks and bonds, they are exposing themselves to “systematic risk factors.” These risk factors are the un-diversifiable uncertainties associated with any investment. For bearing these risks, investors expect to earn a reward. For example, common equity is generally considered to be riskier than fixed income because it is subordinate in the capital structure, does not have a defined payout, and does not have a defined maturity date. A rational investor would only elect to hold stocks over bonds, then, if they expected to earn a return premium for doing so.

Similarly, the historical premium associated with many active investment strategies are also assumed to be risk-based in nature. For example, quantitatively cheap stocks have historically outperformed expensive ones, an anomaly called the “value factor.” Cheap stocks may be trading cheaply for a reason, however, and the potential excess return earned from buying them may simply be the premium required by investors to bear the excess risk.

In many ways, an investor bearing risk can be thought of as an insurer, expecting to collect a premium over time for their willingness to carry risk that other investors are looking to offload. The payoff profile for premiums generated from bearing risk, however, is concave in nature: the investor expects to collect a small premium over time but is exposed to potentially large losses (see Figure 1). This approach is often called being “short volatility,” as the manifestation of risk often coincides with large (primarily negative) swings in asset values.

Even the process of rebalancing a strategic asset allocation can create a concave payoff structure. By reallocating back to a fixed mixture of assets, an investor sells assets that have recently outperformed and buys assets that have recently underperformed, benefiting when the relative performance of investments mean-reverts over time.

When taken together, strategically allocated portfolios – even those with exposure to alternative risk premia – tend to combine a series of concave payoff structures. This implies that a correlation-based diversification scheme may not be sufficient for managing left-tail risk during bad times, as a collection of small premiums may not offset large losses.

In contrast, trend-following strategies “cut their losers short and let their winners run” by design, creating a convex payoff structure (see Figure 1).¹ Whereas concave strategies can be thought of as collecting an expected return premium for bearing risk, a convex payoff can be thought of as expecting to pay an insurance premium in order to hedge risk. This implies that while concave payoffs benefit from stability, convex payoffs benefit from instability, potentially helping hedge portfolios against large losses at the cost of smaller negative returns during normal market environments.

Figure 1: Example Concave and Convex Payoff Structures; Profit in Blue and Loss in Orange

Source: Newfound Research. For illustrative purposes only and not representative of any Newfound Research product or investment.

What is Trend Equity?

Trend equity strategies rely upon the empirical evidence² that performance tends to persist in the short-run: positive performance tends to beget further positive performance and negative performance tends to beget further negative performance. The theory behind the evidence is that behavioral biases exhibited by investors lead to the emergence of trends.

In an efficient market, changes in the underlying value of an investment should be met by an immediate, commensurate change in the price of that investment. The empirical evidence of trends suggests that investors may not be entirely efficient at processing new information. Behavioral theory suggests that investors anchor their views on prior beliefs, causing price to underreact to new information. As price continues to drift towards fair value, herding behavior occurs, causing price to overreact and extend beyond fair value. Combined, these effects cause a trend.

Trend equity strategies seek to capture this potential inefficiency by systematically investing in equities when they are exhibiting positively trending characteristics and divesting when they exhibit negative trends. The potential benefit of this approach is that it can try to exploit two sources of return: (1) the expected long-term risk premium associated with equities, and (2) the convex payoff structure typically associated with trend-following strategies.

As shown in Figure 2, a hypothetical implementation of this strategy on large-cap U.S. equities has historically matched the long-term annualized return while significantly reducing exposure to both tails of the distribution. This is quantified in Figure 3, which demonstrates a significant reduction in both the skew and kurtosis (“fat-tailedness”) of the return distribution.

Figure 2

Figure 3

	U.S. Large-Cap Equities	Trend Equity
Annualized Return	11.1%	11.6%
Volatility	16.9%	11.3%
Skewness	-1.4	0.0
Excess Kurtosis	2.2	-1.0

Source: Newfound Research. Return data relies on hypothetical indices and is exclusive of all fees and expenses. Returns assume the reinvestment of all dividends. Trend Equity invests in U.S. Large-Cap Equity when the prior month has a positive 12-1 month total return and in 3-month U.S. Treasury Bills otherwise. The Trend Equity strategy does not reflect any strategy offered or managed by Newfound Research and was constructed exclusively for the purposes of this commentary. It is not possible to invest in an index. Past performance does not guarantee future results.

Implementing Trend Equity

With trend equity seeking to benefit from both the long-term equity risk premium and the convex payoff structure of trend-following, there are two obvious examples of how it can be implemented in the context of an existing strategic portfolio. The preference as to the approach taken will depend upon an investor’s goals.

Investors seeking to reduce risk in their portfolio may prefer to think of trend equity as a form of dynamically hedged equity, replacing a portion of their traditional equity exposure. In this case, when trend equity is fully invested, the portfolio will match the original allocation profile; when the trend equity strategy is divested, the portfolio will be significantly underweight equity exposure. The intent of this approach is to match the long-term return profile of equities with less realized risk.

On the other hand, investors seeking to increase their returns may prefer to treat trend equity as a pivot within their portfolio, funding the allocation by drawing upon both traditional stock and bond exposures. In this case, when fully invested, trend equity will create an overweight to equity exposure within the portfolio; when divested, it will create an underweight. The intent of this approach is to match the long-term realized risk profile of a blended stock/bond mix while enhancing long-term returns.

To explore these two options in the context of an investor’s lifecycle, we echo the work of Freccia, Rauseo, and Villalon (2017). Specifically, we will begin with a naïve “own-your-age” glide path, which allocates a proportion of capital to bonds equivalent to the investor’s age. We assume the split between domestic and international exposures is 60/40 and 70/30 respectively for stocks and bonds, selected to approximate the split between domestic and international exposures found in Vanguard’s Target Retirement Funds.

An investor seeking to reduce exposure to negative equity tail events could fund trend equity exposure entirely from their traditional equity allocation. Applying the own-your-age glide path over the horizon of June 1988 to June 2018, carving out 30% of U.S. equity exposure for trend equity (e.g. an 11.7% allocation for a 35 year old investor and an 8.1% allocation for a 55 year old investor) would have offered the same long-term return profile while reducing annualized volatility and the maximum drawdown experienced.

For an investor seeking to increase return, funding a position in trend equity from both U.S. equities and U.S. bonds may be a more applicable approach. Again, applying the own-your-age glide-path from June 1988 to June 2018, we find that replacing 50% of existing U.S. equity exposure and 30% of existing U.S. bond exposure with trend equity would have offered a nearly identical long-term volatility profile while increasing long-term annualized returns.

Figure 4

Source: Newfound Research. For illustrative purposes only and not representative of any Newfound Research product or investment.

Figure 5: Hypothetical Portfolio Statistics, June 1988 – June 2018

	Original Glidepath	Same Return, Decrease Risk	Increase Return, Same Risk
Annual Return	8.20%	8.25%	8.60%
Volatility	8.58%	8.17%	8.59%
Maximum Drawdown	-28.55%	-24.71%	-23.80%
Sharpe Ratio	0.61	0.64	0.65

Figure 6: Own-Your-Age Glide Paths Including Trend Equity

Source: Newfound Research. For illustrative purposes only and not representative of any Newfound Research product or investment. Allocation methodologies described in the preceding section.

A Discussion of Trade-Offs

At Newfound Research, we champion the philosophy that “risk cannot be destroyed, only transformed.” While we believe that a convex payoff structure – like that empirically found in trend-following strategies – can introduce beneficial diversification into traditionally allocated portfolios, we believe any overview is incomplete without a discussion of the potential trade-offs of such an approach.

The perceived trade-offs will be largely dictated by how trend equity is implemented by an investor. As in the last section, we will consider two cases: first the investor who replaces their traditional equity exposure, and second the investor that funds an allocation from both stocks and bonds.

In the first case, we believe that the convex payoff example displayed Figure 1 is important to keep in mind. Traditionally, convex payoffs tend to pay a premium during stable environments. When this payoff structure is combined with traditional long-only equity exposure to create a trend equity strategy, our expectation should be a return profile that is expected to lag behind traditional equity returns during calm market environments.

This is evident in Figure 7, which plots hypothetical rolling 3-year annualized returns for both large-cap U.S. equities and a hypothetical trend equity strategy. Figure 8 also demonstrates this effect, plotting rolling 1-year returns of a hypothetical trend equity strategy against large-cap U.S. equities, highlighting in orange those years when trend equity underperformed.

For the investor looking to employ trend equity as a means of enhancing return by funding exposure from both stocks and bonds, long-term risk statistics may be misleading. It is important to keep in mind that at any given time, trend equity can be fully invested in equity exposure. While evidence suggests that trend-following strategies may be able to act as an efficient hedge when market downturns are gradual, they are typically inefficient when prices collapse suddenly.

In both cases, it is important to keep in mind that convex payoff premium associated with trend equity strategies is not consistent, nor is the payoff guaranteed. In practice, the premium arises from losses that arrive during periods of trend reversals, an effect popularly referred to as “whipsaw.” A trend equity strategy may go several years without experiencing whipsaw, seemingly avoiding paying any premium, then suddenly experience multiple back-to-back whipsaw events at once. Investors who allocate immediately before a series of whipsaw events may be dismayed, but we believe that these are the costs necessary to access the convex payoff opportunity and should be considered on a multi-year, annualized basis.

Finally, it is important to consider that trend-following is an active strategy. Beyond management fees, it is important to consider the impact of transaction costs and taxes.

Figure 7Source: Newfound Research. Return data relies on hypothetical indices and is exclusive of all fees and expenses. Returns assume the reinvestment of all dividends. Trend Equity invests in U.S. Large-Cap Equity when the prior month has a positive 12-1 month total return and in 3-month U.S. Treasury Bills otherwise. The Trend Equity strategy does not reflect any strategy offered or managed by Newfound Research and was constructed exclusively for the purposes of this commentary. It is not possible to invest in an index. Past performance does not guarantee future results.

Figure 8

Conclusion

In this primer, we have introduced trend equity, an active strategy that seeks to provide investors with exposure to the equity risk premium while mitigating the impacts of severe and prolonged drawdowns. The strategy aims to achieve this objective by blending exposure to equities with the convex payoff structure traditionally exhibited by trend-following strategies.

We believe that such a strategy can be a particularly useful diversifier for most strategically allocated portfolios, which tend to be exposed to the concave payoff profile of traditional risk factors. While relying upon correlation may be sufficient in normal market environments, we believe that the potential premiums collected can be insufficient to offset large losses generated during bad times. It is during these occasions that we believe a convex payoff structure, like that empirically found in trend equity, can be a particularly useful diversifier.

We explored two ways in which investors can incorporate trend equity into a traditional profile depending upon their objective. Investors looking to reduce realized risk without necessarily sacrificing long-term return can fund their trend equity exposure with their traditional equity allocation. Investors looking to enhance returns while maintaining the same realized risk profile may be better off funding exposure from both traditional stock and bond allocations.

Finally, we discussed the trade-offs associated with incorporating trend equity into an investor’s portfolio, including (1) the lumpy and potentially large nature of whipsaw events, (2) the inability to hedge against sudden losses, and (3) the costs associated with managing an active strategy. Despite these potential drawbacks, we believe that trend-following equity can be a potentially useful diversifier in most traditionally allocated portfolios.

Bibliography

Freccia, Maxwell, and Rauseo, Matthew, and Villalon, Daniel, DC Solutions Series: Defensive Equity, Part 2. Available at https://www.aqr.com/Insights/Research/DC-Solutions/DC-Solutions-Series-Defensive-Equity-Part-2. Accessed September 2018.

Hsieh, David A. and Fung, William, The Risk in Hedge Fund Strategies: Theory and Evidence from Trend Followers. The Review of Financial Studies, Vol. 14, No. 2, Summer 2001. Available at SSRN: https://ssrn.com/abstract=250542

Hurst, Brian and Ooi, Yao Hua and Pedersen, Lasse Heje, A Century of Evidence on Trend-Following Investing (June 27, 2017). Available at SSRN: https://ssrn.com/abstract=2993026 or http://dx.doi.org/10.2139/ssrn.2993026

Lempérière, Yves, and Deremble, Cyril and Seager, Philip and Potters, Marc, and Bouchaud, Jean-Phillippe. (April, 2014), Two Centuries of Trend Following, Journal of Investment Strategies, 3(3), pp. 41-61.

Trade Optimization

By Corey Hoffstein

On August 27, 2018

In Craftsmanship, Portfolio Construction, Weekly Commentary

Trade optimization is more technical topic than we usually cover in our published research. Therefore, this note will relies heavily on mathematical notation and assumes readers have a basic understanding of optimization. Accompanying the commentary is code written in Python, meant to provide concrete examples of how these ideas can be implemented. The Python code leverages the PuLP optimization library.

Readers not proficient in these areas may still benefit from reading the Introduction and evaluating the example outlined in Section 5.

Summary

In practice, portfolio managers must account for the real-world implementation costs – both explicit (e.g. commission) and implicit (e.g. bid/ask spread and impact) associated with trading portfolios.
Managers often implement trade paring constraints that may limit the number of allowed securities, the number of executed trades, the size of a trade, or the size of holdings. These constraints can turn a well-formed convex optimization into a discrete problem.
In this note, we explore how to formulate trade optimization as a Mixed-Integer Linear Programming (“MILP”) problem and implement an example in Python.

0. Initialize Python Libraries

import pandas
import numpy

from pulp import *

import scipy.optimize

1. Introduction

In the context of portfolio construction, trade optimization is the process of managing the transactions necessary to move from one set of portfolio weights to another. These optimizations can play an important role both in the cases of rebalancing as well as in the case of a cash infusion or withdrawal. The reason for controlling these trades is to try to minimize the explicit (e.g. commission) and implicit (e.g. bid/ask spread and impact) costs associated with trading.

Two approaches are often taken to trade optimization:

Trading costs and constraints are explicitly considered within portfolio construction. For example, a portfolio optimization that seeks to maximize exposure to some alpha source may incorporate explicit measures of transaction costs or constrain the number of trades that are allowed to occur at any given rebalance.
Portfolio construction and trade optimization occur in a two step process. For example, a portfolio optimization may take place that creates the “ideal” portfolio, ignoring consideration of trading constraints and costs. Trade optimization would then occur as a second step, seeking to identify the trades that would move the current portfolio “as close as possible” to the target portfolio while minimizing costs or respecting trade constraints.

These two approaches will not necessarily arrive at the same result. At Newfound, we prefer the latter approach, as we believe it creates more transparency in portfolio construction. Combining trade optimization within portfolio optimization can also lead to complicated constraints, leading to infeasible optimizations. Furthermore, the separation of portfolio optimization and trade optimization allows us to target the same model portfolio across all strategy implementations, but vary when and how different portfolios trade depending upon account size and costs.

For example, a highly tactical strategy implemented as a pooled vehicle with a large asset base and penny-per-share commissions can likely afford to execute a much higher number of trades than an investor trying to implement the same strategy with $250,000 and $7.99 ticket charges. While implicit and explicit trading costs will create a fixed drag upon strategy returns, failing to implement each trade as dictated by a hypothetical model will create tracking error.

Ultimately, the goal is to minimize the fixed costs while staying within an acceptable distance (e.g. turnover distance or tracking error) of our target portfolio. Often, this goal is expressed by a portfolio manager with a number of semi-ad-hoc constraints or optimization targets. For example:

Asset Paring. A constraint that specifies the minimum or maximum number of securities that can be held by the portfolio.
Trade Paring. A constraint that specifies the minimum or maximum number of trades that may be executed.
Level Paring. A constraint that establishes a minimum level threshold for securities (e.g. securities must be at least 1% of the portfolio) or trades (e.g. all trades must be larger than 0.5%).

Unfortunately, these constraints often turn the portfolio optimization problem from continuous to discrete, which makes the process of optimization more difficult.

2. The Discreteness Problem

Consider the following simplified scenario. Given our current, drifted portfolio weights $w_{old}$ and a new set of target model weights $w_{target}$ , we want to minimize the number of trades we need to execute to bring our portfolio within some acceptable turnover threshold level, $\theta$ . We can define this as the optimization problem:

\begin{aligned} & \text{minimize} & & \sum\limits_{i} 1_{|t_i|}>0 \\ & \text{subject to} & & \sum\limits_{i} |w_{target, i} - (w_{old, i} + t_i)| \le 2 * \theta \\ & & & \sum\limits_{i} t_i = 0 \\ & \text{and} & & t_i \ge -w_{old,i} \end{aligned}

Unfortunately, as we will see below, simply trying to throw this problem into an off-the-shelf convex optimizer, as is, will lead to some potentially odd results. And we have not even introduced any complex paring constraints!

2.1 Example Data

# setup some sample data
tickers = "amj bkln bwx cwb emlc hyg idv lqd \
           pbp pcy pff rem shy tlt vnq vnqi vym".split()

w_target = pandas.Series([float(x) for x in "0.04095391 0.206519656 0 \
                      0.061190655 0.049414401 0.105442705 0.038080766 \
                      0.07004622 0.045115708 0.08508047 0.115974239 \
                      0.076953702 0 0.005797291 0.008955226 0.050530852 \
                      0.0399442".split()], index = tickers)

w_old = pandas.Series([float(x) for x in \
                   "0.058788745 0.25 0 0.098132817 \
                    0 0.134293993 0.06144967 0.102295438 \
                    0.074200473 0 0 0.118318536 0 0 \
                    0.04774768 0 0.054772649".split()], \
                      index = tickers)

n = len(tickers)

w_diff = w_target - w_old

2.2 Applying a Naive Convex Optimizer

The example below demonstrates the numerical issues associated with attempting to solve discrete problems with traditional convex optimizers. Using the portfolio and target weights established above, we run a naive optimization that seeks to minimize the number of trades necessary to bring our holdings within a 5% turnover threshold from the target weights.

# Try a naive optimization with SLSQP

theta = 0.05
theta_hat = theta + w_diff.abs().sum() / 2.

def _fmin(t):
    return numpy.sum(numpy.abs(t) > 1e-8)

def _distance_constraint(t):
    return theta_hat - numpy.sum(numpy.abs(t)) / 2.

def _sums_to_zero(t):
    return numpy.sum(numpy.square(t))

t0 = w_diff.copy()

bounds = [(-w_old[i], 1) for i in range(0, n)]

result = scipy.optimize.fmin_slsqp(_fmin, t0, bounds = bounds, \
                                   eqcons = [_sums_to_zero], \
                                   ieqcons = [_distance_constraint], \
                                   disp = -1)

result =  pandas.Series(result, index = tickers)

Note that the trades we received are simply $w_{target} - w_{old}$ , which was our initial guess for the optimization. The optimizer didn’t optimize.

What’s going on? Well, many off-the-shelf optimizers – such as the Sequential Least Squares Programming (SLSQP) approach applied here – will attempt to solve this problem by first estimating the gradient of the problem to decide which direction to move in search of the optimal solution. To achieve this numerically, small perturbations are made to the input vector and their influence on the resulting output is calculated.

In this case, small changes are unlikely to create an influence in the problem we are trying to minimize. Whether the trade is 5% or 5.0001% will have no influence on the *number* of trades executed. So the first derivative will appear to be zero and the optimizer will exit.

Fortunately, with a bit of elbow grease, we can turn this problem into a mixed integer linear programming problem (“MILP”), which have their own set of efficient optimization tools (in this article, we will use the PuLP library for the Python programming language). A MILP is a category of optimization problems that take the standard form:

\begin{aligned} & \text{minimize} & & c^{T}x + h^{T}y \\ & \text{subject to} & & Ax + Gy \le b \\ & \text{and} & & x \in \mathbb{Z}^{n} \end{aligned}

Here b is a vector and A and G are matrices. Don’t worry too much about the form.

The important takeaway is that we need: (1) to express our minimization problem as a linear function and (2) express our constraints as a set of linear inequalities.

But first, for us to take advantage of linear programming tools, we need to eliminate our absolute values and indicator functions and somehow transform them into linear constraints.

3. Linear Programming Transformation Techniques

3.1 Absolute Values

Consider an optimization of the form:

\begin{aligned} & \text{minimize} & & \sum\limits_{i} |x_i| \\ & \text{subject to} & & ... \end{aligned}

To get rid of the absolute value function, we can rewrite the function as a minimization of a new variable, $\psi$ .

\begin{aligned} & \text{minimize} & & \sum\limits_{i} \psi_i \\ & \text{subject to} & & \psi_i \ge x_i \\ & & & \psi_i \ge -x_i \\ & \text{and} & & ... \end{aligned}

The combination of constraints makes it such that $\psi_i \ge |x_i|$ . When $x_i$ is positive, $\psi_i$ is constrained by the first constraint and when $x_i$ is negative, it is constrained by the latter. Since the optimization seeks to minimize the sum of each $\psi_i$ , and we know $\psi_i$ will be positive, the optimizer will reduce $\psi_i$ to equal $|x_i|$ , which is it’s minimum possible value.

Below is an example of this trick in action. Our goal is to minimize the absolute value of some variables $x_i$ . We apply bounds on each $x_i$ to allow the problem to converge on a solution.

lp_problem = LpProblem("Absolute Values", LpMinimize)

x_vars = []
psi_vars = []

bounds = [[1, 7], [-10, 0], [-9, -1], [-1, 5], [6, 9]]

print "Bounds for x: "
print pandas.DataFrame(bounds, columns = ["Left", "Right"])

for i in range(5):
    x_i = LpVariable("x_" + str(i), None, None)
    x_vars.append(x_i)
    
    psi_i = LpVariable("psi_i" + str(i), None, None)
    psi_vars.append(psi_i)
    
lp_problem += lpSum(psi_vars), "Objective"

for i in range(5):
    lp_problem += psi_vars[i] >= -x_vars[i]
    lp_problem += psi_vars[i] >= x_vars[i]
    
    lp_problem += x_vars[i] >= bounds[i][0]
    lp_problem += x_vars[i] <= bounds[i][1]
    
lp_problem.solve()

print "\nx variables"
print pandas.Series([x_i.value() for x_i in x_vars])

print "\npsi Variables (|x|):"
print pandas.Series([psi_i.value() for psi_i in psi_vars])

Bounds for x: 
   Left  Right
0     1      7
1   -10      0
2    -9     -1
3    -1      5
4     6      9

x variables
0    1.0
1    0.0
2   -1.0
3    0.0
4    6.0
dtype: float64

psi Variables (|x|):
0    1.0
1    0.0
2    1.0
3    0.0
4    6.0
dtype: float64

3.2 Indicator Functions

Consider an optimization problem of the form:

\begin{aligned} & \text{minimize} & & \sum\limits_{i} 1_{x_i > 0} \\ & \text{subject to} & & ... \end{aligned}

We can re-write this problem by introducing a new variable, $y_i$ , and adding a set of linear constraints:

\begin{aligned} & \text{minimize} & & \sum\limits_{i} y_i \\ & \text{subject to} & & x_i \le A*y_i\\ & & & y_i \ge 0 \\& & & y_i \le 1 \\ & & & y_i \in \mathbb{Z} \\ & \text{and} & & ... \end{aligned}

Note that the last three constraints, when taken together, tell us that $y_i \in \{0, 1\}$ . The new variable A should be a large constant, bigger than any value of $x_i$ . Let’s assume $A = max(x) + 1$ .

Let’s first consider what happens when $x_i \le 0$ . In such a case, $y_i$ can be set to zero without violating any constraints. When $x_i$ is positive, however, for $x_i \le A*y_i$ to be true, it must be the case that $y_i = 1$ .

What prevents $y_i$ from equalling 1 in the case where $x_i \le 0$ is the goal of minimizing the sum of $y_i$ , which will force $y_i$ to be 0 whenever possible.

Below is a sample problem demonstrating this trick, similar to the example described in the prior section.

lp_problem = LpProblem("Indicator Function", LpMinimize)

x_vars = []
y_vars = []

bounds = [[-4, 1], [-3, 5], [-6, 1], [1, 7], [-5, 9]]

A = 11    

print "Bounds for x: "
print pandas.DataFrame(bounds, columns = ["Left", "Right"])

for i in range(5):
    x_i = LpVariable("x_" + str(i), None, None)
    x_vars.append(x_i)
    
    y_i = LpVariable("ind_" + str(i), 0, 1, LpInteger)
    y_vars.append(y_i)
    
lp_problem += lpSum(y_vars), "Objective"

for i in range(5):
    lp_problem += x_vars[i] >= bounds[i][0]
    lp_problem += x_vars[i] <= bounds[i][1]
    
    lp_problem += x_vars[i] <= A * y_vars[i]
    
lp_problem.solve()

print "\nx variables"
print pandas.Series([x_i.value() for x_i in x_vars])

print "\ny Variables (Indicator):"
print pandas.Series([y_i.value() for y_i in y_vars])

Bounds for x: 
   Left  Right
0    -4      1
1    -3      5
2    -6      1
3     1      7
4    -5      9

x variables
0   -4.0
1   -3.0
2   -6.0
3    1.0
4   -5.0
dtype: float64

y Variables (Indicator):
0    0.0
1    0.0
2    0.0
3    1.0
4    0.0
dtype: float64

3.3 Tying the Tricks Together

A problem arises when we try to tie these two tricks together, as both tricks rely upon the minimization function itself. The $\psi_i$ are dragged to the absolute value of $x_i$ because we minimize over them. Similarly, $y_i$ is dragged to zero when the indicator should be off because we are minimizing over it.

What happens, however, if we want to solve a problem of the form:

\begin{aligned} & \text{minimize} & & \sum\limits_{i} 1_{|x_i| > 0} \\ & \text{subject to} & & ... \end{aligned}

One way of trying to solve this problem is by using our tricks and then combining the objectives into a single sum.

\begin{aligned} & \text{minimize} & & \sum\limits_{i} y_i + \psi_i \\ & \text{subject to} & & \psi_i \ge x_i \\ & & & \psi_i \ge -x_i \\ & & & x_i \le A*y_i\\ & & & y_i \ge 0 \\ & & & y_i \le 1 \\ & & & y_i \in \mathbb{Z} \\ & \text{and} & & .. \end{aligned}

By minimizing over the sum of both variables, $\psi_i$ is forced towards $|x_i|$ and $y_i$ is forced to zero when $\psi_i = 0$ .

Below is an example demonstrating this solution, again similar to the examples discussed in prior sections.

lp_problem = LpProblem("Absolute Values", LpMinimize)

x_vars = []
psi_vars = []
y_vars = []

bounds = [[-7, 3], [7, 8], [5, 9], [1, 4], [-6, 2]]

A = 11    

print "Bounds for x: "
print pandas.DataFrame(bounds, columns = ["Left", "Right"])

for i in range(5):
    x_i = LpVariable("x_" + str(i), None, None)
    x_vars.append(x_i)
    
    psi_i = LpVariable("psi_i" + str(i), None, None)
    psi_vars.append(psi_i)
    
    y_i = LpVariable("ind_" + str(i), 0, 1, LpInteger)
    y_vars.append(y_i)
    
    
lp_problem += lpSum(y_vars) + lpSum(psi_vars), "Objective"

for i in range(5):
    lp_problem += x_vars[i] >= bounds[i][0]
    lp_problem += x_vars[i] <= bounds[i][1]
    
for i in range(5):
    lp_problem += psi_vars[i] >= -x_vars[i]
    lp_problem += psi_vars[i] >= x_vars[i]
    
    lp_problem += psi_vars[i] <= A * y_vars[i]
    
lp_problem.solve()

print "\nx variables"
print pandas.Series([x_i.value() for x_i in x_vars])

print "\npsi Variables (|x|):"
print pandas.Series([psi_i.value() for psi_i in psi_vars])

print "\ny Variables (Indicator):"
print pandas.Series([y_i.value() for y_i in y_vars])

Bounds for x: 
   Left  Right
0    -7      3
1     7      8
2     5      9
3     1      4
4    -6      2

x variables
0    0.0
1    7.0
2    5.0
3    1.0
4    0.0
dtype: float64

psi Variables (|x|):
0    0.0
1    7.0
2    5.0
3    1.0
4    0.0
dtype: float64

y Variables (Indicator):
0    0.0
1    1.0
2    1.0
3    1.0
4    0.0
dtype: float64

4. Building a Trade Minimization Model

Returning to our original problem,

\begin{aligned} & \text{minimize} & & \sum\limits_{i} 1_{|t_i| > 0} \\ & \text{subject to} & & \sum\limits_{i} |w_{target, i} - (w_{old, i} + t_i)| \le 2 * \theta \\ & & & \sum\limits_{i} t_i = 0 \\ & \text{and} & & t_i \ge -w_{old,i} \end{aligned}

We can now use the tricks we have established above to re-write this problem as:

\begin{aligned} & \text{minimize} & & \sum\limits_{i} (\phi_i + \psi_i + y_i) \\ & \text{subject to} & & \psi_i \ge t_i \\ & & & \psi_i \ge -t_i \\ & & & \psi_i \le A*y_i \\ & & & \phi_i \ge (w_{target,i} - (w_{old,i} + t_i))\\ & & & \phi_i \ge -(w_{target,i} - (w_{old,i} + t_i)) \\ & & & \sum\limits_{i} \phi_i \le 2 * \theta \\ & & & \sum\limits_{i} t_i = 0 \\ & \text{and} & & t_i \ge -w_{old,i} \end{aligned}

While there are a large number of constraints present, in reality there are just a few key steps going on. First, our key variable in question is $t_i$ . We then use our absolute value trick to create $\psi_i = |t_i|$ . Next, we use the indicator function trick to create $y_i$ , which tells us whether each position is traded or not. Ultimately, this is the variable we are trying to minimize.

Next, we have to deal with our turnover constraint. Again, we invoke the absolute value trick to create $\phi_i$ , and replace our turnover constraint as a sum of $\phi$ ‘s.

Et voila?

As it turns out, not quite.

Consider a simple two-asset portfolio. The current weights are [0.25, 0.75] and we want to get these weights within 0.05 of [0.5, 0.5] (using the L^1 norm – i.e. the sum of absolute values – as our definition of “distance”).

Let’s consider the solution [0.475, 0.525]. At this point, $\phi = [0.025, 0.025]$ and $\psi = [0.225, 0.225]$ . Is this solution “better” than [0.5, 0.5]? At [0.5, 0.5], $\phi = [0.0, 0.0]$ and $\psi = [0.25, 0.25]$ . From the optimizer’s viewpoint, these are equivalent solutions. Within this region, there are an infinite number of possible solutions.

Yet if we are willing to let some of our tricks “fail,” we can find a solution. If we want to get as close as possible, we effectively want to minimize the sum of $\psi$ ‘s. The infinite solutions problem arises when we simultaneously try to minimize the sum of $\psi$ ‘s and $\phi$ ‘s, which offset each other.

Do we actually need the values of $\psi$ to be correct? As it turns out: no. All we really need is that $\psi_i$ is positive when $t_i$ is non-zero, which will then force $y_i$ to be 1. By minimizing on $y_i$ , $\psi_i$ will still be forced to 0 when $t_i = 0$ .

So if we simply remove $\psi_i$ from the minimization, we will end up reducing the number of trades as far as possible and then reducing the distance to the target model as much as possible given that trade level.

\begin{aligned} & \text{minimize} & & \sum\limits_{i} (\phi_i + y_i) \\ & \text{subject to} & & \psi_i \ge t_i \\ & & & \psi_i \ge -t_i \\ & & & \psi_i \le A*y_i \\ & & & \phi_i \ge (w_{target,i} - (w_{old,i} + t_i))\\ & & & \phi_i \ge -(w_{target,i} - (w_{old,i} + t_i)) \\ & & & \sum\limits_{i} \phi_i \le 2 * \theta \\ & & & \sum\limits_{i} t_i = 0 \\ & \text{and} & & t_i \ge -w_{old,i} \end{aligned}

As a side note, because the sum of $\phi$ ‘s will at most equal 2 and the sum of y‘s can equal the number of assets in the portfolio, the optimizer will get more minimization bang for its buck by focusing on reducing the number of trades first before reducing the distance to the target model. This priority can be adjusted by multiplying $\phi_i$ by a sufficiently large scaler in our objective.

theta = 0.05

trading_model = LpProblem("Trade Minimization Problem", LpMinimize)

t_vars = []
psi_vars = []
phi_vars = []
y_vars = []

A = 2
    
for i in range(n):
    t = LpVariable("t_" + str(i), -w_old[i], 1 - w_old[i]) 
    t_vars.append(t)
    
    psi = LpVariable("psi_" + str(i), None, None)
    psi_vars.append(psi)

    phi = LpVariable("phi_" + str(i), None, None)
    phi_vars.append(phi)
    
    y = LpVariable("y_" + str(i), 0, 1, LpInteger) #set y in {0, 1}
    y_vars.append(y)

    
# add our objective to minimize y, which is the number of trades
trading_model += lpSum(phi_vars) + lpSum(y_vars), "Objective"
            
for i in range(n):
    trading_model += psi_vars[i] >= -t_vars[i]
    trading_model += psi_vars[i] >= t_vars[i]
    trading_model += psi_vars[i] <= A * y_vars[i]
    
for i in range(n):
    trading_model += phi_vars[i] >= -(w_diff[i] - t_vars[i])
    trading_model += phi_vars[i] >= (w_diff[i] - t_vars[i])
    
# Make sure our trades sum to zero
trading_model += (lpSum(t_vars) == 0)

# Set our trade bounds
trading_model += (lpSum(phi_vars) / 2. <= theta)

trading_model.solve()

results = pandas.Series([t_i.value() for t_i in t_vars], index = tickers)

print "Number of trades: " + str(sum([y_i.value() for y_i in y_vars]))

print "Turnover distance: " + str((w_target - (w_old + results)).abs().sum() / 2.)

Number of trades: 12.0
Turnover distance: 0.032663284500000014

5. A Sector Rotation Example

As an example of applying trade paring, we construct a sample sector rotation strategy. The investment universe consists of nine sector ETFs (XLB, XLE, XLF, XLI, XLK, XLU, XLV and XLY). The sectors are ranked by their 12-1 month total returns and the portfolio holds the four top-ranking ETFs in equal weight. To reduce timing luck, we apply a four-week tranching process.

We construct three versions of the strategy.

Naive: A version which rebalances back to hypothetical model weights on a weekly basis.
Filtered: A version that rebalances back to hypothetical model weights when drifted portfolio weights exceed a 5% turnover distance from target weights.
Trade Pared: A version that applies trade paring to rebalance back to within a 1% turnover distance from target weights when drifted weights exceed a 5% turnover distance from target weights.

The equity curves and per-year trade counts are plotted for each version below. Note that the equity curves do not account for any implicit or explicit trading costs.

Data Source: CSI. Calculations by Newfound Research. Past performance does not guarantee future results. All returns are hypothetical index returns. You cannot invest directly in an index and unmanaged index returns do not reflect any fees, expenses, sales charges, or trading expenses. Index returns include the reinvestment of dividends. No index is meant to measure any strategy that is or ever has been managed by Newfound Research. The indices were constructed by Newfound in August 2018 for purposes of this analysis and are therefore entirely backtested and not investment strategies that are currently managed and offered by Newfound.

For the reporting period covering full years (2001 – 2017), the trade filtering process alone reduced the average number of annual trades by 40.6% (from 255.7 to 151.7). The added trade paring process reduced the number of trades another 50.9% (from 151.7 to 74.5), for a total reduction of 70.9%.

6. Possible Extensions & Limitations

There are a number of extensions that can be made to this model, including:

Accounting for trading costs. Instead of minimizing the number of trades, we could minimize the total cost of trading by multiplying each trade against an estimate of cost (including bid/ask spread, commission, and impact).
Forcing accuracy. There may be positions for which more greater drift can be permitted and others where drift is less desired. This can be achieved by adding specific constraints to our $\phi_i$ variables.

Unfortunately, there are also a number of limitations. The first set is due to the fact we are formulating our optimization as a linear program. This means that quadratic constraints or objectives, such as tracking error constraints, are forbidden. The second set is due to the complexity of the optimization problem. While the problem may be technically solvable, problems containing a large number of securities and constraints may be time infeasible.

6.1 Non-Linear Constraints

In the former case, we can choose to move to a mixed integer quadratic programming framework. Or, we can also employ multi-step heuristic methods to find feasible, though potentially non-optimal, solutions.

For example, consider the case where we wish our optimized portfolio to fall within a certain tracking error constraint of our target portfolio. Prior to optimization, the marginal contribution to tracking error can be calculated for each asset and the total current tracking error can be calculated. A constraint can then be added such that the current tracking error minus the sum of weighted marginal contributions must be less than the tracking error target. After the optimization is complete, we can determine whether our solution meets the tracking error constraint.

If it does not, we can use our solution as our new $w_{old}$ , re-calculate our tracking error and marginal contribution figures, and re-optimize. This iterative approach approximates a gradient descent approach.

In the example below, we introduce a covariance matrix and seek to target a solution whose tracking error is less than 0.25%.

covariance_matrix = [[ 3.62767735e-02,  2.18757921e-03,  2.88389154e-05,
         7.34489308e-03,  1.96701876e-03,  4.42465667e-03,
         1.12579361e-02,  1.65860525e-03,  5.64030644e-03,
         2.76645571e-03,  3.63015800e-04,  3.74241173e-03,
        -1.35199744e-04, -2.19000672e-03,  6.80914121e-03,
         8.41701096e-03,  1.07504229e-02],
       [ 2.18757921e-03,  5.40346050e-04,  5.52196510e-04,
         9.03853792e-04,  1.26047511e-03,  6.54178355e-04,
         1.72005989e-03,  3.60920296e-04,  4.32241813e-04,
         6.55664695e-04,  1.60990263e-04,  6.64729334e-04,
        -1.34505970e-05, -3.61651337e-04,  6.56663689e-04,
         1.55184724e-03,  1.06451898e-03],
       [ 2.88389154e-05,  5.52196510e-04,  4.73857357e-03,
         1.55701811e-03,  6.22138578e-03,  8.13498400e-04,
         3.36654245e-03,  1.54941008e-03,  6.19861236e-05,
         2.93028853e-03,  8.70115005e-04,  4.90113403e-04,
         1.22200026e-04,  2.34074752e-03,  1.39606650e-03,
         5.31970717e-03,  8.86435533e-04],
       [ 7.34489308e-03,  9.03853792e-04,  1.55701811e-03,
         4.70643696e-03,  2.36059044e-03,  1.45119740e-03,
         4.46141908e-03,  8.06488179e-04,  2.09341490e-03,
         1.54107719e-03,  6.99000273e-04,  1.31596059e-03,
        -2.52039718e-05, -5.18390335e-04,  2.41334278e-03,
         5.14806453e-03,  3.76769305e-03],
       [ 1.96701876e-03,  1.26047511e-03,  6.22138578e-03,
         2.36059044e-03,  1.26644146e-02,  2.00358907e-03,
         8.04023724e-03,  2.30076077e-03,  5.70077091e-04,
         5.65049374e-03,  9.76571021e-04,  1.85279450e-03,
         2.56652171e-05,  1.19266940e-03,  5.84713900e-04,
         9.29778319e-03,  2.84300900e-03],
       [ 4.42465667e-03,  6.54178355e-04,  8.13498400e-04,
         1.45119740e-03,  2.00358907e-03,  1.52522064e-03,
         2.91651452e-03,  8.70569737e-04,  1.09752760e-03,
         1.66762294e-03,  5.36854007e-04,  1.75343988e-03,
         1.29714019e-05,  9.11071171e-05,  1.68043070e-03,
         2.42628131e-03,  1.90713194e-03],
       [ 1.12579361e-02,  1.72005989e-03,  3.36654245e-03,
         4.46141908e-03,  8.04023724e-03,  2.91651452e-03,
         1.19931947e-02,  1.61222907e-03,  2.75699780e-03,
         4.16113427e-03,  6.25609018e-04,  2.91008175e-03,
        -1.92908806e-04, -1.57151126e-03,  3.25855486e-03,
         1.06990068e-02,  6.05007409e-03],
       [ 1.65860525e-03,  3.60920296e-04,  1.54941008e-03,
         8.06488179e-04,  2.30076077e-03,  8.70569737e-04,
         1.61222907e-03,  1.90797844e-03,  6.04486114e-04,
         2.47501106e-03,  8.57227194e-04,  2.42587888e-03,
         1.85623409e-04,  2.91479004e-03,  3.33754926e-03,
         2.61280946e-03,  1.16461350e-03],
       [ 5.64030644e-03,  4.32241813e-04,  6.19861236e-05,
         2.09341490e-03,  5.70077091e-04,  1.09752760e-03,
         2.75699780e-03,  6.04486114e-04,  2.53455649e-03,
         9.66091919e-04,  3.91053383e-04,  1.83120456e-03,
        -4.91230334e-05, -5.60316891e-04,  2.28627416e-03,
         2.40776877e-03,  3.15907037e-03],
       [ 2.76645571e-03,  6.55664695e-04,  2.93028853e-03,
         1.54107719e-03,  5.65049374e-03,  1.66762294e-03,
         4.16113427e-03,  2.47501106e-03,  9.66091919e-04,
         4.81734656e-03,  1.14396535e-03,  3.23711266e-03,
         1.69157413e-04,  3.03445975e-03,  3.09323955e-03,
         5.27456576e-03,  2.11317800e-03],
       [ 3.63015800e-04,  1.60990263e-04,  8.70115005e-04,
         6.99000273e-04,  9.76571021e-04,  5.36854007e-04,
         6.25609018e-04,  8.57227194e-04,  3.91053383e-04,
         1.14396535e-03,  1.39905835e-03,  2.01826986e-03,
         1.04811491e-04,  1.67653296e-03,  2.59598793e-03,
         1.01532651e-03,  2.60716967e-04],
       [ 3.74241173e-03,  6.64729334e-04,  4.90113403e-04,
         1.31596059e-03,  1.85279450e-03,  1.75343988e-03,
         2.91008175e-03,  2.42587888e-03,  1.83120456e-03,
         3.23711266e-03,  2.01826986e-03,  1.16861730e-02,
         2.24795908e-04,  3.46679680e-03,  8.38606091e-03,
         3.65575720e-03,  1.80220367e-03],
       [-1.35199744e-04, -1.34505970e-05,  1.22200026e-04,
        -2.52039718e-05,  2.56652171e-05,  1.29714019e-05,
        -1.92908806e-04,  1.85623409e-04, -4.91230334e-05,
         1.69157413e-04,  1.04811491e-04,  2.24795908e-04,
         5.49990619e-05,  5.01897963e-04,  3.74856789e-04,
        -8.63113243e-06, -1.51400879e-04],
       [-2.19000672e-03, -3.61651337e-04,  2.34074752e-03,
        -5.18390335e-04,  1.19266940e-03,  9.11071171e-05,
        -1.57151126e-03,  2.91479004e-03, -5.60316891e-04,
         3.03445975e-03,  1.67653296e-03,  3.46679680e-03,
         5.01897963e-04,  8.74709395e-03,  6.37760454e-03,
         1.74349274e-03, -1.26348683e-03],
       [ 6.80914121e-03,  6.56663689e-04,  1.39606650e-03,
         2.41334278e-03,  5.84713900e-04,  1.68043070e-03,
         3.25855486e-03,  3.33754926e-03,  2.28627416e-03,
         3.09323955e-03,  2.59598793e-03,  8.38606091e-03,
         3.74856789e-04,  6.37760454e-03,  1.55034038e-02,
         5.20888498e-03,  4.17926704e-03],
       [ 8.41701096e-03,  1.55184724e-03,  5.31970717e-03,
         5.14806453e-03,  9.29778319e-03,  2.42628131e-03,
         1.06990068e-02,  2.61280946e-03,  2.40776877e-03,
         5.27456576e-03,  1.01532651e-03,  3.65575720e-03,
        -8.63113243e-06,  1.74349274e-03,  5.20888498e-03,
         1.35424275e-02,  5.49882762e-03],
       [ 1.07504229e-02,  1.06451898e-03,  8.86435533e-04,
         3.76769305e-03,  2.84300900e-03,  1.90713194e-03,
         6.05007409e-03,  1.16461350e-03,  3.15907037e-03,
         2.11317800e-03,  2.60716967e-04,  1.80220367e-03,
        -1.51400879e-04, -1.26348683e-03,  4.17926704e-03,
         5.49882762e-03,  7.08734925e-03]]

covariance_matrix = pandas.DataFrame(covariance_matrix, \
                                     index = tickers, \
                                     columns = tickers)

theta = 0.05
target_te = 0.0025

w_old_prime = w_old.copy()

# calculate the difference from the target portfolio
# and use this difference to estimate tracking error 
# and marginal contribution to tracking error ("mcte")
z = (w_old_prime - w_target)
te = numpy.sqrt(z.dot(covariance_matrix).dot(z))
mcte = (z.dot(covariance_matrix)) / te

while True:
    w_diff_prime = w_target - w_old_prime

    trading_model = LpProblem("Trade Minimization Problem", LpMinimize)

    t_vars = []
    psi_vars = []
    phi_vars = []
    y_vars = []

    A = 2

    for i in range(n):
        t = LpVariable("t_" + str(i), -w_old_prime[i], 1 - w_old_prime[i]) 
        t_vars.append(t)

        psi = LpVariable("psi_" + str(i), None, None)
        psi_vars.append(psi)

        phi = LpVariable("phi_" + str(i), None, None)
        phi_vars.append(phi)

        y = LpVariable("y_" + str(i), 0, 1, LpInteger) #set y in {0, 1}
        y_vars.append(y)


    # add our objective to minimize y, which is the number of trades
    trading_model += lpSum(phi_vars) + lpSum(y_vars), "Objective"

    for i in range(n):
        trading_model += psi_vars[i] >= -t_vars[i]
        trading_model += psi_vars[i] >= t_vars[i]
        trading_model += psi_vars[i] <= A * y_vars[i]

    for i in range(n):
        trading_model += phi_vars[i] >= -(w_diff_prime[i] - t_vars[i])
        trading_model += phi_vars[i] >= (w_diff_prime[i] - t_vars[i])

    # Make sure our trades sum to zero
    trading_model += (lpSum(t_vars) == 0)
    
    # Set tracking error limit
    #    delta(te) = mcte * delta(z) 
    #              = mcte * ((w_old_prime + t - w_target) - 
    #                        (w_old_prime - w_target)) 
    #              = mcte * t
    #    te + delta(te) <= target_te
    #    ==> delta(te) <= target_te - te
    trading_model += (lpSum([mcte.iloc[i] * t_vars[i] for i in range(n)]) \
                              <= (target_te - te))

    # Set our trade bounds
    trading_model += (lpSum(phi_vars) / 2. <= theta)

    trading_model.solve()
    
    # update our w_old' with the current trades
    results = pandas.Series([t_i.value() for t_i in t_vars], index = tickers)
    w_old_prime = (w_old_prime + results)
    
    z = (w_old_prime - w_target)
    te = numpy.sqrt(z.dot(covariance_matrix).dot(z))
    mcte = (z.dot(covariance_matrix)) / te
    
    if te < target_te:
        break
        
print "Tracking error: " + str(te) 

# since w_old' is an iterative update,
# the current trades only reflect the updates from
# the prior w_old'.  Thus, we need to calculate
# the trades by hand
results = (w_old_prime - w_old)
n_trades = (results.abs() > 1e-8).astype(int).sum()

print "Number of trades: " + str(n_trades)

print "Turnover distance: " + str((w_target - (w_old + results)).abs().sum() / 2.)

Tracking error: 0.0016583319880074485
Number of trades: 13
Turnover distance: 0.01624453350000001

6.2 Time Constraints

For time feasibility, heuristic approaches can be employed in effort to rapidly converge upon a “close enough” solution. For example, Rong and Liu (2011) discuss “build-up” and “pare-down” heuristics.

The basic algorithm of “pare-down” is:

Start with a trade list that includes every security
Solve the optimization problem in its unconstrained format, allowing trades to occur only for securities in the trade list.
If the solution meets the necessary constraints (e.g. maximum number of trades, trade size thresholds, tracking error constraints, etc), terminate the optimization.
Eliminate from the trade list a subset of securities based upon some measure of trade utility (e.g. violation of constraints, contribution to tracking error, etc).
Go to step 2.

The basic algorithm of “build-up” is:

Start with an empty trade list
Add a subset of securities to the trade list based upon some measure of trade utility.
Solve the optimization problem in its unconstrained format, allowing trades to occur only for securities in the trade list.
If the solution meets the necessary constraints (e.g. maximum number of trades, trade size thresholds, tracking error constraints, etc), terminate the optimization.
Go to step 2.

These two heuristics can even be combined in an integrated fashion. For example, a binary search approach can be employed, where the initial trade list list is filled with 50% of the tradable securities. Depending upon success or failure of the resulting optimization, a pare-down or build-up approach can be taken to either prune or expand the trade list.

7. Conclusion

In this research note we have explored the practice of trade optimization, which seeks to implement portfolio changes in as few trade as possible. While a rarely discussed detail of portfolio management, trade optimization has the potential to eliminate unnecessary trading costs – both explicit and implicit – that can be a drag on realized investor performance.

Constraints within the practice of trade optimization typically fall into one of three categories: asset paring, trade paring, and level paring. Asset paring restricts the number of securities the portfolio can hold, trade paring restricts the number of trades that can be made, and level paring restricts the size of positions and trades. Introducing these constraints often turns an optimization into a discrete problem, making it much more difficult to solve for traditional convex optimizations.

With this in mind, we introduced mixed-integer linear programming (“MILP”) and explore a few techniques that can be utilized to transform non-linear functions into a set of linear constraints. We then combined these transformations to develop a simple trade optimization framework that can be solved using MILP optimizers.

To offer numerical support in the discussion, we created a simple momentum-based sector rotation strategy. We found that naive turnover-filtering helped reduce the number of trades executed by 50%, while explicit trade optimization reduced the number of trades by 70%.

Finally, we explored how our simplified framework could be further extended to account for both non-linear functional constraints (e.g. tracking error) and operational constraints (e.g. managing execution time).

The paring constraints introduced by trade optimization often lead to problems that are difficult to solve. However, when we consider that the cost of trading is a very real drag on the results realized by investors, we believe that the solutions are worth pursuing.

Measuring Process Diversification in Trend Following

By Corey Hoffstein

On July 30, 2018

In Craftsmanship, Portfolio Construction, Risk Management, Weekly Commentary

This post is available as a PDF download here.

Summary

We prefer to think about diversification in a three-dimensional framework: what, how, and when.
The “how” axis covers the process with which an investment decision is made.
There are a number of models that trend-followers might use to capture a trend. For example, trend-followers might employ a time-series momentum model, a price-minus moving average model, or a double moving average cross-over model.
Beyond multiple models, each model can have a variety of parameterizations. For example, a time-series momentum model can just as equally be applied with a 3-month formation period as an 18-month period.
In this commentary, we attempt to measure how much diversification opportunity is available by employing multiple models with multiple parameterizations in a simple long/flat trend-following process.

When investors talk about diversification, they typically mean across different investments. Do not just by a single stock, for example, buy a basket of stocks in order to diversify away the idiosyncratic risk.

We call this “what” diversification (i.e. “what are you buying?”) and believe this is only one of three meaningful axes of diversification for investors. The other two are “how” (i.e. “how are you making your decision?”) and “when” (i.e. “when are you making your decision?”). In recent years, we have written a great deal about the “when” axis, and you can find a summary of that research in our commentary Quantifying Timing Luck.

In this commentary, we want to discuss the potential benefits of diversifying across the “how” axis in trend-following strategies.

But what, exactly, do we mean by this? Consider that there are a number of ways investors can implement trend-following signals. Some popular methods include:

Prior total returns (“time-series momentum”)
Price-minus-moving-average (e.g. price falls below the 200-day moving average)
Moving-average double cross-over (e.g. the 50-day moving average crosses the 200-day moving average)
Moving-average change-in-direction (e.g. the 200-day moving average slope turns positive or negative)

As it turns out, these varying methodologies are actually cousins of one another. Recent research has established that these models can, more or less, be thought of as different weighting schemes of underlying returns. For example, a time-series momentum model (with no skip month) derives its signal by averaging daily log returns over the lookback period equally.

With this common base, a number of papers over the last decade have found significant relationships between the varying methods. For example:

	Evidence
Bruder, Dao, Richard, and Roncalli (2011)	Moving-average-double-crossover is just an alternative weighting scheme for time-series momentum.
Marshall, Nguyen and Visaltanachoti (2014)	Time-series momentum is related to moving-average-change-in-direction.
Levine and Pedersen (2015)	Time-series-momentum and moving-average cross-overs are highly related; both methods perform similarly on 58 liquid futures contracts.
Beekhuizen and Hallerbach (2015)	Mathematically linked moving averages with prior returns.
Zakamulin (2015)	Price-minus-moving-average, moving-average-double-cross-over, and moving-average-change-of-direction can all be interpreted as a computation of a weighted moving average of momentum rules.

As we have argued in past commentaries, we do not believe any single method is necessarily superior to another. In fact, it is trivial to evaluate these methods over different asset classes and time-horizons and find an example that proves that a given method provides the best result.

Without a crystal ball, however, and without any economic interpretation why one might be superior to another, the choice is arbitrary. Yet the choice will ultimately introduce randomness into our results: a factor we like to call “process risk.” A question we should ask ourselves is, “if we have no reason to believe one is better than another, why would we pick one at all?”

We like to think of it this way: ex-post, we will know whether the return over a given period is positive or negative. Ex-ante, all we have is a handful of trend-following signals that are forecasting that direction. If, historically, all of these trend signals have been effective, then there may be no reason to necessarily believe on over another.

Combining them, in many ways, is sort of like trying to triangulate on the truth. We have a number of models that all look at the problem from a slightly different perspective and, therefore, provide a slightly different interpretation. A (very) loose analogy might be using the collective information from a number of cell towers in effort to pinpoint the geographic location of a cellphone.

We may believe that all of the trend models do a good job of identifying trends over the long run, but most will prove false from time-to-time in the short-run. By using them together, we can potentially increase our overall confidence when the models agree and decrease our confidence when they do not.

With all this in mind, we want to explore the simple question: “how much potential benefit does process diversification bring us?”

The Setup

To answer this question, we first generate a number of long/flat trend following strategies that invest in a broad U.S. equity index or the risk-free rate (both provided by the Kenneth French database and ranging from 1926 to 2018). There are 48 strategy variations in total constructed through a combination of four difference processes – time-series momentum, price-minus-moving-average, and moving-average double cross-over– and 16 different lookback periods (from the approximate equivalent of 3-to-18 months).

We then treat each of the 64 variations as its own unique asset.

To measure process diversification, we are going to use the concept of “independent bets.” The greater the number of independent bets within a portfolio, the greater the internal diversification. Below are a couple examples outlining the basic intuition for a two-asset portfolio:

If we have a portfolio holding two totally independent assets with similar volatility levels, a 50% allocation to each would maximize our diversification.Intuitively, we have equally allocated across two unique bets.
If we have a portfolio holding two totally independent assets with similar volatility levels, a 90% allocation to one asset and a 10% allocation to another would lead us to a highly concentrated bet.
If we have a portfolio holding two highly correlated assets, no matter the allocation split, we have a large, concentrated bet.
If we have a portfolio of two assets with disparate volatility levels, we will have a large concentrated bet unless the lower volatility asset comprises the vast majority of the portfolio.

To measure this concept mathematically, we are going to use the fact that the square of the “diversification ratio” of a portfolio is equal to the number of independent bets that portfolio is taking.¹

Diversifying Parameterization Risk

Within process diversification, the first variable we can tweak is the formation period of our trend signal. For example, if we are using a time-series momentum model that simply looks at the sign of the total return over the prior period, the length of that period may have a significant influence in the identification of a trend. Intuition tells us that shorter formation periods might identify short-term trends as well as react to long-term trend changes more quickly but may be more sensitive to whipsaw risk.

To explore the diversification opportunities available to us simply by varying our formation parameterization, we build equal-weight portfolios comprised of two strategies at a time, where each strategy utilizes the same trend model but a different parameterization. We then measure the number of independent bets in that combination.

We run this test for each trend following process independently. As an example, we compare using a shorter lookback period with a longer lookback period in the context of time-series momentum in isolation. We will compare across models in the next section.

In the graphs below, L0 through L15 represent the lookback periods, with L0 being the shortest lookback period and L15 representing the longest lookback period.

As we might suspect, the largest increase in available bets arises from combining shorter formation periods with longer formation periods. This makes sense, as they represent the two horizons that share the smallest proportion of data and therefore have the least “information leakage.” Consider, for example, a time-series momentum signal that has a 4-monnth lookback and one with an 8-month lookback. At all times, 50% of the information used to derive the latter model is contained within the former model. While the technical details are subtler, we would generally expect that the more informational overlap, the less diversification is available.

We can see that combining short- and long-term lookbacks, the total number of bets the portfolio is taking from 1.0 to approximately 1.2.

This may not seem like a significant lift, but we should remember Grinold and Kahn’s Fundamental Law of Active Management:

Information Ratio = Information Coefficient x SQRT(Independent Bets)

Assuming the information coefficient stays the same, an increase in the number of independent bets from 1.0 to 1.2 increases our information ratio by approximately 10%. Such is the power of diversification.

Another interesting way to approach this data is by allowing an optimizer to attempt to maximize the diversification ratio. In other words, instead of only looking at naïve, equal-weight combinations of two processes at a time, we can build a portfolio from all available lookback variations.

Doing so may provide two interesting insights.

First, we can see how the optimizer might look to combine different variations to maximize diversification. Will it barbell long and short lookbacks, or is there benefit to including medium lookbacks? Will the different processes have different solutions? Second, by optimizing over the full history of data, we can find an upper limit threshold to the number of independent bets we might be able to capture if we had a crystal ball.

A few takeaways from the graphs above:

Almost all of the processes barbell short and long lookback horizons to maximize diversification.
The optimizer finds value, in most cases, in introducing medium-term lookback horizons as well. We can see for Time-Series MOM, the significant weights are placed on L0, L1, L6, L10, and L15. While not perfectly spaced or equally weighted, this still provides a strong cross-section of available information. Double MA Cross-Over, on the other hand, finds value in weighting L0, L8, and L15.
While the optimizer increases the number of independent bets in all cases versus a naïve, equal-weight approach, the pickup is not incredibly dramatic. At the end of the day, a crystal ball does not find a meaningfully better solution than our intuition may provide.

Diversifying Model Risk

Similar to the process taken in the above section, we will now attempt to quantify the benefits of cross-process diversification.

For each trend model, we will calculate the number of independent bets available by combining it with another trend model but hold the lookback period constant. As an example, we will combine the shortest lookback period of the Time-Series MOM model with the shortest lookback period of the MA Double Cross-Over.

We plot the results below of the number of independent bets available through a naïve, equal-weight combination.

We can see that model combinations can lift the number of independent bets from by 0.05 to 0.1. Not as significant as the theoretical lift from parameter diversification, but not totally insignificant.

Combining Model and Parameterization Diversification

We can once again employ our crystal ball in an attempt to find an upper limit to the diversification available to trend followers, as well as the process / parameterization combinations that will maximize this opportunity. Below, we plot the results.

We see a few interesting things of note:

The vast majority of models and parameterizations are ignored.
Time-Series MOM is heavily favored as a model, receiving nearly 60% of the portfolio weight.
We see a spread of weight across short, medium, and long-term weights. Short-term is heavily favored, with Time-Series MOM L0 and Price-Minus MA L0 approaching nearly 45% of model weight.
All three models are, ultimately, incorporated, with approximately 10% being allocated to Double MA Cross-Over, 30% to Price-Minus MA, and 60% to Time-Series MOM.

It is worth pointing out that naively allocating equally across all 48 models creates 1.18 independent bets while the full-period crystal ball generated 1.29 bets.

Of course, having a crystal ball is unrealistic. Below, we look at a rolling window optimization that looks at the prior 5 years of weekly returns to create the most diversified portfolio. To avoid plotting a graph with 48 different components, we have plot the results two ways: (1) clustered by process and (2) clustered by lookback period.

Using the rolling window, we see similar results as we saw with the crystal ball. First, Time-Series MOM is largely favored, often peaking well over 50% of the portfolio weights. Second, we see that a barbelling approach is frequently employed, balancing allocations to the shortest lookbacks (L0 and L1) with the longest lookbacks (L14 and L15). Mid-length lookbacks are not outright ignored, however, and L5 through L11 combined frequently make up 20% of the portfolio.

Finally, we can see that the rolling number of bets is highly variable over time, but optimization frequently creates a meaningful impact over an equal-weight approach.²

Conclusion

In this commentary, we have explored the idea of process diversification. In the context of a simple long/flat trend-following strategy, we find that combining strategies that employ different trend identification models and different formation periods can lead to an increase in the independent number of bets taken by the portfolio.

As it specifically pertains to trend-following, we see that diversification appears to be maximized by allocating across a number of lookback horizons, with an optimizer putting a particular emphasis on barbelling shorter and longer lookback periods.

We also see that incorporating multiple processes can increase available diversification as well. Interestingly, the optimizer did not equally diversify across models. This may be due to the fact that these models are not truly independent from one another than they might seem. For example, Zakamulin (2015) demonstrated that these models can all be decomposed into a different weighted average of the same general momentum rules.

Finding process diversification, then, might require moving to a process that may not have a common basis. For example, trend followers might consider channel methods or a change in basis (e.g. constant volume bars instead of constant time bars).

Momentum’s Magic Number

By Corey Hoffstein

On July 15, 2018

In Momentum, Risk & Style Premia, Weekly Commentary

This post is available as a PDF download here.

Summary

In HIMCO’s May 2018 Quantitative Insight, they publish a figure that suggests the optimal holding length of a momentum strategy is a function of the formation period.
Specifically, the result suggests that the optimal holding period is one selected such that the formation period plus the holding period is equal to 14-to-18 months: a somewhat “magic” result that makes little intuitive, statistical, or economic sense.
To investigate this result, we construct momentum strategies for country indices as well as industry groups.
We find similar results, with performance peaking when the formation period plus the holding period is equal to 12-to-14 months.
While lacking a specific reason why this effect exists, it suggests that investors looking to leverage shorter-term momentum signals may benefit from longer investment horizons, particularly when costs are considered.

A few weeks ago, we came across a study published by HIMCO on momentum investing¹. Contained within this research note was a particularly intriguing exhibit.

Source: HIMCO Quantitative Insights, May 2018

What this figure demonstrates is that the excess cumulative return for U.S. equity momentum strategies peaks as a function of both formation period and holding period. Specifically, the returns appear to peak when the sum of the formation and holding period is between 14-18 months.

For example, if you were to form a portfolio based upon trailing 6-1 momentum – i.e. ranking on the prior 6-month total returns and skipping the most recent month (labeled in the figure above as “2_6”) – this evidence suggests that you would want to hold such a portfolio for 8-to-12 months (labeled in the figure above as 14-to-18 months since the beginning of the uptrend).

Which is a rather odd conclusion. Firstly, we would intuitively expect that we should employ holding periods that are shorter than our formation periods. The notion here is that we want to use enough data to harvest information that will be stationary over the next, smaller time-step. So, for example, we might use 36 months of returns to create a covariance matrix that we might hold constant for the next month (i.e. a 36-month formation period with a 1-month hold). Given that correlations are non-stable, we would likely find the idea of using 1-month of data to form a correlation matrix we hold for the next 36-months rather ludicrous.

And, yet, here we are in a similar situation, finding that if we use a formation period of 5 months, we should hold our portfolio steady for the next 8-to-10 months. And this is particularly weird in the world of momentum, which we typically expect to be a high turnover strategy. How in the world can having a holding period longer than our formation period make sense when we expect information to quickly decay in value?

Perhaps the oddest thing of all is the fact that all these results center around 14-18 months. It would be one thing if the conclusion was simply, “holding for six months after formation is optimal”; here the conclusion is that the optimal holding period is a function of formation period. Nor is the conclusion something intuitive, like “the holding period should be half the formation period.”

Rather, the result – that the holding period should be 14-to-18 months minus the length of the formation period – makes little intuitive, statistical, or economic sense.

Out-of-Sample Testing with Countries and Sectors

In effort to explore this result further, we wanted to determine whether similar results were found when cross-sectional momentum was applied to country indices and industry groups.

Specifically, we ran three tests.

In the first, we constructed momentum portfolios using developed country index returns (U.S. dollar denominated; net of withholding taxes) from MSCI. The countries included in the test are: Australia, Austria, Belgium, Canada, Denmark, Finland, France, Germany, Hong Kong, Ireland, Israel, Italy, Japan, Netherlands, New Zealand, Norway, Portugal, Singapore, Spain, Sweden, Switzerland, the United Kingdom, and the United States of America. The data extends back to 12/1969.

In the second, we constructed momentum portfolios using the 12 industry group data set from the Kenneth French Data Library. The data extends back to 7/1926.

In the third, we constructed momentum portfolios using the 49 industry group data set from the Kenneth French Data Library. The data extends back to 7/1926.

For each data set, we ran the same test:

Vary formation periods from 5-1 to 12-1 months.
Vary holding periods from 1-to-26 months.
Using this data, construct dollar-neutral long/short portfolios that go long, in equal-weight, the top third ranking holdings and go short, in equal-weight, the bottom third.

Note that for holding periods exceeding 1 month, we employed an overlapping portfolio construction process.

Below we plot the results.

Source: MSCI and Kenneth French Data Library. Calculations by Newfound Research. Past performance is not a predictor of future results. All information is backtested and hypothetical and does not reflect the actual strategy managed by Newfound Research. Performance is net of all fees except for underlying ETF expense ratios. Returns assume the reinvestment of all dividends, capital gains, and other earnings.

While the results are not as clear as those published by HIMCO, we still see an intriguing effect: returns peak as a function of both formation and holding period. For the country strategy, formation and holding appear to peak between 12-14 months, indicating that an investor using 5-1 month signals would want to hold for 7 months while an investor using 12-1 signals would only want to hold for 1 month.

For the industry data, the results are less clear. Where the HIMCO and country results exhibited a clear “peak,” the industry results simply seem to “decay slower.” In particular, we can see in the results for the 12-industry group test that almost all strategies peak with a 1-month holding period. However, they all appear to fall off rapidly, and uniformly, after the time where formation plus holding period exceeds 16 months.

While less pronounced, it is worth pointing out that this result is achieved without the consideration of trading costs or taxes. So, while the 5-1 strategy 12-industry group strategy return may peak with a 1-month hold, we can see that it later forms a second peak at a 9-month hold (“14 months since beginning uptrend”). Given that we would expect a nine month hold to exhibit considerably less trading, analysis that includes trading cost estimates may exhibit even greater peakedness in the results.

Does the Effect Persist for Long-Only Portfolios?

In analyzing factors, it is often important to try to determine whether a given result is arising from an effect found in the long leg or the short leg. After all, most investors implement strategies in a long-only capacity. While long-only strategies are, technically, equal to a benchmark plus a dollar-neutral long/short portfolio², the long/short portfolio rarely reflects the true factor definition.

Therefore, we want to evaluate long-only construction to determine whether the same result holds, or whether it is a feature of the short-leg.

We find incredibly similar results. Again, country indices appear to peak between 12-to-14 months after the beginning of the uptrend. Industry group results, while not as strong as country results, still appear to offer fairly flat results until 12-to-14 months after the beginning of the uptrend. Taken together, it appears that this result is sustained for long-only portfolio implementations as well.

Conclusion

Traditionally, momentum is considered a high turnover factor. Relative ranking of recent returns can vary substantially over time and our intuition would lead us to expect that the shorter the horizon we use to measure returns, the shorter the time we expect the relative ranking to persist.

Yet recent research published by HIMCO finds this intuition may not be true. Rather, they find that momentum portfolio performance tends to peak 14-to-18 months after the beginning of the uptrend in measured. In other words, a portfolio formed on prior 5-month returns should hold between 9-to-13 months, while a portfolio formed on the prior 12-months of returns should only hold 2-to-6 months.

This result is rather counter-intuitive, as we would expect that shorter formation periods would require shorter holding periods.

We test this result out-of-sample, constructing momentum portfolios using country indices, 12-industry group indices, and 49-industry group indices. We find a similar result in this data. We then further test whether the result is an artifact found in only long/short implementations whether this information is useful for long-only investors. Indeed, we find very similar results for long-only implementations.

Precisely why this result exists is still up in the air. One argument may be that the trade-off is ultimately centered around win rate versus the size of winners. If relative momentum tends to persist for only for 12-to-18 months total, then using 12-month formation may give us a higher win rate but reduce the size of the winners we pick. Conversely, using a shorter formation period may reduce the number of winners we pick correctly (i.e. lower win rate), but those we pick have further to run. Selecting a formation period and a holding period such that their sum equals approximately 14 months may simply be a hack to find the balance of win rate and win size that maximizes return.

Author: Corey Hoffstein Page 11 of 18

Decomposing Trend Equity

Summary­

The Simple Arithmetic of Portfolio Construction

Decomposing Trend Equity

Equity + Flat/Short

50% Equity/50% Cash + 50% Long/Short

Conclusion

A Trend Equity Primer

Summary­

A Balance of Risks

The Potential Hedging Properties of Trend Following

What is Trend Equity?

Implementing Trend Equity

A Discussion of Trade-Offs

Conclusion

Bibliography

Trade Optimization

Summary

0. Initialize Python Libraries

1. Introduction

2. The Discreteness Problem

2.1 Example Data

2.2 Applying a Naive Convex Optimizer

3. Linear Programming Transformation Techniques

3.1 Absolute Values

3.2 Indicator Functions

3.3 Tying the Tricks Together

4. Building a Trade Minimization Model

5. A Sector Rotation Example

6. Possible Extensions & Limitations

6.1 Non-Linear Constraints

6.2 Time Constraints

7. Conclusion

Measuring Process Diversification in Trend Following

Summary­

The Setup

Diversifying Parameterization Risk

Diversifying Model Risk

Combining Model and Parameterization Diversification

Conclusion

Momentum’s Magic Number

Summary­

Out-of-Sample Testing with Countries and Sectors

Does the Effect Persist for Long-Only Portfolios?

Conclusion

Summary

Summary

Summary

Summary