*This post is available as a PDF download here.*

# Summary

- Portfolio construction decisions tell us about more than just our objective: they tell us about our beliefs.
- In practice, our beliefs extend beyond views of returns, volatilities, and correlations; we also hold views about our ability to measure these concepts and our confidence in those measures.
- We explore the use of data “transformations” – functions applied to data that manipulate how much information is retained and how much is discarded – in the context of a mean-variance optimization applied within a sector-rotation strategy.
- Over our backtest period, we find that discarding
*all*information about prior returns leads to results that maximize realized Sharpe ratios, suggesting that prior return information is irrelevant. - Statistically, however, differences in Sharpe ratios are insignificant, once again suggesting that the potential to create more consistent investor returns lies in the adoption of process diversification within portfolio construction.

**Introduction**

There are articles that, after we read them, we say, “we wish we had written that.” Recently, our friends over at ReSolve Asset Management published one such article titled *Portfolio Optimization: A General Framework for Portfolio Choice*, wherein they rebuild the theoretical foundation behind different optimization techniques based upon implied assumptions and beliefs.

The whole paper is worth a read, but the following decision tree should really be taped above the computers of anyone constructing portfolios.

*Source: ReSolve Asset Management. Reprinted with permission.*

What makes this diagram so powerful, in our opinion, is that it connects different portfolio schemes together based upon our beliefs, both in what we know and how confident we are.

The choice of employing a maximum diversification portfolio, for example, is not just an expression of our desire to increase diversification within our portfolio. The choice necessarily implies that we have active views on both risk and correlations and believe that the market compensates investors for total risk borne. If we do not actually hold these views, then choosing maximum diversification for our portfolio construction technique may be sub-optimal.

(The notion of approaching portfolio construction from “first principles” is a topic we discussed with ReSolve’s Chief Investment Officer, Adam Butler, on our podcast; listen here and here.).

When it comes to portfolio construction, however, having a belief is one thing. One’s *confidence *in that belief is another. For example, I may believe in the momentum anomaly and therefore hold active views on future returns based upon recent realized returns. How *confident *I am in those views will ultimately dictate how and to what extent the views influence my portfolio construction. If I can go one step further and actually quantity my confidence, then I can attempt to directly account for it.

These are implementation topics we have discussed in the past (see our commentary *Combining Tactical Views with Black-Litterman and Entropy Pooling**), *but we wanted to explore how implementation decisions in portfolio construction (so-called “craftsmanship”) can have an outsized impact over the short-term.

**Measure Twice, Transform Once**

By way of example, we will construct a sector rotation portfolio. Specifically, every day we will measure the 12-1 month total return for each of the primary equity sectors as well as estimate their covariance using exponentially-weighted daily returns. With this data in hand, we will calculate the Sharpe-optimal portfolio.

Let’s pause for a moment and consider the implications of our decisions.

- We hold active views on volatility, correlations, and returns;
- We believe we can quantify these views; and
- We are 100% confident in our estimates.

While the diagram above is a guide for (1), it does not outright address (2) or (3). Consider, for example, if we held active views on volatility, correlations, and returns, but had *zero *confidence in our return forecasts (say because we picked them out of thin air). While the mean-variance optimization would still be the theoretically correct choice for what we believe, it is pragmatically incorrect; we would be better off saying we have no view on returns.

The trouble lies in the middle zone: where we have less than 100% confidence in our estimates. Which is usually the case as our estimates are exactly that: estimates. The precision of a dozen decimal places belies the fact that the figures are actually shrouded in a probability distribution.

What if, for example, we believed that our estimates for returns were accurate relative to one another (i.e. ordinally correct) but incorrect in their precision. In other words, we’re generally confident that the rank order of returns will be preserved, but not very confident in the dispersion between returns (e.g. we believe that Apple will outperform Google but have no idea if Apple will return 5% and Google 4% or if Apple will return 5% and Google -50%)?

One potential solution might be to “transform” the data.

Transformations are merely functions applied to the data that help isolate a particular feature or characteristic. With respect to the expected return data in question here, there are two important features: level and dispersion. Level tells us the average return, while dispersion tells us how close or far the data points are from that average.

It should be noted that in mean-variance optimization, the result is invariant to the scale of the data. In other words, we can multiply our expected returns by any constant and the resulting optimal portfolio will not change. This is important in terms of thinking about the transformations below, as information about level is only eliminated in the case where the data is explicitly de-meaned.

For example, calculating the cross-sectional z-score of expected returns will explicitly de-mean the data. However, the second step – dividing by the standard deviation of the data – merely scales the information, which has no effect. Therefore, z-scoring eliminates level information, but preserves dispersion information. Rank ordering our data, on the other hand, preserves level information (since we can simply scale the data to have a new mean) but eliminates some, though not all, information about the dispersion.

Below we list several potential transforms, as well as general thoughts about how they affect level and dispersion. By no means is it a complete list.

**The Identity Transform**

This transform simply multiplies the data by one, preserving all information.

**Z-Score Transform**

This transform calculates a cross-sectional z-score across the data. By first subtracting the mean of the data, all level information is eliminated. All dispersion information, however, is retained.

The effect of this transformation, in the case of mean-variance optimization, is that assets with returns below the mean will now only be included in the portfolio if they have beneficial diversification properties. As example, consider an optimization over five asset classes with the following expected returns:

- Asset A: 20.0%, Asset B: 19.0%, Asset C: 18.0%, Asset D: 17.0%, Asset E: 16.0%
- Asset A: 12.6%, Asset B: 6.3%, Asset C: 0.0%, Asset D: -6.3%, Asset E: 12.6%

If we z-score this data, and assuming the same covariance structure, then our optimized portfolio would have the same allocations.

**Logistic Transform**

The logistic transform retains information about level but reduces information about dispersion.

The growth rate of the logistic function plays an important role in the amount of dispersion information retained. In fact, in many ways, a logistic transform can be thought of as a more generalized function for transforms discussed below. When the growth rate is zero, all dispersion information is eliminated. When growth rates are small, the transform approaches a rank transformation (discussed below) except in the extreme trails. High growth rates cause the logistic function to approach a step function (discussed below).

For middle-ground growth rates, relative scale of outliers is compressed while the scale between data around the mean is made more uniform.

**Rank Transform**

The rank transform replaces values with their ascending ranks (from 1 to N), preserving the order of the dispersion information, but eliminating any relative scale. In our specific example, this transform reflects the view that the rank order of performance is predictive, but the amount of relative outperformance between securities is not.

**Rank & Bin Transform**

The bin transform places data into (generally) equally-spaced bins and ranks the bins (from 1 to the number of bins), maintaining some “broad-strokes” information about relative dispersion, but eliminating information about dispersion within bins. ** **

**Rank & Step Transform**

The step transform can be seen as a filter. After rank ordering the estimates, those below a certain rank are zeroed out while those above are given an equal non-zero value (e.g. 1). This has the effect of eliminating dispersion information as well as relative rank beyond the threshold level.

Under our optimization framework, “top N” momentum strategies (those that rank securities, choose the top N, and equally weight them) can be formulated as a mean-variance optimal portfolio with active views on returns, volatilities, and correlations where a step transform is applied to prior returns, and both volatilities and correlations are identical across securities.

It should be noted that several transforms can also be chained together. For example, a bin transform taken on its own may lead to a large number of data points being clustered in the same bin depending upon the dispersion of the data. By first taking a rank transform, the bin transform will preserve relative rank information across different percentiles of the data.

**Transformers, Roll Out**

To explore the impact of these different transformations, we apply them to our calculated expected returns and use the result as the input to our mean-variance optimization. For each different variation, we then calculate the full-period realized Sharpe ratio. Below, we plot the Sharpe ratios in sorted order.

For relevant transforms, labels are accompanied by their parameterization. For clarification:

- Parameterization of the logistic transform specifies the growth parameter.
- Parameterization of the rank-bin transform specifies the number of bins employed.
- Parameterization of the step transform specifies the percentile cutoff.

It would appear, at first glance, that applying these transforms may be able to make significant improvements in risk-adjusted returns.

As we look closer, though, we see something rather curious: the methods at the far right seem to be dominated by the fact that they almost entirely eliminate information about both dispersion (including rank and relative scale). A single ranked bin, a step function at the 0^{th}percentile, and a logistic function with a growth parameter of zero all effectively apply the same transform: returns become identical, taking on any value.

Per the decision tree above, if all returns are identical, then the mean-variance optimization basically becomes a minimum-variance optimization.

In fact, it seems that the more specific information we eliminate about returns, the better the Sharpe ratio gets. Does this imply that, for the test at hand, to maximize our out-of-sample Sharpe ratio we are better off eliminating information about returns? Does this mean that momentum does not matter for risk-adjusted returns?

Not so fast. We should remember that these Sharpe ratios are themselves estimates and, therefore, are shrouded in their own probability distribution. If we actually perform the tests required to determine whether differences in Sharpe ratios are statistically significant, we find that they are not.

Let us repeat that again: none of the Sharpe ratios are statistically distinguishable from one another. We cannot reject, with any meaningful confidence, that these results are not entirely due to randomness. The realized outperformance of the minimum variance approach could reflect a superior investment process or be due entirely to luck.

At first this may seem depressing or frustrating. We would argue it is *enlightening*. If the backtested dispersion between these results is high, but they are statistically indistinguishable, then we have found yet another place where diversification can work its magic!

By choosing a single transform and parametrization, we risk selecting the wrong specification. Had we started with the belief that rank and dispersion information is useful and applied no transform, we would have had one of the worst possible realized Sharpe ratios. However, the future may unfold quite differently. Based on the chart above, we could choose to completely forgo the rank and dispersion information only to find that it holds quite a bit of meaningful information going forward.

The wonderful thing about diversification is that it allows us to fully admit our ignorance of the future and build an approach that is robust to all potential outcomes. By taking an ensemble approach – building portfolios for each transform and then averaging the results – we end up with a realized Sharpe ratio that approaches the average and, our research suggests, often even exceeds it.

**Conclusion**

Portfolio construction says just as much about our beliefs as it does our objective. The choice of mean-variance versus risk-parity versus equal-weight is not just a technical decision: it is one founded deeply in our beliefs about return, volatility, and correlation.

In practice, we must also consider the strength of our beliefs. We may hold active views on returns, volatilities, and correlations, but may not have much confidence in our ability to quantify them. This confidence can be addressed through approaches such as shrinkage, Black-Litterman or Entropy Pooling, or we can address them by transforming our data.

Common transformations will attempt to retain the salient features of the data while reducing the potentially harmful impact of estimation error. In the example explored in this commentary – a simple sector rotation strategy – we applied transformations to our expected returns, which were derived from prior returns. The salient features of the data were the level and dispersion of the returns.

Each transform utilized retained different amounts of information about these features. Over the backtest, we found that the transforms that eliminated the most information, effectively devolving the process into a minimum-variance optimization, had the highest backtested Sharpe ratios. At face value, this might suggest that information in returns is too noisy to be valuable for the process we employed. It is important to note, however, that none of the Sharpe ratios were distinguishable from a statistical perspective, meaning that the dominance of the minimum-variance approach may be merely due to randomness.

We believe that the important takeaway, once again, is that diversification is important. While we may believe that level and dispersion information embedded in prior returns is useful for portfolio construction, our test suggests that there may be sufficiently long periods where the signal is dominated by noise and future results may suffer. Just as with diversifying across asset classes, diversifying across transforms allows us to embrace the uncertainty of what the future holds.

## 3 Pingbacks