This blog post will likely be slightly more mathematical than my past posts, though I am hoping to keep it high level enough that it is still just as accessible. I will be using matrix notation in the mathematics, but will step through each part of the calculation to explain it in plain English — so please stick with me!
In this post, I want to provide a intuitive framework for understanding how unconstrained mean-variance optimization finds the optimal solution for the maximum Sharpe ratio portfolio. By creating an intuition, we can understand why unconstrained mean-variance optimization can be dangerous and unstable due to sampling noise in estimating expected excess returns, variances, and correlations.
First, we will begin with our definition of the Sharpe ratio: the expected excess returns per unit risk:
is a vector of expected returns for our assets,
is our variance-covariance matrix, and
Assuming that the excess return of the minimum-risk, fully invested portfolio is positive, then the weights that solve for the maximum Sharpe ratio portfolio have the closed-form solution:
This solution does not provide much intuition; the solution for our weights is a set of simultaneous equations built around the relationships of expected returns, variances, and correlations. Fantastically unclear!
To gain some understanding of how this relationship plays out, let’s consider a fairly trivial case. Let’s assume that we know for certain that the returns of our assets are entirely independent of each other (which implies that their correlations are zero). This means that our variance-covariance matrix is now simply a matrix with variance terms on the diagonal and zeros on the off-diagonals. This makes the computation of the inverse variance-covariance
a trivial procedure: diagonal terms need to only be replaced by their reciprocals. The term:
then becomes a vector of excess returns divided by variances. The term:
is simply the sum of those values. Therefore, our optimal weights for independent assets are proportional to their relative expected excess returns per unit risk, where “risk” is measured as return variance.
Another way to think about this solution is that in the case of independent assets, mean-variance optimization will first leverage every asset so that its variance is equal to the maximum variance, and then set weights proportionally to the leveraged expected returns. So if we consider the case where we have two assets, SPY and AGG,
whose returns are independent, the weights for the maximum Sharpe ratio portfolio of these two assets would be found by:
- Coming up with our scaling factor: the multiple that will scale AGG’s variance to be in line with SPY’s. The solution is:
- Scaling AGG’s expected excess return by this factor:
- Weighing SPY and AGG in proportion to their expected excess return terms
Our solution is:
Now we have developed a fairly good intuition for how unconstrained MVO works for independent assets; how can we translate this to the non-trivial case where we have assets with a non-zero correlation structure?
Fortunately, statistics provides us with a tool for translating a non-zero correlation structure into a zero correlation structure via Principal Component Analysis (PCA). PCA will take an NxN variance-covariance matrix of our assets and create a set of N linearly-uncorrelated principal portfolios (made up of our assets; sometimes referred to in literature as an eigen-portfolio) and their corresponding variances.
The expected excess return for each principal portfolio is simply computed by the weighted average of expected excess returns. So if our first principal portfolio has weights:
then its expected excess return is defined by:
With the expected excess return, variance, and the statistical guarantee that the returns from these principal portfolios are linearly-uncorrelated, we can easily apply our above base-case intuition: leverage the principal portfolios until they have equal variance and then the optimal weights are the relative proportion of their leveraged expected excess returns.
So what, then, is the intuition behind the principal portfolios? This one is a bit more difficult to answer. Traditionally, the intuition behind each principal portfolio is that it represents some sort of independent risk factor. What those risk factors are depends on the assets in the portfolio we are exploring. When the assets are all U.S. domestic equities, the principal portfolio with the largest variance is typically a systematic risk factor; the rest of the portfolios typically correspond highly to sector or industry factors. When the portfolio is made of global assets, we tend to find equity risk factors, interest rate risk factors, commodity factors — and frequently geographic risk factors.
Now that we have a fairly “simple” framework for understanding how unconstrained MVO creates a solution for correlated assets, we can discuss the risks. The biggest risk comes from the fact that in reality, we never know the true variance-covariance structure, but rather must construct a sample estimate, E, from historical returns. Random Matrix Theory (RMT), a branch of probability that deals with matrix-shaped random variables, tells us that when we perform a principal component analysis on E, the composition of the principal portfolios with the lowest variance are likely dominated by sampling error. In other words: they’re noise.
Remember that unconstrained MVO will leverage the low variance portfolios until they have equal variance as the highest variance portfolio. If a low-variance principal portfolio has a high relative expected excess return, by applying leverage it can balloon to dominate the portfolio. This is why unconstrained MVO is often considered unstable: from one sampling period to the next, measurement noise can end up dominating portfolio composition.