Category: Weekly Commentary Page 20 of 21

This blog post is available as a PDF here.

Summary

The debate for the best way to build a multi-factor portfolio – mixed or integrated – rages on.
Last week we explored whether the argument held that integrated portfolios are more capital efficient than mixed portfolios in realized return data for several multi-factor ETFs.
This week we explore whether integrated portfolios are more capital efficient than mixed portfolios in theory. We find that for some broad assumptions, they definitively are.
We find that for specific implementations, mixed portfolios can be more efficient, but it requires a higher degree of concentration in security selection.

This commentary is highly technical, relying on both probability theory and calculus, and requires rendering a significant number of equations. Therefore, it is only available as a PDF download.

For those less inclined to read through mathematical proofs, the important takeaway is this: for some broad assumptions, integrated multi-factor portfolios are provably more capital efficient (e.g. more factor exposure for your dollar) than mixed approaches.

Is That Leverage in My Multi-Factor ETF?

By Corey Hoffstein

On October 10, 2016

In Portfolio Construction, Risk & Style Premia, Weekly Commentary

This blog post is available as a PDF here.

Summary

The debate for the best way to build a multi-factor portfolio – mixed or integrated – rages on.

FTSE Russell published a video supporting their choice of an integrated approach, arguing that by using the same dollar to target multiple factors at once, their portfolio makes more efficient use of capital than a mixed approach.

We decompose the returns of several mixed and integrated multi-factor portfolios and find that integrated portfolios do not necessarily create more capital efficient allocations to factor exposures than their mixed peers.

A colleague sent us a video this week from FTSE Russell, titled Factor Indexing: Avoiding exposure to nothing.

In the video, FTSE Russell outlines their argument for why they prefer an integrated – or composite – multi-factor index construction methodology over a mixed one.

As a reminder, a mixed approach is one in which a portfolio is built for each factor individually, and those portfolios are combined as sleeves to create a multi-factor portfolio. An integrated approach is one in which securities are selected that have high scores across multiple factors, simultaneously.

The primary argument held forth by integration advocates is that in a mixed approach, securities selected for one factor may have negative loadings on another, effectively diluting factor exposures.

For example, the momentum stock sleeve in a mixed approach may, unintentionally, have a negative loading on the value factor. So, when combined with the value sleeve, it dilutes the portfolio’s overall value exposure.

This is a topic we’ve written about many, many times before, and we think the argument ignores a few key points:

By selecting stocks that score well on multiple factors simultaneously, stocks that do extremely well on a single factor, but not well on others, will be left out. This leads to a pool of securities that are jacks of all trades, but masters of none: a pool that already has diluted factor scores.
Factors premiums have different maturity lengths (or, put another way, factor portfolios turnover at different rates). How frequently an integrated portfolio is rebalanced will dictate which factors have the largest impact on portfolio construction. As we’ve written about before, we believe this is the fundamental argument behind the Cliff Asness / Rob Arnott debate on factor timing.

FTSE Russell did, however, put forth an interesting new argument. The argument was this: an integrated approach is more capital efficient because the same dollar can be utilized for exposure to multiple factors.

$1, Two Exposures

To explain what FTSE Russell means, we’ll use a very simple example.

Consider the recently launched REX Gold Hedged S&P 500 ETF (GHS) from REX Shares. The idea behind this ETF is to provide more capital efficient exposure to gold for investors.

Previously, to include gold, most retail investors would have to explicitly carve out a slice of their portfolio and allocate to a gold fund. So, for example, an investor who held 100% in the SPDR S&P 500 ETF (“SPY”) could carve out 5% and by the SPDR Gold Trust ETF (“GLD”).

The “problem” with this approach is that while it introduces gold, it also dilutes our equity exposure.

GHS overlays the equity exposure with gold futures, providing exposure to both. So now instead of carving out 5% for GLD, an investor can carve out 5% for GHS. In theory, they retain their 100% notional exposure to the S&P 500, but get an additional 5% exposure to gold (well, gold futures, at least).

So does it work?

One way to check is by trying to regress the returns of GHS onto the returns of SPY and GLD. In effect, this tries to find the portfolio of SPY and GLD that best explains the returns of GHS.

Source: Yahoo! Finance. Calculations by Newfound Research.

What we see is that the portfolio that best describes the returns of GHS is 0.75 units of SPY and 0.88 units of GLD.

So not necessarily the perfect 1:1 we were hoping for, but a single dollar invested in GHS is like having a $1.63 portfolio in SPY and GLD.

Note: This is the same math that goes into currency-hedged equity portfolios, which is why we do not generally advocate using them unless you have a view on the currency. For example, $1 invested in a currency-hedged European equity ETF is effectively the same as having $1 invested in un-hedged European equities and shorting $1 notional exposure in EURUSD. You’re effectively layering a second, highly volatile, bet on top of your existing equity exposure.

This is the argument that FTSE Russell is making for an integrated approach. By looking for stocks that have simultaneously strong exposure to multiple factors at once, the same dollar can tap into multiple excess return streams. Furthermore, theoretically, the more factors included in a mixed portfolio, the less capital efficient it becomes.

Does it hold true, though?

The Capital Efficiency of Mixed and Integrated Multi-Factor Approaches

Fortunately, there is a reasonably easy way to test the veracity of this claim: run the same regression we did on GHS, but on multi-factor ETFs using a variety of explanatory factor indices.

Here is a quick outline of the Factors we will utilize:

Factor	Source	Description
Market – RFR	Fama/French	Total U.S. stock market return, minus t-bills
HML Devil	AQR	Value premium
SMB	Fama/French	Small-cap premium
UMD	AQR	Momentum premium
QMJ	AQR	Quality premium
BAB	AQR	Anti-beta premium
LV-HB	Newfound	Low-volatility premium

Note: Academics and practitioners have yet to settle on whether there is an anti-beta premium (where stocks with low betas outperform those with high betas) or a low-volatility premium (where stocks with low volatilities outperform those with high volatilities). While similar, these are different factors. However, as far as we are aware, there are no reported long-short low-volatility factors that are publicly available. We did our best to construct one using a portfolio that is long one share of SPLV and short one share of SPHB, rebalanced monthly.

We will test a number of mixed-approach ETFs and a number of integrated-approach ETFs as well.

Of those in the mixed group, we will use Global X’s Scientific Beta U.S. ETF (“SCIU”) and Goldman Sachs’ ActiveBeta US Equity ETF (“GSLC”).

In the integrated group, we will use John Hancock’s Multifactor Large Cap ETF (“JHML”), JPMorgan’s Diversified Return US Equity ETF (“JPUS”), iShares’ Edge MSCI Multifactor USA ETF (“LRGF”), and FlexShares’ Morningstar U.S. Market Factor Tilt ETF (“TILT”).

We’ll also show the factor loadings for the SPDR S&P 500 ETF (“SPY”).

If the argument from FTSE Russell holds true, we would expect to see that the factor loadings for the mixed approach portfolios should be significantly lower than the integrated approach portfolios. Since SCIU and GSLC both target to have four unique factors under the hood, and NFFPI has five, we would expect their loadings to be 1/5^th to 1/4^th of those found on the integrated approaches.

The results:

Source: AQR, Kenneth French Data Library, and Yahoo! Finance. Calculations by Newfound Research.

Before we dig into these, it is worth pointing out two things:

Factor loadings should be thought of both on an absolute, as well as a relative basis. For example, while GSLC has almost no loading on the size premium (SMB), the S&P 500 has a negative loading on that factor. So compared to the large-cap benchmark, GSLC has a significantly higher
Not all of these loadings are statistically significant at a 95% level.

So do integrated approaches actually create more internal leverage? Let’s look at the total notional factor exposure for each ETF:

Source: AQR, Kenneth French Data Library, and Yahoo! Finance. Calculations by Newfound Research.

It does, indeed, look like the integrated approaches have more absolute notional factor exposure. Only SCIU appears to keep up – and it was the mixed ETF that had the most statistically non-significant loadings!

But, digging deeper, we see that not all factor exposure is good factor exposure. For example, JPUS has significantly negative loadings on UMD and QMJ, which we would expect to be a performance drag.

Looking at the sum of factor exposures, we get a different picture.

Source: AQR, Kenneth French Data Library, and Yahoo! Finance. Calculations by Newfound Research.

Suddenly the picture is not so clear. Only TILT seems to be the runaway winner, and that may be because it holds a simpler multi-factor mandate of only small-cap and value tilts.

Conclusion

The theory behind the FTSE Russell argument behind preferring an integrated multi-factor approach makes sense: by trying to target multiple factors with the same stock, we can theoretically create implicit leverage with our money.

Unfortunately, this theory did not hold out in the numbers.

Why? We believe there are two potential reasons.

First, selecting for a factor in a mixed approach does not mean avoiding other factors. For example, while unintentional, a sleeve selecting for value could contain a small-cap bias or a quality bias.
In an integrated approach, preferring securities with high loadings on multiple factors simultaneously may avoid securities with extremely high factor loadings on a single factor. This may create a dilutive effect that offsets the benefit of capital efficiency.

In addition, we have concerns as to whether the integrated approach may degrade some of the very significant diversification benefits that can be harvested by combining factors.

Ultimately, while an interesting theoretical argument, we do not believe that capital efficiency is a justified reason for preferring the opaque complexity of an integrated approach over the simplicity of a mixed one.

Client Talking Points

At the cutting edge of investment research, there is often disagreement on the best way to build portfolios.

While a strongly grounded theoretical argument is necessary, it does not suffice: results must also be evident in empirical data.

To date, the argument that an integrated approach of building a multi-factor portfolio is more capital efficient than the simpler mixed approach does not prove out in the data.

Multi-Factor: Mix or Integrate?

By Corey Hoffstein

On July 11, 2016

In Portfolio Construction, Risk & Style Premia, Weekly Commentary

This blog post is available as a PDF here.

Summary

Recently a paper was published by AQR where the authors advocate for an integrated approach to multi-factor portfolios, preferring securities that exhibit strong characteristics across all desired factors instead of a mixed approach, where securities are selected based upon extreme exposure to a single characteristic.
We believe the integrated approach fails to acknowledge the impact of the varying lengths over which different factors mature, ultimately leading to a portfolio more heavily influenced by higher turnover factors.

The Importance of Factor Maturity
Cliff Asness, founder of AQR, recently published a paper titled My Factor Philippic. This paper was written in response to the recently popularized article How Can “Smart Beta” Go Horribly Wrong? which was co-authored by Robert Arnott, co-founder of Research Affiliates.

Arnott argues that many popular factors are currently historically overvalued and, furthermore, that the historical excess return offered by some recently popularized factors can be entirely explained by rising valuation trends in the last 30 years.
Caveat emptor, warns Arnott: valuations always matter.

Much to our delight (after all, who doesn’t like to see two titans of industry go at it?), Asness disagrees.

One of the primary arguments laid out by Asness is that valuation is a meaningless predictor for factors with high turnover.

The intuition behind this argument is simple: while valuations may be a decent predictor of forward annualized returns for broad markets over the next 5-to-10 years, the approach only works because the basket of securities stays mostly constant. For example, valuations for U.S. equities may be a good predictor because we expect the vast majority of the basket of U.S. equities to stay constant over the next 5-to-10 years.

The same is not true for many factors. For example, let’s consider a high turnover factor like momentum.

Valuations of a momentum basket today are a poor predictor of annualized returns of a momentum strategy over the next 5-to-10 years because the basket of securities held could be 100% different three months from now.

Unless the same securities are held in the basket, valuation headwinds or tailwinds will not necessarily be realized.

For the same reason, valuation is also poor as an explanatory variable of factor returns. Asness argues that Arnott’s warning of valuation being the secret driver of factor returns is unwarranted in high turnover factors.

Multi-Factor: Mix or Integrate?
On July 2^nd, Fitzgibbons, Friedman, Pomorski, and Serban (FFPS) – again from AQR – published a paper titled Long-Only Style Investing: Don’t Just Mix, Integrate.

The paper attempts to conclude the current debate about the best way to build multi-factor portfolios. The first approach is to mix, where a portfolio is built by combining stand-alone factor portfolios. The second approach is to integrate, where a portfolio is built by selecting securities that have simultaneously strong exposure to multiple factors at once.

A figure from the paper does a good job of illustrating the difference. Below, a hypothetical set of stocks is plotted based upon their current valuation and momentum characteristics.

In the top left, a portfolio of deep value stocks is selected. In the top right, the mix approach is demonstrated, where the deepest value and the highest momentum stocks are selected.

In the bottom left, the integrated approach is demonstrated, where the securities simultaneously exhibiting strong valuation and momentum characteristics are selected.

Finally, in the bottom right we can see how these two approaches differ: with yellow securities being those only found in the mix portfolio and blue securities being found only in the integrated portfolio.

It is worth noting that the ETF industry has yet to make up its mind on the right approach.

GlobalX and Goldman Sachs prefer the mix approach in their ETFs (SCIU / GSLC) while JPMorgan and iShares prefer the integrate approach (JPUS / LRGF).

The argument made by those taking the integrated approach is that they are looking for securities with well-rounded exposures rather than those with extreme singular exposures. Integrators argue that this approach helps them avoid holding securities that might cancel each other out. If we look back towards the mix example above (top right), we can see that many securities selected due to strength in one factor are actually quite poor in the other.

Integrators claim that this inefficiency can create a drag in the mix portfolio. Why hold something with strong momentum if it has a very poor valuation score that is only going to offset it?

We find it somewhat ironic that FFPS and Asness both publish for AQR, because we would argue that Asness’s argument points out the fundamental flaw in the theory outlined by integrators. Namely: the horizons over which the premia mature differ.

Therefore, a strong positive loading in a factor like momentum is not necessarily offset by a strong negative loading in a factor like value. Furthermore, by integrating we run the risk of the highest turnover factor actually dominating the integrated selection process.

Data
In the rest of this commentary, we will be using industry data from the Kenneth French data library. For momentum scores, we calculate 12 one-month total return and calculate cross-sector z-scores[1]. For valuation scores, we calculate a normalized 5-year dividend yield score and then calculate cross-sector z-scores.[2]

Do Factor Premia Actually Mature at Different Time Periods?
In his paper, Asness referenced the turnover of a factor portfolio as an important variable. We prefer to think of high turnover factors as factors whose premium matures more quickly.

For example, if we buy a stock because it has high relative momentum, our expectation is that we will likely hold it for longer than a day, but likely much shorter than a year. Therefore, a strategy built off relative momentum will likely have high turnover because the premium matures quickly.

On the other hand, if we buy a value stock, our expectation is that we will have to hold it for up to several years for valuations to adequately reverse. This means that the value premium takes longer to mature – and the strategy will likely have lower turnover.

We can see this difference in action by looking at how valuation and momentum scores change over time.

We see similar pictures for other industries. Yet, looks can be deceiving and the human brain is excellent at finding patterns where there are none (especially when we want to see those patterns). Can we actually quantify this difference?

One way is to try to build a model that incorporates both the randomness of movement and how fast these scores mean-revert. Fitting our data to this model would tell us about how quickly each premium matures.

One such model is called an Ornstein-Uhlenbeck process (“OU process”). An OU process follows the following stochastic differential equation:

To translate this into English using an example: the change in value z-score from one period to the next can be estimated as a “magnetism” back to fair value plus some randomness. In the equation, theta tells us how strong this magnetism is, mu tells us what fair value is, and sigma tells us how much influence the randomness has.

For our momentum and valuation z-scores, we would expect mu to be near-zero, as over the long-run we would not expect a given sector to exhibit significantly more or less relative momentum or relative cheapness/richness than peer sectors.

Given that we also believe that the momentum premium is realized over a shorter horizon, we would also expect that theta – the strength of the magnetism, also called the speed of mean reversion – will be higher. Since that strength of magnetism is higher, we will also need sigma – the influence of randomness – to be larger to overcome it.

So how to the numbers play out?[3]

For the momentum z-scores:

	Theta	Mu	Sigma
NoDur	0.97	0.02	1.00
Durbl	1.00	0.03	1.63
Manuf	1.22	-0.03	0.96
Enrgy	0.98	0.06	1.69
HiTec	1.04	0.03	1.49
Telcm	1.15	-0.07	1.52
Shops	1.22	0.03	1.24
Hlth	0.84	0.11	1.39
Utils	1.48	-0.09	1.61
Other	1.18	-0.09	1.13
Average	1.10	0.00	1.36

For the valuation z-scores:

	Theta	Mu	Sigma
NoDur	0.11	-0.20	0.34
Durbl	0.08	0.58	0.49
Manuf	0.13	0.01	0.37
Enrgy	0.07	0.19	0.40
HiTec	0.09	0.23	0.33
Telcm	0.07	0.03	0.38
Shops	0.11	-0.15	0.36
Hlth	0.05	-0.47	0.36
Utils	0.06	-0.35	0.40
Other	0.11	-0.01	0.37
Average	0.08	-0.01	0.38

We can see results that echo our expectations: the speed of mean-reversion is significantly lower for value than momentum. In fact, the average theta is less than 1/10^th.

The math behind an OU-process also lets us calculate the half-life of the mean-reversion, allowing us to translate the speed of mean reversion to a more interpretable measure: time.

The half-life for momentum z-scores is 0.27 years, or about 3.28 months. The half-life for valuation z-scores is 3.76 years, or about 45 months. These values more or less line up with our intuition about turnover in momentum versus value portfolios: we expect to hold momentum stocks for a few months but value stocks for a few years.

Another way to analyze this data is by looking at how long the relative ranking of a given industry group stays consistent in its valuation or momentum metric. Based upon our data, we find that valuation ranks stayed constant for an average of approximately 120 trading days, while the average length of time an industry group held a consistent momentum rank was only just over 50 days.

Maturity’s Influence on Integration
The scatter plots drawn by FFPS are deceiving because they only show a single point in time. What they fail to show is how the locations of the dots change over time.

With the expectation that momentum scores will change more rapidly than valuation scores, we would expect to see points move more rapidly up and down along the Y-axis than we would see them move left and right along the X-axis.
Given this, our hypothesis is that changes in our inclusion score are driven more significantly by changes in our momentum score.

To explore this, we create an integration score, which is simply the sum of the valuation and momentum z-scores. Those industries in the top 30% of integration scores at any time are held by the integrated portfolio.

To distill the overall impact of momentum score changes versus valuation score changes, we need to examine the absolute value of these changes. For example, if the momentum score change was +0.5 and the valuation score change was -0.5, the overall integration score change is 0. Both momentum and value, in this case, contributed equally (or, contributed 50% each), to the overall score change.

So a simple formula for measuring the relative percentage contribution to score change is:

If value and momentum score changes contributed equally, we would expect the average contribution to equal 50%.

The average contribution based upon our analysis is 72.18% (with a standard error of 0.24%). The interquartile range is 59.02% to 91.19% and the median value is 79.47%.

Put simply: momentum score changes are a much more significant contributor to integration score changes than valuation score changes are.

We find that this effect is increased when we examine only periods when an industry is added or deleted from the integrated portfolio. In these periods, the average contribution climbs to 78.46% (with a standard error of 0.69%), with an interquartile range of 70.28% to 94.46% and a median value of 85.57%.

Changes in the momentum score contribute much more significantly than value score changes.

Integration: More Screen than Tilt?
The objective of the integrated portfolio approach is to find securities with the best blend of characteristics.

In reality, because one set of characteristics changes much more slowly, certain securities can be sidelined for prolonged periods of time.

Let’s consider a simplified example. Every year, the 10 industry groups are assigned a random, but unique, value score between 1 and 10.

Similarly, every month, the 10 industry groups are assigned a random, but unique, momentum score between 1 and 10.

The integration score for each industry group is calculated as the sum of these two scores. Each month, the top 3 scoring industry groups are held in the integrated portfolio.

What is the probability of an industry group being in the integrated portfolio, in any given month, if it has a value score of 1? What about 2? What about 10?
Numerical simulation gives us the following probabilities:

So if these are the probabilities of an industry group being selected in a given month given a certain value score, what is the probability of an industry group not being selected into the integrated portfolio at all during the year it has a given value score?

If an industry group starts the year with a value score of 1, there is 99.1% probability it will never being selected into the integrated portfolio all year.

Conclusion
While we believe this topic deserves a significantly deeper dive (one which we plan to perform over the coming months), we believe the cursory analysis highlights a very important point: an integrated approach runs a significant risk of being more heavily influenced by higher turnover factors. While FFPS believe there are first order benefits to the integrated approach, we think the jury is still out and that those first order effects may actually be simply due to an increased exposure to higher turnover factors. Until more a more substantial understanding of the integrated approach is established, we continue to believe that a mixed approach is prudent. After all, if we don’t understand how a portfolio is built and the source of the returns it generates, how can we expect to manage risk?

[1] Z-scoring standardizes, on a relative basis, what would otherwise be arbitrary values.

[2] We use yield versus historical as our measure for valuation as a matter of convenience. However, there are two theoretical arguments justifying this choice. First, the most common measure of value is book-to-market (B/M), which assumes that fair valuation of a company is its book value. Another such model is the dividend discount model. If we assume a constant growth rate of dividends and a constant cost of capital for the company, then book value should be proportional to total dividends, or, equivalently, book-to-market proportional to dividend yield. Similarly, if you assume a constant long-term payout ratio, dividends per share are proportional to earnings per share, which makes yield inversely proportional to price-to-earnings, a popular valuation ratio.

[3] We used maximum likelihood estimation to calculate these figures.

Math Tests, Birthdays, O.J. Simpson, and Texas Sharpshooters

By Corey Hoffstein

On June 20, 2016

In Risk Management, Weekly Commentary

This blog post is available as a PDF here.

Summary

Humans are not terribly good at accurately assessing probability and dealing with randomness. This may stem from these skills having little evolutionary value.
We exhibit hindsight and confirmation bias. We weave narratives around completely random events. We assess probabilities using the wrong context.
These biases make performance evaluation difficult since security, asset class, and strategy returns have a random component.
Appreciating the role of randomness in portfolio results may quell some of the constant disappointment with diversification.

Humans are generally really, really bad at dealing with randomness and probability. In a 2008 article, Michael Shermer labeled this difficulty “folk numeracy.” He described “folk numeracy” as follows:

“Folk numeracy is our natural tendency to misperceive and miscalculate probabilities, to think anecdotally instead of statistically, and to focus on and remember short-term trends and small-number runs. We notice a short stretch of cool days and ignore the long-term global-warming trend. We note the consternation with the recent downturn in the housing and stock markets, forgetting the half-century upward-pointing trend line. Sawtooth data trend lines, in fact, are exemplary of folk numeracy: our senses are geared to focus on each tooth’s up or down angle, whereas the overall direction of the blade is nearly invisible to us.”

From an evolutionary perspective, none of this should be too surprising. Consider a person living in a jungle full of dangerous predators. He hears a rustle behind him. Is there any value in calculating the probability that the rustle was caused by a dangerous predator? Of course there isn’t. The cost of a false positive – fleeing a harmless situation – is tiny compared to the cost of a false negative – staying put since the odds are the situation is benign, only to be killed by a tiger.

As Shermer’s article points out, our problems with probability are multi-faceted. Below, we’ll discuss a few of the main issues.

Narratives

Humans tend to weave narratives around the events they experience, even if those events are the result of pure randomness.

Consider a high school math student. After studying diligently for a ten-question test, she has a firm grasp on 90% of the material. On average, this means that she should be an A student. However, luck will play a pretty big role in her success or failure. She will do better when the questions selected by the teacher overlap closely with her knowledge and worse when they do not.

The probability of a C or lower is 7.0%. While this outcome would be unlucky for someone that knows 90% of the material, we see that it certainly is not out of the question.

Say the student does indeed get a C. Everyone involved is likely to start to weave a narrative to explain the disappointing grade. The student might start to lose confidence, worrying that she was not as prepared as she thought. The parents might think the student is slacking in her studies or distracted by friends or extracurricular activities. She might get scolded, grounded, or forced to go to a tutor. All of this storytelling occurs despite the fact that the score was nothing more than a bit a bad luck.

Context

In other instances, humans get half of the way there. They stop to consider probabilities and the role of randomness, but because they use the wrong context they still end up far from the mark.

The Birthday Paradox is a prime example of this. Say that you are in a room with 22 other individuals and are asked to estimate the probability that at least two people in the room share the same birthday.

Posed with this question, most people guess that the probability is quite low. In fact, the true probability of at least two people sharing a birthday is 50.7%. Why? Incorrect context.

The actual question posed was: What is the probability that at least two people in the room share the same birthday?

However, most people answer a slightly different question: What is the probability that someone in the room shares a birthday with me?

The answer to this second question is quite low at 5.9%. Most people get the right answer to the wrong question. Their context is off.

Another interesting example of incorrect context occurred during the O.J. Simpson trial, which has been in the spotlight recently thanks to two recent television series.

A key aspect of the trial involved past domestic violence perpetrated by Simpson against Nicole Brown. One of Simpson’s defense attorneys, Alan Dershowitz, argued against the relevance of this evidence by saying that fewer than 1 in 2,500 abusers go on to murder their spouses.

Dershowitz wasn’t necessarily incorrect, but he provided the jury with the wrong context (which, perhaps, was his intention, given that he was defending Simpson). In probability terms, he claims that the probability that a man murders his wife given that he abused her in the past is less than 1 in 2,500.

But this ignores the key fact that someone did in fact murder Brown.

The jury should be interested in a slightly different probability: the probability that a man murders his wife given that he abused her and that she has been murdered. Using some additional data and a powerful tool called Bayes’ theorem, we find that this probability is actually around 89%. 89% may not be high enough to convict by itself, but would be powerful when combined with other evidence of guilt.

Hindsight Bias and Confirmation Bias

From Wikipedia:

“Hindsight bias is the inclination, after an event has occurred, to see the event as having been predictable, despite their having been little or no objective basis for predicting it.”

“Confirmation bias is the tendency to search for, interpret, favor, and recall information in a way that confirms one’s preexisting beliefs or hypotheses, while giving disproportionally less consideration to alternative possibilities.

The Texas Sharpshooter Fallacy illustrates both of these biases.

Consider a cowboy randomly shooting at the side of a barn. This cowboy is a terrible shot and so bullet holes are randomly dispersed on the barn’s surface. Inevitably, some clusters start to develop as the cowboy takes more and more shots. The cowboy locates the densest clusters and draws bull’s-eyes around them. In the future, the cowboy uses the barn as evidence of his sharpshooting prowess.

The Texas Sharpshooter Fallacy is more than an entertaining story. It’s pervasive in many walks of life. Journalist David McRaney pointed out a number of examples in an article on the topic. For example, he wrote:

“In World War II, Londoners took notice when bombing raids consistently missed certain neighborhoods. People began to believe German spies lived in the spared buildings. They didn’t. Analysis afterwards by psychologists Daniel Kahneman and Amos Tversky showed the bombing strike patterns were random.”

Diversification Disappoints

So what in the world does any of this have to do with investing?

A few weeks back we explained why diversification, often called the only free lunch on Wall Street, will always disappoint.

In our opinion, much of this frustration stems from the probability-related issues discussed above. Diversification is valuable precisely because the future is uncertain, which means it should be evaluated with probability and randomness in mind.

As an example, consider an investor that spreads her capital equally among ten different mutual funds (labeled A through J). Her performance, both on a position-by-position and aggregate basis, is summarized in the following periodic table of returns.

These are random, simulated returns generated assuming that each Fund’s returns are independently and identically distributed normal random variables.

One of the first things the investor will probably notice is the performance of Fund J. Fund J not only is the worst performing over the full period, but also has the dubious distinction of underperforming the overall portfolio in all five years. As a result, Fund J detracted from portfolio performance each and every year.

The investor draws a bull’s-eye right on Fund J, just like the Texas Sharpshooter.

These are random, simulated returns generated assuming that each Fund’s returns are independently and identically distributed normal random variables.

Then the investor thinks, “Wow, what are the odds that a Fund underperforms for five consecutive years? Probably pretty low.”

And she isn’t wrong. The probability of Fund J underperforming for five years in a row is only 3.125%. However, her context is wrong, just like in the Birthday Paradox and O.J. Simpson examples. The more relevant question is: “What is the probability that at least one of my ten funds underperforms for five straight years?” Using some simplifying assumptions, the answer is around 28%, far from a rare event[1].

As a quick mathematical aside, it’s worth nothing this probability will actually increase as more funds are added to the portfolio. More diversified portfolios have greater odds of having long-term underperformers. With twenty funds, the probability increases to nearly 50% – equivalent to a coin flip. With thirty funds, the probability is above 60% and so it’s actually more likely than not that we have at least one fund underperform for 5 years in a row.

With this incorrect probability assessment in hand, the investor weaves a narrative as to why Fund J underperformed. Maybe there was a change in portfolio manager or the Fund’s process is fundamentally broken. Maybe the Fund is too big or too small. This is no different than searching for explanations as to why your A-student got a C on a math test.

To be clear, we are not suggesting that investors put their collective heads in the sand and chalk up all performance variation to randomness. A diligent parent won’t ignore their children’s grades and a diligent investor won’t stop trying to understand the sources of outperformance and underperformance just because randomness and luck happen to play a significant role in results, especially when sample sizes are small.

However, we do believe that the best investors appreciate the starring role that randomness plays in investment results. Gaining this appreciation may soothe some of the constant disappointment with diversification.

[1] We assume that each fund’s annual return is independently and identically normal random variables and that there is no serial correlation from year to year. We use these assumptions for all of the hypothetical portfolio analysis.

What are Growth and Value?

By Nathan Faber

On March 28, 2016

In Risk & Style Premia, Value, Weekly Commentary

This commentary is available as a PDF here.

SUMMARY

Growth and value have intuitive definitions, but there are many ways to quantify each.
As with broad factors, such as value, momentum, and dividend growth, the specific metrics used to describe growth and value may fall in and out of favor, depending on the market environment.
Taking a diversified approach to quantifying value and growth can lead to more consistent performance over time.

In our commentary a few weeks ago, we pointed out a key flaw that many index providers have in their growth and value style indices. The industry norm lumps “low value” in with “growth” and “low growth” in with “value” when, in reality, growth and value are independent characteristics of companies. The result is that many of the growth and value ETFs that track these indices are not giving investors what they expect – or what they want.

Final index construction aside, let’s go down to a more fundamental level: what are growth and value in the first place, and how do we measure them?

Intuitively, growth refers to companies that are growing and expected to continue, and value refers to companies that are currently cheap relative to their fair price.

Simple enough.

But a quick survey of index providers finds that the characteristics they use to measure a stock’s growth and value characteristics vary across the board:

Growth Characteristics:

Long-term forward earnings per share growth (EPS) rate (CRSP, MSCI, Russell)
Short-term forward EPS growth rate (CRSP, MSCI)
Current internal growth rate (MSCI)
Long-term historical EPS growth trend (CRSP, MSCI, S&P)
Long-term historical sales per share growth trend (CSRP, MSCI, Russell, S&P)
12-month price change (S&P)
Investment-to-assets ratio (CRSP)
Return on assets, ROA (CRSP)

Value Characteristics:

Book-to-price ratio (CRSP, MSCI, S&P, Russell)
Forward earning to price ratio (CRSP, MSCI)
Earnings-to-price ratio (CRSP, S&P)
Sales-to-price ratio (CRSP, S&P)
Dividend yield (CRSP, MSCI)

Only one metric on each list is common to all four index providers (Sales per share growth trend for growth and book-to-price ratio for value).

So who is right?

We can test the performance of many of these metrics using data readily available online. The forward-looking growth data are more difficult to find historically, but general financial statement data is available on Morningstar’s website.

To keep matters simple, we will look at three metrics for each of growth and value. For growth: 3-year EPS growth, 3-year sales per share growth, and ROA. For value: the P/E, P/S, and P/B ratios.

And to keep things as realistic as possible, we will evaluate the stocks in the S&P 500 as they stood at the end of 2014. Relative to the current set of companies in the S&P 500, we added back in some companies that dropped out of the S&P 500 (mainly energy and materials companies) in 2015. Some mergers and acquisitions also make getting data for the companies more difficult. For example, Covidien was bought by Medtronic, AT&T bought DirecTV, and Kraft merged with Heinz. Since we will be focusing on relative performance differences rather than on absolute ones, we will simply reconstruct a proxy S&P 500 index using the data that is available. In all, our universe contains 481 companies.

Using the fundamental data from December 2014, we can sort based on each metric and select the top 160 companies (about one-third of the universe) and see how that “value” or “growth” portfolio would have performed in 2015. Within each portfolio, we equally weight for simplicity. Results are compared to an equal-weight benchmark to control for any out or underperformance arising from the equal-weight allocation methodology as opposed to stock selection.

There is significant variation during the year depending on which metric was used.

Source: Data from Yahoo! Finance and Morningstar, calculations by Newfound

For growth, all of the portfolios tracked each other until mid-March when the portfolio formed on sales growth began to diverge. The portfolios formed on EPS growth and ROA continued to track each other until mid-June. At this time, ROA rallied hard, eclipsing the sales growth portfolio in the 4^th quarter of 2015.

On the value front, the P/S ratio led through most of the year before falling back to the pack in the Fall. The P/E and P/B portfolios ended the year in very similar places, with the P/S portfolio eking out a ~65bp benefit over the other two portfolios.

Which Metric to Choose

One year is hardly enough data to make a sound judgment as to which metric is the best for selecting growth and value stocks. As we have said many times before, even though we may know a factor (e.g. value) has outperformed in the past and is likely to do so in the future based on behavioral evidence, stating whether that factor will outperform in any given year is tough.

Likewise, deciding which measure of a factor will outperform in a given year is also difficult. Even with value companies, a metric like P/E ratio may not work well when companies with strong sales experience short-term earnings shocks or when companies are able to inflate earnings based on accounting allowances. The P/B ratio may not work well in periods when service oriented companies, which rely on intangible human capital as a large driver of growth, are being rewarded in the market.

Let’s take a closer look at some popular ways of quantifying the value factor.

“Value”, as it stands in academic literature, is commonly measured using the P/B ratio. This is what the famous Fama-French Three Factor Model uses as its basis for calculating the value factor, high-minus-low (HML).

However, using data from Kenneth French going back to 1951, we can see that, for long-only portfolios, those formed both on P/E and P/S actually beat the portfolio formed on P/B both on an absolute and risk-adjusted basis.

Furthermore, AQR showed in their 2014 paper, “The Devil in HML’s Details,” that not only does the metric matter, but the method of calculating the metric matters, as well. While Fama and French calculated HML using book value data that was lagged by 6 months to ensure that data would be available, they also lagged price data by the same amount. The AQR paper proposed using the most recent price data for calculating P/B ratios and showed that their method was superior to the standard lagged-price method because using more current price data better captures the relationship between value and momentum.

The P/S and P/E ratios used in the table above are also calculated using lagged price data. Based on AQR’s research, we expect that those results might also be improved by using the current price data.

Different Measures of Factors May Ebb and Flow

We should be careful not to rush to judgment though. The fact that P/B has underperformed the other value metrics does not mean we should drop it entirely. It is helpful to remember that individual factors can go through periods of significant underperformance. The same is true for different ways of measuring a single factor. For example, over rolling 12-month periods, the return difference between portfolios formed portfolio on P/B, P/S, and P/E – all “value” metrics – has often been in excess of 2000bp!

Put bluntly: your mileage may vary dramatically depending on which value metric you choose.

Source: Data from Kenneth French Data Library, calculations by Newfound

With our 2015 example, we saw that P/S resulted in the best performing portfolio, but as we said before, different measures tend to cycle unpredictably. We can see which ones have been in favor historically by comparing each individual portfolio to the average of all three portfolios.

Source: Data from Kenneth French Data Library, calculations by Newfound

The fact that many index providers combine multiple metrics into a composite growth or value score is an acknowledgement of this unpredictability.

Averaging the different value portfolios would have led to a fraction of outperforming periods on par with the best individual portfolios, higher average outperformance than the P/S portfolio, and lower average underperformance than all three individual portfolios.

Source: Data from Kenneth French Data Library, calculations by Newfound

If you read our previous commentary about multi-factor portfolio construction, you’ll notice that the averaging we did above is approach #1 (the “or” method). In effect, we are investing in companies that have either low P/S, P/B, or P/E ratios. One way to implement this would be to form portfolios based on each metric and then average the allocations into a final value portfolio.

In practice, most index providers score companies based on each selected metric, normalize the scores, and then average them (sometimes using different weightings). The portfolio is then formed using this composite score. This is more in line with approach #2 from the commentary (the “and” approach), which favors companies that have some degree of combined strength across multiple metrics.

While we used value and momentum in the commentary to illustrate why using the “and” approach is problematic in multi-factor portfolios, using this approach isn’t as bad when attempting to identify a single factor. The problem with value and momentum stemmed from the difference in time that each factor took to mature. Using the “and” approach introduced drag from the shorter maturity factor.

If there is no convincing argument that an individual growth or value measure takes longer to mature than another (for instance, does P/S normalize faster than P/B), then taking the “and” approach is not likely to result in a worse outcome. In this case, where we are simply trying to identify growth or value, we care more about the predictive nature of each metric that goes into forming the portfolio.

The index providers vary considerably in regards to what characteristics they look at and how they weight them to arrive at a final portfolio. If you believe that the P/B ratio is the best determinant of company value then you will get the purest exposure with Russell. If you think return on assets is an important contributing factor to company growth, CRSP’s index will be more in line with your view.

However, if you are like us and concede that while there are many ways to quantify growth and value, no one method can outperform over every single period, a diversified approach may be your best option.

The Research Library of Newfound Research

Category: Weekly Commentary Page 20 of 21

Capital Efficiency in Multi-factor Portfolios

Summary

Is That Leverage in My Multi-Factor ETF?

Summary

$1, Two Exposures

The Capital Efficiency of Mixed and Integrated Multi-Factor Approaches

Conclusion

Client Talking Points

Multi-Factor: Mix or Integrate?

Summary

Math Tests, Birthdays, O.J. Simpson, and Texas Sharpshooters

Summary

Narratives

Context

Hindsight Bias and Confirmation Bias

Diversification Disappoints

What are Growth and Value?

SUMMARY

Which Metric to Choose

Different Measures of Factors May Ebb and Flow

Category: Weekly Commentary Page 20 of 21

Capital Efficiency in Multi-factor Portfolios

Summary­­

Is That Leverage in My Multi-Factor ETF?

Summary­­

$1, Two Exposures

The Capital Efficiency of Mixed and Integrated Multi-Factor Approaches

Conclusion

Client Talking Points

Multi-Factor: Mix or Integrate?

Summary

Math Tests, Birthdays, O.J. Simpson, and Texas Sharpshooters

Summary

Narratives

Context

Hindsight Bias and Confirmation Bias

Diversification Disappoints

What are Growth and Value?

SUMMARY

Which Metric to Choose

Different Measures of Factors May Ebb and Flow

Summary

Summary