This post is available as a PDF download here.


  • Research Affiliates published a new piece of research exploring mutual fund returns over the last 25 years and the implied ability for managers to capture popular factor premiums published by the academic community.
  • They argue that several factors accepted in academia may not be implementable after real life frictions (e.g. transaction costs, cost of shorting, missed trades, et cetera).
  • Their research finds significant shortfall between published factor premiums and realized factor premiums.
  • We believe that a sufficient proportion of the shortfall can be explained by estimation error in their process and, therefore, we should refrain from drawing conclusive results until more evidence is published.
  • As a larger point, we acknowledge the inherent biases people exhibit to defer to authority and accept as truth the things they read outside of their own areas of expertise.
  • Financial research is, in many ways, like a backtest: it is rarely published unless it is interesting and supports the firm’s existing products or viewpoint. It should, therefore, be met with much the same skepticism.

Research Affiliates published a new piece this week titled The Incredible Shrinking Factor Return (Unabridged) (henceforth AKW for the authors’ last names: Arnott, Kalesnik, and Wu).[1]  A rather hefty and wonkish read, the ultimate conclusion they draw is that factor returns actually realized by investment managers are far below those purported to be available by long/short factor research.

While they do not provide evidence as to why the shortfall exists, they do put forth several potential reasons, including:

  • The long/short portfolios ignore trading and transaction costs.
  • The true cost of shorting may be prohibitively high.
  • Trades may be missed (e.g. shorts may not be available).
  • Fund fees may significantly erode captured alpha.

The huge disparity between realized and theoretical factor returns put forth in this piece draws into question the entire benefit of factor-based investing, particularly for high-turnover factors like momentum which may incur the brunt of the implementation cost.

Finding new ways to question broadly accepted beliefs is a laudable pursuit.  We believe, however, that there is reason to remain skeptical about these particular results.


Understanding the Research Affiliates Method

AKW employ a “two-stage” regression approach.

First, using fund data from the Morningstar Direct Mutual Fund Database, they compute monthly returns from January 1991 to December 2016.  For each fund, they regress the monthly returns against four factors: Market, Value, Size, and Momentum.  This regression process identifies how much exposure the fund has had to that factor over time.

This is a fairly standard way of creating a “factor lens” through which to identify where fund performance is derived.

The second stage of the process can be a bit confusing if you’ve never come across it before.

Each month, they take the actual fund returns and regress them against the factor betas they just calculated in the prior step.  The result is estimated returns for each factor for that month.  Or, more strictly, estimated returns for the factor as captured by the funds.

For each factor, the long-term average of these estimated monthly returns is compared to the long-term average of the true factor performance.  The difference between the two long-term averages is the “implementation shortfall.”

Now by AKW’s own admission, there are many potential shortfalls in this methodology, not the least of which includes the assumption that factor exposure is constant over time.  Furthermore, they note that the second-stage of the regression will bias estimates downward due to estimation error in the betas.

But just how much downward bias might be created?


Estimation Error in Betas

To explore how big a problem factor beta estimation might be, we wanted to take a look at an example fund that was live over the period and see how meaningfully full-period beta estimates differed from short-run, rolling estimates.  We chose to look at the Vanguard Wellington Fund (VWELX).

Full Period
Minimum Rolling EstimateMaximum Rolling Estimate
Market (“MKTexRF”)0.590.440.87
Size (“SMB”)-0.14-0.290.02
Value (“HML”)0.18-0.180.46
Momentum (“UMD”)-0.02-0.200.06

Source: Yahoo! Finance and Kenneth French Data Library.  Calculations by Newfound Research.

With this one example, we can see that style drift should not go overlooked as a potentially significant source of error in estimating betas and, therefore, a source of error in the two-stage regression process.

Source: Yahoo! Finance and Kenneth French Data Library.  Calculations by Newfound Research.


Quant Note: Why Does Estimation Error Matter?

A fact of regression is that increasing estimation error in the “independent” variables creates what is known as attenuation: a pull of estimates towards zero.  Why?  Consider a simple example:

Y = BX + E

It is assumed in basic regression that we know X (the “independent” variables) with certainty and that Y (the “dependent” variable) might have some estimation noise.  Our goal is then to identify B.  A critical requirement is that B and E, the error term, be independent.  Y is allowed to have estimation error because it can simply be swept into E.

If X has measurement noise, however, we end up with a scenario where the observed X’s will be positively correlated with the error term, e.g.

Y = BX* + E

Where X* is our measured X with estimation noise e (assumed to be normally distributed with zero mean):

X = X* + e

Re-arranging and substituting the true X back in the equation, we get:

Y = B(X – e) + E

Y = BX + (E – Be)

If B > 0, there is a negative correlation between X and (E – Be); if B < 0, there is a positive correlation.  Therefore, if the true B > 0, then the estimator for B will be biased downward.  If the true B < 0, the estimator for B will be biased upwards.

Which means that estimation error in the independent variables creates a pull towards zero.


Introducing a Simulation-Based Approach

We should start by saying that we generally frown upon simulation-based approaches.  All too often, simulations require such a large number of unrealistic assumptions that results do little to give us statistical confidence in the results.

That said, where simulations can be useful is in situations where we want to try to isolate a particular effect.  In this case, with AKW’s two-step regression, we want to isolate the impact of estimation error in the initial betas.  What simulations can do, in this case, is give us a baseline to measure the final results against.

For example, in this case we can ask: is a nearly 50% shortfall in market beta reasonably explained by estimation error? 

To isolate and test the potential impact of this bias, we created a simulation-based replica of AKW’s process.  The process works as follows:

  • For 1000 hypothetical funds, we create a random set of factor betas. We assume these betas are the true betas and are constant over time.  The betas are assumed to be normally distributed with mean zero and standard deviation of 0.15.  This means that 95% of factor betas should fall within +/- 0.3.  For the Market beta, we shift the mean to be around 1 (so 95% of market beta exposures will fall within 0.7 and 1.3).[2]
  • Using these known betas, we create the fund returns by multiplying the betas against the factor returns. This is important because it means that our betas are not estimates: they are known and true.  For convenience we ignore idiosyncratic risk.
  • We then introduce varying levels of error (distributed normally with varying levels of scale) on top of our betas to create “beta estimates,”
  • We perform stage two of AKW’s regression process using the fund returns and the noisy beta estimates to extract the implied realized factor returns and compare those against the true factor returns.


That’s a Big Bias

To calibrate this process, we first assume there is no error in estimating the betas.  If built correctly, the implied realized factor capture exactly matches the true factor returns over time.

Before we analyze results, there is another expectation we can set based upon the data as well.

In the last section we mentioned that the error will create a pull towards zero.  E.g. if the real factor return is +1%, then we would expect a regression result between 0% and 1%.  Similarly, if the real factor return is -1%, then we would expect a regression result between -1% and 0%.

While all of the factors exhibit a positive long-run average, the percent of months that are positive versus negative differ for each.  For example, 60%+ of months for Market and Momentum factors are positive, while for Size and Value the figures are 52% and 49% respectively.

While estimation error will pull both positive and negative results towards zero, Market and Momentum exhibit a higher frequency of positive results than negative, and therefore should exhibit a greater negative drag.[3]

Which is exactly what we see in the AWK results, as well as our simulated results.

Scale of Errors / Scale of Beta (0.15)

The question that remains is whether we believe estimation error could have the same distribution scale as the betas themselves.  While we should not draw too much evidence from a single data point, we can look back towards our VWELX for guidance.

In the prior VWELX example, the standard deviation in differences between full-period and rolling estimates ranges between 0.05 (Momentum) and 0.16 (Value).  In the above table, that would put the figures between 33% and just north of 100%: areas where significant downward bias exists.

That said, our results do not fully refute AKW’s evidence.

  • AKW tests their results using look-back regressions instead of full-period regressions and find similar results. However, they appear to use an expanding window process, instead of a rolling window process, whereby as much historical data is utilized as possible.  In later periods, this will lead to estimates that asymptotically approach the full-period beta.
  • The plots provided by AKW of factor returns captured by managers versus their observed returns exhibit linear relationships with slopes between 0.95 and 1.01. Downward biases created by estimation error would reduce the slope of these lines significantly.
  • AKW reports that the Size premium captured by managers exceeds the theoretical long/short.

Nevertheless, style drift and estimation error in beta may be a significant contributor to the results put forth by AKW and, in our opinion, call into question the ability to draw any meaningful conclusion from their results at this time.


Addressing our Own Biases

We think it is worth taking a moment to take a step back and address a larger issue: biases when it comes to reading published research.

At Newfound, we have the quantitative aptitude to examine the hypothesis put forth by AKW.  Many industry professionals do not.  For those who cannot, all too often instead of addressing the research with a healthy dose of skepticism, it is accepted outright.  Why is that the case?

In a 1993 study by Daniel Gilbert, Romin Tafarodi, and Patrick Malone – titled “You Can’t Not Believe Everything You Read” – experiments found evidence that comprehension first requires an initial belief that is then followed by critical analysis.

In other words, if you read something, your natural inclination is to accept it as true.

We’re personally partial to a variant called the Murray Gell-Mann Amnesia effect, popularized by Michael Crichton in his 2002 essay, “Why Speculate?”:

Media carries with it a credibility that is totally undeserved. You have all experienced this, in what I call the Murray Gell-Mann Amnesia effect. (I call it by this name because I once discussed it with Murray Gell-Mann, and by dropping a famous name I imply greater importance to myself, and to the effect, than it would otherwise have.)

Briefly stated, the Gell-Mann Amnesia effect works as follows. You open the newspaper to an article on some subject you know well. In Murray’s case, physics. In mine, show business. You read the article and see the journalist has absolutely no understanding of either the facts or the issues. Often, the article is so wrong it actually presents the story backward-reversing cause and effect. I call these the “wet streets cause rain” stories. Paper’s full of them.

In any case, you read with exasperation or amusement the multiple errors in a story-and then turn the page to national or international affairs, and read with renewed interest as if the rest of the newspaper was somehow more accurate about far-off Palestine than it was about the story you just read. You turn the page, and forget what you know.

That is the Gell-Mann Amnesia effect. I’d point out it does not operate in other arenas of life. In ordinary life, if somebody consistently exaggerates or lies to you, you soon discount everything they say. In court, there is the legal doctrine of falsus in uno, falsus in omnibus, which means untruthful in one part, untruthful in all.

But when it comes to the media, we believe against evidence that it is probably worth our time to read other parts of the paper. When, in fact, it almost certainly isn’t. The only possible explanation for our behavior is amnesia.

We’d argue these effects are compounded by our inherent authority bias, whereby we attribute greater accuracy to the opinion of an authority figure.

Rob Arnott and company are, without a doubt, authority figures.

Last year they made waves by arguing that not only should we be worried about a violent factor crash, but went so far as to say that many largely accepted factors were nothing but the result of sloppy data-mining.  (We’ll refrain from pointing out the irony that many of the factors identified as being data-mined are offered as RAFI indices…)

The big danger here is that, for many, what is published by big firms filled with PhDs – like Research Affiliates – is taken as gospel.

Cliff Asness, co-founder of AQR, took up the mantle to challenge Arnott’s claims, but Arnott has largely ignored the critiques.

Perhaps more importantly, Asness’s rebukes have been far more poorly covered in the media than Arnott’s initial outlandish claims.  For many, this creates a false perception that Arnott’s view is not only unchallenged, but broadly accepted by the industry.

We want to be clear: we are not saying Research Affiliates is doing anything malicious.  The reality is that we’re all human.  We all make mistakes.  We prefer to defer to Hanlon’s razor: “Don’t assume bad intentions over neglect and misunderstanding.”

That said, published research in finance is often like a backtest: rarely do you see any that does not support the firm’s products or existing views.

All that is to only say that we must be aware of our own biases that make it difficult for us to not blindly believe the things we read.  If we don’t understand the process behind something, it does little harm to apply a healthy dose of skepticism.


[2] We choose these levels largely based on anecdotal experience of running factor regressions.

[3] With this broad statement, we are (perhaps unfairly) assuming that monthly returns will be normally distributed.  Significant skew or kurtosis in these figures could meaningfully change these assumptions.


Corey is co-founder and Chief Investment Officer of Newfound Research, a quantitative asset manager offering a suite of separately managed accounts and mutual funds. At Newfound, Corey is responsible for portfolio management, investment research, strategy development, and communication of the firm's views to clients. Prior to offering asset management services, Newfound licensed research from the quantitative investment models developed by Corey. At peak, this research helped steer the tactical allocation decisions for upwards of $10bn. Corey holds a Master of Science in Computational Finance from Carnegie Mellon University and a Bachelor of Science in Computer Science, cum laude, from Cornell University. You can connect with Corey on LinkedIn or Twitter.