Nassim Taleb, author of the Inconcerto series and, most famously, The Black Swan, is out with a new paper called Error, Dimensionality, and Predictability. You can get a copy here.
To quote some of the scary bits ...
From the abstract:
We show how adding random variables from any distribution makes the total error (from initial measurement of probability) diverge; it grows in a convex manner. There is a point in which adding a single variable doubles the total error.
Higher dimensional systems – if unconstrained – become eventually totally unpredictable in the presence of the slightest error in measurement regardless of the probability distribution of the individual components.
And from the first page of the paper:
In fact errors are so convex that the contribution of a single additional variable could increase the total error more than the previous one. The nth variable brings more errors than the combined previous n-1 variables!
The point has some importance for “prediction” in complex domains, such as ecology or in any higher dimensional problem (economics). But it also thwarts predictability in domains deemed “classical” and not complex, under enlargement of the space of variables.
This paper especially caught my eye after Ilya Kipnis reached out with the following (elided) tweet(s):
This is why we use heuristics and other "suboptimal" methods -- because we know that our estimates have error. When designing a bridge, you can obtain estimates with much higher accuracy due to the laws of physics. In finance, you have assumption risk vs. estimation risk. Go too far on one or the other and you'll get bitten, ergo optimization is different.
— Ilya Kipnis (@QuantStratTradR) July 10, 2015
Here we couldn't agree with Ilya more. Estimation risk is one of the most dangerous latent variables rarely discussed in model research. Everyone discusses assumptions, but nobody likes to admit that the slightly wrong model with the right inputs may be better than the right model with slightly wrong inputs.
The subtext, in our opinion, to Taleb's paper and Ilya's tweets are that increased model complexity can lead to significant errors because of estimation risk – and the impact grows non-linearly with a linear increase of complexity.
We cannot overstate enough our philosophy that a focus on simplicity is key in being robust to complexity – especially when building quantitative models in financial markets.