*This post is available as a PDF download here.*

# Summary

- Over the past few years, ReSolve Asset Management has progressively worked to develop new and exciting rules for the March Madness bracket tournament.
- While the stakes may be much lower than in investing, many of the lessons we have learned translate well to portfolio construction and strategy development.
- Knowing the rules, diversifying appropriately, developing robust models, making wise assumptions, and playing in the moment all come into play in both arenas.
- Achieving financial goals is generally a longer-lasting concern for investors than winning a bracket challenge, but by obeying a few simple principles, these goals can be more attainable with less stress along the way.

With the Olympics now behind us, March Madness is the next big sporting event. If you’re like me, the question at the front of your mind was, what type of devilry will ReSolve Asset Management cook up this year for the tournament?

*Well, we asked them and unfortunately found out that they would not be putting it on again this year. However, they graciously gave their blessing for Newfound to do it. Stay tuned on our blog! *

When ReSolve began the competition in 2014, their main goals were to address some of the problems that typically plague March Madness bracket competitions. Primarily, they aimed to:

- Encourage a larger sample size of teams to be picked rather than just straight chalk bets.
- Reduce the risk of legacy errors in brackets. For example, Syracuse (10) in the Final Four in 2016 after Middle Tennessee (15) defeated Michigan State (2) in the first round.

The rules have been fine-tuned over the years, and two years ago, all of us at Newfound went into the competition in full force.

This contest is definitely not your normal office bracket pool.^{[1]} In 2016, participants allocated a fully-invested, long-only portfolio to the teams in the bracket. Each team was awarded a certain number of points-per-win (PPW), with less points awarded to the teams that had a higher likelihood of winning the tournament.

Rather than encouraging the participants to lean solely on the chalk picks, as is oftentimes the case when filling out a bracket under more standard rules, there were compelling reasons to allocate to underdog teams. Even if they won only one game and were subsequently knocked out of competition, the potential points earned were enough to compensate for the risk taken.

That year, both Corey^{[2]} and Justin^{[3]} published their methodologies on our blog, with Corey taking a resampled simulation-based approach, and Justin taking a mixed multi-factor approach.

Last year, the competition rules changed slightly.^{[4]} It looked more familiar, with participants now filling out an actual bracket. However, the point total awarded for each team was the reciprocal of its cumulative probability (capped at 200) of making it to that round. Points were awarded only for the final round that a team made it to.

As we gear up for the 2018 challenge, what are some important considerations that we’ve learned over the past years’ tournaments and how do they translate to real-world problems investors commonly face?

**Know the Rules Before You Play**

Perhaps the biggest deciding factor is to thoroughly understand the rules before you submit your final choices.

With the 2016 March Madness rules, employing a naïve allocation method of allocating in inverse proportion to the PPW for each team would have netted you a guaranteed 5.34 points. That was your risk free rate.

At the other extreme, for anyone who wanted to put all their chips on their favorite team, the naïve method would have beaten out 59% of those possible choices.

If we opted for the a middle ground between guaranteeing points and going all in on one team, there were some sweet spots in the playing field.

As Justin described in his post, some teams had a much more attractive payout than others.

**March Madness 2016: Expected Point Total vs. Standard Deviation of Point Total**

*Data from FiveThirtyEight, Calculations by Newfound Research.*

Since the payoffs were calculated based on the probability of winning the entire tournament, finding the teams that could get a lot of points for just winning a few games was key.

The structure of the tournament gave Oregon (1) a disproportionate payout relative to the other number 1 seeds. Whether Oregon was destined to get beaten by Kansas (1), Villanova (2), Oklahoma (2), or a lower seed, Oregon was going to earn more points for winning two games than any of the other number one seeded teams would get for winning 3 games. Using this information to your advantage made sense.

In constructing an actual portfolio, thoroughly understanding how an investment works and the risks associated with it are very important. Once you do understand why you are doing something and what could go wrong, you can use it to your advantage.

As we saw with the recent death of XIV[5], a blind investment would have been an unfortunate loss, but a systematically rebalanced portfolio would have been a consistent outperformer.

**Diversification Can Be Very Disappointing**

Given that you understand the rules of the game, defining a winning strategy is the next step.

In the 2016 tournament, the highest scoring entry would have gone to anyone who had allocated their full portfolio to Syracuse (10) even though they did not win the championship. But hindsight is 20/20, and choosing a number 10 seed was probably unlikely, unless you were an Orange.

*Data from FiveThirtyEight, Calculations by Newfound Research.*

The top seeded teams did not ultimately garner the most points, although they were solid performers in this tournament. To have a strategy that is more robust to upsets, many entries opted for diversification to minimize the risk of losing.

But diversification can be very disappointing in hindsight.[6]

For instance, being a front-runner and allocating 25% to each number 1 seed resulted in 8.11 points, but the benefits dropped off as diversification was increased by adding lower seeds, dropping below the risk-free point amount for expanded equal weighting. Even getting to the point where Syracuse was included was not enough to salvage the point total substantially.

*Data from FiveThirtyEight, Calculations by Newfound Research.*

Moving on to a real-world investment topic, allocations to higher expected return assets must be large enough to have a meaningful impact on the portfolio value, but not so large that they tank the whole thing in bad periods. This necessitates diversification across both assets and processes.

2017 was a hard year for asset class diversification if you are mentally anchored to equity performance. Equities across the board posted 20%+ returns while commodities and fixed income returned low single digit figures. Using diversifying risk-management processes such as trend-following allow the portfolio to tilt in years like this.

But even with that, comparing a diversified portfolio to the best performing asset class over a given period can lead to feelings of failure. Unlike in March Madness, where the goal is to win, the goal with investing should be too meet your own goals. Even though a 60/40 portfolio was never the best performer in a year, it led to a much smoother experience over time, likely reducing the risk of abandoning the investment plan.

*Note: The dark line shows the location of the return of a 60% SPY / 40% AGG portfolio.*

Diversification will not always look like a benefit after the returns are realized, but acknowledging its role in reducing the regret of choosing the wrong asset class is pivotal.

What you use to diversify and how you do it is key.

**You Can’t Win In Hindsight**

Wishful anecdotes abound about buying Apple in 2003, Bitcoin in 2011, or nearly any equity ETF in March 2009. However, the only return that matters is the one you actually realized.

As I geared up for March Madness 2017, I revisited my model from 2016 and realized that using a different measure of risk was more appropriate. I didn’t care about minimizing volatility as much as I cared about minimizing the risk of getting beaten by my coworkers (or worse yet, brothers).

Adjusting my simulation to minimize the chances of losing to a bracket in 10% of cases led to some nice looking results and a score of 8.96. With it, I would have won the 2016 March Madness bracket tournament at Newfound… a year too late, in 2017…

We see similar results when constructing portfolios using historical data. If we use the previous year returns and correlations as inputs into a Sharpe maximization, we find that the performance of this “optimal” portfolio over the subsequent year – shown as the red dotted line in the chart below – can leave a lot to be desired.

Dwelling on what could have happened is only valuable insofar as it leads you to make a better decision in the future. Pining away about investments that have risen when you weren’t invested or fallen when you were holding them takes time and mental energy that could be used for more productive endeavors.

**An Overfit Model is Not A Robust One**

My productive endeavor of choice at the time was to develop a model for the new 2017 bracket competition rules. The process was as follows:

- Generate a large number of random brackets.
- Assume that each bracket was the true bracket and calculate the score of every other bracket given this assumption.
- Choose the bracket from the top 1% of average scores that had the highest score over the chalk bracket.

This is somewhat similar to a leave-one-out cross-validation in statistics, and it performed decently well under my test set: the 2016 bracket. My “backtest” looked good enough so I went with it.

However, after Rounds 1 and 2 of the tournament, my model looked more like an overfit piece of junk. This made sense given the extremely small sample size used to develop the model.

Overfitting in finance is a very common problem, one which we have written about before.[7] I would be willing to bet that most people have never seen a bad backtest unless they calculated it themselves. Since we know that every strategy cannot always outperform, some of these backtests are necessarily bogus.

Imaging that eight years ago, I presented you with the following strategy, calculated over the 4 years prior, from Feb 2006 to Feb 2010.

On the 6th day of the month, you invest in SPY and either sell after 8 trading days or when a 2% trailing stop is hit using daily closing prices. It looks pretty good and has a Sharpe ratio of 0.78 compared to SPY’s -0.08 over the same period.

*Source: CSI. Calculations by Newfound. Data is hypothetical and does not represent any Newfound index or strategy. Past performance is no guarantee of future results.*

Little did you know that these parameters come from trying 7,140 different combinations of parameters: holding periods ranging from 5 to 21 days, trailing stop losses ranging from 1% to 20%, and entry points on any day of the month (assuming 21 trading days in each month).

Here’s what the same strategy would have looked like over the subsequent 4 years, now with a Sharpe ratio of 0.87 compared to SPY’s 1.22.

*Source: CSI. Calculations by Newfound. Data is hypothetical and does not represent any Newfound index or strategy. Past performance is no guarantee of future results.*

Not nearly as impressive, and I probably would have been fired as a manager if you were expecting outperformance.

But wait! I did some more research and have an *even better* strategy where you invest on day 19 of each month and hold for 20 days with a 7% trailing stop. It has a Sharpe ratio of 1.63 compared to SPY’s 1.22 over the period from February 2010 to February 2014.

*Source: CSI. Calculations by Newfound. Data is hypothetical and does not represent any Newfound index or strategy. Past performance is no guarantee of future results.*

I’m sure you can guess the result. Luckily for you, with the low volatility and the fact that the strategy was basically fully invested, the next four years from February 2014 to February 2018 did not look too bad. You essentially tracked the market.

However, if we tack on fees and transaction costs for a strategy like this, you would not have seen the type of performance represented by the backtest.

Naturally, we could continue this process for as long as we want, with a further parameter adjustment leading to a strategy with a Sharpe ratio of nearly 2.2 over the past 4 years, compared to a measly 1.2 for SPY.

Trying many iterations of a strategy, selecting the “best”, and constantly tweaking backtests is a perilous trap. Any degree of robustness quickly flies out the window.

One way to avoid this in the first place to dive deeper into the data, if possible. For example, if you know how many sets of parameters were tested and had the results from each trial, you can run tests to calculate parameters such as the probability of backtest overfitting (PBO) and the probability of a negative out-of-sample (OOS) Sharpe ratio.[8]

For the first strategy, the PBO was 81% with a 85% probability of a negative OOS Sharpe ratio. For the second iteration, the PBO was 84% with a 23% probability of a negative OOS Sharpe ratio.

Of course, it always pays to ask why parameters are what they are and whether the model is trying to capture a source of return that has a sound basis for continuing in the future (e.g. momentum, value, quality, etc.). Common sense can go a long way in avoiding an overfit model.

**A Model is Only as Good as the Assumptions that Go Into It**

So what was the problem with my March Madness 2017 bracket?

It turned out to be bad assumptions.

I assumed that the points awarded for each team were known at the beginning of the tournament and would not be altered in the future based on who the teams beat, who they were subsequently playing, injuries, playing location etc. I assumed that the locked up capital that was my bracket was invested according to a known set of rules when there were, in reality, tactical shifts in the underlying that I could not capitalize on after locking in my bets.

But that’s exactly how investing works, isn’t it? As prices and fundamentals change over time, expectations also change. We allocate to new value stocks that have become cheaper or allocate away from momentum stocks that have run their course. If interest rates increase, we may look to allocate more to bonds as duration declines and income is higher.

This is not to say that we should be tactical with our entire portfolio. After all, there is extreme value in buying and holding strategic investments that we believe will earn a premium over our time horizon as long as we can weather the short-term fluctuations. Systematic tactical shifts should serve more as a pivot within a larger strategic framework.

**Conclusion: 2018 March Madness**

Overall, my 2017 bracket would have fared better had my assumptions and interpretation of the rules been more accurate. My consolation is knowing that I have a data point in favor of my model (or at least not enough evidence to reject it).

So what does this all mean for this year’s tournament?

- Study the rules. Is it standard or more bespoke? What areas can be exploited for gain? ReSolve has mentioned previously that they would like to include some elements of prospect theory, so how would that change your approach?
- If you develop a model, ask what could be wrong with it, We are dealing with a small sample size here, and overfitting is an easy trap.
- Take a gamble and be thankful that there is not a lot of weight on it. If you put all your eggs in one basket, you will probably be disappointed. But with the ability to submit multiple entries (essentially like do-overs) and generally low stakes, why not take a risk when some bragging rights may be all you have to lose?

As the rules change again this year, we encourage all of our readers to participate. It’s always nice when we have a larger sample to test our strategies against!

*May the best quant win!*

As for your portfolio:

- Ensure that you understand the investments you hold, why you hold them and what to expect under different market scenarios.
- Pursue true diversification in both asset classes and processes.
- Rely on models that are both theoretically and empirically sound, but don’t place all your trust in a single methodology. In order for something to work in the long run, it must sometimes exhibit short term underperformance.
- Acknowledge that your assumptions may be wrong, and temper hubris with humility. Risk can be present in a variety of forms, and although you can only plan so much, identifying weaknesses beforehand can provide some peace-of-mind if those risks are realized.
- Develop a plan and stick to it. Constant tweaking will almost always do more harm than good. The best portfolio the previous year, quarter, month, week, etc. will likely not be the best over the subsequent period.

March Madness is a short-lived tournament that will not likely have a large impact on your life. But the factors that may lead to a winning strategy under some of the more-interesting bracket rules translate into other areas. Achieving financial goals is generally a longer-lasting concern for investors, and by obeying a few simple principles, these goals can be more attainable with less stress along the way.

[1] http://www.investresolve.com/blog/youre-invited-a-march-madness-challenge-where-skill-prevails/

[2] https://blog.thinknewfound.com/2016/03/coreys-take-building-portfolio-resolves-march-madness-challenge/

[3] https://blog.thinknewfound.com/2016/03/justins-take-building-portfolio-resolves-march-madness-challenge/

[4] http://www.investresolve.com/blog/psa-your-ncaa-march-madness-rules-are-garbage-do-this-instead/

[5] https://blog.thinknewfound.com/2018/02/rip-xiv/

[6] https://blog.thinknewfound.com/2016/05/diversification-will-always-disappoint/

[7] http://www.thinknewfound.com/wp-content/uploads/2013/10/Backtesting-with-Integrity.pdf

[8] https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2326253. If you would like to explore these types of overfit strategies further, the authors have an online tool at http://datagrid.lbl.gov/backtest/.

## 1 Pingback