I recently came across a graph similar to the following while doing some market research.

1

Source: Yahoo! Finance. Analysis by Newfound Research. Data from January 1951 – December 2015.

The argument was that the markets are getting more volatile. While this certainly looks to be the case based on the upward trend of the histogram, let’s investigate the data more thoroughly and ask ourselves some questions about its presentation.

What is the underlying data?

This is one of the first questions we need to ask. If the underlying data is not applicable to our situation, then any graphical manipulations are not going to mean much.

In this case, the data set is the S&P 500. While the graph does not state whether it is price return or total return, we can ascertain that it is price return since daily total return data only goes back to 1988, and the graph goes back to the 50s.

Is price return appropriate? Normally, these types of analyses should be done on a total return basis since that is the return that investors actually get. For instance, say you have an investment that distributes a special 4% dividend. On the ex-dividend date, the price of the asset will drop by 4%. This distribution would have no impact on the total return. If we assume that the total return for the day was negligible, this dividend would make the asset look like it had a >3% move based on price return.

For this graph, price return is fine to use since the frequency and magnitude of the S&P 500 dividends are not likely to distort the data in the graph. Using price return lets us go back further in time, which is beneficial.

Is the graph logically constructed?

This is one of the key questions to ask since there are many ways to massage data to tell a desired story.

A bar graph is a relatively simple way to show a trend. The graph above just counts the number of >3% absolute moves by decade and shows a rising trend. But upon closer inspection, we find that one of the buckets is actually a “decade”…

This is one of the key problems with bar graphs and histograms, one which we have written about before in reference to a graph we came across in a J.P. Morgan white paper. Having inconsistent bucket sizes can skew the results.

The last bar in the graph is actually a 15-year period from 2001-2015. If we break this out, the graph now looks like this.

2

Source: Yahoo! Finance. Analysis by Newfound Research. Data from January 1951 – December 2015.

This certainly looks different, but you’ll notice we have the same problem of uneven bucketing in the last bar since it is now only half of a decade. One way to make this more accurate would be to scale the last bucket up by a factor of 2 to account for the reduced time period.

This is akin to annualizing returns over partial year periods. You can run in to trouble if there is reason to believe that the extrapolated portion of the period will look significantly different than the portion we have experienced. For instance, if we only had data for 1 year of the decade, multiplying that by ten may not be very credible, but as long as you know what has been done to arrive at a final product, you can judge for yourself.

3

Source: Yahoo! Finance. Analysis by Newfound Research. Data from January 1951 – December 2015. The 5-year period from 2011-2015 has been multiplied by a factor of 2 to make it consistent with the decade data.

So what is your conclusion now? While it seems like the market has gotten more volatile than in the 50s and 60s, the 2000s look like more of anomaly than part of a trend.

Am I missing any information?

Now that we have fixed any construction problems with the graph, is there any other data that we should have to draw our final conclusion?

Since the graph already goes to 2015, we can’t go much further into the future. The S&P 500 data begins in 1950, but maybe there is a way we could go back in time further.

Using the market return data from the Kenneth French Data Library we can extend back to the beginning of the 30s – two more decades.

4Source: Kenneth French Data Library. Analysis by Newfound Research. Data from January 1931 – December 2015. The 5-year period from 2011-2015 has been multiplied by a factor of 2 to make it consistent with the decade data.

Sometimes a fresh view of the data is also enlightening. We can plot the number of >3% market moves over rolling 1-year periods to see how consistent the volatility was during the decades.

5

Source: Kenneth French Data Library, Analysis by Newfound Research. Data from July 1926 – April 2016.

The Takeaways

In the field of finance, we often have to rely on data provided by others. Validating it to the extent possible is a crucial step that could save you headache down the road when you either have to retract a false statement or deal with results contrary to what you initially expected.

Three simple steps can make the task easier:

  • When evaluating any analysis, make sure the underlying data is applicable to the situation at hand.
  • Look for ways that statistics and visualization can be misleading (uneven bucketing, inconsistent scaling or sizing, no statements of uncertainty, etc.)
  • Corroborate any conclusions with a more extensive analysis using similar data or by visualizing it a different way.

As for this specific graph, I do not think it shows that the markets are getting more volatile. The 2000s were a more volatile time, but the 1930s were even more so. We wrote a commentary a bit ago about the markets getting faster and came to the similar conclusion that we are within normal volatility ranges but may have an altered perspective.

While being skeptical in our approach to research we read can be good, I intentionally did not use the word “skeptical” in the title of this post since it implies a doubtful predisposition. Many times, data does not lie and can often trump our firmly held beliefs, which is a good thing. I would be more skeptical of the humans presenting the data than the data itself. By being discerning in our approach to data, we can more accurately judge its applicability and meaning, whether good, bad, or unfortunately occasionally ugly.

Nathan is a Vice President at Newfound Research, a quantitative asset manager offering a suite of separately managed accounts and mutual funds. At Newfound, Nathan is responsible for investment research, strategy development, and supporting the portfolio management team.

Prior to joining Newfound, he was a chemical engineer at URS, a global engineering firm in the oil, natural gas, and biofuels industry where he was responsible for process simulation development, project economic analysis, and the creation of in-house software.

Nathan holds a Master of Science in Computational Finance from Carnegie Mellon University and graduated summa cum laude from Case Western Reserve University with a Bachelor of Science in Chemical Engineering and a minor in Mathematics.