Prior to October, whenever our models required a rebalance, we assumed that it occurred at the opening price on the following day in our hypothetical index calculation.
Making this assumption is not necessarily bad, especially when evaluating the performance of a new strategy you are testing or of a strategy you are trading yourself. In these cases, either the accuracy of the calculation is not critical, or you have full control of the trading. Problems arise when the strategy is traded by many different investors on a variety of platforms throughout the course of the day. Suddenly, having an execution price that is more representative of what is actually happening matters more since these investors will inevitably compare their performance against the index.
We see many firms assume that the trades are implemented at the closing price used to generate the trading signals in the first place. While assuming a trade at the opening price isn’t perfect, it is much, much better than this method. If we are striving for realism, using the closing price is a step in the wrong direction. How could investors trade at the close if they didn’t know the allocations beforehand?
But even if investors can see the allocations before the market opens, most of them are not going to trade at the opening price.*
Individual account dispersion around the index is to be expected. However, systematic underperformance is not good when you are running a business…and any kind of systematic difference, either good or bad, is not good from a statistical perspective.
As we said in the previous post, we would need intraday tick-by-tick data to use the true TWAP, so our method is an approximation. However, based on the limited amount of intraday data we can get from Google Finance, we can see how close our method comes to this “true” value (see note below).
The following chart shows the root mean square error (RMSE) of the opening price versus the true TWAP and the RMSE of our TWAP estimate for all the ETFs used in our core products (U.S. large and small-cap sectors, global sectors, income-focused ETFs, alternatives, and fixed income) along with some common benchmarks such as SPY, IJR, and ACWI, over the period from 10/22/1015 – 11/10/2015.
Any point below the orange line means that our TWAP estimate comes closer, on average, to the true TWAP calculated using the Google data. The average reduction in the RMSE of the assumed execution price and the true TWAP was 59%. In absolute terms, the RMSE was generally less than 0.5% for the TWAP estimate. If we treat the RMSE as a standard deviation and assume that the deviations are normally distributed, we would expect the 95% of our TWAP estimates to be within ±1% of the true TWAP.
Out of the 78 ETFs on the chart, 5 had an average TWAP estimate that was further from the TWAP than the opening price. The calculation of our TWAP estimate assumes that the duration of the price moves are proportional to the distance, which is not always the case.
The five ETFs above the line were LALT, HDG, BWZ, PSCC, and PSCU. Of these, PSCC and PSCU are the clearest offenders while the others were just barely over. The chart below shows PSCC on 10/29/15.
Assuming that the price made a smooth transition from open to high clearly did not hold on that day. However, for the most part, the TWAP estimate was generally much closer to the true TWAP across every asset class, especially in those with very liquid ETFs.
So if we assume that investors trade throughout the day and that TWAP is a better representation of this behavior, then our TWAP estimate did a better job over this time period than simply using the opening price.
What if we assume that investor trades are better represented by the volume weighted average price (VWAP) instead of TWAP? VWAP puts more emphasis on the prices during periods of higher volume and may be a better indicator when trades are a significant fraction of the daily volume.
From the intraday data, we can also calculate a true VWAP and compare our estimate to that.
Once again, over this period, our TWAP estimate came closer to this value on average than did the opening price. We still see the same tickers over the line since the daily highs that skewed the TWAP estimate were low volume trades.
In ETF portfolios with dispersed trading, TWAP is likely to be a better estimate of execution than VWAP. In particular, when many people use block trades, it matters a lot more what time you executed (TWAP) as opposed to what prices others received (VWAP).
Ultimately, whatever value we use for execution when calculating model performance will never be right for every investor. There will be scenarios in which the bulk of our clients fall on one side or the other, but reducing the likelihood of systematic performance differences stemming from an unrealistic assumption is always a goal.
As far as how much this process minimizes differences between individual account and index performance, only time will tell.
* Maybe that is a good thing considering what happened in the market on August 24th.
Note: This TWAP value is also an approximation since the quality of Google’s intraday data is up for debate. It is incomplete – the intraday volumes are generally less than the full day volume – but there are not many easily accessible sources of intraday data available to the general public. Stooq provides data at 5 minute intervals in an easily downloadable format, but it is also is incomplete on the volume. NetFonds also has data available, but only for a subset of ETFs. You get what you pay for…