# Gauss’s Fat Tails

September 24, 2010 2 Comments

The term “fat tails” is thrown around with what I consider reckless abandon. Most times I find that people use it without having an appreciation for what it really means and then they make the wrong conclusion. So, I’m going to take a stab at explaining what I think a correct interpretation is. The first and most prevalent wrong conclusion is that all quants and financial models underestimate the risk of extreme events. Wrong – there are plenty of ways to reasonably and accurately model tails. The second is that all models use the Normal Distribution. Wrong – there’s a host of distributions that can be and are used. A corollary to that wrong assumption is that VaR (Value-at-Risk), in particular, always uses the Normal Distribution, which is also wrong. VaR makes no assumptions about what distribution is used, but I’ll have an entirely different set of posts about VaR. This post will be part one of what may be a series of posts on the topic of fat tails in particular, and this one will be limited to discuss actual data and comparing it to the most commonly used distribution, the Normal curve.

Let’s start with a look at the daily returns of the S&P 500 Index. I gathered the prices of the S&P 500 from Oct 20, 1976 (that’s an arbitrary date – nothing special about what was going on in the markets at that time) using the Investor Analytics database, which is a cleansed composite of a number of different data providers. That means we make sure that the data is good. The daily return is computed by taking the percent change from the prior day’s price (I’m lying – we actually calculate log returns, but that’s a small technical difference). The daily returns are shown in Figure 1 (click to enlarge) for each day from Oct 20, 1976 to Sep 20, 2010:

The very first thing you’ll probably notice is that really really bad day back in 1987. On October 19, 1987 the S&P lost 22.90% in a single day. Another thing you’ll notice is that most days are very boring: 1% swings up and down. You’ll also notice that the volatility of the returns comes in bunches – after the volatility of the late 1980’s comes a “quiet time” during the early 1990’s, followed by a clear zone of increased volatility near the turn of the century/millennium, followed by another quiet time that was rudely interrupted by 2008. So from this graph, we can tell that the US equity market’s volatility comes in clumps – with periods of high volatility interspersed with periods of low volatility.

Another way that we look at the exact same data is to put it into a histogram, which does ‘nothing more’ than bin up the returns to see how often different returns occur. Figure 2 (click to enlarge) shows the same exact data in histogram form: this time, the range of returns is shown on the x-axis and the number of days with that return is shown on the y-axis.

As before, most days are really boring: the bin right around zero has the most number of days. I limited this graph to returns between -10% and +10% because that’s where the severe majority of returns is. In fact, there is only 1 day out of the 8,633 days in this data that has a return worse than -10%, and we’ve already talked about that really really bad day. This histogram reveals a bunch of other information that the time series doesn’t reveal: first, it tells us how much less likely the bad days are than the good days. Just look at how few days have returns worse than a 5% loss. A second thing it tells us, remarkably, is that returns are symmetric: a big return is just about as likely as the same sized loss. This is one of those things that a professor of mine, Dr. Soven, used to call “a minor miracle.” In physics, symmetries are very important because they’re a big clue about the underlying structure and dynamics of what’s causing them. More on that in future posts.

In finance parlance, the parts of this histogram distribution that are worse than, say, a loss of about 3% (or, symmetrically, the parts that are better than, say, a gain of about 3%) are called the “tails” of the distribution. I’m picking that number totally arbitrarily – the tails might begin at 2% or at 4%. The point is that tails are the less frequent and larger moves. It’s fuzzy where they begin. Unfortunately, their shapes are fuzzy and exactly where they end is fuzzy. Just to get a better appreciation of the tails, let’s zoom in on the left hand tail for this S&P 500 data set.

Figure 3 shows the loss tail in greater detail, with the number of small losses like -1% or -2% literally off the scale. You can see that the frequency of these extreme losses drops rather quickly – while a 5% loss happened about 20 times since 1977, losses of 9% or 10% are much rarer, having actually happened only a handful of times. Interestingly, we’ve never had a day where the S&P lost 11%. Or 12%. Or 13%. Those huge losses have just not happened. Not even once. So is it safe to assume that we don’t have to worry about such big losses? Is it safe to assume they have no chance of happening? Or is it better to assume that it’s just that we haven’t seen them yet, and given enough time, we will.

What about that really really bad day – way out there at -23%, all by itself. Some statisticians would call that an “outlier” – an abnormality which is clearly outside the expected range and therefore, somehow, ignorable. Everybody else calls this a crash. Modeling such events is a Big problem. Plenty of future posts on that topic.

The benefit of a histogram like this is that it calls your attention to how often certain things happen and shows you how much more likely common things (like small returns) are than the rare events (like large gains or large losses), and the shape of that relationship. But histograms only tell part of the story, and they hide some important information. One of the downsides to looking at a histogram is that it totally mixes up time – you have no idea if those 9% losses all took place one day after another or if they were years apart. And that little detail can matter – a lot.

So what does all this have to do with fat tails? Well, it has to do with the modeling that’s done to try to replicate the return distribution (the shape of the whole histogram shown in Figure 2). Financial Engineers analyze the data and come up with an equation that fits the data as closely as possible, and the most common choice of equation is called the Normal Distribution or the Gaussian Distribution, named after German mathematician Carl Frederich Gauss. The result looks like what’s shown in Figure 4.

In this Figure 4, the data are shown in orange dots and the Normal curve is shown in the black line. Let’s ignore the red lines for now. I’ll go into the mathematics of the Normal Curve in a different post, but for now just take a look at how incredible well this one relatively simple equation matches the data. I mean, really, it’s beautiful. It reproduces the peak and matches a severe majority of the data really well. For this data set, 95% of the returns are in the band between -2.1% and +2.1%, shown in the red vertical lines in the figure. And in that band, the black curve describes the data really well. The fact that it underpredicts the data points outside that band, which is especially obvious between 2% and 4% in this plot, is what’s called a “fat tail.” The data’s tail is fatter than the theoretical curve’s tail. In other words, the calculation underpredicts the frequency of large losses. There are two points I want to hammer home about this: 1) the Normal curve works remarkably well for 95% of the data, and 2) from this graph, you cannot tell how badly it underpredicts the tails. Let’s take them one at a time.

**The Normal curve is great for 95% of the data**. Another way of saying this is “under normal circumstances, the Normal curve is a good description of the data.” Well, duh. That’s exactly what you expect of something called “Normal.” But I’m serious: this curve is good at this job so long as you’re limiting yourself to “most of the time,” and most can be as big as 95% of the time. **The problem is when you want to know about that remaining 5%.** What happens 96% of the time? 98% of the time? There, as you can tell from this graph, the curve underpredicts the actual data. Using it will underpredict the real likelihood of those losses. But, how bad is this underprediction? Can I get away with using it? From Figure 4, it doesn’t look like it’s off by much.

Let’s take another look at the same data, but this time using a logarithmic scale. Figure 5 shows the same data as Figure 4, but using a log scale. This means that every major tick on the y-axis doesn’t represent a additional amount (as in “add to the previous amount”), but rather, it represents an additional factor (as in multiply). Each tick is 10 time bigger (if reading up) or 10 times smaller (if reading down). This is a common trick scientists use – if you really want to see if a theoretical curve matches data, use a log plot. It picks up the differences really nicely. Think of it as a magnifying glass. You see things you can’t see otherwise. And in this case, what you see is *BAD*. The Normal curve actually falls off a LOT faster than the data does. Look at that tail — the data show relatively constant probabilities of 1/100% or 1/50% or something like that, but the theoretical curve drops like a rock. Take another look at Figure 4 (above), and notice that you just can’t tell if the data matches the curve – they’re both so small, you just can’t tell. But in the log plot of Figure 5, you can see quite clearly that the prediction is *WAY* off.

Just for fun (and yes, this is fun for me), I replotted that data so we can see just what the theoretical Normal curve prediction is for a 5% down day. That plot, Figure 6, tells a very bad story. The probability according to the data of a 5% down day is about 1/10%, or about 1 day in 1,000. That’s about once every 2.7 years. [One of the difficulties with interpreting such low statistics is that the very next worse point, at -5.5% has a probability of 1/100%, or once every 10,000 days = 27 years, but the next point jumps back to 1/10% (1 day every 2.7 years). So, the probability of a 5% down day is somewhere in the range of 1/100% to maybe 1/10%]. What’s the curve predict? Well, 1/10000000000% or 1 day in 1,000,000,000,000 days. That’s 1 day in every 2.7 billion years. Earth is about 4.5 billion years old. So that means that we should have had only 2 such days since the Earth formed. After you recover from thinking about this fact, take a look at the shape of the curve in the plot and how much lower it’s got to go before it it manages to predict any of the even worse returns, like -7% or -9%. Actually, the Normal curve predicts that we should never have seen even a 6% down day since the Big Bang, 14 billion years ago. Pause. Breathe. Recover. What do I conclude from this? This: using the Normal curve in these tails isn’t just wrong – it’s absurdly stupidly wrong.

Things to take away from this whole post: **the Normal curve is actually really really good at modeling financial returns inside the 95% line**. I mean that. And it’s not a trivial thing. That means it works most of the time. Seriously. And we shouldn’t throw that out just because it doesn’t do a good job in the tails. After all, “most of the time” happens, well, most of the time. And the Normal curve is good for giving us a distribution that works “most of the time.” The other point is that for those unlikely events outside the 95% band, the Normal curve isn’t just a little inaccurate. It’s fundamentally and totally wrong. Put another way, **you cannot use the normal curve to model risk in the tails. Period.**

This probably doesn’t change point of the discussion, but shouldn’t the changes in in stock prices be more like lognormal rather than Gaussian (or the logarithm of the stock prices be Gaussian)?

Perhaps the tails arise from the fact that the distribution is really a superposition of Gaussians with different sigmas, indicating the changes in volatility. It would be highly interesting to see the sigma distribution by taking time bins and doing Gaussian fits to them. The optimal width of the time bin (upper limit) may be determined by the chi-squared test of gaussian fits to the tails of the histogram. Intriguing!