Why is the normal distribution of such importance to statistical models?

Status
Not open for further replies.

JJChicken

Diamond Member
Apr 9, 2007
6,165
16
81
Question: Why is the normal distribution of such importance to statistical models?

I was first introduced to this brain-wringing distribution back in statistics 101. I was told many real-life phenomena are approximated well by this model and it is used quite frequently across a variety of disciplines. That was that, I didn't think anything more on the topic then.

Fast forward to today, I am in the process of re-starting my journey into the wonderful world of mathematics, specifically within the space of probability measures (pun unintended :p). Being stuck on a question where I have to take a normal random variable and transform X and transform it to Z=e^X, I ask:

Why? Why must I stay up late working on convoluted integrations because of this normal distribution? Why is the normal distribution of such importance to statistical models?

Is it something to do with its tails, its bell shape curve, etc.? Is it by happenstance that the distribution approximates many phenomenon which clump around the middle and taper off at either ends?

Or (hopefully) is there some greater elegance and mathematical purity to this model, independent of its applicability to the real world? Just like how e and pi seem like specific definitions to overcome specific issues in maths (e.g. pi to get the area of a circle) but happen to interwine in many ways in other areas that would not have been apparent at their initial construction.
 

JJChicken

Diamond Member
Apr 9, 2007
6,165
16
81
Well, looks like I found my answer in the Central Limit Theorem.

Happy to hear other thoughts as well.
 

f4phantom2500

Platinum Member
Dec 3, 2006
2,284
1
0
think about it. just look at the Histogram of ProportionOfHeads. Firstly, notice how, as you go to the left or right from the median, 0.5, you get about an equal drop off in each direction. This means that, for a given set of data that is represented by this type of distribution, as you go farther out from the median, the value is less and less likely to occur, about equally in either direction. It's because these types of distributions represent things that are relative. Say you're taking the average height of an elementary school class. Well, all of those kids are about the same height; human genetics are pretty consistent in that there is a fairly narrow range of heights for people at a relatively similar level of physical maturity. Think about the expiration date of your milk. The likelihood of it going bad before the date gets smaller and smaller the farther you go back from the date much faster than it does when you go to the right of the date. This is because they determine the date based on such a model; they determine the average life expectancy of their milk, then put a date far earlier on it, such that you are far more likely to get milk that lasts beyond the date than you are to get milk that expires early.
 

CycloWizard

Lifer
Sep 10, 2001
12,348
1
81
It's often assumed that everything follows a normal distribution because it's simply too time consuming/expensive to collect huge numbers of data points to map other distributions. Since the preponderance of evidence supports the basic assumption that most things are normally distributed, we use that as a rule of thumb until someone demonstrates otherwise. The Central Limit Theorem is basically just a way for us to estimate the parameters (mean and standard deviation/variance) of a normal distribution with a small number of samples, since it's obviously impossible to ever collect the infinitely many data points to exactly map the full distribution space.
 

TecHNooB

Diamond Member
Sep 10, 2005
7,458
1
76
It's often assumed that everything follows a normal distribution because it's simply too time consuming/expensive to collect huge numbers of data points to map other distributions. Since the preponderance of evidence supports the basic assumption that most things are normally distributed, we use that as a rule of thumb until someone demonstrates otherwise. The Central Limit Theorem is basically just a way for us to estimate the parameters (mean and standard deviation/variance) of a normal distribution with a small number of samples, since it's obviously impossible to ever collect the infinitely many data points to exactly map the full distribution space.

This. And gaussian functions are very friendly/easy to use.
 

esun

Platinum Member
Nov 12, 2001
2,214
0
0
Central limit theorem is the primary answer as you suggested. Another reason is that with Gaussian random variables, uncorrelated implies independent. That makes calculations immensely easier in many contexts.
 

bwanaaa

Senior member
Dec 26, 2002
739
1
81
And it is curious how the cauchy distribution looks so much like the normal distribution, but because of its heavier tails, the statistical parameters of mean and variance are undefined. And apparently, this is why the stock market crash was not predicted. The stock market has been shown by Mandelbrot to follow a cauchy distribution but various models (Black-Scholes) have used the normal distribution. (And they won the Nobel for this mistake!)

The wonks of wall street have since blamed physicists and rocket scientists for this tragedy, even though it is they, the propaganda ministers, who made billions using the model while it worked.

This is another example of the misuse of the fruits of mathematicians/physicists labors.
 
Last edited:

C1

Platinum Member
Feb 21, 2008
2,394
114
106
The Central Limit Theorem is basically just a way for us to estimate the parameters (mean and standard deviation/variance) of a normal distribution with a small number of samples, since it's obviously impossible to ever collect the infinitely many data points to exactly map the full distribution space.

The CLT also is a significant techno & philosophical implication. Mainly, the sampling distribution of RVs derived from a mix of processes NO MATTER WHAT THEIR DISTRIBUTIONS! approach Gaussian Normal. Try it yourself & see!
 
Status
Not open for further replies.