Quick statistical question - number of data points relation to error

dullard

Elite Member
May 21, 2001
24,998
3,326
126
Suppose I have 10,000 data points and this results in a total error of +- 15 (units are not necessary for this question, but if you really want to know it is microvolts). Now suppose instead I use 100 data points. What is the error I should expect?

It has been ~ 6 years since my statistics class and I think the formula is that error is proportional to n^(-0.5). If that is true then the error with 100 data points would be 10 times greater than with 10,000 data points, giving +- 150.

Can anyone confirm this? Is my memory correct or am I way off?

Yes I can google. I'm busy and don't have time to read through a bunch of websites.
 
Jan 18, 2001
14,465
1
0
hmmm... are you using the term 'total error' or is that from what ever you are looking at?

I recall two types of statistical error.

Standard Deviation is the amount of variance within a sample.

In both cases, increasing the sample size will decrease the measure of variance.

On a more practical level, why do you care? What exactly do you want to do?
 

dullard

Elite Member
May 21, 2001
24,998
3,326
126
Ok now to my poor vocab skills - scratch the words 'total error' if that has a different meaning to you and replace it with the word 'accuracy'. I am reading a thermocouple with a PCI computer card. The PCI card has an claimed error of 0.015% +- 15 uV, based on a sample size of 10,000 samples. Due to practical reasons I cannot read the thermocoupe voltage 10,000 times. Instead I want to read it ~100 to 200 times. How confident can I be in the average of 100 readings?

Some thermocouples are about 60 uV/°C - meaning an error of 15 uV is 0.25°C. Other thermocouples have a reading of about 7.5 uV/°C - meaning an error of 15 uV is 2°C. I need to know exactly what the expected temperature error is, as 2°C is an unacceptable error.

If my formula is correct, then moving from 10,000 sample to 100 samples will increase the voltage reading inaccuracy by a factor of 10 - moving past the 2°C error even with the most sensitive thermocouples.
 

Bekker

Golden Member
Sep 6, 2000
1,330
0
0
If I recall my stat correctly, random sampling error is measured by Z S/square root of n where Z is the value associated with the confidence level you desire and S is the standard deviation of the variable estimated. If so, leaving the other terms constant in the equation, a sample of 10,000 would result in S / square root of 10,000 while a sample of 100 would result in S / Square root of 100. This might or might not have a substantial influence on your error because if S is very small, the multiplier will be very small in either case. Usually we decide the error we are willing to tolerate and then compute the desired sample size. Have no idea if this helped at all. Good luck.

Bekker
 

Reel

Diamond Member
Jul 14, 2001
4,484
0
76
I think in pure statistical methods, the distribution matters. A rough estimate that tends to work is just take the square root of the number of samples. I think it is only accurate for a Poisson distribution but close for others. I don't know if I am fully accurate because my statistical skills have always been a little shaky.
 

Lynx516

Senior member
Apr 20, 2003
272
0
0
Random noise is always normaly distributed. I could dig up the info you need from my DSP book but I cannot fnd it at the mo
 

dullard

Elite Member
May 21, 2001
24,998
3,326
126
Originally posted by: Lynx516
Random noise is always normaly distributed. I could dig up the info you need from my DSP book but I cannot fnd it at the mo
So is my formula correct for a normal distribution?
 

Bekker

Golden Member
Sep 6, 2000
1,330
0
0
If you look at the equation I gave you above, it would seem that your comment that the error would be 10 times as great is correct.
RSe = Z value S/Sq rt of n, thererfore, for any given Z level with S constant, the sq rt of 10000 is 100 and the square root of 100 is 10.
 

TuxDave

Lifer
Oct 8, 2002
10,572
3
71
When you doing your measurements, are you taking the rms of your errors? Or are you taking the upper and lower bounds of your error?
 
Aug 14, 2003
44
0
0
Ressurected thread. :)

By 'total error' I'll assume you mean 'margin of error'. (?)

So it must be known, or we have good reason to assume, that microvolts are distributed approximately normal?? I have no idea about microvolts. :)

The general expression for upper and lower bounds on an estimate of a 'true' average, of something that is normally distributed, is: sample ave +- z*SE, where SE is the standard error, and z is from a standard normal distribution and is a function of how 'confident' you want to be. In this case, SE = s/sqrt(n), where s is the standard deviation and n is the sample size.

Assuming z and s are fixed (so z*s = k, just some constant), which might not be realistic, you first have:

ME = zs/sqrt(10000) = k/sqrt(10000) = 15, so k = 1500

and then taking another sample, you get that

ME = k/sqrt(100) = 1500/10 = 150.

So it does look like the margin of error is inversely proportional to the sqrt(n) as some have mentioned. :)