# DAMN you OCZ

Discussion in 'Memory and Storage' started by borisvodofsky, Jun 25, 2012.

Actually by statistics, to have a confidence level of 95+-10% for 7million people, you need a sample size of 96.

It goes up sharply if you need 95+-5%, for that you'd need a sample size of 384.

All this does not include high bias trends. If the trending statistics are in the 10% level, you only need 139 to reach a 95+-5% confidence level. At 5% 73.

There are theorems that are proved in probability and statistics for just the sort of thing that we are discussing. But to understand the proofs, you need to understand the subject matter. Some people can pick it up just by reading a textbook or wikipedia article. But based on your attitude, I guess you would need to take a formal course in order to understand it.

Suffice it to say that it can be proved that, given a sample of size N from a population for which each event follows a binomial distribution, that the mean failure rate of the sample has an X probability of being in a certain confidence interval [a, b] of the population mean, where you can choose X yourself depending on what level of confidence you desire (90%, 95%, 99%, etc.) and thereby compute [a, b]. You can look it up under "sampling distribution of the means" and "binomial distribution" for the details.

No need for the peanut gallery to chime in, especially with hugely useless posts.

I'm a network engineer, not a mathematician. That said, let's see from where/who they got this sample, how large it was and from what the overall group size is.

I looked around for a suitably simple tutorial on sample estimation and confidence intervals so that those who are not familiar with the subject could understand it. I found a pretty good tutorial aimed at high school AP students:

http://stattrek.com/estimation/confidence-interval-proportion.aspx?tutorial=ap

Hopefully a "network engineer" should have enough math and logic background to follow it.

Now to take a specific example, let's look at the return rates for a given manufacturer, and say the sample size is 500 (the minimum allowed for inclusion in the BeHardware study). Also assume that the total population of parts produced by the manufacturer is much more than 500 (at least 10 times more) and that the sample had 10 returns for a return rate of 10/500 = 2%

Then just follow the recipe given in the link above. We assume it is a random sample (there is no reason to believe that the manufacturer cherry-picked parts to send to this etailer), there were at least 10 (or 5) failures. The population size is much larger than 500, so we can use the approximate formula for the standard error:

SE = sqrt( p * q / n ) = sqrt( 0.02 * 0.98 / 500 ) = 0.00626

It is convenient to go for a 95% confidence interval (since margin of error is equal to about 2 SEs, or more precisely 1.96 SE)

Then ME = 1.96 * 0.00626 = 0.0123

So the 95% confidence interval is 0.02 +/- 0.0123

Not an especially narrow interval, but that is why the BeHardware study chose 500 as the minimum sample size. Larger samples will have narrower intervals. Also, larger return rates will have intervals with widths proportionally smaller compared to the return rate.

What the math is saying though. It matters not where they got the sample or how large the overall group is; because, if the variation and trend are small, the confidence will be high.

When trend is such low number 0-15%, the sample size needs to be relatively small as well.

Summing up.

100+ samples
1 product
<15% trend
equals
95+-5% confidence

Can we NOT get into a semantic argument over whether statistics is mathematics or applied mathematics? :thumbsdown:

Very true, I know this chick who's a statistician, she's HORRIBLE at math. And she wanted to be an actuary, HAHAHAHAHAHA, maybe if she grinded 9-10 combinatorics books.

This isn't a semantic argument, statistics is very very very very different from real math.

railgun, give up bro, the stat guy won this one. .

Can we NOT get into an argument about the meaning of "semantic"? :thumbsdown:

