- Jun 13, 2004
- 740
- 0
- 76
I wasn't sure where else to ask this although I agree this is probably an odd place for it.
I haven't used any statistics for about 10 years now. I wanted to perform a simple analysis for a project but I was really unsure of where to start. I have about 100,000 data points and would like to use a sample of the data to extrapolate for the entire group. I think I remember in order to make a good assessment I need to perform a test to see how large of a sample size I need. The data is in a sorted list alphabetically. I was wondering if it would be statistically accurate if I took the list and depending on how many I needed I could use every Z # to gather the sample of data from.
Z = (X # of total data points) / (N # gathered from sample size requirement)
I will try to give a general idea on types of data.
This is not the exact same data being used but it is representative of what it sorta looks like in the database. I just want to make some simple assertions like 30% of the people are in Math, 22% are rank A, or 75% of the people are active.
If someone could please point me in the right direction on some basic theories/tests, I would be very appreciative. I looked on wikipedia to see if I could find this information but it seemed there were considerably more complex tests that would be overkill for what I think will be a very simple problem. I would consider myself fairly decent in Mathematics. Just for background information, I have taken courses from stats/trig to Calc II,III, and Differential Equations so I won't be totally clueless if you ask me to go somewhere.
I'm sorry for the long rant, but thanks for your time!
I haven't used any statistics for about 10 years now. I wanted to perform a simple analysis for a project but I was really unsure of where to start. I have about 100,000 data points and would like to use a sample of the data to extrapolate for the entire group. I think I remember in order to make a good assessment I need to perform a test to see how large of a sample size I need. The data is in a sorted list alphabetically. I was wondering if it would be statistically accurate if I took the list and depending on how many I needed I could use every Z # to gather the sample of data from.
Z = (X # of total data points) / (N # gathered from sample size requirement)
I will try to give a general idea on types of data.
Code:
a b c d e
1 North A English Active Boat A
2 South B Science Active Boat B
3 South C Math Inactive Boat C
4 North A Math Active Boat A
5 North D English Inactive Boat C
If someone could please point me in the right direction on some basic theories/tests, I would be very appreciative. I looked on wikipedia to see if I could find this information but it seemed there were considerably more complex tests that would be overkill for what I think will be a very simple problem. I would consider myself fairly decent in Mathematics. Just for background information, I have taken courses from stats/trig to Calc II,III, and Differential Equations so I won't be totally clueless if you ask me to go somewhere.
I'm sorry for the long rant, but thanks for your time!