Figuring Out the Probability of A Number Sequence

clamum

Lifer
Feb 13, 2003
26,255
403
126
So it's been about 20 years since I had stats class, but I was asked by someone if I could possibly help with figuring out the probability of a certain scenario.

There's 7 columns, and in each column you can have the number 0 - 7,550.

So for example, the numbers might be: 255 - 35 - 1- 0 - 0 - 0 -0

What I'm trying to do is figure out the probability of two lines having the same numbers.

So I think this would be P(total) = P(line1) * P(line2)

And P(line1) or P(line2) would be the probability of a unique line, say that example one above. Since each column can have number 0 - 7,550, and there's 7 columns that have numbers, I think it would be P(line) = 7551 ^ 7 = 1.39968817e27

So that's the probability of a single unique line. So for the probability of two lines having the same numbers, it would be: P(line1) * P(line2) = 1.39968817e27 * 1.39968817e27 = 1.04235006e39

Does that seem right? Don't have my old stats book to check if I'm modeling this right haha, and I dunno if I used the 'e' function right on the calculator.
 

sactoking

Diamond Member
Sep 24, 2007
7,516
2,716
136
You don't need to do the last step. If you're checking that line 2 mirrors line 1 then the probability of getting any specific number combination on line 1 is irrelevant. For your purposes line 1 being 0-0-0-0-0-0-0 has the same relevance as it being 7550-7550-7550-7550-7550-7550-7550 or anything else in between.

It's the same as asking what are the odds that two dice rolled independently show the same number. The answer is 1:6. The first die roll just sets the target number for the second roll.
 

clamum

Lifer
Feb 13, 2003
26,255
403
126
You don't need to do the last step. If you're checking that line 2 mirrors line 1 then the probability of getting any specific number combination on line 1 is irrelevant. For your purposes line 1 being 0-0-0-0-0-0-0 has the same relevance as it being 7550-7550-7550-7550-7550-7550-7550 or anything else in between.

It's the same as asking what are the odds that two dice rolled independently show the same number. The answer is 1:6. The first die roll just sets the target number for the second roll.
So you're saying it's just 1/(7551 ^ 7)? Haha I forgot the "1/" in the OP.

I was about to reply with "that doesn't seem right" cause I thought it would've meant the probability of having one particular line was equal to having 100 of that same line... But yeah, that's the case isn't it?

Thanks for the replies!
 

sactoking

Diamond Member
Sep 24, 2007
7,516
2,716
136
So you're saying it's just 1/(7551 ^ 7)? Haha I forgot the "1/" in the OP.

I was about to reply with "that doesn't seem right" cause I thought it would've meant the probability of having one particular line was equal to having 100 of that same line... But yeah, that's the case isn't it?

Thanks for the replies!
Yes. If your scenario is "what are the odds that two lines are XXXX?" then your original calculation is correct because YOU set the target combination and now have to hit it twice. But in the scenario you described the first line sets the target so you only need to hit it once more.
 
  • Like
Reactions: clamum

Paperdoc

Platinum Member
Aug 17, 2006
2,298
273
126
Look at it this way. Posters above are correct to say whatever the FIRST line does not come into the calcs - it is just the target. So examine the probs on the SECOND line.

There are 7551 possible numbers for the first column, but only ONE of these matches what is above it. So prob of matching in first column is 1/7551. In fact, prob of the one in SECOND column matching the one above it also is 1/7551. NOTE that any of those 7551 numbers can be in any cell - numbers CAN be re-used in the problem as stated. So prob that the numbrs match in BOTH first and second colums is (1/7551)^2. Third through seventh columns is the same, so you DO get your answer: (1/7551)^7, which is the same as 1 / (7551^7).
 
Last edited:
  • Like
Reactions: clamum

clamum

Lifer
Feb 13, 2003
26,255
403
126
Thanks guys, appreciate the help.

So 1/(7551^7) is the probability of two lines having a particular series of numbers in the 7 columns... that means that it's also the probability of, say, 55 lines all having some particular series of numbers too? That's kinda weird to grasp but I do seem to remember that concept from stats and thought it was strange at first back then too.
 
Last edited:

BurnItDwn

Lifer
Oct 10, 1999
26,072
1,553
126
Did you confirm if are all 7 columns random numbers?
If this is for real data, they likely will not be random.
 

clamum

Lifer
Feb 13, 2003
26,255
403
126
Did you confirm if are all 7 columns random numbers?
If this is for real data, they likely will not be random.
Yeah the numbers are theoretically random. I mean if they were seeded by a simple algorithm they're likely not truly random but it was assumed they were random.
 

sactoking

Diamond Member
Sep 24, 2007
7,516
2,716
136
Thanks guys, appreciate the help.

So 1/(7551^7) is the probability of two lines having a particular series of numbers in the 7 columns... that means that it's also the probability of, say, 55 lines all having some particular series of numbers too? That's kinda weird to grasp but I do seem to remember that concept from stats and thought it was strange at first back then too.
The probability of all 55 lines having the same 7 numbers in order (sequence) would be 1/(7551^378).

The first line sets the target sequence. From that point forward each cell of the remaining 54 lines can be viewed as an independent trial to hit the corresponding target number.

EXAMPLE: Line 1 sets the target sequence as 1-2-3-4-5-6-7. Line 2 cell 1 has a 1/7551 chance of hitting the target number (1). Line 2 cell 2 has a 1/7551 chance of hitting the target number (2). The cumulative chance of hitting the targets in Line 2 cells 1 and 2 is 1/7551 * 1/7551, or 1/(7551^2). Line 3 cell 1 has a 1/7551 chance of hitting the target number (1). The cumulative chance of hitting the targets in all three cells is 1/(7551^3).

Because each cell is independent of the others the probability is one in (number of possible outcomes ^ number of trials).
 

Paperdoc

Platinum Member
Aug 17, 2006
2,298
273
126
For your "55 line" query I assume you mean ALL 55 lines will be the same in every column.
For TWO lines the probability was 1/7551^7. For a THIRD line ALSO to match the first the probability likewise is 1/7522^7. Thus the probability that BOTH the 2nd and 3rd lines will match the first is 1/7551^7 * 1/7551^7, or 1/7551^(7^2). For N lines to ALL be the same, it will be 1/7551^(7^(N-1)). Plug 55 in there! I doubt there are 7551^(7^54) grains of sand in the universe, but i have not tried to look that one up on the internet. And there certainly are NOT that many bits in any computer RAM, so don't try to calculate it in Excel!