2 puzzles for you

Daverino · Oct 19, 2012

I can get puzzle one down from 7.88861E-31 to 5.14164E-24

Something tells me that's not the best case

Cerb · Oct 19, 2012

Well, isn't that some WTFery. I got it...but not by careful mathematical reasoning, or anything. I had been doing basic partitioning, pretty much, based just on the first locker's, or set of lockers', contents. I've got 31% at 3M trials, and counting, now.

Now that I got that, then Googled it, I'm left like

. I don't get it. Somehow, it partitions the range of choices, but I don't really get how.

Mr. Pedantic · Oct 19, 2012

Holy shit.

Googled it, and I have to say, I would literally have never figured that out in a million years. I don't understand how the optimal strategy is any better than picking lockers at random.

iCyborg · Oct 19, 2012

eLiu said:
Yeah in an interview, I don't think anyone expects you to get as far as proving the strategy is optimal. But for the level that these interviews are testing for, being able to find the solution (typically at least with a couple hints) and analyze the result (i.e., show what the probability of success is--which just requires some discrete math/probability) is reasonable to expect. I saw this problem in a software dev (new grad) interview in SF/south bay area (like Google, Dropbox, FB, etc. but not actually one of those 3). I mean if companies are offering 120k+ salary to new grads, the interviewees better be smart and so there will usually be some tougher questions to gauge that.

Personally I'm not sure if I like this problem as an interview problem. Most people I've shown this problem to end up having more of an "aha!" moment. There are definitely reasonable mathematical/algorithmic approaches to apply & reach the solution, but I find it difficult to give hints that aren't too strong while still being helpful.

Sometimes it seems to me that these types of puzzles are more about testing someone's background than problem solving skills. Some of you might know about this: IBM runs a monthly puzzle challenge (google 'IBM ponder this') where every month they post a puzzle/problem and people send solutions. Some of the puzzles seem technical or just not appealing, so I don't bother every month, but many are quite amusing. And I also feel that often it depends on 'background knowledge'. Two cases to illustrate:
1) Last month was a nice problem, basically a 2-player game. I was trying to formulate a hypothesis based on a couple examples and prove by induction, but it didn't work (the hypothesis was wrong, it turns out there's only a conjecture for the sequence, so induction has very slim to no chance of success). Ultimately, I couldn't solve the problem. Looking at the solution, they noticed that this is a game theory problem whose solution is described by Sprague-Grundy theorem. If you know this theorem, it's not hard to solve the problem. If you don't, unless you're a Sheldon Cooper type of a guy, it's gonna be very tough to solve it from scratch.
2) In June 2012 they also had a nice problem, and I recognized it as an example of Nim game. Nim game has a neat solution, you can check on wiki, and it's easy to apply it to IBM's problem. But if you look at how algorithm for solving Nim goes, you'll see that it's not something that one would come up with easily. If I had not known about Nim, I wouldn't have been able to solve it.

This brings me to the problem 2). I still don't see how to move from 1 as the worst case, and I tried a couple approaches, but they don't work for all inputs. Nor can I see how XOR would help. But, related to the above digression, I can't help to feel that someone well versed in Shannon-type of information theory and theorems would be much better equipped to tackle this.

iCyborg · Oct 19, 2012

Mr. Pedantic said:
Holy shit.

Googled it, and I have to say, I would literally have never figured that out in a million years. I don't understand how the optimal strategy is any better than picking lockers at random.

I know. And problem 2) is supposedly harder

Mr. Pedantic · Oct 19, 2012

iCyborg said:
Sometimes it seems to me that these types of puzzles are more about testing someone's background than problem solving skills. Some of you might know about this: IBM runs a monthly puzzle challenge (google 'IBM ponder this') where every month they post a puzzle/problem and people send solutions. Some of the puzzles seem technical or just not appealing, so I don't bother every month, but many are quite amusing. And I also feel that often it depends on 'background knowledge'. Two cases to illustrate:
1) Last month was a nice problem, basically a 2-player game. I was trying to formulate a hypothesis based on a couple examples and prove by induction, but it didn't work (the hypothesis was wrong, it turns out there's only a conjecture for the sequence, so induction has very slim to no chance of success). Ultimately, I couldn't solve the problem. Looking at the solution, they noticed that this is a game theory problem whose solution is described by Sprague-Grundy theorem. If you know this theorem, it's not hard to solve the problem. If you don't, unless you're a Sheldon Cooper type of a guy, it's gonna be very tough to solve it from scratch.
2) In June 2012 they also had a nice problem, and I recognized it as an example of Nim game. Nim game has a neat solution, you can check on wiki, and it's easy to apply it to IBM's problem. But if you look at how algorithm for solving Nim goes, you'll see that it's not something that one would come up with easily. If I had not known about Nim, I wouldn't have been able to solve it.

This brings me to the problem 2). I still don't see how to move from 1 as the worst case, and I tried a couple approaches, but they don't work for all inputs. Nor can I see how XOR would help. But, related to the above digression, I can't help to feel that someone well versed in Shannon-type of information theory and theorems would be much better equipped to tackle this.

It's also a test of attitude - do you fold and give up easily, or do you keep going at it, even if you're completely wrong? On a more oblique slant, it's also a test of exposure - maybe the assumption is that their ideal employee would seek out this kind of problem as a challenge, so if you are motivated, passionate, and inspired by this kind of work, you should have seen this kind of problem before.

Cerb · Oct 19, 2012

Mr. Pedantic said:
Holy shit.

Googled it, and I have to say, I would literally have never figured that out in a million years. I don't understand how the optimal strategy is any better than picking lockers at random.

I would have never figured it out either, and was going for 0.0000000x or so, so that it could easily be shown by experimentation (that would still be an amazing improvement v. random). I still haven't figured it out. I just thought of it as something different to try, and the code is pretty short and easy.

eLiu · Oct 19, 2012

Pertaining to the value of puzzle 1 as an interview problem...
Yes, most tech interview questions assume you have a certain level of background. I mean I was asked things like "determine if a binary tree is a BST" or "give a dynamic data structure that can return the median in O(1) time" or "find whether a given string of characters (no spaces) can be partitioned into sub-words given a dictionary: e.g., TheHappyCow can be split, Shortest cannot. Note you cannot use a greedy strategy: unlocked cannot be split into 'unlock' and 'ed' ". All of these assume things you'd see in an algorithms class or after reading a book like CLRS (Intro to Algos). Going into a tech interview and saying you don't know what a BST is or what a hash table is will usually reflect poorly on you (but not always--if the interviewer can give you the definition & you run with it immediately).

But none of them are like direct applications of just 1 data structure or algorithm. They usually require some reduction first and then the application of 1 or 2 relevant ideas. These questions are typically aimed at trying to explore how you think. Only rockstar candidates will sail through every problem without having to think; most people will get hung up. And then you back and forth with the interviewer, describing your thought processes & ideas. The interviewer will offer some commentary (e.g., counter-examples) and help guide you along. You can ask for hints/help if you get really stuck. Solving the problem isn't always critical; they really want to see you struggle with a bite-sized problem and how your problem solving process works.

I'll explain problem 1 a bit more in a spoilers... maybe it'll be helpful to someone. But the first problem is really meant for someone with a strong CS background (at least an advanced undergrad algorithms course)

Hint for those who don't want a solution:

In two solutions discussed so far (each opens a pre-determined list of lockers; each chooses 50 random lockers), the prisoners are never looking at the numbers in the lockers that they open. Is there some additional information that we've been ignoring? Why or why not?

And

Can this problem be represented with a graph?

SOLUTION:

First, this problem is probably nearly impossible if you've never seen graphs before. So I'm going to assume that you have... read wiki otherwise.

Quick summary of the solution: each prisoner begins by opening the locker corresponding to his number. So prisoner 2 opens the 2nd locker. Then he looks at the number inside, and opens that locker. So if prisoner 2 opens locker 2 & sees 50, he then opens locker 50. Repeat until he finds the number 2 or he dies. Every prisoner does this. That's the strategy; now I'll try to build some intuition for it.

Ok, let's number the lockers from 1 to 100 (say left-to-right). Inside each locker is a number, 1-100. Let each locker be a node in a graph. Suppose locker i has a number j in it. Then add a (directed!) edge from i to j in our graph. Note that it is possible for a node to have an edge to itself (i.e., happens if locker 10 has the number 10 in it).

This graph has the same structure as a linked list with back-links and with possibly multiple heads. (Back-link being when a later node points back to an earlier node). Multiple heads is easy to see: suppose every locker contains its own number (1 has 1, 2 has 2, etc). Then the graph is 100 independent nodes that just point to themselves.

Try picking a few random orders of numbers from 1-10 and drawing this graph.

Now consider the cycle decomposition of our graph. To form this, enumerate all the cycles in the graph. For each cycle, list out the vertices in that cycle. The cycle decomposition is a set of subsets of graph vertices, where each subset of vertices forms a cycle in the original graph. Note that since there are *no duplicate* numbers, each vertex is a member of ONLY one cycle. Some examples:
A) Suppose each locker contains its own number. Then the cycle decomposition has 100 subsets; each subset is only 1 vertex.
B) Suppose each locker contains the number of its immediate right neighbor (with 100 wrapping back to 1). The cycle decomposition has one member subset, and that subset has size 100.

You should see now that having each prisoner start with his own locker is important. Think back to the fact that each vertex is a member of only one cycle. The prisoners cannot start in a random place b/c he cannot guarantee that any other locker is in the same cycle as his locker. (Consider the case where every locker has its own number in it.)

Explained a different way, say I'm prisoner 1. I'm looking for the number 1. Locker 1 (in the graph) is a member of exactly one cycle. Thus as long as I start on vertex of this cycle, I will eventually find the number 1. Why? Because SOME vertex in the cycle MUST contain the number 1! (By the definition of cycle.) The only locker that prisoner 1 knows is on the cycle he's looking for is locker 1. He could pick randomly, but then he'd have to be lucky enough to find another member of his cycle (which could be impossible if locker 1's cycle has size 1).

Ok, so now that we know that each locker is a member of exactly one cycle, under what conditions can the prisoners always succeed? Well, if there are NO cycles with size greater than N/2, the prisoners will always win. Why?

Every prisoner knows how to find the cycle containing his number: he starts with the locker with the same number as him (prisoner i opens locker i first). Thus if we guarantee that no cycle has length > N/2, then all the prisoners will succeed. If the max cycle size is N/2, then no matter where you start in the cycle, you will open at most N/2 (=50) lockers before finding your own number. So the probability of success revolves around computing how many orderings of the numbers 1-100 do not have cycles of length > N/2.

Now with a counting (permutations) argument, we can show that the probability that this happens approaches 1 - ln(2) = 30.6% as N->infinity. That's actually pretty amazing: not only can the prisoners succeed nearly 1/3 of the time, but the number of lockers & prisoners is irrelevant!

I'm not going to repeat the details of the proof here b/c there's nothing conceptually interesting there. Google if you want to see it or try and work it out.

Why is this better than having each prisoner choose 50 random lockers? Well, in that naive solution, the prisoners are acting with 0 information.

BUT there's actually A LOT of information in this problem! In particular, there are 100 lockers, and they contain numbers from 1-100. Thus there is an underlying graph structure (really even simpler than a general graph since each node has exactly 1 outgoing and exactly 1 incoming edge). And taking advantage of this structure is the key. So even though the prisoners cannot talk to each other, there is a wealth of information that they can derive on their own. You should now see why things like order of prisoners and whatnot totally don't matter since every prisoner truly does act 100% alone.

This is incidentally why I like this problem. The answer forces you to really explore a multitude of CS tools. Even better, it's a graph problem. There are SO MANY real-world scenarios that can (and should be) reasoned about with graphs. They can show relations, states, transport, the list is endless. I'll quote Skiena, author of "The Algorithm Design Manual" (great book btw):
"The key to solving many algorithmic problems is to think of them in terms of graphs. Graph theory provides a language for talking about the properties of relationships, and it is amazing how often messy applied problems have a simple description and solution in terms of classical graph properties."

If the lockers contained a random number from 1 to say 2*N, then this strategy would fail.

Here's a similar problem that uses the same technique: say I give a list of N numbers in the range [1,N-1] with exactly 1 duplicate (i.e., each number is in there at least once). Find me the duplicate in linear time with constant memory (so you cannot sort & binary search; you also cannot hash). One approach is to XOR all the numbers in the list together, then XOR the result with all the numbers in the range [1,N-1]. Since x^x = 0, and the duplicate is the only number that would be XOR'd an odd number of times, you'll find it.

But can you think of a way to do it using an approach similar to the solution to puzzle 1?

As for the 2nd problem,

iCyborg said:
This brings me to the problem 2). I still don't see how to move from 1 as the worst case, and I tried a couple approaches, but they don't work for all inputs. Nor can I see how XOR would help. But, related to the above digression, I can't help to feel that someone well versed in Shannon-type of information theory and theorems would be much better equipped to tackle this.

The general form of this problem is given a (2^N - 1)-bit message from alice to bob, how many bits of information can you transmit to Bob by flipping at most 1 bit of Alice's msg?

Comment: I don't think either of Shannon's main theorems can be applied to this. They dealt with compression bounds and noisy channels. Maybe you can do something with the compression bound... not sure. That said, I've had minimal exposure to information theory so I really don't know. Regardless, anyone who's seen info theory at all before should probably guess the answer is related to log2(numBits), which is accurate; see next spoiler box. Not that knowing this helps you much...

Hint 1: The answer is N bits, i.e., log2(numBits + 1). So with Alice sending a 31 bit msg, I can always send 5 bits.

We already know how to solve this for N=1. Try it for N=2. How many bits can you send?

Hint 2: For CS/Info-theory guys, try reasoning w/the Hamming Distance. If you don't know what that is, then consider this: Alice's message is 3 bits. If Bob can flip at most 1 bit, starting from the initial message, how many different messages can Bob create?

NOTE: This technique readily solves N=2 and you can hack out N=3, but it gets progressively harder to generalize. You could write a computer program, but there is a much simpler way for any N. But maybe there are patterns in the smaller N that might help? The point of this exercise is to show you that you can definitely send more than 1 bit.

Hint 3: If you're still stuck after Hint 2, consider that you could have multiple 3-bit messages map into the same 2-bit messages. These can be statically arranged with Bob beforehand. For example, you could tell Bob that receiving 000 or 111 both mean 10.

Cerb · Oct 19, 2012

Explained a different way, say I'm prisoner 1. I'm looking for the number 1. Locker 1 (in the graph) is a member of exactly one cycle. Thus as long as I start on vertex of this cycle, I will eventually find the number 1. Why? Because SOME vertex in the cycle MUST contain the number 1! (By the definition of cycle.) The only locker that prisoner 1 knows is on the cycle he's looking for is locker 1. He could pick randomly, but then he'd have to be lucky enough to find another member of his cycle (which could be impossible if locker 1's cycle has size 1).

I still don't get that, though--

how, with the lockers randomized, his own number has a better chance. FI, #1's search could very well go like so: 1=5, 5=12, 12=90, 90=90--notice, that cycle does not contain his number--or like so: 8=53, 53=2, 2=8; or like so: 72=50, 50=3, 3=1. However, randomizing the first locker makes the chances go to pot, again.

Mr. Pedantic · Oct 19, 2012

Regarding problem 1:

I don't understand how prisoner i can guarantee that cycle i is on the same cycle as the locker with the number i inside.

EDIT: It appears Cerb has the same problem as me

Venix · Oct 19, 2012

Cerb said:
I still don't get that, though--

how, with the lockers randomized, his own number has a better chance. FI, #1's search could very well go like so: 1=5, 5=12, 12=90, 90=90--notice, that cycle does not contain his number--or like so: 8=53, 53=2, 2=8; or like so: 72=50, 50=3, 3=1. However, randomizing the first locker makes the chances go to pot, again.

Your example is not possible. Lockers 12 and 90 both contain 90.

Since each number appears only once, only the first locker opened can contain its own number.

Cerb · Oct 19, 2012

Venix said:
Your example is not possible. Lockers 12 and 90 both contain 90.

Since each number appears only once, only the first locker opened can contain its own number.

I rationally knew they were shuffled, but the distinction wasn't making its way as far as it needed to. Being shuffled, all lockers and their numbers can be treated as having been swapped. Having been swapped, no matter how matter how many times they've been swapped, they will always create cycles. Also, due to the equivalence to being swapped, no cycles will cross paths. Each path taken will either be a subset of some cycle, or never make it to detecting the cycle.

eLiu · Oct 20, 2012

Cerb said:
I still don't get that, though--

how, with the lockers randomized, his own number has a better chance. FI, #1's search could very well go like so: 1=5, 5=12, 12=90, 90=90--notice, that cycle does not contain his number--or like so: 8=53, 53=2, 2=8; or like so: 72=50, 50=3, 3=1. However, randomizing the first locker makes the chances go to pot, again.

First, I'm going to explain how you count cycles of size k with k > N/2. I wrote this up and then realized I don't actually need it to answer your question, but maybe it'll be helpful to others.

ASSUMING k > N/2,
With N numbers, the number of cycles with length EXACTLY k is: (n C k) * (k-1)! * (n-k)!. (n C k) means n Choose k; this is the number of ways to pick the k cycle members out of N. Then there are (k-1)! ways to arrange these k cycle members and (n-k)! ways to arrange the remaining n-k numbers.

You might wonder why it's (k-1)! instead of k! ways to arrange the cycle members. A k-length cycle actually has k-1 degrees of freedom in it. To see, this, consider a cycle of length 2: A contains B, B contains A. There's only ONE way to order this. You can think of arranging a k-length cycle as fixing 1 node and moving the other k-1; this is entirely equivalent b/c you always have to be able to get back to the fixed node. If you counted k!, then you'd include all the k-1 orderings where the fixed node is not in a cycle.

If you simplify that, you get: n!/k

You might also wonder why this screws up for k <= N/2. The formula overcounts in that scenario. This should be obvious if you set k=1. It would claim n! ways to get a cycle with length exactly 1. But this cannot be the case b/c there are n! possible arrangements in total, and you can get cycles of length n (which have 0 cycles of length 1). So what happened? The (n-k)! term is the problem. With k > N/2, there can only be ONE cycle of such size. With k<=N/2, the remaining n-k terms could have arrangements that are ALSO cycles of size k. And so you will count those repeatedly and over-estimate.

Ok, now to address your question. This is actually a pretty subtle point and I didn't even realize how subtle it is until I started trying to explain it.

First, the short summary: if each inmate randomizes his first locker choice (instead of picking his own), the result is actually WORSE than everyone picking 50 random lockers. (This is assuming that you can open the same locker more than once and have it count against you.)

If you add the condition that inmates detecting a cycle will choose a new random locker not already visited (so you don't have inmates that get stuck opening the same lockers over and over), then you do better, but the probability of success will still go to 0 as n->infinity.

If that doesn't make 100% sense, then read on b/c I'll explore the possibilities.

First, let's consider the model where inmates choose an initial locker at random and if they end up in the wrong cycle, they die. (i.e., they keep opening the same doors over and over.) I'm analyzing this b/c it is MUCH easier and motivates the obvious modification of keeping track of which lockers you opened & choosing a new one at random if you find a cycle.

With this strategy, an inmate succeeds only if he randomly picks a locker that is at most 50 lockers away from the one his number lies in. (If he picks a locker in a completely different cycle, I call that distance infinity away). I want to consider a few examples of how having each inmate randomize his first locker choice performs.

--Let's say inmate i is going and his number sits in a cycle of length 1 (i.e., locker i contains i). Then 99/100 times, he fails since he MUST pick locker i to succeed.
--Now consider the number of inmate i laying in a cycle of size 50. Here he succeeds 50% of the time since ending up in his cycle = success, failure otherwise.
--Now consider the number of inmate i laying in a cycle of size 75. Here he succeeds 50% of the time since only 50 lockers total lie within a distance of 50 trials to the correct locker.
--Now consider there only being 1 cycle of size 100. Same as 75, 50%.

This is easy to generalize. If inmate i's number is in a cycle of size k such that k < n/2, then he succeeds with probability k/n. Otherwise he succeeds with probability 50%. Thus the BEST a single inmate can do is 50%, which is the same as picking 50 lockers at random. But sometimes, he can do WORSE!

INTUITION 1: The intuition here is that by picking at random, the inmates will end up looking at lockers that have no possibility of leading to their number. So these trials are completely wasted.

And we need one last step b/c the above analysis only applies to ONE inmate. When each inmate picks 50 lockers at random, we established that the probability of success is 0.5^100. Here again, the inmates are acting independently. And in fact they do WORSE than 0.5^100, because the probability of success AT MOST 50% (but often less!)

INTUITION 2: Following intuition 1, needing all 100 inmates to do it really screws us b/c they will all end up looking repeatedly at lockers that will not help their search.

INTUITION 3: Thus the reason the optimal strategy is better is because it (generally) correlates the success of one inmate to another! (This is shocking b/c the inmates cannot give each other any information!) Why? Say inmate i opens his locker & conducts the process, finding his number after 50 tries. He's now seen the numbers of 49 other inmates. If he could tell those inmates to do something, what would he tell them? He might say "start at my number, look inside, go to that locker, open it, repeat." Then those 49 other inmates would find their numbers within 50 tries too. BUT he doesn't have to do this! Using the optimal procedure, those 49 other inmates will ALL find their numbers successfully without having to care about what the other inmates did. So even though the inmates can't talk, almost the end result is almost as good.

The opposite of this is when the success of inmates is decorrelated (BAD). When decorrelated, inmates must all act independently. And since there is no information, for any given arrangement, we can compute the expected success rate of each individual inmate, c < 1. Since they act independently, the total success is c^n, which goes to 0 as n->infinity.

Ok, now let's look at the modification. If I find a cycle (and not yet found my number), I pick a new locker at random that I haven't opened. For example, say I'm prisoner 1 and I get:
4=50, 50=2, 2=4. Now I've opened 3 lockers and I know 4, 50, and 2 don't contain my number. So I'll pick a new locker from the remaining options.

This is clearly better than just dying if you encounter a cycle b/c it lets you at least use all 50 tries. Whether this is better than picking 50 lockers at random, I'm not sure. I kind of suspect that it'll be the same, actually. I'm not sure how to prove this simply but it seems to work out for all combinations of 1, 2, and 3 cycles. Let's look at 2 cycles briefly. Say we have one cycle of length k and one of length n-k. Without loss of generality, let k < n/2. Then the k inmates with numbers in the k-cycle succeed with probability k/n. The n-k inmates in the other group succeed with probability
So the chance that a random inmate succeeds is k/n*k/n + (k/n*(n/2 - k)/(n-k) + (n-k)/n*(n/2)/(n-k)))*(n-k)/n = 1/2 = 50%.
You can get the same result with 3 cycles. So it's possible that this ends up being the same as choosing 50 lockers at random but I haven't proved it.

But that's pretty irrelevant insofar as the actual probability of success with the modified method STILL goes like c^n, where c < 1 (and I'm pretty sure c = 1/2). Thus as n grows, the probability of success goes to 0.

With the correct solution, the probability of success goes to a constant, 1-ln(2), as n goes to infinity. Again, the intuition here is that with the random choice, you will end up with a lot of inmates exploring sets of lockers that cannot possibly contain their number (when cycle size is small), and finding their number with a small probability when cycle size is large.

Does that answer your question?

Mr. Pedantic said:
Regarding problem 1:

I don't understand how prisoner i can guarantee that cycle i is on the same cycle as the locker with the number i inside.

EDIT: It appears Cerb has the same problem as me

Actually, Cerb's concern is different from yours.

He's questioning why starting with your own locker gives the best chances as opposed to starting with a random locker. You're questioning why starting with your own locker means you're guaranteed to find your number (but possibly not within 50 tries).

So let's look at your question. First, read below in the blurbs following "And finally, to anyone...". That establishes why each locker is a member of only 1 cycle. I had a different explanation earlier but I like Cerb's more so I killed mine.

Once you understand that, the answer to your question is straightforward. Say I am prisoner i and I start at locker i. Locker i is a member of exactly one cycle. A cycle is like:
i->b->c->d->i
So we observe: being a member of a cycle means that I MUST be able to get back to my starting location. If I start in locker i, by traveling along the cycle, I will eventually get back to locker i. This means that some other locker contains i. Hence by starting at locker i, inmate i is guaranteed to find his number (although it might need more than 50 tries if the cycle is longer than 50).

And finally, to anyone questioning WHY each locker is a member of exactly 1 cycle, Cerb saved me the trouble of explaining it:

Cerb said:
Being shuffled, all lockers and their numbers can be treated as having been swapped. Having been swapped, no matter how matter how many times they've been swapped, they will always create cycles. Also, due to the equivalence to being swapped, no cycles will cross paths. Each path taken will either be a subset of some cycle, or never make it to detecting the cycle.

In particular, consider first the arrangement where each locker contains its own number. We have 100 cycles. Then perform swaps to get to the desired random arrangement. Initially, swapping 2 lockers merges their cycles. Swapping again splits their cycles into 2 independent, smaller cycles. This is true in general; swapping numbers btwn lockers will either merge or split cycles, but always each locker is a member of only 1 cycle since it started out that way.

Mr. Pedantic · Oct 20, 2012

eLiu said:
Actually, Cerb's concern is different from yours.

If I start in locker i, by traveling along the cycle, I will eventually get back to locker i. This means that some other locker contains i. Hence by starting at locker i, inmate i is guaranteed to find his number (although it might need more than 50 tries if the cycle is longer than 50).

This is what I don't get.

I get that if I start at locker i, I will eventually get back to locker i.
I get that some other locker contains i.
I don't get how the cycle starting at locker i is guaranteed to have i inside one of the lockers.

Also:

So when you say that the inmates have a 30% chance, is that solely due to having cycles longer than N/2? i.e. if all the cycles are shorter than N/2, is the chance of success 100?

postmortemIA · Oct 20, 2012

Mr. Pedantic said:
This is what I don't get.

I get that if I start at locker i, I will eventually get back to locker i.
I get that some other locker contains i.
I don't get how the cycle starting at locker i is guaranteed to have i inside one of the lockers.

because that's only way to close the cycle. simply because there is only one i in the locker. otherwise cycle wouldn't get closed, as as you open other lockers only when you find their numbers, so i remains as only one that can close cycle

as for why is important to always open its own locker first, it eliminates worst case scenario of cycle being 1 for all lockers - when each number matches the locker.

Mr. Pedantic · Oct 20, 2012

Oooooooh!!!!!!!!!

But wouldn't that mean for a cycle length > N/2, the chance of finding your own number be 0? Because your number will always be the last number you get before you go back to your own locker, so if the cycle has 51 lockers and you only get to open 50...

iCyborg · Oct 20, 2012

Mr. Pedantic said:
Oooooooh!!!!!!!!!

But wouldn't that mean for a cycle length > N/2, the chance of finding your own number be 0? Because your number will always be the last number you get before you go back to your own locker, so if the cycle has 51 lockers and you only get to open 50...

Yes, it does mean that, and this is used to calculate the odds: otherwise you'd need to consider cycles of length>N/2 as well in the calculation.

Mr. Pedantic · Oct 20, 2012

iCyborg said:
Yes, it does mean that, and this is used to calculate the odds: otherwise you'd need to consider cycles of length>N/2 as well in the calculation.

So that means that the probability involved is really the probability of there being at least one chain longer than 50 lockers?

Schmide · Oct 20, 2012

I didn't read the above solutions. They all die.

The chance of any one dude living is 50%, however since they can't communicate and their chance of succeeding is dependent on all of them succeeding it equals flipping a coin 100 times and getting the same result 100 times.

0.5 ^ 100 = 1 in 1,267,650,600,228,229,401,496,703,205,376

Edit:

However if you remove a locker for a successful pass you get a 100C50 or 1 in 100,891,344,545,564,193,334,812,497,256 chance.

iCyborg · Oct 20, 2012

Schmide said:
I didn't read the above solutions. They all die.

You should. If independent, it would be 0.5^100, but they can arrive with a strategy that gives them >30% chance.

iCyborg · Oct 20, 2012

Mr. Pedantic said:
So that means that the probability involved is really the probability of there being at least one chain longer than 50 lockers?

Correct. You compute the probability P that there are no cycles longer than N/2. The answer is 1-P I believe.

Schmide · Oct 20, 2012

iCyborg said:
You should. If independent, it would be 0.5^100, but they can arrive with a strategy that gives them >30% chance.

If they can communicate yes. They can't communicate so no.

Edit:

If the all live all die rule was not to apply they could easily come up with a strategy to limit their losses. I assume the only thing they could see during the exercise is the success of any one individual. With this it becomes a 100c50 problem as you eliminate any repeat checking of successful lockers.

iCyborg · Oct 20, 2012

eLiu said:
Here's a similar problem that uses the same technique: say I give a list of N numbers in the range [1,N-1] with exactly 1 duplicate (i.e., each number is in there at least once). Find me the duplicate in linear time with constant memory (so you cannot sort & binary search; you also cannot hash). One approach is to XOR all the numbers in the list together, then XOR the result with all the numbers in the range [1,N-1]. Since x^x = 0, and the duplicate is the only number that would be XOR'd an odd number of times, you'll find it.

But can you think of a way to do it using an approach similar to the solution to puzzle 1?

I would never think of XOR here, nor a way similar to puzzle 1, but I can think of a different and very simple approach: The sum of all numbers 1 to N-1 is T = N*(N-1)/2. Now, do a single pass through the list to get their sum S. The duplicate is S-T

eLiu said:
As for the 2nd problem,

The general form of this problem is given a (2^N - 1)-bit message from alice to bob, how many bits of information can you transmit to Bob by flipping at most 1 bit of Alice's msg?

Comment: I don't think either of Shannon's main theorems can be applied to this. They dealt with compression bounds and noisy channels. Maybe you can do something with the compression bound... not sure. That said, I've had minimal exposure to information theory so I really don't know. Regardless, anyone who's seen info theory at all before should probably guess the answer is related to log2(numBits), which is accurate; see next spoiler box. Not that knowing this helps you much...

Hint 1: The answer is N bits, i.e., log2(numBits + 1). So with Alice sending a 31 bit msg, I can always send 5 bits.

I didn't refer to Shannon exclusively, but generally some sort of information theory background. Like Hamming distance

I actually tried to see how to send 2 bits with length 4 (one would think that 31 instead of 32 from problem statement would give me a clue to go with 3, but, eh...), but I still don't see a way. I can create 4 messages, but I don't see a way to put them in some classes so Bob can know what we sent without knowing Alice's message, it becomes messy, I feel like I could do it with a PC and some sort of backtracking algorithm, but I didn't see some nice pattern.

iCyborg · Oct 20, 2012

Schmide said:
If they can communicate yes. They can't communicate so no.

Edit:

They can communicate about anything *before* they are taken to the 'locker opening' phase. I'll repeat the example I already gave for N=2 and opening 1 box: they can decide that inmate 1 opens box 1, and inmate 2 opens box 2. No communication is needed after that, and their chance of survival is 50%, because if inmate 1 passes (50% chance), inmate 2 will pass too. If independent, it would be 0.5^2 = 25%.
Seriously, try reading at least the first couple of posts where the problem is clarified, as the same questions were already answered.

eLiu · Oct 20, 2012

Mr. Pedantic said:
This is what I don't get.

I get that if I start at locker i, I will eventually get back to locker i.
I get that some other locker contains i.
I don't get how the cycle starting at locker i is guaranteed to have i inside one of the lockers.

Errr, I'm not sure what is unclear here. It sounds like you agree to a few things:
1) locker i is a member of exactly ONE cycle
2) being a member of a cycle, starting at locker i means I can get back to locker i

In the solution, the only way I can decide to open a new locker is by reading the number in an existing locker. This is how we form the edges of our graph.

Mr. Pedantic said:
Also:

So when you say that the inmates have a 30% chance, is that solely due to having cycles longer than N/2? i.e. if all the cycles are shorter than N/2, is the chance of success 100?

Yes. If the warden were a nice guy and he guaranteed that there were no cycles of length> N/2, the prisoners will always succeed. And yeah, if your locker is in a cycle of length L, you always have to open L lockers. So if the cycle length is 51, you die, as you said.

iCyborg said:
Yes, it does mean that, and this is used to calculate the odds: otherwise you'd need to consider cycles of length>N/2 as well in the calculation.

I think you stated that backwards. The calculation ONLY considers cycles of length>N/2. You count how many there are and divide by the total number of arrangements, n! to get the probability that an arrangement has a cycle of length > N/2. 1 - thatnumber is the answer. At the start of my last post, I explain how to compute the number of arrangements having exactly one cycle of length > N/2 and also why computing the number of cycles of length < N/2 is hard & why you cannot use the same approach.

Mr. Pedantic said:
So that means that the probability involved is really the probability of there being at least one chain longer than 50 lockers?

Yes. One key observation is that in a given arrangement, there can only be ONE cycle of length > N/2. This is b/c of the uniqueness property--each locker is a member of exactly one cycle. This is extremely important b/c otherwise it'd be difficult if not impossible to count them.

But yes, computing the probability revolves around (quoting from my earlier SOLUTION: post)

Ok, so now that we know that each locker is a member of exactly one cycle, under what conditions can the prisoners always succeed? Well, if there are NO cycles with size greater than N/2, the prisoners will always win. Why?

Then the probability of success is 1 - Probability(there exists a cycle of length > N/2). In the start of my last post, I showed that number of arrangements that have a cycle of length k > N/2 is n!/k. The number of total arrangements is n!. So the probability of having a cycle of length k > n/2 is:
1/n! * sum(n!/k, k ranges from n/2+1 to n) <= sum(1/k, k ranges from n/2+1 to infinity) = ln(2).
Hence the answer, 1 - ln(2).

You should re-read my version of the solution (search for SOLUTION: in this thread) and take a look at how we actually count cycles (described in the beginning of my last post). Sounds like things are making more sense to you now.

Schmide said:
I didn't read the above solutions. They all die.

Nope. You can find a strategy where they succeed with a double digit percentage of success. For any problem size even (as long as it's N prisoners opening N/2 lockers, N even.)

0.5^100 is the naive solution for this problem and you can do way better. Yes, the prisoners cannot communicate. But that doesn't mean there is no structure to the problem that they cannot exploit individually (so much structure in fact that the result is almost as good as being able to communicate).

2 puzzles for you

Platinum Member

Elite Member

Diamond Member

Golden Member

Golden Member

Diamond Member

Elite Member

Diamond Member

Elite Member

Diamond Member

Golden Member

Elite Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Golden Member

Golden Member

Diamond Member

Golden Member

Golden Member

Diamond Member