First, the short summary: if each inmate randomizes his first locker choice (instead of picking his own), the result is actually WORSE than everyone picking 50 random lockers. (This is assuming that you can open the same locker more than once and have it count against you.)
If you add the condition that inmates detecting a cycle will choose a new random locker not already visited (so you don't have inmates that get stuck opening the same lockers over and over), then you do better, but the probability of success will still go to 0 as n->infinity.
If that doesn't make 100% sense, then read on b/c I'll explore the possibilities.
First, let's consider the model where inmates choose an initial locker at random and if they end up in the wrong cycle, they die. (i.e., they keep opening the same doors over and over.) I'm analyzing this b/c it is MUCH easier and motivates the obvious modification of keeping track of which lockers you opened & choosing a new one at random if you find a cycle.
With this strategy, an inmate succeeds only if he randomly picks a locker that is at most 50 lockers away from the one his number lies in. (If he picks a locker in a completely different cycle, I call that distance infinity away). I want to consider a few examples of how having each inmate randomize his first locker choice performs.
--Let's say inmate i is going and his number sits in a cycle of length 1 (i.e., locker i contains i). Then 99/100 times, he fails since he MUST pick locker i to succeed.
--Now consider the number of inmate i laying in a cycle of size 50. Here he succeeds 50% of the time since ending up in his cycle = success, failure otherwise.
--Now consider the number of inmate i laying in a cycle of size 75. Here he succeeds 50% of the time since only 50 lockers total lie within a distance of 50 trials to the correct locker.
--Now consider there only being 1 cycle of size 100. Same as 75, 50%.
This is easy to generalize. If inmate i's number is in a cycle of size k such that k < n/2, then he succeeds with probability k/n. Otherwise he succeeds with probability 50%. Thus the BEST a single inmate can do is 50%, which is the same as picking 50 lockers at random. But sometimes, he can do WORSE!
INTUITION 1: The intuition here is that by picking at random, the inmates will end up looking at lockers that have no possibility of leading to their number. So these trials are completely wasted.
And we need one last step b/c the above analysis only applies to ONE inmate. When each inmate picks 50 lockers at random, we established that the probability of success is 0.5^100. Here again, the inmates are acting independently. And in fact they do WORSE than 0.5^100, because the probability of success AT MOST 50% (but often less!)
INTUITION 2: Following intuition 1, needing all 100 inmates to do it really screws us b/c they will all end up looking repeatedly at lockers that will not help their search.
INTUITION 3: Thus the reason the optimal strategy is better is because it (generally) correlates the success of one inmate to another! (This is shocking b/c the inmates cannot give each other any information!) Why? Say inmate i opens his locker & conducts the process, finding his number after 50 tries. He's now seen the numbers of 49 other inmates. If he could tell those inmates to do something, what would he tell them? He might say "start at my number, look inside, go to that locker, open it, repeat." Then those 49 other inmates would find their numbers within 50 tries too. BUT he doesn't have to do this! Using the optimal procedure, those 49 other inmates will ALL find their numbers successfully without having to care about what the other inmates did. So even though the inmates can't talk, almost the end result is almost as good.
The opposite of this is when the success of inmates is decorrelated (BAD). When decorrelated, inmates must all act independently. And since there is no information, for any given arrangement, we can compute the expected success rate of each individual inmate, c < 1. Since they act independently, the total success is c^n, which goes to 0 as n->infinity.
Ok, now let's look at the modification. If I find a cycle (and not yet found my number), I pick a new locker at random that I haven't opened. For example, say I'm prisoner 1 and I get:
4=50, 50=2, 2=4. Now I've opened 3 lockers and I know 4, 50, and 2 don't contain my number. So I'll pick a new locker from the remaining options.
This is clearly better than just dying if you encounter a cycle b/c it lets you at least use all 50 tries. Whether this is better than picking 50 lockers at random, I'm not sure. I kind of suspect that it'll be the same, actually. I'm not sure how to prove this simply but it seems to work out for all combinations of 1, 2, and 3 cycles. Let's look at 2 cycles briefly. Say we have one cycle of length k and one of length n-k. Without loss of generality, let k < n/2. Then the k inmates with numbers in the k-cycle succeed with probability k/n. The n-k inmates in the other group succeed with probability
So the chance that a random inmate succeeds is k/n*k/n + (k/n*(n/2 - k)/(n-k) + (n-k)/n*(n/2)/(n-k)))*(n-k)/n = 1/2 = 50%.
You can get the same result with 3 cycles. So it's possible that this ends up being the same as choosing 50 lockers at random but I haven't proved it.
But that's pretty irrelevant insofar as the actual probability of success with the modified method STILL goes like c^n, where c < 1 (and I'm pretty sure c = 1/2). Thus as n grows, the probability of success goes to 0.
With the correct solution, the probability of success goes to a constant, 1-ln(2), as n goes to infinity. Again, the intuition here is that with the random choice, you will end up with a lot of inmates exploring sets of lockers that cannot possibly contain their number (when cycle size is small), and finding their number with a small probability when cycle size is large.
Does that answer your question?