Also, the cpu doesn't "look ahead 100 cycles"; that makes no sense. All out of order execution and ilp happens after decode. For x86 that would be uops in the uop queue.
Yes. Do note that there very often are 100 cycles worth of instructions (or more) in the uop queue. Consider a situation where there is a miss through all caches to RAM. That's roughly 90ns on modern Intel CPUs, give or take a bit depending on how good your DRAM is. If the CPU is running at 4GHz, and decoding ~3 instructions per cycle on average, it would be able to decode a >1000 instructions while waiting on data for that one miss. The hard limit on how far ahead it can get are the ROB and the PRFs. Or, whenever there is a miss to memory the CPU will fetch, decode and issue instructions until one of [ROB (224), sPRF (180), vPRF (168)] runs out of room. (numbers for coffee lake)
Even misses to L3 will give the CPU enough time to fill up half the queues. Misses are common enough (and usually happen before major computation -- you have to first get data before you can work on it) that unless your code has plenty of badly predicted branches, the normal operating condition of a CPU is executing from queues that are closer to full than empty, allowing a lot of OoO despite the in-order frontend and retire.
Of course, if you are doing things like tree traversal and branching on every node, the queue gets cleared every time the predictor fails, and you are always running with empty queues. That's just one more reason why in practice trees usually lose to hash tables, even if hash tables technically have to do more work.