Ok, I'll briefly describe how a the workflow of a modern game is structured. Basically, a CPU has to feed a GPU data (such as primitives, lighting, shaders, etc...) to process (transform, clip, rasterize, per-vertex/fragment, etc...). The CPU needs to take time to process stuff, and so does the GPU. The bottleneck arises when one has to wait for the other. If the CPU takes too long, the GPU finishes its work and idles. On the other hand, if the GPU is busy doing something, the CPU may have to wait on the GPU to do synchronization. This lack of "finishing jobs at the same time" is where the bottleneck arises.
Imagine a prducer/consumer scenario. The producer produces good so that the consumer can consume. Let's use the XBox 360 at launch...MS couldn't produce enough units to satisfy demand (in the US, at least). Therefore, it's like the CPU that can't produce enough "work/goods" for the GPU. On the other hand, the reverse situation is best illustrated with the Japanese XBox 360 launch. The bus and stuff in between? Think of that as the shipping routes trucks/cargo ships take to deliver XBoxes from the factory to the store shelves. If the trucking is really slow, there might be a launch or resupply delay, causing the stores/GPU to wait.
Now, does that make things a little more clear?