pm wrote a post about yield a while back. Unfortunately, I don't remember most of it, but I do remember this example:
Take a few grains of sand and toss them onto a wafer. The sand represents defects. If you have bigger dies, there's a higher chance of a given die being defective; with smaller dies, any individual die is less likely to be hit by a defect.
Now, to address the rest of your question...
Caches are sort of a special case, because there are things you can do so that even with a defect in the cache, it still works (and you don't even have to do something like disabling half of it).
There are actually a lot of factors that affect yield. There's a somewhat complicated interplay between the manufacturing process and the chip's design. I'm making some simplifications here to keep this readable.
A given manufacturing process can produce some things more reliably than others (for example, the polysilicon transistor gates usually print much better if they're running in one direction that the other direction). Using the shapes that print well helps yield. Also, there's a lot of variation in the process. The transistors on one die will be a little different from the transistors on another die on the same wafer; they may be even more different on a different wafer. The chip fab will usually tell the chip designers what the variation is - for example, they might say, "90% of transistors will be within 10% of the nominal speed. 95% will be within 20%. 99% will be within 50%. 99.9% will be within 75%". Now, it's easy to build a small, fast circuit that works with nominal transistors, but if you want any decent yields, you have to make sure it works "most of the time". You have to pick some cutoff where you say, "if the transistors are more than x% off the target, the circuit won't work". If you do a very aggressive design, it might not work when the transistors are more than 10% off nominal, which means you're going to have to throw out a lot of dies. If you do a more conservative design, you'll get better yields. You can't design too conservatively, because that tends to make things bigger (and then the grain-of-sand explanation tells you why that's bad) and/or slower.
It's very tempting to use super-aggressive circuits, because they're just so fast/small (both good attributes). One example is called a "pulsed latch". It's a design for a storage element that has great properties - until the manufacturing process changes too much. Pulsed latches are very sensitive to the relative delays of things, and as the fab gets better, they can make things faster - but everything doesn't improve by the same amount. If the important ratios change too much, the pulsed latch won't store data properly. As the "nominal" attributes of the process get faster because the fab is changing things, due to the variations discussed above, more and more dies will be nonfunctional.
Getting back to the part about some things being produced more reliably than others, "vias" tend to be flaky. They're the connections between the layers of metal, and they're just difficult to create. To address this, a given connection is often made with multiple vias, so that if one of them fails, the others still conduct current. A design that uses just one via for each connection is likely to have lower yield than a design that uses multiple vias for backups.