Thought I would "throw in" on this one.
I think I began over-clocking around 2003 with Pentium Northwood processors. After that -- Conroes, Kentsfields and Wolfdales. Now, it's my Sandy Bridge. Usually, I'd build one or two systems a year for myself and the fam-damn-ily. Always or mostly, the OC system would be my own.
We've examined Intel's specs for several generations of processors here on the forum, using the specs to guide our OC exercises.
If course, Intel "bins" their own processors. They are guaranteed to run within spec, but VID varies among the production runs.
As someone already said, there are other voltages and tweaks that might make a difference between a good OC and either a "so-so" or disappointment. And also as someone else had said, people who publish their results -- often early in the processor life-cycle -- may either exaggerate or use a lower standard of stability out of sheer eagerness.
Me? I just buy the boxed processor new. I pick a motherboard in the mid- to high-end price range with useful features and good components. I always pick a board that has ample phase-power design, solid state components, and ample BIOS features. A low-end motherboard may come with a BIOS that only allows certain voltages to be tweaked. So if I OC'd my mother's system which has a E6700 Wolfdale and a $90 mATX Gigabyte board, I might only have bumped up the speed to 3.4 Ghz.
I've had decent luck with the Northwoods, E6600 and Kentsfield Q-6600, E-8400 and E-8600 -- and even the budget E2140 through 2180 C2D's. I skipped the first-gen I7/I5 cores (socket-1366). The Sandy system is the first and only I've fiddled with since LGA-775.
So while I feel compelled to overclock, tweak and tune, I build fewer systems less frequently than I did earlier. There's nothing like a good, fast, stable OC'd system with a long expectation of usefulness.
Even with my own cautious limitations and early stability tests, I learned from our colleague IDontCare that there are some minor voltage adjustments needed to clean up errors that lower the GFLOP results of LinX tests.
The long and the short of it: I've never returned a CPU.