After seeing some talk about this, I'm rather tempted to at least delid and apply better TIM. Although, my only question is which TIM? Everyone seems to suggest CLU, but I've already got Prolimatech PK-3 lying around. IDC, your tests showed that CLU was comparable to NT-H1, and other benchmarks show NT-H1 as being comparable to PK-3. So, it might just be worthwhile to use PK-3 instead of spending even more on CLU.
I'm not sure which removal technique (razor vs. vise) that I'd use. I've seen quite a few comments about how easy and painless delidding is with the vise method on a newer (Ivy Bridge or Haswell) processor.
What I learned with my tests is that there is a very real risk of using too little (yes, too little) TIM on the bare silicon die and this results in worse thermal performance versus just adding a little more TIM and letting the excess squeeze out under the mounting pressure.
It very much is a goldilocks type situation in which having either too little or too much TIM can result in equally poor temperatures, what you really want to strive for is the middle result where everything is just right.