O/C and HD errors...need advice

ScottGridley

Junior Member
Jan 28, 2011
2
0
0
Hi All,
Thanks in advance for reading this - I've got an odd issue with a recent build, and I'm looking for some help diagnosing the root cause.

Stats on machine:
i7-950 on asus sabertooth x58, 8gb PC1333 ocz sniper ram, toshiba 1tb 3gb/s sata drive, 1000w rosewill ps, zotac 1gb gtx460.

Currently overclocked to 4114ghz @1.334v. Seems very stable, however box occassionally restarts or boots slowly. For example, i was just running a prime95 torture test blend, ran about 10 hours without an error, cpu at ~67oC, max ~73oC. Suddenly, PC reboots. Only error in the logs (including Prime95 results txt) is disk error 7 in event viewer (several apparent bad sectors) and a kernel power error (but that's probably there from the reboot).

So the question is are there really bad sectors on the drive, which somehow cause a reboot during a torture test, or is there some underlying instability in the overclock which is showing up as an apparent drive error?

The real experiment would be to swap HDs but I am not particularly looking forward to a reinstall of OS, apps, games, etc...

Does anyone have experience where an O/C caused disk errors or is this really a bad disk?

Thoughts?
 

PreferLinux

Senior member
Dec 29, 2010
420
0
0
Probably your PSU. Rosewill are not very good, actually. I have had similar behaviour before, and it was the PSU.
 

ScottGridley

Junior Member
Jan 28, 2011
2
0
0
Thanks for the input, all. Much appreciated.

I've made progress and have eliminated the disk errors (I hope! - can only say that until it happens again, right?).

I don't believe its the ram at this point, as I've tortured it a fair bit, and still get 0 errors on prime95. I'm going to give IBT a shot over the weekend to really test it out...

I backed off the QPI voltage a bit as it had gotten a little far from my CPU voltage somehow :\ (now 1.40 and 1.336v, respectively) and pulled back the QPI GT/s to 6254. That seemed to do the trick.

I'll let you all know if the problem creeps back and I isolate a specific component, but all looks good currently. [crossed fingers emoticon]
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,286
16,123
136
Ram and PSU or combination. I didn't see the speed you are running it at, but don't even run your ram at rated speed when overclocking (in my experience)
 
Nov 26, 2005
15,194
403
126
A copy and Paste from SpikeSoldier:

"basically with nehalems according to intel, you never want vdimm - vtt = .5v or more.

i.e.

vtt 1.21v
vdimm 1.65v

1.65 - 1.21 = .44 ok


vtt 1.21v
vdimm 1.75v

1.75 - 1.21 = .54 too high


vtt 1.33v
vdimm 1.75v

1.75 - 1.33 = .42 ok


this difference of vtt and vdimm voltage is what causes the memory controller on nehalems to become 'fragile' to vdimm increases when using custom voltages."

Again, props to SpikeSoldier for this :thumbsup: