Unusual q6600 OC System and thoughts

metaforest

Junior Member
Apr 8, 2008
7
0
0
This looked like a good place to drop anchor for a few and see what I can learn.

I am a noob on OC, and may have made life more challenging by attempting to stuff a fairly aggressive system into a 2U rack case....

The system had first boot 3 days ago and I am now evaluating what it can be pushed out to.

Initial attempts to push the FSB were total fail beyond 404 even with dropping the memory strapping to the 800MHz range... it just wont pass memtest86

I'm not running Windoz on this box so the usual apps and verification dont really apply to this setup. Currently I am running Ubuntu Server 7.10 and as I write this the system is running mprime for over two hours without a hitch.. an earlier run today crapped out after 3 hours and after some tinkering decided to jack up the FSB voltage to 1.26 and lower the FSB to 403. I have reason to believe that this last tweak will get me to a stable system.

I have seen some noise about running @ frequencies slightly over 400FSB, and am not clear on what the issues of lower total bandwidth is... My memory is slightly OC'd with this strapping but it passes memtest and doesn't seem to be over heating.
If it turns out that running at 400FSB provides better overall bandwidth then I can focus on reducing temps for a 3.60 GHz OC.

While I have no doubt could have built this system in a standard case, with just air cooling and probably with fewer headaches, I did manage to stuff the whole thing into a 2U case.... Water cooling was mandatory, there's no room in the case for large heatsinks, or quiet 120mm fans.

The only fans in use inside the case are the internal power supply fan, and a 40mm fan foam-taped next to one end of the memory sticks for cross-flow cooling. The SB, NB , CPU, HDD, and both regulator banks on the ASUS P5E have water blocks. The radiator is a koolance 1KW bare unit mounted externally. Originally the plan was to use simple fanless convection cooling, but under early testing the coolant temp was getting too high, so a single 120mm fan provides enough additional flow through the radiator to keep the coolant at a good working temp.

As you might guess there's no room in a 2U case for a decent graphics card for this intentionally headless system, so a $7 ATI Rage 8MB PCI provides the essential console video services.


 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,410
16,259
136
Welcome to the forums !

As for your OC, 3.6 is very good (I do see you have water cooling)
 

Syzygies

Senior member
Mar 7, 2008
229
0
0
Yes, 3.6 GHz is very nice. I stopped at 3.2 GHz on air (I did try 3.6 GHz), same reasoning would say going higher would bring more headaches than extra performance.

I'm running Ubuntu Server also for scientific work (math, actually) and I too wondered about the 400 FSB recommendations, we've clearly been studying the same articles.

I take these statements as summaries of collected experience, not gospel. I've run many trials on computations that matter to me, and used linear regressions to attempt to make sense of the data (I'm not yet ready to post what I've found). Yes, there is an effect there, but it may not be dominant for any particular person. Another effect that stuck out in my data: A memory multiplier of 2.4 does take a hit, compared to 2 or 2.5 or 3.

Once one settles on a cpu clock speed (3.2 GHz or 3.6 GHz, in our cases), there is a finite point cloud of realistic options for how we set the rest of our board: FSB, memory divider, tRD, CAS, etc. General assertions from collected experience are instead statements that make sense if that point cloud were a solid body. Then one would simply follow the dictums and know which corner of the solid body to choose. Alas, in our particular situation, we might not have a point in our point cloud close enough to that corner to follow this advice! Instead, a very different point might win, for reasons quite peculiar to our circumstances.

(This is a classic mental image for those who know it, "linear programming" versus "integer programming". This is what the "Operations Research" engineering department does, at any university. It also goes by the shorter name of "optimization". Big surprise, that's what we're trying to do, it makes sense we could learn from the people who make a living doing this!)

The good news is that it is very easy to simply list all the points in our personal point cloud, once we have chosen a target cpu speed, e.g. 3.6 GHz, then run timing trials for a computation that personally matters to us, to tune our systems.

A quirk here is that nudging the clock speed, e.g. down to 3.59 GHz, might change the point cloud exposing a new point that offers better memory performance, offsetting the loss in cpu speed. But cpu clock speed is by far the dominant term, you're messing around with the rest of this to gain a small fraction of extra performance. Admit it, it's a hobby/obsession/call it what you will. You could just pick values that work.
 

metaforest

Junior Member
Apr 8, 2008
7
0
0
Well it's not stable yet :frown:

And though that 3.65 GHz with slightly over clocked memory looks good, it really just doesn't add much practical value...

It appears that I am experiencing a thermal issue. The system will run fine for about an hour or two **under heavy loads** and then spontaniously hardware reset.

I have been trying to figure out what is going sideways. monitoring lm-sensors seems to indicate that one or more of the coretemp values are getting to 70+C at the die during stress testing.

The MOBO CPU Temp is Peaking at around 58 before the system reboots. The MOBO temp is a cool 25 to 30... not much heat hanging about when the water carries it all off :thumbsup:

I'm running sys_basher with four threads.... at the point where the memory tests begin bank pair tests (forcing slightly faster memory transactions) the whole system reboots... This is not a kernel panic, it's hardware-level, as if the power on the system had just been cycled.

I've backed the system down to 3.60 GHz 400 FSB and found that the issue still happens... it just takes a little longer to get there....

And if I kick off a four thread the sys_basher while the mprime is running, mprime errors out within a couple of minutes.

Now from doing further reading it may be that I have crummy thermal grease messing with my thermal conductivity hence the spiky thermal activity at the die..

I'm also considering probing my PSU...

Any thoughts?
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,410
16,259
136
I think 3.65 is too high. With several of my systems, the heat was fine, the voltages kept getting turned up, but it just didn;t want to run that fast. Turn it down a few notches and see what happens. I did, and it solved my re-boot problems.
 

metaforest

Junior Member
Apr 8, 2008
7
0
0
It might be wise of me to run it stock for an extended stress cycle to see if I might have a gremlin scampering around in the case. Running it up to 3.65 was largely stab at the sky, anyway.

So after I prove the build. I'll give it another go at the 3.6HGz range. I also feel pretty lucky here... I have a b0 stepping.... so far I have yet to see any other OC confirms of b0 above about 3.4GHz...
 

Syzygies

Senior member
Mar 7, 2008
229
0
0
Originally posted by: metaforest
Any thoughts?
Some of my 400 FSB overclocks worked at first, but don't work now. I'm baffled, but I found a different "sweet spot" for my 3.2 GHz overclock, 9x 356 FSB with a memory multiplier of 3, 1068 DDR at relaxed 5-5-5-18 timings. Because the Q6600 can't use a multiplier over 9x, you're forced to a 400 FSB to reach 3.6 GHz.

I'm on a different board, but what inexplicably remains stable for me is 400 FSB, 2.5 memory multiplier for 1000 DDR at 5-5-5-18 timings. This is more or less equivalent to the stock 800 DDR, 4-4-4-15 timings, yet it's stable. As I said, I'm baffled. It's either my memory or my MHC, one or the other drifted a bit while burning in.

My timings are a hair better with a 356 FSB, and I'm now suspicious of my mobo at 400 FSB, so I'm sticking with 356 FSB. But my "thoughts" are to try all combinations including relaxed, overclocked memory, looking for a stable combo.
 

metaforest

Junior Member
Apr 8, 2008
7
0
0
Well RL intrudes as usual... I'm looking at a stable system in the 3.4199 GHz range give or take a strap.

Some better thermal paste I am confident will push it back up by a comfortable margin...

IT mau also have been unwise to design a single threaded cooling system.... but hey it was a hack of a lot more cost effective... and if I get a 1.425x improvement over stock maybe it was worth it for the education :}

Per previous question it passes muster stock so I'm not dealing with any gremlins save my own :)

IT's still stuffed into a 2U case :D


 

metaforest

Junior Member
Apr 8, 2008
7
0
0
Confirmed run @ 3.51 GHz

I'll have to find a location to post the piccy with the Prime95 Results

But for them that doesn't need the proof it was 8+ hours on the default setting that generates the most heat.

The fix was offsetting vdroop and adding a second 120mm fan to the radiator to get coolant equilibrium down around 31C.

I'm a little disappointed that the VCore supply on the ASUS P5E droops ~0.11V under load... And that means I'm running Idle at VCore=1.47 >_<

OCCT helped me a lot to dial in good values.

Now I just have to get the temps down a bit before I get true stability under OCCT
(my one hour run under OCCT past, but the screen update/data recording thread crashed 40 minutes into the run... so I consider that to be a big, fat, hairy, FAIL)

And I don't think I'll get it to run at 3.6GHz.... I think it would with a bit more VCore.... However, I suspect the integrated heat spreader on the chip package just cant move the heat off the die fast enough. If I assume the hype about As5 is true, an additional 5 - 7C ?T is not going to do enough to keep the die below the redline for this particular chip.

Sooo Checkmate at 3.51GHz

*edit* after reading the thread on lapping.... maybe I can get that ?T.... maybe not...

 

metaforest

Junior Member
Apr 8, 2008
7
0
0
Well the new system holds fire @ 3.6GHz now that I have applied Arctic Silver 5 to the CPU and NB.

Per the advice above I am going to back it off to 3.51 GHz so that it remains stable even on a hot day.

OCCT stats:

2 hrs mixed mode testing:

Core temps: 73 67 70 63 (peaks@load)
MOBO CPU Temp : 52
MOBO temp : 37

Coolant temp: 34

Vcore: 1.48 (idle) 1.42 (load)
FSB = 400MHz

NOTES: Significant PSU sag in 5V, 3.3V and 12V under load...
Arctic Silver was a huge win dropped CPU core temps a conservative 15c @ 3.51 GHz...

earlier runs at 3.51 GHz were stable but the cores were running at around 68c average
after AS5 the cores were averaging 48c and peaking at 52c.... much better than I expected.

Adding a second 120mm fan to the radiator was mandatory with the improved CPU and NB heat conduction from AS5. Without the second fan I was seeing thermal run-away. Terminated tests early when coolant temp rose above 50c and still climbing. That didn't happen before applying AS5. Before AS5 the coolant would hold at 38c under load.
Three fans would be "bomb proof" per Koolance kits... go figure :D

Measuring coolant temp at the HDD proved very useful as it is the first thermal device to see coolant after the radiator. The exception to that is at cold idle: the HDD heated up faster than the CPU, giving the odd result of a CPU at 13 when the HDD was reading 20. Under load the HDD temp tracked the coolant temp within 2c. (how do I know that? I also added an in-line temp sensor read from a DVM... pain in the butt, really as I had to CALCULATE the temps by hand from the resistance), thats is not really part of the built system. Go figure...

Lessons learned:

1. Put the CPU on it's own cooling circuit with 10mm hoses, and set the rest up on 6mm. Current system uses 6mm hoses for the entire system except the radiator which is 10mm.

2. AS5 mandatory... 'nuff said

3. Memory cooling via air is good enough.... attempts to water cool the ram created serious plumbing fit issues(read Koolance FAIL) and were not effective at all for this system.

4. OCCT 2.0.0a provides high quality information about the progress of the OC process. HOWEVER, OCCT does not seem to be as picky about subtle ALU errors as Prime95.... so... I advise that stability is best cross checked with Prime95 after a successful OCCT run. Bottom line: if you pass a OCCT 2 hour run you have it 95% dialed in; Prime95 will prove it. [IMO]

5. While some components from Koolance are kickass (the CPU-300 block for example) their other aux blocks and their mechnical hardware are more trouble than they are worth. Their plumbing hardware is (mostly) first rate, so I have mixed feelings about going with Koolance for this build. I'll be looking at other options for future builds.

6. Next time no ASUS... the P5E is ok... but I could have done a LOT better for the same price and more features this PC HW thang is new to me so I have a lot to learn about reviews and word-on-the-street for this stuff...

7. while this system runs stable 3.6GHz.... I'll run it at 3.51 to avoid the headaches and the saggy PSU (don't have many options for replacement in a 2U compatible form factor)

Shout outs:
Special thanks to Syzygies for the philosophical/light touch input on my project.... It helped a lot thanks :D