Prime Small FFT and Large FFT tests run overnight but...

Dadofamunky

Platinum Member
Jan 4, 2005
2,184
0
0
Following IDC's advice from elsewhere, I came up with a pretty good collection of OC settings and tested them heavily on the system in my sig, using Prime 95 Small FFT to push the CPU OC, and Prime Large FFT to push the surrounding glue logic/northbridge.

I get strong overnight runs (3X so far) on Small FFT, but all have errored out at 12-13 hours.

On Large FFTs, it's all over the map. I've gotten two runs that errored out at 8 hours, and a couple more that errored out at about 30 minutes!

The thing runs like a tank now that I'm running XP 64-bit. Can't crash this thing running my normal workload to save my life. I'm just hoping to find the settings that will enable me to 'pass' Large FFT testing along with Small FFT testing.

My settings are:

VCore 1.3750
9x/460 FSB
TRFC 42 (highest it will go on this mobo)
FSB Strap to NB - Auto
Clock Overcharging Mode - Auto
CPU Voltage Reference - 0.63x
Voltage Damper disabled
PLL voltage - Auto
FSB Termination Voltage - Auto
NB Voltage - 1.41V
NB Voltage Reference - 0.67x
Southbridge voltage - Auto
Temps at idle - 36-37C
Temps under load - 60-62C
C1E and EIST disabled
Voltage at idle - 1.336V
Voltage in load - 1.272/1.28V

I think in some ways I'm limited by my motherboard.

Anyway, some of these settings are a complete mystery, particularly the voltage Reference settings and the Clock Overcharging Mode. (Of course, the Asus manual is no help either.) this makes it difficult if not outright risky to play with some of these settings, as I'm unsure of their effects. I've also been reading Gillbot's sticky with great interest, but it doesn't fully answer my issue.

So, without spending another month testing individual settings combos, can anyone help me solve this problem?
 

dust

Golden Member
Oct 13, 2008
1,339
2
71
You still have a lot of volts on auto. That's the first thing you should change IMO. Once you find the lowest volts possible @stock speed(maybe the gtl volts you can have on auto at the beginning), try to push a mild overclock only by upping the FSB and see what's the maximum you can achieve with the stock voltage on everything. Higher than that you have to apply small increments in volts, preferably step by step and prime all the way up.

The sticky on this forum also suggests that various prime errors can point to one or another voltage (FSB, dram, NB,vcc), depending on how many cores are failing, which cores are failing,etc.

It is a long process I'm afraid, but you have to go through it. I really don't believe you'll get your rig stable with your volts on auto.
 

roid450

Senior member
Sep 4, 2008
858
0
0
yea you really should be manually setting all those Auto settings, Auto is a lot of times not enough.... Like Drivenbyvoltage said, it's long to read thru the sticky, but it pays off. At first I started reading and gave up after 5 minutes. Then I came back and dedicated to the walk through and reading more. It payd off after a few weeks of tinkering.
 

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
Originally posted by: Dadofamunky
VCore 1.3750

Voltage Damper disabledPLL voltage - Auto

Voltage at idle - 1.336V
Voltage in load - 1.272/1.28V

Temps under load - 60-62C

C1E and EIST disabled
Temps at idle - 36-37C

Enable voltage damper and your small FFT fails will go away provided you don't see too much heat be generated (in which case you may find yourself backing down the Vcc in the BIOS with Vdamper enabled, which is a *good* thing).

Also hopefully you can re-enable C1E and EIST once you get this rig prime stable, that will bring down your idle temps.

Originally posted by: Dadofamunky
CPU Voltage Reference - 0.63x

NB Voltage Reference - 0.67x

Are you manually specifying these reference multipliers? Any reason you don't set them to AUTO in the BIOS? I leave these to auto and don't touch them.

Originally posted by: Dadofamunky
TRFC 42 (highest it will go on this mobo)
FSB Strap to NB - Auto

NB Voltage - 1.41V

Your NB may be too hot, or too low in voltage, or both for your 460MHZ FSB - thus the large FFT errors.

What is your Vdimm? And what is your ram clocked to? Large FFT errors can be MCH (memory controller hub) located in your NB, can be FSB (located in your NB) or can be ram (located in the dimms).
 

BonzaiDuck

Lifer
Jun 30, 2004
15,730
1,457
126
I can't speak with any authority about this, but if -- as you say errors are " . . . all over the map. . . . " it seems uncharacteristic of hardware failure alone. I'd look at the temperature factor, such as Idontcare suggests.

Here I also have my own question, because I never found any mention of it. He's using PRIME95, and some of his test runs are long enough that you think the elapsed time would be sufficient.

How many iterations of IntelBurnTest or Linpack are considered to guarantee rock-solid stability? The IBT program urges "at least five . . . " I've thought that 20 or 25 would be more than sufficient, without over-cooking the silly-cone. But I really haven't seen mention of anything confidently substantive about an adequate number of iterations.

It only seems intuitive to me that the sheer amount of RAM -- 8GB -- adds another factor that appears as "random" failure.
 

Dadofamunky

Platinum Member
Jan 4, 2005
2,184
0
0
Originally posted by: Idontcare

What is your Vdimm? And what is your ram clocked to? Large FFT errors can be MCH (memory controller hub) located in your NB, can be FSB (located in your NB) or can be ram (located in the dimms).

VDIMM is set to 2.1V, which is the recommended for this particular RAM at DDR2-1000 speeds. At DDR2-800, G.Skill states the sticks can run at 1.8V. RAM is now DDR2-920.

I think the voltage reference settings are probably better left to Auto.

Originally posted by: roid450
yea you really should be manually setting all those Auto settings, Auto is a lot of times not enough.... Like Drivenbyvoltage said, it's long to read thru the sticky, but it pays off. At first I started reading and gave up after 5 minutes. Then I came back and dedicated to the walk through and reading more. It payd off after a few weeks of tinkering.

I've gone through the sticky already; it was helpful, but I suspect again that my mobo limits me to some degree.
 

Gillbot

Lifer
Jan 11, 2001
28,830
17
81
Got a fan on your NB sink? If not, you may want to add one anf get one on your ram too.
 

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
Originally posted by: Dadofamunky
Originally posted by: Idontcare

What is your Vdimm? And what is your ram clocked to? Large FFT errors can be MCH (memory controller hub) located in your NB, can be FSB (located in your NB) or can be ram (located in the dimms).

VDIMM is set to 2.1V, which is the recommended for this particular RAM at DDR2-1000 speeds. At DDR2-800, G.Skill states the sticks can run at 1.8V. RAM is now DDR2-920.

OK, thanks for the added info.

Gillbot's suggestion is worth considering. Maybe even for just the duration of doing some tests you could "ghetto" style a case fan with some zip-ties or wire-ties to direct some moving air towards your NB heatsink and ram.

So here's my recommendation: we need to isolate your error's as being due to FSB (NB), ram, or CPU.

We do this piecemeal-wise, and we do it in this order: Ram first, NB/FSB second, CPU last.

Here's what you do, first you find your ram's max clockspeed and Vdimm that it seems to be able to operate at. We do this while keeping both the CPU and the FSB/NB at speeds we know aren't going to cause errors. I.e. put that CPU and FSB back to stock in all aspect.

Then start changing your ram multiplier (1:1, 5:6, etc) only while using memtest86+ and large FFT prime 95 to find the max stable overclock for the heat and Vdimm you are willing to push them with.

Hopefully you find them to be stable at 2.1V and DDR2-1000, meaning you have margin when you are clocking them at 920MHz. At this point you know when your ram will be holding you back so you don't waste time going there with FSB overclocks.

Second we find the FSB/NB max clocks and voltage. Now we don't want the CPU getting overclocked and throwing instability into this testing, so leave everything about the CPU on AUTO except the multiplier, intentionally set it to the lowest value you can in the BIOS (should be 6x) and set your ram multiplier to 1:1 (also called sync for some BIOS's) to ensure the ram is not going to be unstable, also set your Vdimm to your stable value for its highest clocks you found in the stage one testing above.

Now overclock your FSB clockspeed, increasing it in sizable chunks (50MHz or so) every iteration and then test system stability with Large FFT. Don't waste your time with small FFT or memtest here. Once you find the FSB clockspeed and voltage you are comfortable operating while knowing the rig is stable then you are done with the second stage.

Hopefully you find your FSB is stable up to 480-500MHz given that you want to operate at 460MHz. If FSB isn't stable at 460 then you've isolated your problems way above.

Third we find the CPU max clocks and voltage. Again we don't want the ram or the NB to be the cause of any instabilities we uncover, so set ram multiplier to 1:1 and Vdimm to the value you needed to reach peak ram overclocks as well as set all your NB/FSB settings (except actual FSB clockspeed) to the values you determined you needed in stage 2 above to be stable while hitting those peak FSB clocks.

Now set your CPU multiplier manually to its max value and start testing at elevated FSB clocks and checking stability with small FFT only. Presumably memtest and large FFT are passing at these lower FSB clockspeed based on them passing in stages one and two above.

As you increase FSB you will naturally need to manually increase CPU Vcc. Make sure you got the Voltage Damper enabled (not AUTO, not disabled). Step thru the same FSB increase increments you did in stage 2 above.

At some point in these three stages you are going to find something in your rig goes unstable below its current overclocked clockspeeds. It is a laborious process, but the orthogonal testing nature allows you to isolate the weakest link component that is holding back your OC.
 

Dadofamunky

Platinum Member
Jan 4, 2005
2,184
0
0
Originally posted by: Gillbot
Got a fan on your NB sink? If not, you may want to add one anf get one on your ram too.

Unfortunately, there isn't room for one in my case, being a uATX unit. It's another reason why if at some point I decide to go quad-core, I'd give up and go with an ATX case and mobo.

Originally posted by: Idontcare
So here's my recommendation: we need to isolate your error's as being due to FSB (NB), ram, or CPU.

I'm going to start going through this process tonight. Gak. It's a good thing the BIOS has Profile settings. At least this also gives me a good reason to use the newest version of MemTest.

Originally posted by: Idontcare
Maybe even for just the duration of doing some tests you could "ghetto" style a case fan with some zip-ties or wire-ties to direct some moving air towards your NB heatsink and ram.

LOL. Already done that. Still in the process of using a Dremel on the case so I can make it look a little less ghetto. :) I can also recommend a decent fan controller to anyone who doesn't already have one (not you, IDC.)
 

Gillbot

Lifer
Jan 11, 2001
28,830
17
81
Originally posted by: Dadofamunky
Originally posted by: Gillbot
Got a fan on your NB sink? If not, you may want to add one anf get one on your ram too.

Unfortunately, there isn't room for one in my case, being a uATX unit. It's another reason why if at some point I decide to go quad-core, I'd give up and go with an ATX case and mobo.

Originally posted by: Idontcare
So here's my recommendation: we need to isolate your error's as being due to FSB (NB), ram, or CPU.

I'm going to start going through this process tonight. Gak. It's a good thing the BIOS has Profile settings. At least this also gives me a good reason to use the newest version of MemTest.

Originally posted by: Idontcare
Maybe even for just the duration of doing some tests you could "ghetto" style a case fan with some zip-ties or wire-ties to direct some moving air towards your NB heatsink and ram.

LOL. Already done that. Still in the process of using a Dremel on the case so I can make it look a little less ghetto. :) I can also recommend a decent fan controller to anyone who doesn't already have one (not you, IDC.)

There's always room for one. you can get a small 40-60mm one and just set it in the area to create a breeze of air over the components.
 

Dadofamunky

Platinum Member
Jan 4, 2005
2,184
0
0
Or get a little 25mm one and lay it on top of the NB heatsink...

I've gotten the Small FFT to run continuously for 15-16 hours, after enabling C1E again. I think that side of things is set. Temps are a bit better too.

On the Large FFT side, I've bumped the NB voltage up a couple notches to 1.45V. I checked Intel's ICH9R documentation and it appears that the max voltage is 1.55V with the nominal being 1.45. So now the Large FFT has run for an hour and is going into an overnight run attempt. Perhaps running 4 DIMMs just makes the NB require a little more juice than normal. We shall see... The next step should this fail is to go through the isolation testing. Which will take weeks. :)
 

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
Originally posted by: Dadofamunky
Perhaps running 4 DIMMs just makes the NB require a little more juice than normal. We shall see...

Yes that is absolutley true. The more dimms you populate the higher the capacitance (electrically speaking) of the dimm/MCH/FSB setup which in turn means your voltage oscillations (peak to trough) during periods of memory access is going to be all the higher which in turns reduces your signal/noise ratio to the point (at some point) of making the system unstable.

The solution is to increase the voltage and/or reduce the temperatures.

Originally posted by: Dadofamunky
The next step should this fail is to go through the isolation testing. Which will take weeks. :)

Isolation testing can takes weeks but it need not take weeks. Perhaps I left the wrong impression. When you do isolation testing you first only worry about quick-and-dirty stability tests in the early testing as you want to quickly iterate towards your hardware's limitations.

For instance with the CPU testing (stage 3) the first thing I'd do at stock is drop that Vcc several 0.1's of volts and test with small FFT in 5-10min durations until I get Vcc so low that I get a worker death (not BSOD) within 5-10min. Then bump Vcc one notch, if it passes small FFT for 5-10min (really I just wait to see if it passes test3 of the small FFT test suite) then bump your FSB up and start again.

This will give you a rough feeling for how much Vcore your chip needs as you scale to higher GHz, it represents a lower-limit to what you would need if you wanted to be 24hr prime95 stable.

Here's and example: Min Vcore for smallFFT stable operation

With this data you can then see for yourself at what point is your chip simply requiring too much Vcc to operate at a given GHz and this helps "close in" on an acceptable peak GHz. You can generate a graph like this in a matter of hours, it doesn't need to be uber precise, you are just mapping out the general lower-bound for your specific chip.

It's only when you begin to get serious about the specific GHz you want to operate at that you then start in with the "does it pass 6hrs at 4GHz?" if no then drop it back to 3.8GHz but keep same Vcore and test again, if it doesn't pass 3.8GHz then try 3.6GHz at same Vcc again, and work backwards at this point.

A full isolation workup shouldn't take you more than 2-3 days. It can take longer if you are more thorough at every iteration of the testing. The graph I show above was actually generated with 12hr stability tests for every data point because I wanted to be extra anal about it, the extra precision in the data did not help me get to a stable 4GHz overclock any faster though.
 

Dadofamunky

Platinum Member
Jan 4, 2005
2,184
0
0
***UPDATE*** Bumped the NB voltage up to 1.47V. This is the only change I've made other than previously enabling C1E. Now the system passes Prime for 16 hours on both Small FFT and Large FFT. Mission appears to be accomplished. I'm running one more overnight run of Large FFT to ensure it wasn't an anomaly.

Temps are 34-35 idle and 58-60 under load. I'm sure in summer the temps will be higher.

Linpack absolutely crashes my system. Then again, I don't really care in that particular case.

I'm still going to go through the isolation process at some point; I want to get the Vcore down a notch or two. This has been very helpful!