After 4 months of stability, suddenly instability

arcas

Platinum Member
Apr 10, 2001
2,155
2
0
When I assembled this rig ~5 months ago, I spent the first month trying to find the limits of the components. To find the memory's limit, I set the CPU multiplier as low as possible and turned up the memory speeds until memset86 would fail. To get the CPU limit, I turned the LDT multiplier down and boosted the CPU speed until it couldn't run prime for an hour.

Then I subtracted about 10% from the memory and CPU limits (300mhz --> 280mhz) to be "safe" and ran memtest86 for a week followed by prime on both cores for a week. Seemed okay. Temps stayed below 50c with prime on both cores. System has run 24/7 ever since with no anomalies.

Yesterday I saw a couple odd segfaults which usually suggests either a memory or CPU problem. Ran prime just to see and, sure enough, it failed within a couple minutes. Repeated the test a few times. So far, all the failures have occurred on Core #2. Temps and voltages still look good.

Lowered the FSB to 275 and retested. Prime failed within minutes.

Lowered FSB to 250. Prime has been running for the last 90 minutes with no issues. Yet.

It's damned annoying that the rig would suddenly decide to ****** itself.

 

Luckyboy1

Senior member
Mar 13, 2006
934
0
0
Well, I would have taken 20% off myself, but that's a judgement call based on experience and my belief values and may not reflect reality in any way.

That said, parts tend to "burn in" over time and become a bit less capable.

I know that people are going to ask you to check for dust bunnies, but do we always remember to check the power supply for them as well? They say dust settles, but you'd never know that looking at most power supplies which are at the top of the case.

Then check your voltages under load and have a program that will keep a record of highs and lows as you run the stress tests.

I've been told by the air cooling crowd... or at least a few of them that it is a good idea to re-apply the thermal compound once every 3 months. I dunnow, sounds like overkill, but who knows!

You may also want to hang a fan in front of the system RAM. For all you know from, what you've told me, you haven't done the testing to find out if it's RAM or CPU or what freaking out.

Big time edit!...

To expand on what to do to in order to find if it's RAM or the CPU/mommaboard side of life...

Underclock the RAM and test. Also, loosen up the RAM timings at the same time. Then take your clock speed past offending speeds and if it doesn't skip a beat, it's the RAM freaking out and not the CPU/mommaboard side of life.
 

Insidious

Diamond Member
Oct 25, 2001
7,649
0
0
I agree with the post above:

Re-isolate your components and re-test their respective maximums to find the "offender".

Perhaps you will be led to a cure

:cool: