Help: SoB client fails w/ odd error: PROBLEM SOLVED

Woodie

Platinum Member
Mar 27, 2001
2,747
0
0
Preface: We lost power the other morning, and since then SOB hasn't been running for squat. :(

Here's the error message in the SoB.log:
[Sat May 31 20:35:55 2003] internal computation error [mismatched sums]! check your memory/processor. test will restart in 5 minutes.

There's a whole series of them, and my system hasn't been very stable today. There have been a couple of freezes, and many reboots. I've been fiddling w/ the bios settings, trying to recover them, since the power outage toasted all of my previous settings. I'm up to a 166 FSB x 13 on an XP 2100+. (See Bomber in my rigs).

The system has been stable for the last 2 hours, playing CS w/o issues. Temps in the low 30s for CPU, IGP and system. I tried to restart the service, but it immediately goes into standbye mode, and generates another error as above.

I suspect there's a memory setting, but I'm not sure what to change/set. I have a pair of 256MB DDR2700 sticks (Corsair), running at 333Mhz, according to the bios.

Any ideas?

 

Electrode

Diamond Member
May 4, 2001
6,063
2
81
1. The fact that this happened after a power outage makes me immediately think of the PSU. If you have a spare, try it.
2. Usually, I get that error message from SB due to an unstable overclock. Try backing down to rated speed.
 

ProviaFan

Lifer
Mar 17, 2001
14,993
1
0
The memory timings might need to be relaxed a bit, but I doubt that's a problem. Just out of curiousity, try a few mintues of Prime95 torture test to see if the problems affect more programs than just SoB. Also, stick in a Memtest86 diagnostic CD or floppy, reboot, and give that a run (it's great for detecting the slightest instability in a memory subsystem). I'd suggest trying the PSU replacement after the stability tests and o/c-ing tweaks, because it's much more time consuming and just a general PITA (unless you have a hot swappable PSU ;)).
 

YellowRose

Senior member
Apr 22, 2003
247
0
0
The problem with the SOB client is your OC of the XP2100. Normally the XP 2100 runs at 1.73ghz but you have OC'ed it
in excess of 2.1ghz. This is the problem. go back to stock speed and your client will run fine.
 

MathGuy

Member
Mar 27, 2003
43
0
0
If it now seems to be otherwise stable, another thing you might try is clearing the SB cache (change the user name to "anonymous" then immediately change it back). It's *possible* that the cache is semi-nonsensical due to the previous errors and that's causing the SB-specific grief.
 

SinfulWeeper

Diamond Member
Sep 2, 2000
4,567
11
81
I only get that error from two things.

1. Unstable OC
2. To aggressive memory timings if the ram can not handle it.

Bring you computer back down to stock speed and gradually get to an OC that can be handled.
Overall though, I think it is the CPU and not the memory. Corsair make some good memory. Just try lowering your multipler to say... 11.5 and see if you get the same problem.
 

Confused

Elite Member
Nov 13, 2000
14,166
0
0
I was getting that error when I was pushing my overclock too far. It would be fine at 11.5x182 for 2.09Ghz, but it would give that error at 11.5x183, and any multiplier at that or higher FSB, unless I relaxed the memory timings right back from 5-2-2-2.5 to 7-3-3-3, which obviously I didn't want to do!!

Try a lower multiplier and same FSB, then see how high you can get your FSB with lower multiplier, as a faster FSB will speed up the overall system more than just increasing the multiplier :)


Garry
 

sciencewhiz

Diamond Member
Jun 30, 2000
5,885
8
81
just because your processor runs "stable" in that it doesn't cause windows or another program to crash doesn't mean that it isn't making computation errors that affect SOB.

Here is what I would do, in order:

1) go back to stock speed and try SOB. If that fails, look for hardware issues.
2) run prime95 and memtest86 at the overclocked speed. If those fail, back off on the overclock because other things will eventually fail.
3) Follow Garry's suggestion and lower the multiplier and increase the FSB
4) find another project that doesn't fail
 

Woodie

Platinum Member
Mar 27, 2001
2,747
0
0
Thanks for the thoughts:beer:...I'll work on the box tonight. In the meantime, more questions/info!

PSU is last guess at this point, because: it's 3 months old, Antec True-power, PITA to remove!

Thank-you MathGuy, I didn't know how to get new numbers for the client. I'll try that first, since it's the easiest, and see what happens.

O/C is the next most likely culprit, I was wondering about CPU vs. RAM
Old setup (100% stable on SoB)
CPU was o/ced at 13 x 160, default voltage (auto)
RAM was 1:1 w/ FSB, default voltage.
New Setup (Pretty stable, but not 100%. SoB not working)
CPU o/ced at 13 x 166, default voltage (auto)
RAM at 1:1 w/ FSB, default voltage (2.6) (Tried 2.7, but it wouldn't post).

Memory timings are "aggressive", something like 7,3,3,CAS 2.5. I'm not familiar enough with those to feel confident setting them manually. Can anyone suggest what those timings should be?
 

ProviaFan

Lifer
Mar 17, 2001
14,993
1
0
Ok, so you're overclocking more than you were last time, and you're wondering why there's a problem now when there wasn't a problem then. The answer seems pretty obvious. :eek: ;)
 

Woodie

Platinum Member
Mar 27, 2001
2,747
0
0
Ok, so you're overclocking more than you were last time, and you're wondering why there's a problem now when there wasn't a problem then. The answer seems pretty obvious.

Well, yes. :eek: But before, when I o/ced to this level (a measly 6 MHz more), the system wouldn't post. Now it does, and seems stable for the most part. The exception being memory intensive stuff, like SoB. If I can knock back either the memory speed or the CPU speed, then I'm doing better.

Yes, I know I could go back to stock, but I don't WANT to.
 

deerslayer

Lifer
Jan 15, 2001
10,153
0
76
I used to get that message when I tried to go past 166 x 10.5 on my 1600+ :(

Great way to find out of it was capable of that speed though!
 

Baldy18

Diamond Member
Oct 30, 2000
5,038
0
0
Not 6MHz more, but 13x6MHz more or 78MHz more which can deffinetly make a big difference in CPU stability.:)
 

Woodie

Platinum Member
Mar 27, 2001
2,747
0
0
Results: (so far) still looking for help.

It's not an SoB problem...the cpu will stay at 100% for 3-5 minutes, and then die w/ the corrupt math error.
So, I tried to back down the memory timings. Well, they already are at 7-3-3-2.5 so that idea's shot.
Next, to lower the multiplier from 13. I set it in BIOS, it remembers it, but the cpu still comes up at the 13x multiplier!! Confirmed by MBM and CPUid. This is very odd. So, I set the mult from 12.5 down to 12. Same behaviour. SoB continues to abend after a few minutes.

I'm going to go ahead and lower the FSB, and see where that takes me, but I'd really rather not. Anyone have other suggestions?
 

Confused

Elite Member
Nov 13, 2000
14,166
0
0
Lower the FSB. My computer would give me that error at 183MHz FSB, but at 182 it will be fine! Even if I lowered the multiplier right down, it would still fail at 183!


Garry
 

SinfulWeeper

Diamond Member
Sep 2, 2000
4,567
11
81
Lower your voltage back down to stock (providing your adding more voltage). Then lower the multipler to 9. Raise the FSB to 166 (I know it's only a 1.5GHz test).

If it passes, keep on raising the multiplier by .5 until you see the error. Once you see it just raise to vcore in the smallest inciments at that multipler until it passes (1.85 would be my max, but some people have no fear of over volting :p)

Remember. Just because a lot of people get good OC's out of chips does not mean all XP2100+'s are good ones. OC is a toss-up. I have yet to get a nice sized OC on any of my systems.
I have a Malay P4 SL6S2 mVID CPU that people 'regularly' get to 185+ FSB from the stepping/week/plant. Mine only caps out at 170 :(:|, it's simply the nature of the beast. And while I can boot into Windows and play games and do misc work at 178 FSB, it is not stable at all above 170.
Yet those few things never see the problem. SoB is running your CPU nearly to it's max. Like Prime95, unstableness will be pointed out in time if there is any, and in some cases not.
I have a very strict testing for stableness on my OC's and it entailes over 96 hours of testing. If it fails any of those tests. The OC is not stable at all even if the chip 'plays' fine.
My testing consists of all the following.

Prime95 for 6 hour loops on every 1MHz FSB increase.
Memtest86 for 10 loops for every 2MHz FSB over the rated stock speed
PiFast on every single 1MHz incriment on the FSB
3dmark2001 running with SoB under normal priority and a repeating M$ Powerpoint presentaion all for a 10 hour loop providing the CPU passed the above 3.

Also I am only taking guesses on how to help you in your OC. I am not an AMD fan at all
rolleye.gif
. But I thought to help a fellow TeAm DC'er out. The peeps of in the CPU forums can help you much better than I can. AMD's compared to P4's for OC's :p. It's like saying since Chevy's and Fords both vehicles, what works on one should work on the other. In reality all that would interchange is perhaps the rubber on the rims :confused: and a few nut's and bolts :).
 

deerslayer

Lifer
Jan 15, 2001
10,153
0
76
You need to lower the FSB. If your front side bus speed is too high, SOB will give you that error. That's how I can always tell my overclock is too high. It will boot fine, but it wont run SOB at all.
 

Confused

Elite Member
Nov 13, 2000
14,166
0
0
SW, 96 hours! :Q


My overclocking/testing strategy:

FSB = Stock
For each FSB
If (SOB = Fail) Then
If (Voltage < 1.85)
Voltage++
Else
FSB--
Fail = true
End If
Else
FSB++
End If
Until (Fail = true)


Garry
 

Woodie

Platinum Member
Mar 27, 2001
2,747
0
0
Thanks to all for the continued support.

I dropped the FSB to 160, and all is working fine. I didn't have time to fiddle any more last night, but perhaps tonight I can try raising the FSB a bit.
At least I'm getting some production out of the box!
Additional comments:
I have left the CPU voltage at: Auto, and MBM is reporting about 1.6, so clearly I have some headroom there.
The RAM voltage is still default at 2.6.

Lowering the multiplier doesn't seem to be working. =:eek: I don't understand why. I *may* try some other multiplier settings (I only tried 12.5 and 12.0) to see if I can get the BIOS to actually set the multiplier correctly. Any suggestions? This is an MSI k7N2G board (BOMBER in MyRigs).

Sinful Weeper :Q You're crazy! (in a good way!) I can't spend anywhere near that much time doing this! I'm just looking for some extra K/sec.
Lynx Is the SoB failure a CPU issue or a memory issue? I'm suspecting it's a CPU issue, because the CPU is more tightly bound to the FSB than the RAM timing.
Confused I'll give your methodology a try. It's just a slow process, because SoB fails somewhere between 3 and 5 minutes after starting, so I have to wait for it. :(
 

Confused

Elite Member
Nov 13, 2000
14,166
0
0
My method I find works best when you are just using the PC. I just keep an eye on the temps, and if they seem low, then it's normally because SB has stopped. Don't sit around watching SB trying to fail, as I have had it happen after a few hours, let alone a few minutes!!!

I have found that my CPU will go slightly higher than where it will fail, but the memory will fail if it goes any higher. I did try 215 FSB and 183FSB, for 2.15 GHz, and it will fail, but knock that down to 213 and 182, and it will be fine, and anything lower on the memory, it will be fine, so it is probably the memory, but maybe also the CPU :) My experiences have been with the memory topping out (I have a stick of Samsung PC2700)


Garry
 

Woodie

Platinum Member
Mar 27, 2001
2,747
0
0
Thanks to all for your help. I've got the system running stable, and more importantly, SoB crunching well. (see my SoB stats page) :D:D

I ended up taking the FSB up 1Mhz at a time, until SoB started to fail at 166 (which CPUid says is 167 FSB). Upped the voltage to 1.625, and *voila*! :D I may play w/ it to see how much further I can get, but I'm comfortable with this level for right now.

Thanks again for all your help. Meet me in Boston at Jillians next month, and I'll buy you a :beer:! If you can't make it, then have a few right now: :beer::beer::beer:
:D