Advice needed on flakey CPU

AluminumStudios

Senior member
Sep 7, 2001
628
0
0
I have an AMD Athlon XP2200+ (tbred-a) on an Epox 8k3a+ mobo in an Antec case with excellent ventilation (entire inside is nice and cool.) I have a Thermaltake Volcano 9 hsf with the fan at full speed. Temp reading from the core under a load can hit 63-64 deg. c. (please note this is a core reading, those of you with temps in the 30's are probably getting an external thermal diode reading.)

I think my CPU is boogered up. At stock speeds and voltage with my one stick of Mushkin DDR333 (cl 2.5 normal stuff running at "normal" settings in BIOS) my system is unable to pass Prime 95. It has CPU rounding errors. lately my system has become unstable too with randome reboots (no blue screen or anything, just BANG - reboot.)

After I come back from a reboot I have a single event log entry from savedump with a bugcheck code 0x000001e (might have the wrong number of zeros, I can't look because I'm making an image of that machine right now.) I searched for "bugcheck" and the hex number on the MS KB and found two hits that seemed relevent. One was a potentially defective font which I removed from my system. The problem still occured. The other potential thing was a hardware problem that the MS KB article didn't go into detail about. it said potentially the L2 cache could be at fault.

My system was 100% rock solid stable with all of the same hardware previously when I had a Palimino Athlon XP2000+ (and a Volcano 7 HSF, it was running near the same temps I am now.) I gave my XP2000+ to a relative though and bought myself the 2200.

I'm beginning to think that my tbred-a is trash.

I haven't run a great deal of other diagnostic software because I don't ahve time - I do a lot of video editing and rendering on my system, so I need it to be stable above all else. I was trying to render the Mystery Anime Theater 3000 movie for an animation convention that I was helping this weekend and I almost didn't make it because of these bloody spontaneous reboots and having to restart the 7 hr rendering job over and over...

So, do people think it's likely that my CPU is to blame? My system does pass one pass of memtest86 (haven't taken time to run more passes.) I'm on a buget, that's why I bought an Athlon to begin with (which I now think was a bloody huge mistake.) Is there any way to definitevly proove it's the CPU and get it replaced? I bought it in December from newegg. Would I be better served by getting a slow tbred-b and overclocking ... ? i'm tempted just to get another Palimino 2000 because my system ran GREAT with that. I can't drop any slower than 2000+ level though because of the video work I do and i don't want to spend more than $100 ...

Hmm, what to do ...
 

optimistic

Diamond Member
Apr 29, 2001
3,006
0
0
First impressions... did you remove the plastic film -protecting the base of your Heatsink? Secondly, are you using any form of thermal compound or pad?

And 60+ C is not a safe operating temperature, I don't care how off you onboard diode reading is, you're suppose to be running under 55C MAX (This temp is NOT an "external thermal diode" reading). My Athlon system would get unstable and give me all kinds of errors after 50C on my old AK31a board.

~30C is suppose to be your idle CPU temp, and 45-50C is suppose to be your "core reading" as you put it.
 

AluminumStudios

Senior member
Sep 7, 2001
628
0
0
I'm not n00b ... my heat sink in installed properly. I"ve re-installed it with both generous and meager amounts of silicon, then micronized silver thermal compound with the same results each time.

I've NEVER gotten my 2000 or 2200 to less than 60 under a load. I"ve read on other forums that many Epox 8k3a+ owners have extraordinarily high temp readings. I did change my bios to one that reads from the socket thermistor temporarily and had much lower temps more on-par with other peoples ... but I upgraded my bios again after that and it went back to reading internal temps.

If you have an old board, then you're probably getting a socket thermistor reading ...

I haven't been able to do anything to get the temp down. I crank my Volcano 9+ up to windtunnel levels and the temp remains high. My heat sink doesn't feel hot to the touch however which is strange. My old PIII's head sink will leave grill marks on your fingers almost. I do have a good mating between the heat sink and cpu though. Each time I take it off it looks as if the core was pressing hard against the heatsink which flattened whatever thermal compund I was using (silver or silicon) into an ultra-thin layer ....
 

Gatsby

Golden Member
Nov 6, 1999
1,588
0
0
RMA the CPU. I had a similar problem with a machine check exception.
Turned out the CPU was bad. Replaced it with intel and the same system is running 300 mhz faster than stock (2.4b) on stock heatsink fan. Perfectly stable.
 

mechBgon

Super Moderator<br>Elite Member
Oct 31, 1999
30,699
1
0
For the motherboard you have, the temp is completely normal, as you say.

My A7V333-RAID has begun to behave in a similar fashion under heavy CPU/memory load, and the issue persists despite using two different CPUs (one an AthlonXP 1600+ and the other a Duron 1GHz) and different memory modules. Under light load, it still does random reboots once in a while. I don't know what might have brought this on (dirty utility power perhaps?) but I'm pretty sure it's the motherboard itself at fault.

If you want a quick-&-dirty replacement board for a good price, consider an Asus A7N266-VM if you can live with one AGP slot and three PCI slots. They are very reliable (I have 20 at work, with three more in the pipeline), they have very high PCI-bus performance, and they work well with Win2000 or WinXP. Max RAM is 1Gb (2 x 512Mb) and they're not a tweaker's board, but they're a good buy for ~$70 and they'll take 266MHz-bus CPUs up to a 2600+ with the latest BIOS.

edit: BTW, do you have a line-interactive UPS for your work system? That's always a good thing to have when possible.
 

AtomicDude512

Golden Member
Feb 10, 2003
1,067
0
0
Originally posted by: Gatsby
RMA the CPU. I had a similar problem with a machine check exception.
Turned out the CPU was bad. Replaced it with intel and the same system is running 300 mhz faster than stock (2.4b) on stock heatsink fan. Perfectly stable.

Except you dont have to replace it with an Intel.
rolleye.gif
 

moonshinemadness

Platinum Member
Jan 28, 2003
2,254
1
0
My system wont even think about running over 55c. Reseat the CPU and reinstall HS And fan then leave your case side off. I dont know mucha bout this kind of thing and i dont know whether a CPU will run without Level 2 Cache but if it will (Ull have to find out from somewhere else) disable it in the bios and see if you still have the problems. You really need to find out if any CPU will run without L2 Cache though People on here should know.
 

AluminumStudios

Senior member
Sep 7, 2001
628
0
0
moonshinemadness - you must have missed when I said I redid my head sink multiple times with multiple types of thermal compound.

You also missed when I said I have a very nice Antec case with GREAT ventilation and everything inside the system is cool. In fact things start to warm up when I run it with the case open becasue it messes up it's great air flow.

 

ScrapSilicon

Lifer
Apr 14, 2001
13,625
0
0
Originally posted by: mechBgon
For the motherboard you have, the temp is completely normal, as you say.

My A7V333-RAID has begun to behave in a similar fashion under heavy CPU/memory load, and the issue persists despite using two different CPUs (one an AthlonXP 1600+ and the other a Duron 1GHz) and different memory modules. Under light load, it still does random reboots once in a while. I don't know what might have brought this on (dirty utility power perhaps?) but I'm pretty sure it's the motherboard itself at fault.

If you want a quick-&-dirty replacement board for a good price, consider an Asus A7N266-VM if you can live with one AGP slot and three PCI slots. They are very reliable (I have 20 at work, with three more in the pipeline), they have very high PCI-bus performance, and they work well with Win2000 or WinXP. Max RAM is 1Gb (2 x 512Mb) and they're not a tweaker's board, but they're a good buy for ~$70 and they'll take 266MHz-bus CPUs up to a 2600+ with the latest BIOS.

edit: BTW, do you have a line-interactive UPS for your work system? That's always a good thing to have when possible.

grab a cheap $30 Duron to determine if everything runs well..while you are RMAing the 2200+ if needed..or a XP2100+ TBred 'B' from newegg for $88(OEM $4 extended to 1 yr at checkout box I believe) or $95 (Retail..3yr from amd ;))
 

bfonnes

Senior member
Aug 10, 2002
379
0
0
Originally posted by: optimistic
First impressions... did you remove the plastic film -protecting the base of your Heatsink? Secondly, are you using any form of thermal compound or pad?

And 60+ C is not a safe operating temperature, I don't care how off you onboard diode reading is, you're suppose to be running under 55C MAX (This temp is NOT an "external thermal diode" reading). My Athlon system would get unstable and give me all kinds of errors after 50C on my old AK31a board.

~30C is suppose to be your idle CPU temp, and 45-50C is suppose to be your "core reading" as you put it.

Dude, settle. He didn't call you a newb... He just pointed out a fact, and that is that your CPU should never be running above 50C if you can help it. That is the point when the CPU gets flaky, and yes, it sounds like a CPU problem, 99.9% sure.
 

AluminumStudios

Senior member
Sep 7, 2001
628
0
0
most people with Epox 8k3a+'s have VERY high temp readings. My XP2000 (Palimino) and now this POS tbred-a both regularly stayed above 60 dec c. no matter what I did. Every time I open my mouth about my cpu people want to know the temp, then give me that same "is your heat sink properly installed" stuff. I'm tired of addressing it.

Anyway, it's cold out today so I opened my windows and flooded my room with cold air. My system temp is currently 25 c and my cpu is 52 c and mostly idling and my system has spontaneously rebooted twice today ...



 

mechBgon

Super Moderator<br>Elite Member
Oct 31, 1999
30,699
1
0
Originally posted by: AluminumStudios
most people with Epox 8k3a+'s have VERY high temp readings. My XP2000 (Palimino) and now this POS tbred-a both regularly stayed above 60 dec c. no matter what I did. Every time I open my mouth about my cpu people want to know the temp, then give me that same "is your heat sink properly installed" stuff. I'm tired of addressing it.
[*]8K3A+: real core temperature reading directly from CPU diode
[*]most other boards: estimated temperature reading, usually with a candy coating so people don't freak out

Aside from the 8K3A's, the only other consumer board I know of that reports the core-diode reading to the end user is the Gigabyte 7VRX revision 2. Asus A7V333's can be tricked into reporting it via MBM5, but the calibration mysteriously dropped a whole bunch around the time the 1008 BIOS came out
rolleye.gif
42C at core under full load... riiiiiiiiiiiiiiight, Asus...
rolleye.gif
 

ScrapSilicon

Lifer
Apr 14, 2001
13,625
0
0
Originally posted by: AluminumStudios
most people with Epox 8k3a+'s have VERY high temp readings. My XP2000 (Palimino) and now this POS tbred-a both regularly stayed above 60 dec c. no matter what I did. Every time I open my mouth about my cpu people want to know the temp, then give me that same "is your heat sink properly installed" stuff. I'm tired of addressing it.

Anyway, it's cold out today so I opened my windows and flooded my room with cold air. My system temp is currently 25 c and my cpu is 52 c and mostly idling and my system has spontaneously rebooted twice today ...
turn off auto reboot upon error..in Start>ControlPanel>System>Advanced>Startup&Recovery ...Settings.. to get a oldstyle BSOD with some error codes possibly..
your temps are fine..possible tho to have developed a very fine hairline fracture in the core tho..
Originally posted by: AluminumStudios
Thanks for the reply ScrapSilicon. I'd seen that auto-reboot thing before but forgot about it.

Hmm "System must be rebooted for changes to take effect." Ayecarumba!

Hopefully it'll blue screen and I can get some useful info!

As far as the cpu goes, I'm thinking along the lines of micro-damage too. I don't know about a crack, but it's possible there's a flaky transistor that likes to not work right under certain conditions triggering some kind of error then catastrophic exception ....

My even log events look like this:

The computer has rebooted from a bugcheck. The bugcheck was: 0x0000001e (0xc000001d, 0xbfda0d30, 0x81849044, 0x81b0b544). Microsoft Windows 2000 [v15.2195]. A dump was saved in: E:\winnt\Minidump\Mini033003-01.dmp.

The 2nd, 3rd, and 4th hex numbers in parenthesis seem to change ... the MS KB had two articles that referred to bugcheck 0x0000001e. One was a corrupt font which I removed, the other said it could indicate a hardware problem, possibly L2 cache. And since all of this started after I replaced my previous CPU in this system which WAS rock solid AND this cpu can't pass Prime95 tourture test for 24 hours, I"m thinking the CPU is scr3w3d. I was an OEM and the 30 day warrenty is long gone ... :(

yw..but :(
 

AluminumStudios

Senior member
Sep 7, 2001
628
0
0
Thanks for the reply ScrapSilicon. I'd seen that auto-reboot thing before but forgot about it.

Hmm "System must be rebooted for changes to take effect." Ayecarumba!

Hopefully it'll blue screen and I can get some useful info!

As far as the cpu goes, I'm thinking along the lines of micro-damage too. I don't know about a crack, but it's possible there's a flaky transistor that likes to not work right under certain conditions triggering some kind of error then catastrophic exception ....

My even log events look like this:

The computer has rebooted from a bugcheck. The bugcheck was: 0x0000001e (0xc000001d, 0xbfda0d30, 0x81849044, 0x81b0b544). Microsoft Windows 2000 [v15.2195]. A dump was saved in: E:\winnt\Minidump\Mini033003-01.dmp.

The 2nd, 3rd, and 4th hex numbers in parenthesis seem to change ... the MS KB had two articles that referred to bugcheck 0x0000001e. One was a corrupt font which I removed, the other said it could indicate a hardware problem, possibly L2 cache. And since all of this started after I replaced my previous CPU in this system which WAS rock solid AND this cpu can't pass Prime95 tourture test for 24 hours, I"m thinking the CPU is scr3w3d. I was an OEM and the 30 day warrenty is long gone ... :(