sudden reboots after 1 month rock-solid

Qythyx

Member
Feb 6, 2000
87
0
0
I have a Soltek SL-75DRV5 and recently started having some problems. The problem is that my computer will suddenly reboot. I bought my MB about 1 month ago and it was running fine until 2 days ago. During this time the computer was turned on 24 hours every day. Then, 2 days ago, it started rebooting spontaneously. It would reboot, run for 5 or 10 minutes, and then reboot again, continuously.

MotherBoard Monitor reports all of the voltages as normal and the CPU temperature as between 48 and 51 degrees C. So, I think that temperature is not a problem. I also checked the BIOS temperature setting and it said the same thing, about 48 degrees. BTW, my CPU is an Athlon XP +1800.

I also checked the BIOS setting for automatic temperature shutdown but it was set to 70 degrees so I don?ft think this is the problem because it is only 50 degrees or so. I even tried turning off this BIOS setting but the problem still occurred.

Usually when my computer runs I'm running Prime95 (which really uses the CPU a lot). When I run with this program running my temperature is around 51 degrees. If I turn off this program so that my CPU is only 0% or 1% used then the temperature goes down to around 48 degrees. BUT, I think that 48 degrees is quite hot for the CPU when it is not used.

I also noticed that if I run my computer with Prime95 not running then my problem is OK and the computer does not reboot. BUT, this is very strange because I ran Prime95 for 1 month with no problems. Why is it now suddenly a problem?

Also, I will admit that the weather's been a little warmer here but only a few degrees. It's not like it's the middle of summer. The room temperature is still around 25 degrees C if not cooler. As for my case I have 2 intake and 2 outtake (including the power supply) fans. I'm pretty sure there is enough ventilation. Again I point to the fact that it was running fine for 1 month.

Any ideas????
 

Duvie

Elite Member
Feb 5, 2001
16,215
0
71
I am sorry, but I find it hard to believe that 48c idle to 51c load swing when running an amd chip as well as running prime95...that swing should be much greater then that, period??? maybe reporting errors????

what os are you running???winxp has a setting that at default has the computer restart during fatal errors and what we would commonly see as BSOD....

look at power supply starting to drop rails....

look at backing off of ram timmings....

a bunch of things to look at basically in the early stages...
 

Qythyx

Member
Feb 6, 2000
87
0
0
I agree completely. I thought the swing should be much more as well. My MB has another temperature monitor that I have stuck to the top of one of my HDs and it reads around 28 C consistently so I know the inside ambient temperature of the case should be no more than that.

As for the CPU temp, when I first reboot the temp starts off a bit lower, maybe 43 or 44 but quickly goes up to 48 and stays there. Then when I start Prime95 it jumps to 50 or 51 as I said. I don't think it is a reporting problem because, when I was at the BIOS, i.e., not using the CPU, I opened the case and felt the CPU heatsink and it was quite warm even though the fan was blowing lots of air through it.

I did do some voltage checking via MBM, I had MBM log the reading to a CSV file every 10 seconds and looking at that, after a reboot, there were no anomalies. Of course it is possible a power dropoff was too sudden for MBM to log. My PSU is a brand new 420W monster, though, so I think I'm ok on that end.

My OS is Windows XP but I don't think it is BSOD. First, I don't see any BSOD message. Second, when the system comes back up it doesn't say anything about reporting the previous crash log, implying there was no recorded crash.

I guess the biggest hint right now that something is weird is the fact that the idle CPU is still running at around 45 to 48 C. When I get home later today I'll boot the computer (after it has been off all day) and check from the BIOS to make sure that it does hover at those temps when idle.
 

Duvie

Elite Member
Feb 5, 2001
16,215
0
71
I agree at ten second intervals that may not spot the problem....


also I remember running prime that I never saw may max temp for quite a few test in the program....It will seem to plateau off at 45c on my p41.8@2.4 then hold there form many test then all of sudden shoot up and then plateau again at 50c where it would stay for hours or days depending on when I shut it off....my asus probe records at 1 second intervals and I saw that at that same point of temps jumping there was a voltage drop or cpu drawing more power off of my 12v rail....may look into that...


also 420watt don't mean crap if the power supply does not have a stroing 3.3 and 5v combined rail...something the athlon needs...I have had a 350watt start dropping 1 month later back on a 1.1 tbird about a year ago....
 

Buz2b

Diamond Member
Jun 2, 2001
4,619
0
0
Patient: "Doctor, Doctor, it hurts when I do this!"
Doctor: "Then don't do that!" ;)
For starters, shut down Prime95. Let the system run for a couple of days (if it will). Assuming it does then you either have a voltage problem as mentioned or a heat problem. Your temps are too high but not alarmingly so. If it runs with Prime95 off, then take the case cover off and point a small fan towards the cpu area and try running it with Prime95 turned on. If it does, you will know it is the temps. If it doesn't then you need to look at the voltages.
Also, as mentioned, if XP is set up to not display the BSOD, then you would never see it or the message. You have to change the setting so that it does not reboot and instead shows the BSOD error message. You'll have to look around to find the settings. That might help.
 

Qythyx

Member
Feb 6, 2000
87
0
0
You're right about the PS. I don't have the spec with me now but I do know that the combined voltage is well above the recommended values. That was one thing I specifically looked at when buying it.

As for the temperature, again what you say makes sense. My understanding of Prime95 is that it does a number of different things: factoring, checking primeness, and a few others I think. Perhaps some of these phases abuse the CPU more than others.

I will change MBM to log at 1sec intervals and check the temp and the voltages. I'm not sure what else I can do now, though. Do you have any other diagnostic ideas?

--Rainer
 

Duvie

Elite Member
Feb 5, 2001
16,215
0
71
The only time I had a problem with prime95 rebooting on its own in winxp with error log on and not reporting an error....was related to power not temp...
 

Qythyx

Member
Feb 6, 2000
87
0
0
Good suggestion Buz2b. I'll try running more without Prime95. I have run for about 12 hours w/o it and w/o problems. I'll also double check the BSOD setting.

Duvie, how did you fix your power problem? As I said, I'm pretty sure my PS is ok. Is there anything else that could cause power drops?
 

Duvie

Elite Member
Feb 5, 2001
16,215
0
71
well my power issue is a bit more serious...I fixed it by reducind the vcore on my oc....however I have diagnosed the issue as being a damaged resistor on the mobo...my fix will come later when I replace the board for a newer more advanced version.

remember one thing though that prime95 is a great test of stability of your oc and system....it is only doing you a favor by pointing out you have issues. maybe it does you a favor...saves you from corrupted data or future damage to hardware...


random reboots IMO have always been 2 things one or the other....power or heat!!!!
 

Qythyx

Member
Feb 6, 2000
87
0
0
My biggest concern is that 1) it is a brand new MB that I got because my last one was unstable and 2) I'm NOT overclocking.

I'm really hoping there is something I can do to fix this other than stop running Prime95. Not because Prime95 is so important to me but because I want a really stable system, not one that I always have to worry is going to start crashing again.
 

Buz2b

Diamond Member
Jun 2, 2001
4,619
0
0


<< START-RUN and entering regedit. Navigate to [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\CrashControl] . In this look for value of "AutoReboot". Setting its value to 1 will activate Autoreboot. Setting the value to 0 should enable the BSOD error messages. >>

That should enable the BSOD error messages and at least give us a clue as to the trouble.
I think you also need to reexamine your dedication to Prime95 also. If your system runs absolutely fine while not running Prime95, then what makes you think it is unstable and unreliable. Some systems can be effected adversley by one or two settings in the BIOS for instance and the only way it would ever be "exposed" is in Prime95. That doesn't meant that the system is unstable or unreliable by default. Try the temp test I suggested and activate the BSOD in XP. My guess is temps but it also could be a problem caused by a recently installed driver or program. Have you tried setting the BIOS at FailSafe or default settings to see if the trouble is duplicated? You might try that and only change the FSB to accurately run your correct cpu speed. Leave the rest of the settings alone. If it works OK, then start changing the BIOS settings back a few at a time to see if you can determine the "stress area".
 

Qythyx

Member
Feb 6, 2000
87
0
0
Ok, so I started my system after it being off all day and the temperature rose up to 42 C. After letting it run all night it remained at that temp. I then set MBM to log every 1 second and started Prime95. It has now been about 2 hours and the system is still up and the temperature is around 49 C. It is still morning so as the day heats up the temperature will probably rise a few degrees.

I checked my BSOD settings and they were not set to automatically reboot so my previous resets were not BSODs.

I'll keep you posted as Prime95 continues to run.

BTW, Buz2b, I mostly agree with you that I could just stop running Prime95 to avoid the problem. My concern is that someday in the future I'll have a program that, like Prime95, produces instabality but that I really want to run. Let's say for the sake of argument that when Doom 2000 comes out it uses the CPU enough to cause similar crashes. I'd like to find any way I can now to make my system reliably stable.
 

Buz2b

Diamond Member
Jun 2, 2001
4,619
0
0


<< checked my BSOD settings and they were not set to automatically reboot so my previous resets were not BSODs. >>

Good chance that you are either getting an electrical (Power supply) problem or it is the temps. Have you checked to see how hot the RAM is? I'm not sure if somewhere in this thread you posted the brand/type of memory but that might also be an area to look at. Don't forget about the idea of putting a fan on the open case to see if that stops the reboots. That would confirm that it was a temp issue anyway.

Prime95 abuses a system more than a game would. I doubt there would be a comparison. However, I don't mean to imply that having this trouble should be something to just ignore. I only meant that by running Prime95 right now you are subjecting both your system and yourself to some needless abuse. ;) It really proves nothing except that under extremes, your system is not as stable as it should be for some reason. That same thing might bug me also. Did you take a look at your BIOS settings also? Specifically look at the areas dealing with the video card and memory timings. There's an excellent BIOS guide here.
 

Boonesmi

Lifer
Feb 19, 2001
14,448
1
81
im curious... has the ambient temp gone up?

meaning is the room your pc is in gone up much in temp since spring/summer getting here


**** edit *** hehe nevermind i finally saw your last paragraph !!!
 

Qythyx

Member
Feb 6, 2000
87
0
0
Thanks for more good ideas Buz2b.

I'll check the RAM temp when I get home tonight. Maybe I'll move the HD probe to the RAM so I can log it too. I don't remember the RAM brand off hand but it is DDR 333 so I imagine it could be hot.

I also haven't tried the fan on the open case idea yet but that's because I don't have a fan readily available. Also, at the moment I can't reproduce the problem. Prime95's been running for more than 6 hours now and the CPU temp is 48. Today is a cooler day, though, so that might explain why it's not getting up to 51 C like it was before. I guess I could stop a few of my fans to force a higher temp and see if that generates a reboot. What's the highest CPU temp that is safe before doing real damage?

Thanks.
 

Jik47

Member
Jun 9, 2001
65
0
0
Hey Guys,

I am having the rebooting problem with a MSI motherboard and Windows XP PRO. Matter of fact in the last hour it has rebooted Four times sitting there doing nothing. I am curious where that setting is for the auto reboot as I am getting those error messages after it does reboot. Thanks.....
 

Buz2b

Diamond Member
Jun 2, 2001
4,619
0
0


<< I am curious where that setting is for the auto reboot as I am getting those error messages after it does reboot >>

Look up to the 5th post above yours. I quote the instructions there.

Edit:
Qythyx--As a curiosity, what is your DDR voltage running? This review of that board mentioned a problem with the RAM voltage. That might produce some anomolies at high stress. Might want to take a look.
 

Qythyx

Member
Feb 6, 2000
87
0
0
Thanks for the DDR Voltage pointer Buz2b. According to MBM it is running at 2.47 V, well below the 2.7 setting. Soltek did reply to my emails with info that it is possible that the shutdown is related to the ABS function and they have released a new BIOS version giving more access to this functionality.

Interestingly my system has now been up about 26 hours with Prime95 running the whole time. Current data is:

CPU 49 C
Core0 1.78V, Core 1 (DDR) 2.47 V
+3.3 = 3.41
+12 = 12.34
+5 = 4.99
-12 = -12.36
-5 = -5.16

I'm wondering if those numbers are dangerously off spec? Percentage wise, the DDR is the worst.

I haven't had a chance to check RAM temp yet, hopefully tonight.
 

Qythyx

Member
Feb 6, 2000
87
0
0
According to Soltek the DDR voltage is supposed to be 2.5, not 2.7. That makes my 2.47V pretty close to spec. I'm not sure why the referenced review said they were expecting 2.7.

Thanks for the confirmation on my values, Boonesmi.
 

Buz2b

Diamond Member
Jun 2, 2001
4,619
0
0


<< I'm not sure why the referenced review said they were expecting 2.7 >>

I'm not sure either. I didn't really pay attention to that when I first saw the article. Upon rereading it, maybe they were trying to acheive better stability at higher clock speeds and tried to increase the voltage to the DDR to assist in this. That's the only thing I can think of. Oh well, it was a "shot in the dark" anyway. BTW, have you tried to increase the voltage to the DDR? I'm just curious if what they report is true.


<< my system has now been up about 26 hours with Prime95 running the whole time >>

Are you doing anything else to the system while it is running Prime95? I mean other than web browsing; something that would "break the cycles" a bit more and put some different stress on it.
 

Qythyx

Member
Feb 6, 2000
87
0
0
First, in response to the latest question about doing anything else besides Prime95...mostly my system is just doing Prime95, sometimes I also play games, recently this has been Diablo II (how many times will I need to finish this game before I get tired of it?).

Here's the latest info...

I downloaded and installed the latest BIOS from Soltek. It addes access to the ABS II functionality that is in the latest Athlons. This allows me to set the ABS II shutdown temp and view the current temp. The default shutdown was at 75 C and my idle temp was around 69 C so obviously I wasn't that far off. I increased the shutdown to 80 and we'll see what happens.

I also moved my temp probe that was on my HD and stuck it to the outside of my CPU heatsink (i.e., where it is not being cooled by the fan). I now have 3 CPU temperatures:

* ABS II temp - the sensor built into the CPU - idle at around 69 C and I can't view it via MBM
* CPU core temp - the sensor on the MB under the CPU - idle at around 44 and peaks at around 51
* heatsink temp - the sensor taped to the heatsink - idle at around 35 and peaks at around 42

So here're my new questions:
1. Does anyone else have an Athlon 1800+ who can see the ABS II temp? What do you have? Is 69 C too hot?
2. Is 44 C too hot for the core temp of an 1800+?
3. If these are too hot, what might be the cause? The heatsink is warm to the touch so I guess that means there is reasonable contact between the heatsink and the CPU. Also the heatsink fan is working fine. It is the standard hintsink and fan combo that comes with the CPU from AMD.

Thanks.
 

Buz2b

Diamond Member
Jun 2, 2001
4,619
0
0


<< It is the standard hintsink and fan combo that comes with the CPU from AMD >>

First, not running an XP cpu, I'll wait and let someone else answer if those temps are too high. They seem a bit high to me though. I was under the impression that the xp series ran cooler than the standard Athlons. Second, have you taken a close look at the HSF to see if they are lined up properly on the cpu? Sometimes they can be slightly "tilted" on the die and loose efficiency. You might also want to try removing the HSF, then cleaning them of the remnants of the thermal paste, reapplying the paste and then reapply the HSF. Make sure of a good flat contact. Just because you "feel" that the HS is warm doesn't mean it is transerring heat as it should. Another idea would be to get a better HSF. Yes, the "stock" ones are "OK" and keep the temps well within spec. However, sometimes you just need to get the temps down a bit or some cpu's will not be as stable as they should. I ran across that with a 1.33 cpu once. Had the stock hsf and ran in the upper 50's. I experienced some mild instability. The only thing I changed was getting a good HSF which brought the temps down to around 42-45C. No more problems at all. This was not an overclocked cpu either.
But, let's hope someone can shed a bit of light on your temps and how they rate with other XP cpu's.