• We should now be fully online following an overnight outage. Apologies for any inconvenience, we do not expect there to be any further issues.

CPU overheat auto shutoff

l337viton

Junior Member
Aug 9, 2007
5
0
0
Do pretty much all modern server CPUs shut off once a temperature threshold is reached?

We have a situation where keeping the server alive for a long as possible even if it overheats is more important than the server surviving. Now granted I dont want to smoke CPUs for the heck of it but these requirements are there for this application.

So I guess in other words, if for some reason the server overheats (CPU overheards) the server needs to run until it burns out and not shutoff when it reaches an "unsafe" temperature.

I am open to either Intel Xeons or AMD Opterons. However I do not believe Intel processors will be up to the job. From my research:

**To address catastrophic failures as well as transitory thermal events, Intel processor engineers have integrated thermal protection and monitoring features in Intel's latest processors to protect your investment.**

http://www.intel.com/cd/ids/de...ng/downloads/54118.htm

I cant presently find any information from AMD so I'm posting to see if anyone else has run into this

thanks!

 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
So wait... if your cooling solution fails, you want the server to keep running for the additional 3 seconds before it burns out as opposed to having the chip slow down until the thermal event passes?
 

l337viton

Junior Member
Aug 9, 2007
5
0
0
Originally posted by: TuxDave
So wait... if your cooling solution fails, you want the server to keep running for the additional 3 seconds before it burns out as opposed to having the chip slow down until the thermal event passes?

Yes and no...

If the cooling solution where to fail (most likely a dead fan) the heatsink will still be able to provide some cooling, so I doubt it will only run for an additional 3 seconds :)

Now if the CPU slows down that is ok (goes from 2.8Ghz to 2.0Ghz) as long as it doesnt shut off.

For example, lets say a CPU has a recommend operating temperature of 30C-60C (i think thats pretty common/accurate). The way I understand it for Intel Xeons, is that once it reaches 61C (or whatever the catastrophic temp is) it shuts off automatically. Nothing you can do about.

Just need to know if the same thing will happen with an AMD Opteron, my knowledge of old AMD procs was that the thermal shutdown was a BIOS thing that could be turned off. Just trying to get a confirmation of that in regards to the latest batch of Opterons

Thanks!
 

a123456

Senior member
Oct 26, 2006
885
0
0
I think you'll have issues with this. Pretty much all processors, AMD and Intel, made in the last few years have thermal protection built in. Well, at first, it'll just throttle like crazy by slowing down the CPU but keep running assuming there's some semblance of cooling on there.

The last CPU that I remember that would melt a motherboard if it overheated was maybe 10+ years ago? Some of the older Intel and AMD ones would melt.

How exactly are you planning to overheat your computers anyway? Assuming no overclocking, most CPUs should be okay with the stock cooler. If you're at the melting point, the extra time you get probably isn't more than a few minutes anyway.
 

Modelworks

Lifer
Feb 22, 2007
16,240
7
76
With poor/no cooling it would be a bad cycle for everything involved including the motherboard. The cpu would heat up, cool down, heat up, cool down and they do that normally, but not to the extremes that trigger the internal throttling.

The only thing that would stop the cpu completely would be a motherboard setting to shutdown in case of fan failure, some have that setting.
 

l337viton

Junior Member
Aug 9, 2007
5
0
0
Well the idea behind this is as follows;

These servers will be capturing tons of sensory data. Data that needs to be processed to assist in making life and death decisions. The servers will not be installed in a data center, they will in fact be vehicle mounted (or that is the idea that being kicked around). Should the vehicle be damaged the server may also become damaged (cooling being just one concern at this point). Obviously the servers will be ruggedized, but because of the role they play they need to run until the last transistor bites the dust :)

Note, due to the nature of the application multiple redundant system will be used.

Anyway.. sounds like the CPUs will shutdown automatically anyway... guess this idea might just get shelfed in the end.

On another note www.amd.com offline right now? Cant seem to get it to load up :(
 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
Originally posted by: l337viton
Originally posted by: TuxDave
So wait... if your cooling solution fails, you want the server to keep running for the additional 3 seconds before it burns out as opposed to having the chip slow down until the thermal event passes?

Yes and no...

If the cooling solution where to fail (most likely a dead fan) the heatsink will still be able to provide some cooling, so I doubt it will only run for an additional 3 seconds :)

Now if the CPU slows down that is ok (goes from 2.8Ghz to 2.0Ghz) as long as it doesnt shut off.

For example, lets say a CPU has a recommend operating temperature of 30C-60C (i think thats pretty common/accurate). The way I understand it for Intel Xeons, is that once it reaches 61C (or whatever the catastrophic temp is) it shuts off automatically. Nothing you can do about.

Just need to know if the same thing will happen with an AMD Opteron, my knowledge of old AMD procs was that the thermal shutdown was a BIOS thing that could be turned off. Just trying to get a confirmation of that in regards to the latest batch of Opterons

Thanks!

So from your presentation:
The catastrophic thermal protection mechanism was an "all or nothing" capability without pre-notification of a thermal problem. Once the preset core temperature was reached, the processor was shutdown. With the introduction of Pentium 4 and Intel Xeon processors, a new thermal protection mechanism was introduced, allowing for the processor to automatically control the processor temperature before reaching the catastrophic shutdown temperature but at the expense of temporarily reducing processor performance.

So the xeons will throttle first before hitting the catastrophic temperature and if it was a simple cpu fan problem, the clock throttling should be enough to keep the temperature from hitting the catastrophic temperature. If that isn't enough and the temperature hits the catastrophic temperature, it's bound to die pretty fast without shutdown.
 

l337viton

Junior Member
Aug 9, 2007
5
0
0
So the xeons will throttle first before hitting the catastrophic temperature and if it was a simple cpu fan problem, the clock throttling should be enough to keep the temperature from hitting the catastrophic temperature. If that isn't enough and the temperature hits the catastrophic temperature, it's bound to die pretty fast without shutdown.

Yup, pretty sure Intel is gonna be a no go, wondering about AMD now...
 

dbcooper1

Senior member
May 22, 2008
594
0
76
If it's just gathering data and writing it to disk, just use an Atom or something low power and passively cooled like that. I don't think it would burn up unless the heatsink fell off for some reason.
 

l337viton

Junior Member
Aug 9, 2007
5
0
0
If it's just gathering data and writing it to disk, just use an Atom or something low power and passively cooled like that. I don't think it would burn up unless the heatsink fell off for some reason.

Excellent idea! However, the server will be ingesting (2) 10 gigabit channels worth of data and writing it to 24 drives in a RAID array (exact RAID configuration TBD). Not sure the Atom will be up to the task.

thanks!
 

dbcooper1

Senior member
May 22, 2008
594
0
76
Originally posted by: l337viton
If it's just gathering data and writing it to disk, just use an Atom or something low power and passively cooled like that. I don't think it would burn up unless the heatsink fell off for some reason.

Excellent idea! However, the server will be ingesting (2) 10 gigabit channels worth of data and writing it to 24 drives in a RAID array (exact RAID configuration TBD). Not sure the Atom will be up to the task.

thanks!

How about the 330? And all that data is funneled through the PCI bus? Given that it's mobile, the larger concern might be cooling and shock mounting all those 10k rpm drives. Hope it has redundant power supplies; how big is this thing?
 

Modelworks

Lifer
Feb 22, 2007
16,240
7
76
I would use embedded processors for data capture. There are some great ones that can capture data faster than any x86 processor out there and write it to disk using less power and producing far less heat.
And be more reliable doing the task.
 

Billb2

Diamond Member
Mar 25, 2005
3,035
70
86
Intels throttle and AMDs shut off and there's nothing you can do about either.

If this is a "big bucks" project (or when it's commercialized), maybe AND or Intel would help with some proprietary CPUs. Money talks, ya know!
 

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,225
126
Originally posted by: l337viton
If it's just gathering data and writing it to disk, just use an Atom or something low power and passively cooled like that. I don't think it would burn up unless the heatsink fell off for some reason.

Excellent idea! However, the server will be ingesting (2) 10 gigabit channels worth of data and writing it to 24 drives in a RAID array (exact RAID configuration TBD). Not sure the Atom will be up to the task.

thanks!

You're getting two 10 gigabit channels worth of data, from something in a vehicle? That's some mobile broadband plan you've got going!

Edit: N/m, it must be some sort of sat. dish.
 

faxon

Platinum Member
May 23, 2008
2,109
1
81
i skimmed the thread. one thing i have seen in most modern CPUs and motherboards is the ability to choose which temperatures your system will throttle down or shut off at. on my Q9650 i have it set to shut off at 80C right now (which i only ever hit on a hot day with linpack), but you could probably set it higher, and enable thermal throttling options in bios so it will drop the CPU multiplier and voltage to their lowest stable settings until temperatures stabilize. the chances of this happening with proper cooling though are slim to none. even in a 1U server, with the invention of heatpipes you can channel the heat just about wherever you need it to go relatively easily and vent it out the back of the server rack just fine. i would check what cooling your cpu will have access to before building your system. if it's going into a tower box, even a low profile cooler like the thermalright AXP-140 (http://www.thermalright.com/ne...oduct_cpu_axp140.html) will be significantly overkill enough for a CPU running at stock speeds to not have any worries about it overheating, unless you are pumping hot air directly into the case from an oven or something.

ed: also, if you are sticking this in a 4U server rack that's already in a loud room, consider investing in some super high capacity delta industrial fans for case cooling. those fans are loud as hell but depending on which ones you get, you can get ones which move as much as 300CFM.
 

Zensal

Senior member
Jan 18, 2005
740
0
0
I thought I remember a video that had a P4 with the heatsink off. It just kept throttling down to a crawl but never died. Don't know if the newer Xeons will do that though.