Why are there such discrepencies in CPU temperature readings?

beatle

Diamond Member
Apr 2, 2001
5,661
5
81
It amazes me that after years of having on-die temperature diodes, we're still guessing how hot the core of a CPU is. Granted, there is some method to the madness that is CPU temperature readings, but it's not consistent across motherboards. Seems to me like the standard for reading the value on the cpu temperature diode would be passed on to motherboard manufacturers so that they can calibrate their sensors accordingly. Wouldn't the value be consistent at least within a chip generation? The boards have no problems with running x86 code - something far more complex than taking the value reported by the diode and turning it into a proper temperature reading.

Instead, what we find are motherboards that read the same chip with the same cooling with the same ambient temperature as having temperatures that differ by up to 10C! Is there some technical roadblock here? Is the true temperature reading not known because there is no benchmark to base the reading? Are manufacturers trying to edge each other out by offering that their board will cause your chip to run cooler? Are manufactuers just too incompetent/lazy to provide an accurate solution? I'm guessing the latter two are not true, but the former is possible. What do you think?
 

jagec

Lifer
Apr 30, 2004
24,442
6
81
because making a highly accurate on-die thermistor would cost a little more, and is not a priority for CPU manufacturers.
 

beatle

Diamond Member
Apr 2, 2001
5,661
5
81
If it's the on-die thermistor that's inaccurate, wouldn't we see these inaccuracies as similar readings across all boards?
 

Calin

Diamond Member
Apr 9, 2001
3,112
0
0
I think there might be a white noise in the reading process. So, a mainboard might make more readings, or faster readings or something and compute the average result for a longer period - this would help reduce the variations.
(just an idea, I don't know if this is implemented)
 

Peter

Elite Member
Oct 15, 1999
9,640
1
0
Inaccuracy is one thing. The other, much bigger factor is: They're cheating. There is so much hype around the importance of low CPU temperature readings, partially because that's what people are used to from back when we didn't have CPU internal probes. The outcome is that quite a few mainboard BIOSes display a LOWER number than what's actually being measured - and because they do the same thing in the ACPI thermal zone object, you'll get the same "adjusted" reading later, when a tool like Sandra inquires it.

There's no reason to panick when your CPU temperature reads 75C - for the aforementioned reasons, users do, support hotline gets swamped. Impractical solution: Try to convince users they're OK. Not working, not even if you slap them with datasheets repeatedly. Practical solution: Cook the reading. (Pun intended)
 

Matthias99

Diamond Member
Oct 7, 2003
8,808
0
0
Originally posted by: beatle
If it's the on-die thermistor that's inaccurate, wouldn't we see these inaccuracies as similar readings across all boards?

Yes, if they were 'inaccurate' in that, for example, every chip read exactly 10 degrees C lower than the 'true' temperature. However, one might read 10 degrees C too cold, one might read 10 degrees C too hot (the swing is probably not THAT large, but I hope you get the drift). The sensors are basically designed to be accurate enough to tell if your HSF is not attached, or the CPU is drastically overheating for some reason -- NOT to provide super-accurate temperature readings for tweakers.

I'm not sure how much 'cheating' there really is (wouldn't that defeat the whole point of having a thermal shutdown limit???), but that may be a factor as well.
 

rgwalt

Diamond Member
Apr 22, 2000
7,393
0
0
Another issue is calibration. For sensors to be at all accurate, they need to be calibrated. In fact, each individule sensor manufactored will offer slightly different readings, so they must be calibrated one by one. Unfortunately, there isn't a good way to calibrate a CPU thermal diode that I know of, and BIOSes don't offer the ability to change the calibration settings, so we are stuck with what they give us.

Ryan
 

f95toli

Golden Member
Nov 21, 2002
1,547
0
0
You only need to calibrate sensors if you want very accurate readings (maybe +-0.5 degree), there are even some "real" temperature sensors like the LS10 (a Si diod) from Lakeshore that are accurate within +-0.5K (or it might be +-1K) between 4K and 350K even without calibration; you just use a standard curve.

The spread between each ensor should not be more than 2-3K or so in this range (say between 50 and 90 degrees C) unless there is something very wrong with the fab.



 

jagec

Lifer
Apr 30, 2004
24,442
6
81
Originally posted by: Matthias99
I'm not sure how much 'cheating' there really is (wouldn't that defeat the whole point of having a thermal shutdown limit???), but that may be a factor as well.

the thermal shutdown limit (at least on Pentiums) is build in to the chip, yes? It's only the reading that gets passed to Windows that gets cooked.
 

Matthias99

Diamond Member
Oct 7, 2003
8,808
0
0
Originally posted by: jagec
Originally posted by: Matthias99
I'm not sure how much 'cheating' there really is (wouldn't that defeat the whole point of having a thermal shutdown limit???), but that may be a factor as well.

the thermal shutdown limit (at least on Pentiums) is build in to the chip, yes? It's only the reading that gets passed to Windows that gets cooked.

P4s actually *throttle* themselves at high temps, but I do not believe they can shut down the system because of high temperatures (for instance, high ambient temperatures, or if the CPU is not throttling for some reason) -- the MB has to do that.

On AMD boards (at least with the AthlonXP; I'm *pretty* sure the A64 works like this as well), thermal management is done by the MB. They are supposed to cut the power if the CPU temp gets above some ridiculous threshold like 90-100 degrees C.

It is possible that a motherboard would use the 'true' reading for maintaining its thermal limits, but pass a lower reading on to Windows. However, this would be a very bad behavior if you intended to use an OS-level utility for thermal management rather than the tools in the BIOS.
 

Peter

Elite Member
Oct 15, 1999
9,640
1
0
Re the P4: Yes they can. They have a two-stage reaction to thermal failure: First, throttle down to 50% operation / 50% standstill mode. If still overheating, and getting past "critical" temperature, signal this to the CPU voltage regulation circuit for an immediate cutoff.

32-bit Athlon boards _can_ implement the same thing. The extra bit over a P4 system is a temperature monitoring chip (which the P4 boards have anyway, just for actual readback of the temperature). The BIOS would set this up for the "throttle" and "cutoff" threshold temperatures; the reaction to the actual failure would then again be pure hardware, no software intervention required. Here, the south bridge does the throttling when signalled to do so by the monitoring chip, and the monitoring chip would signal the VRM to cut off.

I haven't had insight into the AMD64 CPUs handle this.

Thermal management, even when supervised by Windows, runs through the ACPI "Thermal Zone" objects provided by the board's BIOS. These can do whatever they please. It only starts going wrong when non-ACPI 3rd party tools are being used that read the monitoring chips directly.
 

Matthias99

Diamond Member
Oct 7, 2003
8,808
0
0
Originally posted by: Peter
Re the P4: Yes they can. They have a two-stage reaction to thermal failure: First, throttle down to 50% operation / 50% standstill mode. If still overheating, and getting past "critical" temperature, signal this to the CPU voltage regulation circuit for an immediate cutoff.

32-bit Athlon boards _can_ implement the same thing. The extra bit over a P4 system is a temperature monitoring chip (which the P4 boards have anyway, just for actual readback of the temperature). The BIOS would set this up for the "throttle" and "cutoff" threshold temperatures; the reaction to the actual failure would then again be pure hardware, no software intervention required. Here, the south bridge does the throttling when signalled to do so by the monitoring chip, and the monitoring chip would signal the VRM to cut off.

I haven't had insight into the AMD64 CPUs handle this.

Thermal management, even when supervised by Windows, runs through the ACPI "Thermal Zone" objects provided by the board's BIOS. These can do whatever they please. It only starts going wrong when non-ACPI 3rd party tools are being used that read the monitoring chips directly.

Interesting -- I didn't realize P4s had those extra capabilities. Good stuff!
 

Peter

Elite Member
Oct 15, 1999
9,640
1
0
The Pentium Pro had that already, IIRC. Through all the revisions and reshapings of PPro, PII and PIII, there have been numerous errata on the THRMTRIP# signal line that essentially prevented board makers from implementing the emergency cutoff ... but on the P4 it works correctly - and on the previous Intels, one could still have the external monitoring chip do that exactly as you would with an Athlon design.
 

PhotoLab

Junior Member
Feb 6, 2005
11
0
0
<<<Thermal management, even when supervised by Windows,
>> runs through the ACPI "Thermal Zone" objects

See:
Advanced Configuration & Power Interface
ADVANCED CONFIGURATION AND POWER INTERFACE SPECIFICATION
Revision 3.0, September 2, 2004
http://www.acpi.info/spec.htm

Long story short...

Win98 -PMPI FanThrottleToleranceAc/Dc Not Implemented Under Win98
Article ID : 189803

Windows 98 Does Not Support ACPI Passive Cooling Mode
189091


On P4 CPUs, there is a throttling going on that will be controled via WinXP, and most likely your Bios throttling setting.

If you go back to around 2000, and look at some of those P4 motherboard comparisions, I have this feeling that Tom [et.al] blew it when a motherboard was reported to have lower scores {e.g., MSI 6339}.

Anybody follow me???
 

Peter

Elite Member
Oct 15, 1999
9,640
1
0
The "passive cooling" (throttling the CPU speed down) is engaged when the "active cooling" (CPU fan) is going at full blast and the CPU temperature reaches a critical threshold nonetheless.

The P4 does that via its internal hardware, you don't even need ACPI for it. This is why they don't fry when you forget to put a heatsink on.
 

stevty2889

Diamond Member
Dec 13, 2003
7,036
8
81
My question is if the temp readings are inaccurate, are the inacurate readings used to kick in the thermal thorttlling? So if a temp of 70c is being reported, but the actual temp is lower, the thermal throttling kicks in anyway? Is there any way to disable the thermal throttling, while allowing the critical temp shutdown to still work?
 

Matthias99

Diamond Member
Oct 7, 2003
8,808
0
0
Originally posted by: stevty2889
My question is if the temp readings are inaccurate, are the inacurate readings used to kick in the thermal thorttlling?

I would assume so, unless the CPU itself is knowingly reporting an inaccurate temperature to the MB.

So if a temp of 70c is being reported, but the actual temp is lower, the thermal throttling kicks in anyway?

I'm not sure that they start throttling at 70C, but presumably, yes, the CPU will throttle whenever it "thinks" it is at a high enough temperature. In theory, if the thermal diode was defective and constantly reported a very high temperature, the CPU would keep throttling itself (or shut down), even if it was quite cool.

Is there any way to disable the thermal throttling, while allowing the critical temp shutdown to still work?

AFAIK, there is no way to disable or modify the thermal throttling. This is done in hardware, entirely inside the CPU. You couldn't turn it off even if you wanted to.

Basically, the temperature readings are accurate enough for what they're supposed to do. They're built for making sure the chip isn't melting down, not providing laboratory accuracy for comparative measurements between systems.
 

PhotoLab

Junior Member
Feb 6, 2005
11
0
0
>>My question is if the temp readings are inaccurate<<

Here is some tidbit info:

In the Intel® Pentium® 4 Processor Datasheet (423 pin), Intel notes:

7.3.1 Thermal Diode
The Pentium 4 processor incorporates an on-die thermal diode. A thermal sensor located on the
system board may monitor the die temperature of the Pentium 4 processor for thermal
management/long term die temperature change purposes. Table 35 and Table 36 provide the diode
parameter and interface specifications. This thermal diode is separate from the Thermal Monitor?s
thermal sensor and cannot be used to predict the behavior of the Thermal Monitor.



7.3 Thermal Monitor
Thermal Monitor is a new feature found in the Pentium 4 processor which allows system designers
to design lower cost thermal solutions, without compromising system integrity or reliability. By
using a factory-tuned, precision on-die thermal sensor, and a fast acting thermal control circuit
(TCC), the processor, without the aid of any additional software or hardware, can keep the
processors' die temperature within factory specifications under typical real world operating
conditions. Thermal Monitor thus allows the processor and system thermal solutions to be designed
much closer to the power envelopes of real applications, instead of being designed to the much
higher maximum theoretical processor power envelopes.

Thermal Monitor controls the processor temperature by modulating the internal processor core
clocks.

<<<Is there any way to disable the thermal throttling, while allowing the critical temp shutdown to still work? >>>

On Intel's P4s, critical shutdown will always work. To disable thermal throttling would require programming skills, unless an utility program exists.

 

PhotoLab

Junior Member
Feb 6, 2005
11
0
0
<<<AFAIK, there is no way to disable or modify the thermal throttling. This is done in hardware, entirely inside the CPU. You couldn't turn it off even if you wanted to. >>>

Throttling can be controled, but...from Intel

"If automatic mode is disabled the processor will be operating out of specification and cannot be guaranteed to provide reliable results. Regardless of enabling of the automatic or On-Demand modes, in the event of a catastrophic cooling failure, the processor will automatically shut down when the silicon has reached a temperature of approximately 135 °C. At this point the system bus signal THERMTRIP# will go active and stay active until the processor has cooled down and RESET# has been initiated. THERMTRIP# activation is independent of processor activity and does not generate any bus cycles."
 

PhotoLab

Junior Member
Feb 6, 2005
11
0
0
<AFAIK, there is no way to disable or modify the thermal throttling>

ACPI specs is what handles this process via talking to Bios, but Microsoft didn't get their act together on Win98 OS.

WinXP most likely does...have not checked it to make sure. Also, the motherboard's bios must have correct code...hence motherboard bios and OS work together...

If you are running a program like Flight Stim, it ramps up CPU to about 100%; I was messing with a fan on motherboard and touched CPU fan last night...guess what, destroyed several fan blades. I was going to check out thermal aspects, but shiit happens.

 

stevty2889

Diamond Member
Dec 13, 2003
7,036
8
81
Well my thermal throttling is definatly kicking in at 70c. At 69c it stops, 70c+ it kicks in. I am waiting on my water cooling kit now, becuase it is completely rediculous that I have throttling kicking even if I underclock to 3.2ghz. Thermtrip# is what I believe shuts down the CPU, I think PROCHOT# is what is used for the throttling, and since they are on seprate pins, I was wondering if by isolating that pin, you could disable the thermal throttling. If my temp reading is inacurate and that is what reading is causing the thermal throttling to kick in, it would be quite nice to be able to disable it, since it keeps this cpu from performing at 100% at speeds even below stock.
 

BEL6772

Senior member
Oct 26, 2004
225
0
0
Originally posted by: stevty2889
Thermtrip# is what I believe shuts down the CPU, I think PROCHOT# is what is used for the throttling, and since they are on seprate pins, I was wondering if by isolating that pin, you could disable the thermal throttling.

Nope. The CPU clocks are controlled by the CPU. The thermal sensors that provide the data to the thermal protection (clock throttling) circuitry are calibrated in the factory. Once they're set, that's it. The CPU will forevermore begin throttling at the point that the internal comparators are calibrated to.
 

stevty2889

Diamond Member
Dec 13, 2003
7,036
8
81
Well I guess it's a good thing I am going to water cooling, cause otherwise this damn chip won't stop throttling itself.
 

PhotoLab

Junior Member
Feb 6, 2005
11
0
0
<<<Well my thermal throttling is definatly kicking in at 70c>>>

Check out Intel's doc on 423 pin P4

ftp://download.intel.com/design/Pentium4/datashts/24919805.pdf

See:
7.3 Thermal Monitor
7.3.1 Thermal Diode
8.4.1 Boxed Processor Cooling Requirements
8.4.2 Variable Speed Fan

There are three concepts here:

1. Active Cooling
2. Passive Cooling
3. Intel's CPU Control

Your OS is suppose to take care of items 1 & 2, and Intel takes care of item 3.



 

PhotoLab

Junior Member
Feb 6, 2005
11
0
0
<<Originally posted by: stevty2889
Thermtrip# is what I believe shuts down the CPU>>

<<Nope. The CPU clocks are controlled by the CPU>>
===============================

First poster is correct..."The processor protects itself from catastrophic overheating by use of an internal thermal sensor. This sensor is set well above the normal operating temperature to ensure that there are no false trips. The processor will stop all execution when the junction temperature exceeds approximately 135°C."

Second poster is taking about something else..."Automatic mode is required
for the processor to operate within specifications and must first be enabled via BIOS....When TCC is enabled, and a high temperature situation exists (i.e. TCC is active), the clocks will be modulated by alternately turning the clocks off and on at a a 50% duty cycle. Clocks will not be off more than 3 µs when TCC is active. "

There is more, due read up on the specs; see Intel doc mentioned before,

<a target=_blank class=ftalternatingbarlinklarge href="ftp://download.intel.com/design/Pentium4/datashts/24919805.pdf">ftp://download.intel.com/design/Pentium4/datashts/24919805.pdf</a>

As noted before, there are several factors [active,passive, cpu] and of course, your motherboard and bios and your OS involved in this cooling process.

Do note, if you are doing speed testing, if your CPU is in 50% duty cycle, guess what, you get bad scores.