Turboboost limitation

netmonk

Junior Member
Nov 13, 2015
4
0
6
Good afternoon, i'm a new joiner, all apologies if i break any rules by posting my first post here.

My mean concern is looking for study, about turboboost limitation.

Let's say i have turboboost activated on my server and that usually on a 16cores cpu (model name : Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz
),
- only 5 or 6 are used 100% with polling process which are binded on distinct cores (one process per core, stuck at 100%cpu (while true do.... style)

- cstate enforced to 0 in bios and linux boot cmdline (intel_idle.max_cstate=0 processor.max_cstate=0 idle=poll)

If i add more processes, in this context cpupower monitor returns :
Code:
              |Nehalem                    || SandyBridge        || Mperf              
PKG |CORE|CPU | C3   | C6   | PC3  | PC6  || C7   | PC2  | PC7  || C0   | Cx   | Freq 
   0|   0|   0|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00||  3.28| 96.72|  3292
   0|   1|   1|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00||  1.22| 98.78|  3291
   0|   2|   2|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00||  0.55| 99.45|  3292
   0|   3|   3|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  0.01|  3292
   0|   4|   4|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  0.01|  3292
   0|   5|   5|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  0.01|  3292
   0|   6|   6|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  0.01|  3292
   0|   7|   7|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  0.01|  3292
   1|   0|   8|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  0.01|  3292
   1|   1|   9|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  0.01|  3292
   1|   2|  10|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  0.01|  3292
   1|   3|  11|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00||  0.56| 99.44|  3292
   1|   4|  12|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  0.01|  3292
   1|   5|  13|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  0.01|  3292
   1|   6|  14|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  0.01|  3292
   1|   7|  15|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00||  0.36| 99.64|  3293

I would like to know if turboboost can have some limitations in high percentile, because my kpi really worsen when binding new 100%cpu process on idle core.

So even if cpupower report 99.9% of time on c0 at 3.294ghz, can it happens that turboboost in the 0.01% remaining, really lower the power in the core which can impact my application kpi in the high percentile almost x3 my observed latency ?

Is there any papers related to turboboost performance and heavy load ?
Is there any tool better than standard linux tools to monitor turboboost mode on a lower time scale (ie microsecond or nanosecond)?

Thank you for your concern and your replies.
If i did any mistakes i will be committed to correct them.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,250
3,845
75
Your CPU might be throttling based on power usage. This helps prevent overheating of the CPU or any associated components. I understand it might be possible to adjust or turn off TDP throttling, but I'm not sure, and if you do you should carefully monitor your CPU's temperature.
 

sm625

Diamond Member
May 6, 2011
8,172
137
106
To find the turboboost limitations (or rather the steps or increments of it) you need to run your own processes because it is different depending on the code you're running. Whatever your "polling" processes are doing, they are probably not using very many execution units. Thus the CPU is able to run in a turboost state even with 6 cores loaded. If your processes were heavily weighted with AVX instructions then I'd expect the clocks to drop to stock speeds. If you reduce your thread count to 1 or 2 you should see your clock speed increase. Eventually if you keep adding more loads the turboboost will drop out. That is the only way you're going to know where the cutoff points are.
 

TheRyuu

Diamond Member
Dec 3, 2005
5,479
14
81
Is turbo boost not limited when cstates are disabled? Or is that a Windows/mobo thing.

I was under the impression to get the most out of it the advanced cstates had to be enabled (C3/C6/C7). Also idle=poll seems a little crazy but I've no idea what you're doing so it may in fact be ok if you know what you're doing (I can't say I am with linux). There were some issues with the intel power driver on linux which lead to increased latencies (so it was better to disable that and just use the acpi default one). I think it had to do with how it handled clockspeed and cstates, not sure if they fixed that (I don't think it applicable here with your settings but it may be if you decided to use more balanced power settings).

I'm also not entirely sure what you're asking. If turbo boost can somehow dramatically increase latency for something? I doubt it unless there's some serious throttling going on (maybe there is considering idle=poll, not sure if that would cause it).
 

netmonk

Junior Member
Nov 13, 2015
4
0
6
Thank you all for replies.

In fact, at application level, the internal indicator show that we might have loose some cpu power like running at lower frequency.

But when running cpupower from time to time, i still get the same frequency as before.

So my question is : does increasing the load on cpu (number of core running at 100%) might have some impact on turboboost which would lead to lower frequency processing. But not visible by using standard monitoring tool because those losses would be at microsecond level for exemple.

If such behaviour exist, may be it has been quantified and studyed, with may be a floor effect above specific value of load level on the system.

This is one of the area im trying to investigate to determine the root cause of our increase of latency in our application.
 

myocardia

Diamond Member
Jun 21, 2003
9,291
30
91
Intel's turbo is 100% dependent upon the amount of cores being loaded. The more cores you're using, the lower the speed they all run. Use 100% of your cores (at a high percentage of use), and you will have zero turbo taking place. This is not a fault, it is working as it was designed to work.
 

netmonk

Junior Member
Nov 13, 2015
4
0
6
Totally understood.

But do we have charts about relation between load and frequency ?
Have it been studied scientifically with factual conclusion somewhere ?
 

Ansau

Member
Oct 15, 2015
40
20
81
Turboboost and frequencies are determinated by:
- Type of workload
- Number of active cores
- Estimated current consumption
- Estimated power consumption
- Processor temperature
I think it's pretty much impossible to know exactly when it activates or when it changes, since that would require the access to the internal coding of the several algorithms that make turboboost work.

Btw, the MAXIMUM turboboost frequencies over base clock for your cpu are 4/4/4/5/5/7/7/9 (6 cores used being the first number and 1 being the last).
 
Last edited:

unclewebb

Member
May 28, 2012
57
11
71
I think it's pretty much impossible to know exactly when it activates or when it changes, ...

Intel CPUs use high performance timers that run at billions of cycles per second so it is very easy to precisely determine exactly how much turbo boost the CPU is using at any moment in time.

Intel published a paper back in November 2008 when the first generation Core i CPUs were released that explains exactly how software can use these timers to measure turbo boost. It is difficult to still find this document on the Intel website but you can download it from here.

Intel® Turbo Boost Technology in Intel® Core™ Microarchitecture (Nehalem) Based Processors
http://files.shareholder.com/downlo...C8-A433-E28F64CB8EF2/TurboBoostWhitePaper.pdf

Intel is still using these timers and registers in their latest 6th Gen Skylake processors. Monitoring software can sample these timers hundreds or thousands of times per second if it wants to.

If you are using Windows, give ThrottleStop a try. The multiplier data it provides is a little more detailed compared to most other monitoring tools in Windows. When you first start ThrottleStop, it is in monitoring mode so no worries.

http://i.imgur.com/O1ManUs.jpg

ThrottleStop 8.00
https://www.sendspace.com/file/p1q40a
 

netmonk

Junior Member
Nov 13, 2015
4
0
6
Intel CPUs use high performance timers that run at billions of cycles per second so it is very easy to precisely determine exactly how much turbo boost the CPU is using at any moment in time.

The problem is the polling frequency. If you query the cpu every second, it will be impossible to see microsecond throttle. If you query it every microsecond, it will had so much load that it will interfere with the current production application and even modify the measurement results.

Thank you for the documentation, very interesting. And i'm running linux not windows, and tools are not so versatile ! :)