• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

some Cinebench r15 FX-8xxx module/thread scaling results

few comments:
  • each test repeated 3x to check score consistency
  • Windows Scheduler still not properly prioritizing cores (priority should be in the following order: 0,2,4,6, then 1,3,5,7) sometimes. (With second monitor, can watch while gaming. Sometimes 0,2,4,6 are the loaded cores. Haven't paid attention to which games but mainly UT3)
  • Cinebench r15 overrides core affinity settings affecting benchmark performance. To get around this (if you'd like to repeat the test), have the 'affinity' window open with the settings you want, and click 'OK' after you start the bench run. Verify the proper cores are being loaded in Task Manager or SpeedFan. 0,2,4,6 should be pegged at 100% and 1,3,5,7 almost completely inactive.
  • 16.4% [!!!] threading performance improvement when scheduling four threads to 0,2,4,6 instead of 0,1,2,3

jmG0suT.png


4 modules, 4 threads occupying cores 0,2,4,6
pACa5Os.png



2 modules, 4 threads occupying cores 0,1,2,3
sfr56Xu.png



4 modules, 4 threads occupying cores 0,1,2,3,4,5,6,7
zDfwYB6.png
 
Last edited:
@AtenRa's numbers, why is scaling greater than linear with core count? I suppose there's some overhead?
 
Because the results are flawed. It's impossible to have scaling greater than unity.

It means that Hardware.fr is flawed in favour of Intel since they have such a bench that overscale with a 4670K, you are perhaps right after all, i should find a site that has less good results with Intel s chips...
 
Because the results are flawed. It's impossible to have scaling greater than unity.

Untrue. Everything in the background sucks up measurable cpu resources. From user input, GUI, network stack, antivirus... All of it contributes to CPU usage, even during light CPU loads. This was readily apparent when android phones went from single to dual core and dual to quad cores while using the same generation of chips. The OS and all background tasks can tug a single thread down.

It gets less apparent the more cores a CPU has, but it's still possible. Since the scheduler likes to (is patched) use cores 0,2,4,6 first, who is to know that those background tasks didn't nerf the single threaded/low threaded results.
 
@AtenRa's numbers, why is scaling greater than linear with core count? I suppose there's some overhead?

I have asked the same question and got this answer from Dresdenboy,

Well, the discrepancy between significant 2T/1M and 2T/2M scaling compared to still a ~761% scaling with 8T might be caused by Windows' scheduling and missed turbo opportunities (no sleeping cores with low thread count). It even looks like more than 100% scaling per thread if there is a 1T:1M ratio ;-)

Are CB threads hopping around? Another effect might be the shared L2s being filled with data from former threads, so that there are some "data already here" or "lucky prefetch" opportunities.
 
I have asked the same question and got this answer from Dresdenboy,

In heavily threaded, repeatable workloads this is not surprising. Over time doing the same or extremely similar instructions, cache likely starts to be really efficient. Especially when the cpu has ample/large amounts of cache storage. In a server I'd expect this to be very important.
 
Jeez people, it's not rocket science. 1+1 = 2. You can't increase the number of cores by 1 and get a result of > 2.

If you think it's possible then you just proved Amdahl wrong and should publish your findings in a journal stat. You're about to become famous.
 
Jeez people, it's not rocket science. 1+1 = 2. You can't increase the number of cores by 1 and get a result of > 2.

If you think it's possible then you just proved Amdahl wrong and should publish your findings in a journal stat. You're about to become famous.

Example,

One core gets you only 89% efficiency because of stalls, misses etc.
But with two cores you get 92% efficiency per core because you fetch more threads and those are stored in L1-L2-L3 caches making you having higher per core efficiency and higher scaling than one core. 😉
 
Example,

One core gets you only 89% efficiency because of stalls, misses etc.
But with two cores you get 92% efficiency per core because you fetch more threads and those are stored in L1-L2-L3 caches making you having higher per core efficiency and higher scaling than one core. 😉

By default running 1 thread the 1 thread gets the entire L3 to itself. There is no extra stuff stored there.
 
Example,

One core gets you only 89% efficiency because of stalls, misses etc.
But with two cores you get 92% efficiency per core because you fetch more threads and those are stored in L1-L2-L3 caches making you having higher per core efficiency and higher scaling than one core. 😉

Wink all you want. You're still wrong.
 
My results are with Bulldozer FX8150 on pre patched Win 7, not with Vishera 😉

I posted it because there s also a FX8150 displayed..

What the numbers do not show is that most benches do not max out a 8C CPU, Hardware.fr software suite manage to max out the Intels i5/i7 at 100% and 90-100% respectively.

With only 80-90% of its throughput used on average on theses benches the FX8350 manage to beat the i5 4670k by 15% while lagging the 4770K by only 8%, it has still some grunt left in most applications.
 
So you want me to prove a negative, while you supposedly have evidence that Amdahl got it wrong.

And you're the one posting rolling eyes.

Seriously, you should stop now.
 
So you want me to prove a negative, while you supposedly have evidence that Amdahl got it wrong.

And you're the one posting rolling eyes.

Seriously, you should stop now.

Im sure that you dont fully understand what Amdahls law is all about but ill stop here. I gave you an example, if you want to believe that one core always have 100% efficiency then you dont really understand how CPUs are working.
 
Back
Top