some Cinebench r15 FX-8xxx module/thread scaling results

Dec 30, 2004
12,554
2
76
few comments:
  • each test repeated 3x to check score consistency
  • Windows Scheduler still not properly prioritizing cores (priority should be in the following order: 0,2,4,6, then 1,3,5,7) sometimes. (With second monitor, can watch while gaming. Sometimes 0,2,4,6 are the loaded cores. Haven't paid attention to which games but mainly UT3)
  • Cinebench r15 overrides core affinity settings affecting benchmark performance. To get around this (if you'd like to repeat the test), have the 'affinity' window open with the settings you want, and click 'OK' after you start the bench run. Verify the proper cores are being loaded in Task Manager or SpeedFan. 0,2,4,6 should be pegged at 100% and 1,3,5,7 almost completely inactive.
  • 16.4% [!!!] threading performance improvement when scheduling four threads to 0,2,4,6 instead of 0,1,2,3

jmG0suT.png


4 modules, 4 threads occupying cores 0,2,4,6
pACa5Os.png



2 modules, 4 threads occupying cores 0,1,2,3
sfr56Xu.png



4 modules, 4 threads occupying cores 0,1,2,3,4,5,6,7
zDfwYB6.png
 
Last edited:

AtenRa

Lifer
Feb 2, 2009
14,000
3,357
136
This may help as well, from my Bulldozer CMT scaling numbers.

Turbo off for all CPUs.

nq7gib.jpg


go0mqkj
 

Yuriman

Diamond Member
Jun 25, 2004
5,530
141
106
@AtenRa's numbers, why is scaling greater than linear with core count? I suppose there's some overhead?
 

Abwx

Lifer
Apr 2, 2011
10,847
3,297
136
Because the results are flawed. It's impossible to have scaling greater than unity.

It means that Hardware.fr is flawed in favour of Intel since they have such a bench that overscale with a 4670K, you are perhaps right after all, i should find a site that has less good results with Intel s chips...
 
Apr 20, 2008
10,161
984
126
Because the results are flawed. It's impossible to have scaling greater than unity.

Untrue. Everything in the background sucks up measurable cpu resources. From user input, GUI, network stack, antivirus... All of it contributes to CPU usage, even during light CPU loads. This was readily apparent when android phones went from single to dual core and dual to quad cores while using the same generation of chips. The OS and all background tasks can tug a single thread down.

It gets less apparent the more cores a CPU has, but it's still possible. Since the scheduler likes to (is patched) use cores 0,2,4,6 first, who is to know that those background tasks didn't nerf the single threaded/low threaded results.
 

AtenRa

Lifer
Feb 2, 2009
14,000
3,357
136
@AtenRa's numbers, why is scaling greater than linear with core count? I suppose there's some overhead?

I have asked the same question and got this answer from Dresdenboy,

Well, the discrepancy between significant 2T/1M and 2T/2M scaling compared to still a ~761% scaling with 8T might be caused by Windows' scheduling and missed turbo opportunities (no sleeping cores with low thread count). It even looks like more than 100% scaling per thread if there is a 1T:1M ratio ;-)

Are CB threads hopping around? Another effect might be the shared L2s being filled with data from former threads, so that there are some "data already here" or "lucky prefetch" opportunities.
 
Apr 20, 2008
10,161
984
126
I have asked the same question and got this answer from Dresdenboy,

In heavily threaded, repeatable workloads this is not surprising. Over time doing the same or extremely similar instructions, cache likely starts to be really efficient. Especially when the cpu has ample/large amounts of cache storage. In a server I'd expect this to be very important.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Jeez people, it's not rocket science. 1+1 = 2. You can't increase the number of cores by 1 and get a result of > 2.

If you think it's possible then you just proved Amdahl wrong and should publish your findings in a journal stat. You're about to become famous.
 

AtenRa

Lifer
Feb 2, 2009
14,000
3,357
136
Jeez people, it's not rocket science. 1+1 = 2. You can't increase the number of cores by 1 and get a result of > 2.

If you think it's possible then you just proved Amdahl wrong and should publish your findings in a journal stat. You're about to become famous.

Example,

One core gets you only 89% efficiency because of stalls, misses etc.
But with two cores you get 92% efficiency per core because you fetch more threads and those are stored in L1-L2-L3 caches making you having higher per core efficiency and higher scaling than one core. ;)
 

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
Example,

One core gets you only 89% efficiency because of stalls, misses etc.
But with two cores you get 92% efficiency per core because you fetch more threads and those are stored in L1-L2-L3 caches making you having higher per core efficiency and higher scaling than one core. ;)

By default running 1 thread the 1 thread gets the entire L3 to itself. There is no extra stuff stored there.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Example,

One core gets you only 89% efficiency because of stalls, misses etc.
But with two cores you get 92% efficiency per core because you fetch more threads and those are stored in L1-L2-L3 caches making you having higher per core efficiency and higher scaling than one core. ;)

Wink all you want. You're still wrong.
 

Abwx

Lifer
Apr 2, 2011
10,847
3,297
136
My results are with Bulldozer FX8150 on pre patched Win 7, not with Vishera ;)

I posted it because there s also a FX8150 displayed..

What the numbers do not show is that most benches do not max out a 8C CPU, Hardware.fr software suite manage to max out the Intels i5/i7 at 100% and 90-100% respectively.

With only 80-90% of its throughput used on average on theses benches the FX8350 manage to beat the i5 4670k by 15% while lagging the 4770K by only 8%, it has still some grunt left in most applications.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
So you want me to prove a negative, while you supposedly have evidence that Amdahl got it wrong.

And you're the one posting rolling eyes.

Seriously, you should stop now.
 

AtenRa

Lifer
Feb 2, 2009
14,000
3,357
136
So you want me to prove a negative, while you supposedly have evidence that Amdahl got it wrong.

And you're the one posting rolling eyes.

Seriously, you should stop now.

Im sure that you dont fully understand what Amdahls law is all about but ill stop here. I gave you an example, if you want to believe that one core always have 100% efficiency then you dont really understand how CPUs are working.