some Cinebench r15 FX-8xxx module/thread scaling results

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Toggle sidebar Toggle sidebar

Phynaz

Lifer

Apr 7, 2015

#26

All you have posted is a non sequitur. Made up CPU inefficiencies still can't get you to greater scaling than Amdahls law.

It's good you are going to stop. You should have stopped a while ago. Before you had to resort to ad hominem attacks rather than posting proof.

Last edited: Apr 7, 2015

AtenRa

Lifer

Apr 7, 2015

#27

I know i have to stop when i see the other person wears blinkers

TheELF

Diamond Member

Apr 7, 2015

#28

I guess he's arguing that single core speed is actually faster than his own results would suggest(cache misses and stuff,you know) ,because you know...cinebench is totally crappy software that everyone uses for the last,how-many-years to benchmark cpus only because it's totally pro intel biased and not because it's one of the few programs that actually use all of the available resources of a cpu.

But then again, if you look at the hyper threaded intels,1 core + 0 core = more than one core,so everything's possible.

Phynaz

Lifer

Apr 7, 2015

#29

AtenRa said:
I know i have to stop when i see the other person wears blinkers

Oh, now I get why you are so invested. I just went and looked, those are your personal benchmark results.

J

jhu

Lifer

Apr 7, 2015

#30

AtenRa said:
This may help as well, from my Bulldozer CMT scaling numbers.

Turbo off for all CPUs.

Those scaling numbers appear off for FX. From going 1 to 8 threads, I'm getting 680% scaling on Povray and 640% for Blender Cycles.

Your Phenom II scaling numbers show greater than 1:1 for some reason. That seems off since my Povray and Blender results (not shown) show near 1:1 scaling.

AtenRa

Lifer

Apr 8, 2015

#31

jhu said:
Those scaling numbers appear off for FX. From going 1 to 8 threads, I'm getting 680% scaling on Povray and 640% for Blender Cycles.

Your Phenom II scaling numbers show greater than 1:1 for some reason. That seems off since my Povray and Blender results (not shown) show near 1:1 scaling.

You tested on linux, i used Win 7 64bit.

Secondly, not every application is the same.

Also, CPU frequencies remained constant at base level by disabling Turbo in all three processors.

POV-Ray 3,7 RC (Balcony Project at 1024x768, AA 0,3)

J

jhu

Lifer

Apr 8, 2015

#32

That still doesn't make any sense. The only way scaling would be > 1:1 would be if all the background tasks were also stuck on the same core during the 1 core test thereby artifactually lowering the true single core performance.

Look at the Povray results again. The anomalous result is the 1 core result at 2279 pps. The 2, 4, and 6 results all have about 2400 pps/core.

x264 results look about right with slightly < 1:1 scaling.

Last edited: Apr 8, 2015

A

Abwx

Lifer

Apr 8, 2015

#33

In Cinebech 11.5 Techreport got 1.03 with the turbo wich get 0.927 at 3.6GHz.

The single core result for this set up should be 0.89 if we compare its 8 core 5.79 score with Techreport s 6.03, look like the single thread score is not what it should be, hence the scaling apparent issue.

S

soccerballtux

Lifer

Apr 8, 2015

#34

FWIW I can consistently get 99 on single core on core 2 in r15 over 96 on core 0. there may be other threads locked to core 0.

Phynaz

Lifer

Apr 9, 2015

#35

AtenRa seems to have gone missing.

Last edited: Apr 9, 2015

S

svenge

Senior member

Apr 9, 2015

#36

Phynaz said:
AtenRa has seemed to have gone missing.

Perhaps someone ought to file a missing child report with AMD then?

Last edited: Apr 9, 2015

Phynaz

Lifer

Apr 10, 2015

#37

Maybe. I think the he just wants this thread to get buried.

J

Jorge_Orwell

Banned<br>RBM schmuckley

Apr 10, 2015

#38

soccerballtux said:
few comments:

each test repeated 3x to check score consistency

Windows Scheduler still not properly prioritizing cores (priority should be in the following order: 0,2,4,6, then 1,3,5,7) sometimes. (With second monitor, can watch while gaming. Sometimes 0,2,4,6 are the loaded cores. Haven't paid attention to which games but mainly UT3)

Cinebench r15 overrides core affinity settings affecting benchmark performance. To get around this (if you'd like to repeat the test), have the 'affinity' window open with the settings you want, and click 'OK' after you start the bench run. Verify the proper cores are being loaded in Task Manager or SpeedFan. 0,2,4,6 should be pegged at 100% and 1,3,5,7 almost completely inactive.

16.4% [!!!] threading performance improvement when scheduling four threads to 0,2,4,6 instead of 0,1,2,3

4 modules, 4 threads occupying cores 0,2,4,6

2 modules, 4 threads occupying cores 0,1,2,3

4 modules, 4 threads occupying cores 0,1,2,3,4,5,6,7

What's your score without 12 Chrome windows open and running a server?

TheELF

Diamond Member

Apr 11, 2015

#39

Jorge_Orwell said:
What's your score without 12 Chrome windows open and running a server?

CPU usage 1%

So it shouldn't have affected the outcome at all.

J

Jorge_Orwell

Banned<br>RBM schmuckley

Apr 11, 2015

#40

TheELF said:
CPU usage 1%

So it shouldn't have affected the outcome at all.

+1 Firefox.
There may not have been any RAM left to run it.

S

soccerballtux

Lifer

Apr 11, 2015

#41

jorge_orwell said:
+1 firefox.
There may not have been any ram left to run it.

32gb

Dresdenboy

Golden Member

Apr 13, 2015

#42

Phynaz said:
Because the results are flawed. It's impossible to have scaling greater than unity.

This is true for old static architectures with constant clock frequency and no shared components except memory bus.

With a shared cache and threads working on the same stuff they could help eachother as intelligent prefetchers (by not predicting but doing the real thing). That happens at multiple levels: shared L3$, shared L2$, shared I$ and in case of HT shared D$.

E

Erenhardt

Diamond Member

Apr 13, 2015

#43

Could you guys incluse 3M, 3 threads results?

frozentundra123456

Lifer

Apr 13, 2015

#44

Phynaz said:
So you want me to prove a negative, while you supposedly have evidence that Amdahl got it wrong.

And you're the one posting rolling eyes.

Seriously, you should stop now.

I think it is basically a matter of semantics. I would agree that if only one process is active, you cant get more than linear scaling. However in a cpu, especially one with shared resources like vishera, other resources may come into play, so the scaling could be effectively more than 1:1, or less than 1:1 as well.

J

jhu

Lifer

Apr 13, 2015

#45

Dresdenboy said:
This is true for old static architectures with constant clock frequency and no shared components except memory bus.

With a shared cache and threads working on the same stuff they could help eachother as intelligent prefetchers (by not predicting but doing the real thing). That happens at multiple levels: shared L3$, shared L2$, shared I$ and in case of HT shared D$.

Possibly. However, as I and others have pointed out, most likely the 1 core test under estimates the true single thread performance (look at the Phenom II results, the only strange one is the 1 core result) - probably all OS threads are also on the same core.

J

jhu

Lifer

Apr 13, 2015

#46

frozentundra123456 said:
I think it is basically a matter of semantics. I would agree that if only one process is active, you cant get more than linear scaling. However in a cpu, especially one with shared resources like vishera, other resources may come into play, so the scaling could be effectively more than 1:1, or less than 1:1 as well.

See the Phenom II results. They're also > 1:1 scaling. Even the Intel results (HT off) show this, but to a much smaller degree. The anomaly is the 1 core result. The multicore results show the expected per core performance scaling.

Last edited: Apr 13, 2015

Dresdenboy

Golden Member

Apr 13, 2015

#47

jhu said:
Possibly. However, as I and others have pointed out, most likely the 1 core test under estimates the true single thread performance (look at the Phenom II results, the only strange one is the 1 core result) - probably all OS threads are also on the same core.

This could also be the case. Other SC/MC scaling results might be interesting here.

A single core run would still render the whole image and thus run longer, averaging out the OS thread effects. But if the single thread isn't pinned to one CPU it might constantly jump between cores and the cache contents have to be flushed (if not written through) or reloaded all the time.

J

jhu

Lifer

Apr 13, 2015

#48

Dresdenboy said:
This could also be the case. Other SC/MC scaling results might be interesting here.

A single core run would still render the whole image and thus run longer, averaging out the OS thread effects. But if the single thread isn't pinned to one CPU it might constantly jump between cores and the cache contents have to be flushed (if not written through) or reloaded all the time.

My results don't show this anomalous 1 thread result (didn't pin the thread to one CPU), so it doesn't appear to be an issue.

J

jhu

Lifer

Apr 13, 2015

#49

Here's what I get:

Povray 3.7, Windows 8.1, Phenom II 1090T

1 thread @ 3.6 GHz
Render Time:
Photon Time: 0 hours 0 minutes 2 seconds (2.968 seconds)
using 4 thread(s) with 2.983 CPU-seconds total
Radiosity Time: No radiosity
Trace Time: 0 hours 20 minutes 29 seconds (1229.596 seconds)
using 1 thread(s) with 1226.734 CPU-seconds total

6 threads @ 3.3 GHz
Render Time:
Photon Time: 0 hours 0 minutes 2 seconds (2.562 seconds)
using 9 thread(s) with 2.967 CPU-seconds total
Radiosity Time: No radiosity
Trace Time: 0 hours 3 minutes 50 seconds (230.688 seconds)
using 6 thread(s) with 1345.419 CPU-seconds total

1 thread: 59.2 pps/core/GHz
6 threads: 57.4 pps/core/GHz

As you can see, IPC goes down slightly as expected. It shouldn't go up. If it does, that means something else is going on.

AtenRa

Lifer

Apr 13, 2015

#50

jhu said:
Here's what I get:

Povray 3.7, Windows 8.1, Phenom II 1090T

1 thread @ 3.6 GHz
Render Time:
Photon Time: 0 hours 0 minutes 2 seconds (2.968 seconds)
using 4 thread(s) with 2.983 CPU-seconds total
Radiosity Time: No radiosity
Trace Time: 0 hours 20 minutes 29 seconds (1229.596 seconds)
using 1 thread(s) with 1226.734 CPU-seconds total

6 threads @ 3.3 GHz
Render Time:
Photon Time: 0 hours 0 minutes 2 seconds (2.562 seconds)
using 9 thread(s) with 2.967 CPU-seconds total
Radiosity Time: No radiosity
Trace Time: 0 hours 3 minutes 50 seconds (230.688 seconds)
using 6 thread(s) with 1345.419 CPU-seconds total

1 thread: 59.2 pps/core/GHz
6 threads: 57.4 pps/core/GHz

As you can see, IPC goes down slightly as expected. It shouldn't go up. If it does, that means something else is going on.

Now do the same without Turbo for the single thread. Also, i have disabled cores through BIOS, so when you see a single core or thread it means only a single core/thread was usable in the system. Same goes with Bulldozer Modules/Cores and Intel Cores/HT.

You must log in or register to reply here.

Share:

Facebook X (Twitter) Reddit Tumblr WhatsApp Email Link

TRENDING THREADS

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)
- Started by DisEnchantment
- Sep 29, 2022
- Replies: 25K
CPUs and Overclocking
T
Discussion Intel Meteor, Arrow, Lunar & Panther Lakes + WCL Discussion Threads
- Started by Tigerick
- Aug 22, 2022
- Replies: 23K
CPUs and Overclocking
Discussion Intel current and future Lakes & Rapids thread
- Started by TheF34RChannel
- Jun 18, 2017
- Replies: 23K
CPUs and Overclocking
Discussion Apple Silicon SoC thread
- Started by Eug
- Nov 10, 2020
- Replies: 11K
CPUs and Overclocking
Question Zen 6 Speculation Thread
- Started by IronLynx
- May 22, 2024
- Replies: 8K
CPUs and Overclocking

Top Bottom

This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.

Accept Learn more…