I know i have to stop when i see the other person wears blinkers
This may help as well, from my Bulldozer CMT scaling numbers.
Turbo off for all CPUs.
![]()
![]()
AtenRa has seemed to have gone missing.
few comments:
- each test repeated 3x to check score consistency
- Windows Scheduler still not properly prioritizing cores (priority should be in the following order: 0,2,4,6, then 1,3,5,7) sometimes. (With second monitor, can watch while gaming. Sometimes 0,2,4,6 are the loaded cores. Haven't paid attention to which games but mainly UT3)
- Cinebench r15 overrides core affinity settings affecting benchmark performance. To get around this (if you'd like to repeat the test), have the 'affinity' window open with the settings you want, and click 'OK' after you start the bench run. Verify the proper cores are being loaded in Task Manager or SpeedFan. 0,2,4,6 should be pegged at 100% and 1,3,5,7 almost completely inactive.
- 16.4% [!!!] threading performance improvement when scheduling four threads to 0,2,4,6 instead of 0,1,2,3
![]()
4 modules, 4 threads occupying cores 0,2,4,6
![]()
2 modules, 4 threads occupying cores 0,1,2,3
![]()
4 modules, 4 threads occupying cores 0,1,2,3,4,5,6,7
![]()
CPU usage 1%What's your score without 12 Chrome windows open and running a server?
CPU usage 1%
So it shouldn't have affected the outcome at all.
+1 firefox.
There may not have been any ram left to run it.
This is true for old static architectures with constant clock frequency and no shared components except memory bus.Because the results are flawed. It's impossible to have scaling greater than unity.
So you want me to prove a negative, while you supposedly have evidence that Amdahl got it wrong.
And you're the one posting rolling eyes.
Seriously, you should stop now.
This is true for old static architectures with constant clock frequency and no shared components except memory bus.
With a shared cache and threads working on the same stuff they could help eachother as intelligent prefetchers (by not predicting but doing the real thing). That happens at multiple levels: shared L3$, shared L2$, shared I$ and in case of HT shared D$.
I think it is basically a matter of semantics. I would agree that if only one process is active, you cant get more than linear scaling. However in a cpu, especially one with shared resources like vishera, other resources may come into play, so the scaling could be effectively more than 1:1, or less than 1:1 as well.
This could also be the case. Other SC/MC scaling results might be interesting here.Possibly. However, as I and others have pointed out, most likely the 1 core test under estimates the true single thread performance (look at the Phenom II results, the only strange one is the 1 core result) - probably all OS threads are also on the same core.
This could also be the case. Other SC/MC scaling results might be interesting here.
A single core run would still render the whole image and thus run longer, averaging out the OS thread effects. But if the single thread isn't pinned to one CPU it might constantly jump between cores and the cache contents have to be flushed (if not written through) or reloaded all the time.
Here's what I get:
Povray 3.7, Windows 8.1, Phenom II 1090T
1 thread @ 3.6 GHz
Render Time:
Photon Time: 0 hours 0 minutes 2 seconds (2.968 seconds)
using 4 thread(s) with 2.983 CPU-seconds total
Radiosity Time: No radiosity
Trace Time: 0 hours 20 minutes 29 seconds (1229.596 seconds)
using 1 thread(s) with 1226.734 CPU-seconds total
6 threads @ 3.3 GHz
Render Time:
Photon Time: 0 hours 0 minutes 2 seconds (2.562 seconds)
using 9 thread(s) with 2.967 CPU-seconds total
Radiosity Time: No radiosity
Trace Time: 0 hours 3 minutes 50 seconds (230.688 seconds)
using 6 thread(s) with 1345.419 CPU-seconds total
1 thread: 59.2 pps/core/GHz
6 threads: 57.4 pps/core/GHz
As you can see, IPC goes down slightly as expected. It shouldn't go up. If it does, that means something else is going on.