First ever look at core IPC and Blender benchmark

Abwx · Dec 20, 2016

Arachnotronic said:
Intel says it's 5.5% faster on average per clock than Haswell.

Including gains, dixit Intel, due to new instructions, so in practice that s less than this.

Dannotech · Dec 20, 2016

dfk7677 said:
Strange thing is, when I did the same rendering in Windows, I got 107.9sec with ~1.4IPC...
(IPC calculated by PerfMonitor2)

That seems pretty low. That's about what I get my Skylake i5 without HT. My Ivy Bridge laptop that I'm running on right now was doing ~1.6 IPC as measured by PerfMonitor2.

I guess my first question would be are you running in a VM? In that case the performance counters might also be virtualized and you might be seeing thread IPC instead of core IPC. I don't actually understand how performance counters work in a VM so I can't provide any insight. What's your take?

dfk7677 · Dec 20, 2016

i5 4590 does not have HT, it is a haswell 4c/4t. So I guess ~1.4 is normal. The linux result is very strange. Except if the build for Linux has optimizations that are not in the Windows build (???).

Both Win and Linux are installed on their own, they are not VMs.

Dannotech · Dec 20, 2016

dfk7677 said:
i5 4590 does not have HT, it is a haswell 4c/4t. So I guess ~1.4 is normal. The linux result is very strange. Except if the build for Linux has optimizations that are not in the Windows build (???).

Both Win and Linux are installed on their own, they are not VMs.

Yes I suppose the mix of instructions may be different for the Linux version, depending on compiler options. Strange that the Windows version is that much slower. gg Microsoft.

Nothingness · Dec 21, 2016

dfk7677 said:
i5 4590 does not have HT, it is a haswell 4c/4t. So I guess ~1.4 is normal. The linux result is very strange. Except if the build for Linux has optimizations that are not in the Windows build (???).

Both Win and Linux are installed on their own, they are not VMs.

This has much worse results than your run on Linux: http://www.realworldtech.com/forum/?threadid=163466&curpostid=163575

Code:

perf stat ../blender -b RyzenGraphic_27.blend -E CYCLES -x 1 -o foo -f 1

CPU: i7-4600U

____678664.078975 task-clock (msec) # 3.420 CPUs utilized
__________126 886 context-switches # 0.187 K/sec
____________1 685 cpu-migrations # 0.002 K/sec
__________305 255 page-faults # 0.450 K/sec
1 470 647 042 574 cycles # 2.167 GHz
1 671 800 359 058 instructions # 1.14 insns per cycle
__147 770 835 671 branches # 217.738 M/sec
____3 032 929 583 branch-misses # 2.05% of all branches

_____198.412894034 seconds time elapsed

dfk7677 · Dec 21, 2016

Nothingness said:

This has much worse results than your run on Linux: http://www.realworldtech.com/forum/?threadid=163466&curpostid=163575

Code:

perf stat ../blender -b RyzenGraphic_27.blend -E CYCLES -x 1 -o foo -f 1

CPU: i7-4600U

____678664.078975 task-clock (msec) # 3.420 CPUs utilized
__________126 886 context-switches # 0.187 K/sec
____________1 685 cpu-migrations # 0.002 K/sec
__________305 255 page-faults # 0.450 K/sec
1 470 647 042 574 cycles # 2.167 GHz
1 671 800 359 058 instructions # 1.14 insns per cycle
__147 770 835 671 branches # 217.738 M/sec
____3 032 929 583 branch-misses # 2.05% of all branches

_____198.412894034 seconds time elapsed

Probably uses the old file from the New Horizon site. I can attest that there were 2 different files, I still have the previous version and I get 92sec instead of 66sec for the last one (using the command of the quoted post).

But the difference is in the instructions executed. For the old file I get 2,557B, as the OP showed in the video, for the new one I have 1,914B. The IPC is the same for both files.

Update:
I did the rendering using the rendering (with the newest file) using the SIMD version of Blender (2.78.4) provided in this post: https://forums.anandtech.com/threads/summit-ridge-zen-benchmarks.2482739/page-149#post-38641613

I got ~63sec (even better than linux). PerfMonitor2 reported an IPC of ~1.75, which I suppose is under-reported (a value >2 would be more appropriate for that time).

Headfoot · Dec 21, 2016

Multi socket results are not a good basis for comparison at all versus a single socket demo of Ryzen. Multisocket has the massive bottleneck of intersocket communication for memory, cpu, etc whether NUMA aware or not.

Unless you're trying to compare platforms (fabric and all), the "IPC" you found is not a per core IPC figure because it incorporates all the limitations of intersocket fabric. Single socket 8c/16t is the best and only accurate comparison. Everything else takes way too many steps of approximation.

Pilum · Dec 23, 2016

dfk7677 said:
Update:
I did the rendering using the rendering (with the newest file) using the SIMD version of Blender (2.78.4) provided in this post: https://forums.anandtech.com/threads/summit-ridge-zen-benchmarks.2482739/page-149#post-38641613

I got ~63sec (even better than linux). PerfMonitor2 reported an IPC of ~1.75, which I suppose is under-reported (a value >2 would be more appropriate for that time).

No, that sounds right. SIMD=Single Instruction Multiple Data, i.e. a single instruction will perform several arithmetic operations at once, in contrast to scalar code, where one instruction performs a single arithmetic operation. Thus SIMD code needs fewer instructions to perform the same number of calculations, so IPC will generally go down while performance goes up.

First ever look at core IPC and Blender benchmark

Abwx

Lifer

Dannotech

Junior Member

dfk7677

Member

Dannotech

Junior Member

Nothingness

Diamond Member

dfk7677

Member

Headfoot

Diamond Member

Pilum

Member

TRENDING THREADS