CMT vs SMT - Bulldozer vs Gulftown Scaling

BlueBlazer · Oct 20, 2011

CMT vs SMT >>> AMD Bulldozer vs. Intel Gulftown Scaling at OpenBenchmarking.org

Multi-threading technology head to head!

Idontcare · Oct 20, 2011

I'm rather impressed with how well the $250 8150 held in there compared to a $1000 990X.

BlueBlazer · Oct 20, 2011

The end results do matter (that's what benchmarks measure). The thing is the scaling is very similar for both despite the throughput claims. Perhaps rather than CMT, AMD should be looking at SMT which is less complex and takes up less die area.

The odd one out is CLOMP.......

CLOMP is the C version of the Livermore OpenMP benchmark developed to measure OpenMP overheads and other performance impacts due to threading in order to influence future system designs. This particular test profile configuration is currently set to look at the OpenMP static schedule speed-up across all available CPU cores using the recommended test configuration.

Bulldozer seems to level off at 4 threads. Could be "core"/thread scheduling impact (second "core"/thread of each module being used)? :hmm:

Also there's a weird anomaly for the Gulftown at between 6 to 8 threads on some of these tests.

Martimus · Oct 20, 2011

I would think that SMT is more complicated to implement.

Tuna-Fish · Oct 20, 2011

Martimus said:
I would think that SMT is more complicated to implement.

Why? Register renaming (which all modern x86 cpus have) takes you halfway there. If you can rename the flags register too, you basically don't have to do any changes to the execution units to support SMT.

Vesku · Oct 20, 2011

Full 12 threads 990x, full 8 threads FX-8150
Test Speedup % 990x full HT Speedup % FX-8150 full Module
C-Ray 5.18 83.35
Smallpt 38.16 83.65
GraphicsMagick 39.39 44.68
GraphicsMagick 46.88 62.50
7-Zip 30.82 91.54
x264 21.75 88.44
NAS Parallel -4.38 68.60
NAS Parallel 49.06 86.60
NAS Parallel -20.92 59.29
NAS Parallel 9.25 67.91
NAS Parallel 9.69 65.28
CLOMP 25.00 -2.77

BlueBlazer · Oct 20, 2011

Vesku said:
Full 12 threads 990x, full 8 threads FX-8150

For scaling, I'm looking at 8 threads on the Bulldozer versus 8 threads on the Gulftown, since we do not have 12 thread Bulldozer sample data (thus how Bulldozer scales beyond 8 threads is unknown, waiting for Interlagos on that one). Deltas are similar on a few of those tests (the ones without the Gulftown anomalies)

Vesku · Oct 20, 2011

I was contrasting HT to CMT gains, I must say HT has come a long way since the P4 implementation some solid gains. CMT doesn't have as much room to improve but it's an interesting approach to running more threads.

Is it possible to disable cores in a 2600K? Someone with access to comparable 2600K and 8150 systems could then start from 1 core no HT and 1 module 1 thread all the way up to 4 core with HT and 4 module 2 thread per module.

jhu · May 5, 2014

Vesku said:
I was contrasting HT to CMT gains, I must say HT has come a long way since the P4 implementation some solid gains.

I get about 25% increase for both Pentium 4 (Northwood) and my Ivy Bridge.

Arkaign · May 6, 2014

LOL I was like '8150? Gulftown?' wtf?

/checks date

AH okay then.

HurleyBird · May 6, 2014

BlueBlazer said:
For scaling, I'm looking at 8 threads on the Bulldozer versus 8 threads on the Gulftown, since we do not have 12 thread Bulldozer sample data (thus how Bulldozer scales beyond 8 threads is unknown, waiting for Interlagos on that one). Deltas are similar on a few of those tests (the ones without the Gulftown anomalies)

That's problematic. For eight threads we know that BD is fully loaded, but what does that mean for Gulftown? It might mean that the benchmark is running on four cores with two threads per core, or it could be running on all six cores with only two cores fully loaded via SMT -- or five with three fully loaded. Depending on how the threads are managed you could be looking at a huge difference in scaling.

A better comparison for CMT vs. SMT would be one Intel core vs. one BD module, or at least compare products that have the same number of total threads. It can be difficult to tell if the benchmarks are favouring loading up physical cores before logical ones, or the other way around. 6/12 vs 4/8 just compounds the issue. Too much random noise.

rvborgh · May 6, 2014

i wonder how a 12 core Opteron 8439SE/2439SE system would have done against Gulftown...

ShintaiDK · May 6, 2014

SMT is almost free to implement die size wise. The same can hardly be said about AMDs CMT.

Idontcare · May 6, 2014

ShintaiDK said:
SMT is almost free to implement die size wise. The same can hardly be said about AMDs CMT.

CMT with Intel's beefy Haswell cores would be something of interest, CMT with otherwise weakened and lackluster cores not so much.

jhu · May 6, 2014

Arkaign said:
LOL I was like '8150? Gulftown?' wtf?

/checks date

AH okay then.

This somehow ended up on the front page. Fooled by date too.

Lepton87 · May 6, 2014

HurleyBird said:
That's problematic. For eight threads we know that BD is fully loaded, but what does that mean for Gulftown? It might mean that the benchmark is running on four cores with two threads per core, or it could be running on all six cores with only two cores fully loaded via SMT -- or five with three fully loaded. Depending on how the threads are managed you could be looking at a huge difference in scaling.

That's a non-issue, Windows is HT aware and will always load physical cores first and logical cores second so with 8 threads it will always load 6 physical cores and 2 logical cores. An older unpatched version of Windows like Windows XP may load 4 physical cores and 4 logical cores but it doesn't happen on newer versions of Windows.
BTW. If you don't trust windows scheduler you can always assign core affinity manually.

Search

CMT vs SMT - Bulldozer vs Gulftown Scaling

BlueBlazer

Senior member

Idontcare

Elite Member

BlueBlazer

Senior member

Martimus

Diamond Member

Tuna-Fish

Golden Member

Vesku

Diamond Member

BlueBlazer

Senior member

Vesku

Diamond Member

jhu

Lifer

Arkaign

Lifer

HurleyBird

Platinum Member

rvborgh

Member

ShintaiDK

Lifer

Idontcare

Elite Member

jhu

Lifer

Lepton87

Platinum Member

TRENDING THREADS