- Nov 25, 2008
- 555
- 0
- 76
CMT vs SMT >>> AMD Bulldozer vs. Intel Gulftown Scaling at OpenBenchmarking.org
Multi-threading technology head to head!
Multi-threading technology head to head!
Bulldozer seems to level off at 4 threads. Could be "core"/thread scheduling impact (second "core"/thread of each module being used)? :hmm:CLOMP is the C version of the Livermore OpenMP benchmark developed to measure OpenMP overheads and other performance impacts due to threading in order to influence future system designs. This particular test profile configuration is currently set to look at the OpenMP static schedule speed-up across all available CPU cores using the recommended test configuration.
I would think that SMT is more complicated to implement.
For scaling, I'm looking at 8 threads on the Bulldozer versus 8 threads on the Gulftown, since we do not have 12 thread Bulldozer sample data (thus how Bulldozer scales beyond 8 threads is unknown, waiting for Interlagos on that one). Deltas are similar on a few of those tests (the ones without the Gulftown anomalies)Full 12 threads 990x, full 8 threads FX-8150
I was contrasting HT to CMT gains, I must say HT has come a long way since the P4 implementation some solid gains.
For scaling, I'm looking at 8 threads on the Bulldozer versus 8 threads on the Gulftown, since we do not have 12 thread Bulldozer sample data (thus how Bulldozer scales beyond 8 threads is unknown, waiting for Interlagos on that one). Deltas are similar on a few of those tests (the ones without the Gulftown anomalies)![]()
SMT is almost free to implement die size wise. The same can hardly be said about AMDs CMT.
LOL I was like '8150? Gulftown?' wtf?
/checks date
AH okay then.
That's problematic. For eight threads we know that BD is fully loaded, but what does that mean for Gulftown? It might mean that the benchmark is running on four cores with two threads per core, or it could be running on all six cores with only two cores fully loaded via SMT -- or five with three fully loaded. Depending on how the threads are managed you could be looking at a huge difference in scaling.
