AMD Zen supports CMT and SMT

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

SarahKerrigan

Senior member
Oct 12, 2014
735
2,036
136
It was just SMT+CMP. The CMT comes from Chip MultiThreading reference.

T1 was non-SMT; it used fine-grained multithreading. T2/T3 was an oddity; i'd classify it somewhere between SMT and CMT. Significant logic (complete integer pipelines) were partitioned, two to a core, and each allocated to a group of four threads. As a result, two threads could issue per cycle, but it certainly wasn't a conventional SMT design.

Oracle no longer uses any related design, and hasn't since T4, which is a more conventional multithreaded OoO core.

Freescale, on the other hand, uses a very CMT-esque design in their e6500 core, although that's a very different target workload from anything AMD is targeting. An e6500 core has a pair of integer "virtual cores", with dedicated decode, issue, and execution logic, with a lot of similarity to the previous-generation e5500 core. Floating-point and VMX logic is shared.
 

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
Did any of you clowns read the original Japanese article? It doesn't mention SMT or CMT at all...

Yeah exactly.... nobody at AMD ever said anything like that at all. Its all reading between the lines at best and pure speculation at worst
 

DooKey

Golden Member
Nov 9, 2005
1,811
458
136
Wow, the ADF is really reaching now. Speculation or lies............I guess time and a official review will tell the tale.
 

DrMrLordX

Lifer
Apr 27, 2000
22,696
12,651
136
Did any of you clowns read the original Japanese article? It doesn't mention SMT or CMT at all...

The Register article linked by the OP doesn't mention them, either.

There are a few tidbits in there (the original Japanese article) I hadn't heard before, though, not that they are related to the OP (per se).

For example, the article (yay Google translate) seems to indicate that Nolan and Amur will be fully HSA-compliant:

In generation of this 2015, x86 is "Puma + core", ARM will be equipped with the "Cortex-A57 core" of low-power, equipped with a complete HSA support, such as part of the non-core common design the GCN graphics of AMD the city go.

That factoid may seem trivial to many, but it's interesting to see the progression of available HSA APUs. First Kaveri, then Carrizo, then Nolan/Amur . . . HSA is being pushed onto successively lower-power devices. Berlin, meanwhile, is still MIA in retail channels. Funny how Mr. Hayashi seems to ramble on about Berlin and Seattle as if they were shipping in non-trivial numbers in 2014.
 

NTMBK

Lifer
Nov 14, 2011
10,411
5,677
136
Yeah, not surprised they got HSA- they had to redesign the entire uncore to let them swap out ARM and x86 cores.

I suspect that Nolan and Amur are mostly tech demonstrators for Zen/K12, derisking that project by developing the uncore design using an off the shelf ARM core and a tried and tested x86 design. Lets them iron out the details before the main event.
 

MiddleOfTheRoad

Golden Member
Aug 6, 2014
1,123
5
0
There is a reason: CMT is a failed concept that nobody but AMD in the entire industry bothered to pursue.

I think most Scientists / Researchers would disagree with you.

For scientific computing, most researchers seem to prefer all 8 projects running/finishing at a consistent speed (FX-8320e) -- versus 4 projects finishing quickly, with 4 projects dragging along like a slug on the "hyper" threads (Core i7).

My alma mater just purchased a cluster of water cooled FX's instead of i7's because the scientific computations were completed so erratically on the tested i7's.

Just saying.... Intel isn't the end all and be all that the fanboys on this forum make it out to be. CMT and SMT combined has the potential to be very good for many specialized tasks.
 
Last edited:

MiddleOfTheRoad

Golden Member
Aug 6, 2014
1,123
5
0
Aprils fools day was 2 days ago. Unless you use a very lose definition of "not that far away".

Technically, he might not be wrong. If you were to compare an entry-level Haswell Celeron G1830 to a Streamroller Athlon X4 860K -- Passmark only separates them by a mere 76 points in single threaded performance. Seems like a fair comparison, too -- since there is only a $20 difference in price between those 2 chips.

It's clearly no contest once you break into the i5 Haswells or above (versus Steamrollers) -- but it really depends on which Haswell he was referring to.... If he was only referring to Celerons or even the low-end Pentiums, then sure.... It is a true statement.
 
Last edited:

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
I think most Scientists / Researchers would disagree with you.

For scientific computing, most researchers seem to prefer all 8 projects running/finishing at a consistent speed (FX-8320e) -- versus 4 projects finishing quickly, with 4 projects dragging along like a slug on the "hyper" threads (Core i7).

Not that it would matter anyway, as scientists/researchers are too small of a niche to warrant a product specifically developed for then (hence CMT designs being a failure as a product), but I'm rather curious about what these scientists of yours think about the Xeon line of processors. You know, 8 projects is so 2011.

But since you are bringing your anecdotal evidence as a proof of authority, could you at least explain what kind of projects these scientists of yours work? Because The scientists/researchers I know in our company (the guys working on geological data and chemical simulations) don't really go beyond Xeon/Tesla for their stuff, and they seem to be quite happy with them. In fact, AMD processors would be way anemic when compared to the heavy weight workstation and servers we have in our labs.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Technically, he might not be wrong. If you were to compare an entry-level Haswell Celeron G1830 to a Streamroller Athlon X4 860K -- Passmark only separates them by a mere 76 points in single threaded performance. Seems like a fair comparison, too -- since there is only a $20 difference in price between those 2 chips.

It's clearly no contest once you break into the i5 Haswells or above (versus Steamrollers) -- but it really depends on which Haswell he was referring to.... If he was only referring to Celerons or even the low-end Pentiums, then sure.... It is a true statement.

The context was IPC.
 

MiddleOfTheRoad

Golden Member
Aug 6, 2014
1,123
5
0
The context was IPC.

No, it wasn't. You are incorrect, sir.

AtenRa said: "Even Steam Roller is not that far away than Haswell in ST"

ST was a reference to Single Threaded performance. Passmark scores
agree with what he is saying.
 
Last edited:

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Last edited:

NTMBK

Lifer
Nov 14, 2011
10,411
5,677
136
Why does IPC in of itself matter? It's overall perf/W that I care about, how they get there is just implementation detail. If Bulldozer clocked at 5GHz at 95W and offered good results, I would not care how much higher Intel's IPC is. But of course it doesn't, so hey.
 

MiddleOfTheRoad

Golden Member
Aug 6, 2014
1,123
5
0
Not that it would matter anyway, as scientists/researchers are too small of a niche to warrant a product specifically developed for then (hence CMT designs being a failure as a product), but I'm rather curious about what these scientists of yours think about the Xeon line of processors. You know, 8 projects is so 2011.

They considered Xeon and Opteron as well -- but in the end, FX 8320e offered the most impressive performance for the allotted grant money. The staggering costs of current Xeon's ended up being the non-starter (they came close to purchasing the E5 2697 instead).

It was for the math department (integer calculations). They ended up with a cluster of 6 FX-8320e's -- running 48 projects simultaneously running Redhat.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Why does IPC in of itself matter? It's overall perf/W that I care about, how they get there is just implementation detail. If Bulldozer clocked at 5GHz at 95W and offered good results, I would not care how much higher Intel's IPC is. But of course it doesn't, so hey.

Unless something radically changes. IPC is all that matters with the ~4Ghz barrier we had the last 10 years.
 

MiddleOfTheRoad

Golden Member
Aug 6, 2014
1,123
5
0
He replied to IPC. Context is IPC.

And how realible is passmark to actual performance? Not to mention its user supplied samples with whatever variation.

Again, I don't think so.

Just because he replied to a separate post -- the actual context is how the author intended it in his specific comment. From how he wrote it, there was nothing about IPC intended in the comment as far as I can tell. I'm sure he can clarify it better, but I'm pretty sure he was making a general point that Steam Roller provides very good Single Threaded Performance (in relation to Haswell).... Nothing more and nothing less. I believe you are injecting IPC into a comment where it wasn't intended.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Again, I don't think so.

Just because he replied to a separate post -- the actual context is how the author intended it in his specific comment. From how he wrote it, there was nothing about IPC intended in the comment as far as I can tell. I'm sure he can clarify it better, but I'm pretty sure he was making a general point that Steam Roller provides very good Single Threaded Performance (in relation to Haswell).... Nothing more and nothing less. I believe you are injecting IPC into a comment where it wasn't intended.

I dont think so. Strawberry cake is almost as good. (Dont even bother to show me any sales charts)

If context doesnt matter then the reply doesnt matter.
 

MiddleOfTheRoad

Golden Member
Aug 6, 2014
1,123
5
0
I dont think so. Strawberry cake is almost as good. (Dont even bother to show me any sales charts)

If context doesnt matter then the reply doesnt matter.

I guess it does bother you that re-writing other people's thoughts doesn't work.

That's the bottom line -- you're trying to twist someone else's comments to suit your narrative.

There is nothing wrong with someone making a general observation (which is most likely what Atenra actually did). Not everything has to be related. I'm pretty sure if Atenra was talking about IPC -- he would have actually mentioned it.
http://www.cpu-world.com/news_2013/2013062001_Intel_expands_Xeon_Phi_co-processor_lineup.html
 
Last edited:

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
I think most Scientists / Researchers would disagree with you.

For scientific computing, most researchers seem to prefer all 8 projects running/finishing at a consistent speed (FX-8320e) -- versus 4 projects finishing quickly, with 4 projects dragging along like a slug on the "hyper" threads (Core i7).

My alma mater just purchased a cluster of water cooled FX's instead of i7's because the scientific computations were completed so erratically on the tested i7's.

Just saying.... Intel isn't the end all and be all that the fanboys on this forum make it out to be. CMT and SMT combined has the potential to be very good for many specialized tasks.

They considered Xeon and Opteron as well -- but in the end, FX 8320e offered the most impressive performance for the allotted grant money. The staggering costs of current Xeon's ended up being the non-starter (they came close to purchasing the E5 2697 instead).

It was for the math department (integer calculations). They ended up with a cluster of 6 FX-8320e's -- running 48 projects simultaneously running Redhat.

Meh. Maybe in that case.

However, the vast majority of scientific computation is FP and that is really where intel excels.

If you are under strict budget requirements and performance/efficiency isn't terribly important AMD makes sense.
 

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
They considered Xeon and Opteron as well -- but in the end, FX 8320e offered the most impressive performance for the allotted grant money. The staggering costs of current Xeon's ended up being the non-starter (they came close to purchasing the E5 2697 instead).

(...)

It was for the math department (integer calculations). They ended up with a cluster of 6 FX-8320e's -- running 48 projects simultaneously running Redhat.

Is this real, or you are making this stuff up? Because from what you are saying those 48 project were granted around $100-$150 per project. I don't think this kind of shoestring budget can be used to define which processor is good for scientific calculation. This is a real tail end scenario.
 
Last edited:

TheELF

Diamond Member
Dec 22, 2012
4,027
753
126
My alma mater just purchased a cluster of water cooled FX's instead of i7's because the scientific computations were completed so erratically on the tested i7's.
This can not happen.
There is a thing called thread migration,each time a thread is beeing rescheduled it gets assigned to a new core(real or virtual) no matter how many or few threads you run each and everyone will get the same amount of cpu time so there is no way for them to finish at erratically different times.
Even if they went out of their ways to use affinity and force each process/thread to a different core,the scheduler would still give each thread an equal amount of cpu time.