64 core EPYC Rome （Zen2）Architecture Overview？

moinmoin · Nov 12, 2018

amd6502 said:
On an unrelated topic, does anyone know if chiplets can be tested and crudely binned before being used in MCM, other than by having computers look at it with microscopes?

Personally I think this is one of the primary reasons the whole control fabric with its 1000s of sensors exist in modern AMD designs: It gives them the ability to self-check all possible qualities of the chip(let) before selecting for specific use cases/products. (Aside it being a great help for debugging during development as well as constant monitoring during production use.)

Glo. · Nov 12, 2018

Oh, how funny this thread appears to be.

"No way, AMD will offer 16C/32T! Threadripper sales will be canibalized! There is not enough memory bandwidth to feed those cores! Power consumption way to high! They have to have high MSRP, users on AM4 don't need 16 cores!"

16C/32T CPU on 7 nm process will be 95W TDP. 16C/32T will be 500$ at best, because of SKU segmentation. Threadripper 3 will have 32 cores. Zen 2 CPUs will start from 8C/8T, because if the core dies are fully working, there is no reason for AMD to sandbag them in any way.

IRobot23 · Nov 12, 2018

itsmydamnation said:
Explain Rome then

4k pins

itsmydamnation said:
If we are just going to make up numbers i could easily do the same to support my position.

So you have no idea how cache cohernecy/memory accesses work on the new chiplet+I/O die. Second we already have an idea, for most workloads sweet FA. For gaming its very quickly hits minimal returns ( see the stilts mem clock + low latency timings benchmark data).

Nobody said that the doubled L3 cache.
No games won't make any difference with more cache or L4... they need low latency DDR4.

F-Rex · Nov 12, 2018

amd6502 said:
On an unrelated topic, does anyone know if chiplets can be tested and crudely binned before being used in MCM, other than by having computers look at it with microscopes?

Usually, every chips on every wafers is tested because youtube don't want to ship a DOA part to your customer. Bad for business. But, Final Wafer test has a non unsignificant cost per chip ( depending on What your chip is doing in Real life, test can take Time.) You test only so many chip at a Time si when you have several thousand Small chips per wafer , Time to test can take a week per wafer.
So if your yield is above 99% and your customer is ok, you can ship without testing or more usually with only a tested subsample.
To make it Clear, AMD and Intel test all their chips. Always ever.

dnavas · Nov 12, 2018

Vattila said:
It is not that I am disappointed with AMD. I am full of admiration for what Lisa Su has achieved so far, as she has lead the company from deathbed to sustained profits.

But I set my expectations based on what I see would be a reasonable business plan. Planning to lose against 14nm Skylake/CFL at a time when 10nm Ice Lake was expected to be in the market, with Ice Lake having architectural improvements, increased density (core count), lower power and higher performance — as originally projected for Intel's 10nm process — that is not a sound business plan.

I rather felt that the Ryzen2 release was disappointing. They didn't really do anything with 12nm -- seems to me they mostly just fixed their control plane and called it a day. It feels like people were pulled off of that effort and concentrated on some other project ... like Zen2. The latest rumor is that Zen2 isn't being delivered to consumers until Q3, probably Q4. Zen2 is being sampled ... but barely, given what we know of how it fared in lab tests. They didn't use their favorite benchmark, they emphasized that they're meeting their deadlines. Yet they've lost at least one HPC deal because they can't deliver in the timeframe required. Pulling those threads together makes it sound like they are running late. Because I'm forced to guess, I'd say that this is GF fall-out -- Ryzen 2 team pulled to run a Zen2 backup plan at TSMC, and now they're a half year back of their original plans. Those aren't the only threads, of course -- could be just that consumer is pushed out so that they can do a single die and tackle latency. Who knows.

But as I'm forced to guess, my suspicion is that we're at least two quarters away from a release. The AMD folks doing demos wouldn't even commit to first half of 2019. "2019". Yet, the current parts are already beating Intel. So yes, I expect really good things. Maybe not really great latency, but reasonably high frequencies and significant IPC gains. I don't expect them soon is all.

csbin · Nov 12, 2018

Naples, Rome, Milan, Zen 4: An Interview with AMD CTO, Mark Papermaster

https://www.anandtech.com/show/1357...-4-an-interview-with-amd-cto-mark-papermaster

IC: With all the memory controllers on the IO die we now have a unified memory design such that the latency from all cores to memory is more consistent?

MP: That’s a nice design – I commented on improved latency and bandwidth. Our chiplet architecture is a key enablement of those improvements.

coercitiv · Nov 12, 2018

csbin said:
MP: That’s a nice design – I commented on improved latency and bandwidth. Our chiplet architecture is a key enablement of those improvements.

The next answer is even more relevant:
MP: We haven’t provided the specifications yet, but the architecture is aimed at providing a generational improvement in overall latency to memory. The architecture with the central IO chip provides a more uniform latency and it is more predictable.

Atari2600 · Nov 12, 2018

Doesn't give much away does he?

TheGiant · Nov 12, 2018

After this answer I am expecting something skylake-SP/x jump. Better server (the money is there) architecture and the lower gaming IPC.

coercitiv · Nov 12, 2018

Atari2600 said:
Doesn't give much away does he?

He was quite close to start screaming at the audience towards the end of his presentation on Zen 2, whether it was acted or not didn't even matter, the desire to share more than he was allowed was clearly there.

exquisitechar · Nov 12, 2018

The amazing @kokhua predicts on Twitter that there is no L4$. I believe him.

Topweasel · Nov 12, 2018

dnavas said:
I rather felt that the Ryzen2 release was disappointing. They didn't really do anything with 12nm -- seems to me they mostly just fixed their control plane and called it a day. It feels like people were pulled off of that effort and concentrated on some other project ... like Zen2. The latest rumor is that Zen2 isn't being delivered to consumers until Q3, probably Q4. Zen2 is being sampled ... but barely, given what we know of how it fared in lab tests. They didn't use their favorite benchmark, they emphasized that they're meeting their deadlines. Yet they've lost at least one HPC deal because they can't deliver in the timeframe required. Pulling those threads together makes it sound like they are running late. Because I'm forced to guess, I'd say that this is GF fall-out -- Ryzen 2 team pulled to run a Zen2 backup plan at TSMC, and now they're a half year back of their original plans. Those aren't the only threads, of course -- could be just that consumer is pushed out so that they can do a single die and tackle latency. Who knows.

But as I'm forced to guess, my suspicion is that we're at least two quarters away from a release. The AMD folks doing demos wouldn't even commit to first half of 2019. "2019". Yet, the current parts are already beating Intel. So yes, I expect really good things. Maybe not really great latency, but reasonably high frequencies and significant IPC gains. I don't expect them soon is all.

Zen+ Never had much on it to begin with. AMD had 3 choices. No new releases in 2018, A slight bump in numbers to prevent a major backslide in sales, or delay their Zen 2 development by having a full blown dev process done for Zen+ which couldn't go chiplet. Couldn't really increase core size since it was already a bit large to begin with. So they would only be working on the arch. How much better could it be? And would it be worth having Zen 2 being just a die shrink version of Zen+ and waiting till 2020 with Zen 3 to change to chiplets?

Saylick · Nov 12, 2018

I think this line of questioning was most interesting, considering it is Ian asking the same questions we have on the forums, namely if the cores on the chiplets are all on the same CCX and if there's some sort of all inclusive LLC on the IO die to mitigate the latency when a core on one chiplet needs to grab data from a core on another chiplet. Perhaps I'm thinking into it too much but does the fact that Mark immediately stops Ian in his question but still states that AT has the right questions hint that Ian's speculations are true?

IC: When one core wants to access the cache of another core, it could have two latencies: when both cores are on the same chiplet, and when the cores are on different chiplets. How is that managed with a potentially bifurcated latency?

MP: I think you’re trying to reconstruct the detailed diagrams that we’ll show you at the product announcement!

IC: Under the situation where we now have a uniform main memory architecture, for on-chip compared to chip-to-chip there is still a near and a far latency…

MP: I know exactly where you’re going and as always with AnandTech it’s the right question! I can honestly say that we’ll share this info with the full product announcement.

coercitiv · Nov 12, 2018

Glo. said:
Oh, how funny this thread appears to be.

"No way, AMD will offer 16C/32T! Threadripper sales will be canibalized! There is not enough memory bandwidth to feed those cores! Power consumption way to high! They have to have high MSRP, users on AM4 don't need 16 cores!"

16C/32T CPU on 7 nm process will be 95W TDP. 16C/32T will be 500$ at best, because of SKU segmentation. Threadripper 3 will have 32 cores. Zen 2 CPUs will start from 8C/8T, because if the core dies are fully working, there is no reason for AMD to sandbag them in any way.

Yeah, Zen 2 comes at you fast:

Glo. said:
LOLOLOLOLOLOL.

Move on, nothing to see here.

And guys remember, Matisse IS 8C/16T design. That is one thing sure, at this moment.

Good thing covering the entire spectrum with predictions means one can never be wrong, am I right? Am I right or am I right? Right!

turtile · Nov 12, 2018

dnavas said:
But as I'm forced to guess, my suspicion is that we're at least two quarters away from a release. The AMD folks doing demos wouldn't even commit to first half of 2019. "2019". Yet, the current parts are already beating Intel. So yes, I expect really good things. Maybe not really great latency, but reasonably high frequencies and significant IPC gains. I don't expect them soon is all.

One of AMD's employees said that they won't release more information on the specifics until Q1 or Q2 so late Q2 is probably the earliest we will see this with Q3 being most likely.

Glo. · Nov 12, 2018

coercitiv said:
Good thing covering the entire spectrum with predictions means one can never be wrong, am I right? Am I right or am I right? Right!

And Matisse is NOT the 8C/16T Chiplet design?

moinmoin · Nov 12, 2018

I mentioned the mental exercise in the Next Horizon thread already, following part of the answer:

coercitiv said:
MP: We haven’t provided the specifications yet, but the architecture is aimed at providing a generational improvement in overall latency to memory. The architecture with the central IO chip provides a more uniform latency and it is more predictable.

Makes me expect no more CCX structure. Instead every core is directly handled by the central IO chip. L3$ may even be situated on the IOC instead the chiplets to create a truly global unified L3$ across all cores.

dnavas · Nov 12, 2018

Topweasel said:
Zen+ Never had much on it to begin with. [...] How much better could it be?

Well, they could have fixed any number of issues with memory compatibility. I don't think we would have gotten a whole new front-end, but there were definitely problems worth addressing that wouldn't have cost a lot of silicon -- just time/iterations.

Here's an interesting quote, on a couple of levels

IC: With the FP units now capable of doing 256-bit on their own, is there a frequency drop when 256-bit code is run, similar to when Intel runs AVX2?

MP: No, we don’t anticipate any frequency decrease.

So, first, WRONG-Navas, no lower frequency AVX. :shrug: Ok, fine

But also, a hint about how far Rome really is away at this point "we don't anticipate" -- they don't know. I mean, this is probably *engineer* "know" but still. They don't know what quarter they're shipping, they don't know what frequency they're going to hit, they won't talk about IPC increase targets, and they won't commit to what they may or may not have to do about alleviating potential hot spots. Sounds to me like they have design targets, those design targets appear achievable, are well under way (controlled demos) and look to be less than a year away. But bad things can (and do) happen along the way. Does that sound right to you (all)?

coercitiv · Nov 12, 2018

dnavas said:
a hint about how far Rome really is away at this point "we don't anticipate" -- they don't know. I mean, this is probably *engineer* "know" but still.

Do you reckon they had clocks set in stone for Ryzen back in Nov 2016? And that was with a launch scheduled for Q1 2017.

dnavas · Nov 12, 2018

moinmoin said:
L3$ may even be situated on the IOC instead the chiplets to create a truly global unified L3$ across all cores.

Aren't those chiplets a little big for a stormtr ...err... an L3-cacheless design? Cache also is one of those features that scales really well, so it'd be a shame not to have it on 7nm. The L2/L3 cache design is also already lower latency than Intel's -- I'd hate to see the L3 latency rise to something that looks like off-chip. So, unless those latencies have improved dramatically, I'm dubious. Where they need help is on the memory access and prefetch side. And who knows what they're doing for cross-chiplet caching. I suspect that's why there's so much speculation about L4$.

dnavas · Nov 12, 2018

coercitiv said:
Do you reckon they had clocks set in stone for Ryzen back in Nov 2016? And that was with a launch scheduled for Q1 2017.

So, I'm not someone who has gone through a hardware launch -- my engineer hat has to read "I don't know." And you've written a leading, semi-rhetorical question, so my forum hat reads "interesting question" and my contrary/grumpy-old-man hat screams "YES!" :> Maybe answer with my own question -- were there spins done (completed) between November and Q1?

Okay, that's a weasel response. 1Q2017 felt release-date driven. My best guess is that they shipped with whatever they had by whatever internal drop-dead date existed. But Intel pulled a groin muscle, so AMD is no longer feeling pressured to release by a certain quarter. The good news is they're going to work to "get it right." The bad news is "it'll ship when it's ready." My read is that that is not any time soon.

Another interesting note from the interview -- they showed Rome to support their GPU launch (which is imminent). I don't want to start a war, but AMD is ... let's say they're hurting on the ATI side. My hope is that the strategy to reinvigorate their CPU side is going to reach their GPU team, but thus far....

Beemster · Nov 12, 2018

Saylick said:
IC: Under the situation where we now have a uniform main memory architecture, for on-chip compared to chip-to-chip there is still a near and a far latency…

256MB eDRAM buffer L4 fits very nicely into the perceived xtr 150mm^2 on the hub chip

Topweasel · Nov 12, 2018

dnavas said:
Well, they could have fixed any number of issues with memory compatibility. I don't think we would have gotten a whole new front-end, but there were definitely problems worth addressing that wouldn't have cost a lot of silicon -- just time/iterations.

I told you the three options 1. A dedicated team to develop the CPU and a delayed Zen 2 with it being a shrunk Zen +. 2. Not doing anything. 3. Release a slightly refined process version to prevent losing to many sales this year.

As much as anyone would have liked to see a bigger improvement this Gen. You can go back to quotes I said in March 2017. It was always clear that Zen+ Was about treading water and trying not to lose market share they just got while they worked to wards Zen 2 and on. Spending any more time on what became Zen+ was always going to be a waste of design effort because it wasn't going to be a move to chiplets and therefore would cost them engineering time for Zen 2. Which means AMD wouldn't have had any time to prep what is Zen 2 for 7nm. Which means that for 2019 AMD would be releasing a die shrunk Zen+. So yeah you get your single year increase fixing a small component or two of the die for that one year. But now you get 3 years of minimal improvements from AMD instead of just the one.

teejee · Nov 12, 2018

dnavas said:
But also, a hint about how far Rome really is away at this point "we don't anticipate" -- they don't know. I mean, this is probably *engineer* "know" but still. They don't know what quarter they're shipping, they don't know what frequency they're going to hit, they won't talk about IPC increase targets, and they won't commit to what they may or may not have to do about alleviating potential hot spots. Sounds to me like they have design targets, those design targets appear achievable, are well under way (controlled demos) and look to be less than a year away. But bad things can (and do) happen along the way. Does that sound right to you (all)?

I'm pretty sure they know most of this but why should they tell us before the launch?
For example IPC is almost 100% known at RTL freeze (logical design) that was a long time ago for Matisse.
Some uncertainty always exist concerning for example exact frequency though since it is related to statistical distribution of some properties of the dies.

IRobot23 · Nov 12, 2018

I don't that chiplet design will offer lower latency, they might offer lower avg. latency ( 65-75ns) with TR but with naples and lower clock probably 100ns.

Personally I don't need PC for work anymore, mostly hobby.. and games. So 144Hz-240Hz is all I want and hopefully R3000 will be good upgrade from R7 1700.

64 core EPYC Rome （Zen2）Architecture Overview？

Diamond Member

Diamond Member

Senior member

Junior Member

Senior member

Senior member

Diamond Member

Golden Member

Senior member

Diamond Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Senior member

Diamond Member

Senior member

Senior member

Member

Diamond Member

Senior member

Senior member