64 core EPYC Rome （Zen2）Architecture Overview？

Olikan · Nov 20, 2018

This patent seems to fit exacly what AMD explained about Zen2 front end chages...
https://patents.justia.com/patent/10127044

Trumpstyle · Nov 21, 2018

raghu78 said:
Firstly you have not understood how AMD CPUs are designed. AMD's CPU physical design is targetted at achieving high clock frequencies. So TSMC N7 is not a choice at all. AMD Zen CPUs had a max turbo boost of 4 Ghz. Zen+ CPUs performance had a max single core turbo of 4.35 Ghz. The mobile Ryzen 2700u had a max turbo of 3.8 Ghz. AMD is likely to target max clock frequencies atleast on par or higher than Zen+ for their 7nm Zen 2. So the only option is N7 HPC. AMD has already confirmed that all their 7nm CPUs and GPUs are using N7 HPC. This was confimed by Ashraf Eassa of The Motley Fool on twitter. But Ashraf has deactivated his twitter account a couple of months back.

btw I was one of the earliest people to propose that the Rome IO die could have L4 cache. But I think the chances of that are slim to none. Firstly for a significant amount of L4 cache (say 256 MB) AMD needs to go with 14HP and eDRAM for L4 cache. I think that process is not suitable for low cost high volume designs. Rome IO die needs to be low cost and low complexity. So its most likely based on the mature GF 14LPP node. Moreover if you look at the Zeppelin die and move all the IO and memory controller circuitry to a single die you would end up quite close to the 420 sq mm die size. AMD has probably spent some die area to maintain some cache information about the data stored on the L3 of each chiplet so that a chiplet can quickly look up that info to see if some data is in the L3 of another chiplet. But thats about it.

AMD's 8 core chiplet die is the basic building block for all of its 7nm products from server CPUs, desktop CPUs, desktop/notebook APUs , next gen console APUs (PS5/XB2). BTW AMD's move to chiplets is not only for servers. I expect almost every AMD design at 7nm to incorporate chiplets. AMD's move is very logical as its easier to yield smaller dies and you can match chiplets with similar characteristics to build SKUs across the product stack. The modularity and reusability of chiplets dictates that 8 cores is the right choice. Here is how I see the 7nm designs from AMD

Rome - 8 x 8=64C, 8MC, 128 PCI-E 4.0 lanes
Threadripper - 8 x 8=64C, 8MC, 128 PCI-E 4.0 lanes
Ryzen - 2 x 8=16C, 2MC, 32 PCI-E 4.0 lanes
Ryzen APU - 1 x 8 = 8C + Navi GPU chiplet 20 CU + 4 GB HBM2 cache, 2MC, 32 PCI-E 4.0 lanes
PS5/XB2 - 1 x 8= 8C + Navi GPU chiplet 80 CU, 256 or 384 bit GDDR6.

Here is how I see AMD's Navi product stack

Ryzen 7nm APU - 20CU, 1280 sp.
Navi 12 - 40CU , 2560 sp, 128 bit GDDR6 or 256 bit GDDR5X.
Navi 10 (PS5 GPU) - 80CU, 5120 sp, 256 bit GDDR6.
Navi 20 - 120CU, 7680 sp, 384 bit GDDR6.

I think Navi will be a good architecture and address long standing problems and drawbacks with GCN like scalability, perf and area efficiency, perf per CU, perf per sp. In fact I am optimistic because Sony is very aggressive with their PS5 graphics performance goals and Navi is heavily influenced by PS5's perf targets and design goals.

This post makes a lot of sense, I just wanna add 2 things, we don't know if Navi 20 is real, from what I could find it's originated from the site Fudzilla. And we know that Navi 11 exist from a leaked roadmap, I assume Navi 11 is Mobile/desktop APU as Vega11 was.

About Navi 12 I'm guessing you got that information from wccftech, we don't know how accurate that is

Yotsugi · Nov 21, 2018

Trumpstyle said:
And we know that Navi 11 exist from a leaked roadmap, I assume Navi 11 is Mobile/desktop APU as Vega11 was.

RR iGPU is called RAVEN.
Vega11 is a canned midrange part.

Hitman928 · Nov 21, 2018

Bondrewd said:
RR iGPU is called RAVEN.
Vega11 is a canned midrange part.

Vega11 was never announced or on any official roadmap as far as I know, just a bunch of rumors. The only thing that was ever official was the mobile line.

AMD said:
AMD Ryzen™ 5 2400G with Radeon™ RX Vega 11 Graphics

https://www.amd.com/en/products/apu/amd-ryzen-5-2400g

Yotsugi · Nov 21, 2018

Hitman928 said:
You should tell AMD that:

https://www.amd.com/en/products/apu/amd-ryzen-5-2400g

Marketing names have no relation to their die naming schemes.
VEGA10 is Vega56/64/WX8200/9100.
VEGA11 is ded.
RAVEN is RR iGPU bins.
VEGA12 is Vega16/20.
VEGA20 is Mi50/Mi60.

Hitman928 · Nov 21, 2018

Bondrewd said:
Marketing names have no relation to their die naming schemes.
VEGA10 is Vega56/64/WX8200/9100.
VEGA11 is ded.
RAVEN is RR iGPU bins.
VEGA12 is Vega16/20.
VEGA20 is Mi50/Mi60.

Edited my post above.

Yotsugi · Nov 21, 2018

Hitman928 said:
Edited my post above.

VEGA11 appeared on leaked ROCm roadmaps, together with everything else that made it out alive.

Hitman928 · Nov 21, 2018

Bondrewd said:
VEGA11 appeared on leaked ROCm roadmaps, together with everything else that made it out alive.

I couldn't find any such leaks outside of some super blurry rumor articles and it wasn't ever a part of AMD's ROCm presentations.

Yotsugi · Nov 21, 2018

Hitman928 said:
I couldn't find any such leaks outside of some super blurry rumor articles and it wasn't ever a part of AMD's ROCm presentations.

https://videocardz.com/65521/amd-vega-10-and-vega-20-slides-revealed
Good ole' roadmap.

Trumpstyle · Nov 21, 2018

Bondrewd said:
Marketing names have no relation to their die naming schemes.
VEGA10 is Vega56/64/WX8200/9100.
VEGA11 is ded.
RAVEN is RR iGPU bins.
VEGA12 is Vega16/20.
VEGA20 is Mi50/Mi60.

I think you're overthinking the code names.
Vega 10 = 64CU
Vega 20 = 64CU (refresh)
Vega 11 = 11CU desktop/mobile apu
Polaris10 = 36CU
Polaris11 = 16CU
Polaris12 = 10CU

So new Navi will look like this:
Navi10 = 80CU
Navi11= 11CU desktop/mobile apu
Navi12 = 40CU
Navi20= maybe exists but I think it will be a refresh/rebrand of Navi10.

It all makes perfect sense if Wccftech leak is true.

Hitman928 · Nov 21, 2018

Bondrewd said:
https://videocardz.com/65521/amd-vega-10-and-vega-20-slides-revealed
Good ole' roadmap.

So a blurry picture on a rumor article. . .

Yotsugi · Nov 21, 2018

Trumpstyle said:
Vega 20 = 64CU (refresh)

That's no refresh, it's a very different product.

Trumpstyle said:
Vega 11 = 11CU desktop/mobile apu

Stop being stupid.
Go read kernel patches or something.

Trumpstyle said:
Navi20= maybe exists but I think it will be a refresh/rebrand of Navi10.

That's another HPC product aka Mi-Next.

Hitman928 said:
So a blurry picture on a rumor article. . .

Leaked roadmaps are no rumors.

Hitman928 · Nov 21, 2018

Bondrewd said:
Leaked roadmaps are no rumors.

Who says it's a leaked roadmap and not a fake? Did AMD tell you this or the internet?

Yotsugi · Nov 21, 2018

Hitman928 said:
Who says it's a leaked roadmap and not a fake? Did AMD tell you this or the internet?

Yes, AMD told me this by releasing literally everything sans VEGA11.

Hitman928 · Nov 21, 2018

Bondrewd said:
Yes, AMD told me this by releasing literally everything sans VEGA11.

Most of the boards in the pic were either already released or officially announced by the time this "leak" happened, so that's not saying much. Also, not everything was released, unless you can show me where to buy a Vega 10x2 which was supposed to release last year.

Lastly, look at the pictures in the roadmap. It is clearly someone just used paint or some other primitive photo editor to put new labels on pre-existing card pictures. I mean, look at Navi 10, you can still see the old Firepro label underneath "Navi 10". This was a fake, and a bad one at that.

Yotsugi · Nov 21, 2018

Hitman928 said:
unless you can show me where to buy a Vega 10x2 which was supposed to release last year.

Their shiny new Vega10-based dual GPU board is here.
https://www.amd.com/en/press-releas...raphics-card-delivers-accelerated-performance

Hitman928 said:
This was a fake, and a bad one at that.

That's called an "internal roadmap" and you surely don't even remotely know how this silly industry works.

Hitman928 · Nov 21, 2018

Bondrewd said:
Their shiny new Vega10-based dual GPU board is here.
https://www.amd.com/en/press-releas...raphics-card-delivers-accelerated-performance

Fair enough, hadn't seen that announcement. Still doesn't change my opinion of the road map. AMD has been doing dual GPU server / workstation boards for many generations now, wouldn't be that hard to predict.

That's called an "internal roadmap" and you surely don't even remotely know how this silly industry works.

I work in this silly industry, so yes, I do know how it works. I have friends who have worked for AMD, Nvidia, and intel. I've never seen an internal roadmap that looks like that. You can believe whatever you want, but I think to everyone else, it's obviously a fake.

Yotsugi · Nov 21, 2018

Hitman928 said:
I work in this silly industry

Proof-me-up, sempai.

krumme · Nov 21, 2018

If i remember correctly a jaguar core including l2 is 3.5mm2 on 28nm or aprox 28mm for 8 cores. A bit less than half of a zen2 cpu cluster.
Now I don't know if sony or ms would have tilted some mm2 from gpu to cpu but anyways 73mm2 for cpu seems to high a budget even if you have 350mm2 or so in total for a single soc.
Also if I remember correctly of those 8 jaguar cores 2 were dedicated to the system.
Using zen2 surely 1 core with it will cower that and then some. That means building with a 8c Zen2 cluster you can harvest for 7 cores. It will bring more economic sense into it.
Add a more mature process be it 7nmplus or 5nm it seems you can get into reach of this cpu used in a console.
I would just use the cpu2 as there is off shelf and add another io die specific for the consoles.
I feel the days of single soc for consoles is perhaps over.
The Lego age have begun

Hitman928 · Nov 21, 2018

Bondrewd said:
Proof-me-up, sempai.

I'm not going to dox myself so what kind of proof do you want?

Topweasel · Nov 21, 2018

Bondrewd said:
Yes, AMD told me this by releasing literally everything sans VEGA11.

Ok. Here is where the confusion is so you can stop name calling.

AMD has internal code names for products Vega 10 was the large Vega die that made it into production.

In the past AMD tended to use numbers to separate big dies from little dies. But I haven't seen anything to lead me to believe that Vega 11 existed as a code name. Or outside Vega mobile (can't remember it's code name) that AMD ever planned a "small" Vega. It was always a Fury replacement and never a replacement for Polaris.

AMD did change up something with Vega. That is naming the shipping product by CU size. Vega 10 became Vega 64 and Vega 56. Vega 11 exists as on fully featured Raven Ridge chips as the GPU portion of the die.

There is a Vega 11. It isn't and never was what you think it was.

Abwx · Nov 21, 2018

raghu78 said:
btw I was one of the earliest people to propose that the Rome IO die could have L4 cache. But I think the chances of that are slim to none. Firstly for a significant amount of L4 cache (say 256 MB) AMD needs to go with 14HP and eDRAM for L4 cache. I think that process is not suitable for low cost high volume designs. Rome IO die needs to be low cost and low complexity. So its most likely based on the mature GF 14LPP node. Moreover if you look at the Zeppelin die and move all the IO and memory controller circuitry to a single die you would end up quite close to the 420 sq mm die size. .

8 cores + 16MB L3 take 44mm2 according to AMD s stated density improvement, if the L3 is extended to 32MB it would require 52mm2 on the chiplet (and tHis would led to 256MB total L3s), dunno what are the remaining 20mm2 used for as this seems a lot for IF.

FTR 256 MB would require 256mm2 if implemented in the I/O device, and even using 12nm wouldnt shrink it further than 223mm2.

jpiniero · Nov 21, 2018

Going to 256-bit units is going to increase the core size a bit.

Tuna-Fish · Nov 21, 2018

Abwx said:
8 cores + 16MB L3 take 44mm2 according to AMD s stated density improvement, if the L3 is extended to 32MB it would require 52mm2 on the chiplet (and tHis would led to 256MB total L3s), dunno what are the remaining 20mm2 used for as this seems a lot for IF.

Density improvement is different for logic and SRAM, and SRAM shrank a lot more than logic. Based on published numbers, a high-density SRAM bitcell should take 0.42x the space on TSMC 7nm of what one took on a GF 14nm process.

Atari2600 · Nov 21, 2018

ub4ty said:
I'd love to reduce my current systems down to one massively contained solution.

Nice until something breaks. Then your down rather than operating at half capacity.

64 core EPYC Rome （Zen2）Architecture Overview？

Platinum Member

Member

Golden Member

Diamond Member

Golden Member

Diamond Member

Golden Member

Diamond Member

Golden Member

Member

Diamond Member

Golden Member

Diamond Member

Golden Member

Diamond Member

Golden Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Lifer

Lifer

Golden Member

Golden Member