• Guest, The rules for the P & N subforum have been updated to prohibit "ad hominem" or personal attacks against other posters. See the full details in the post "Politics and News Rules & Guidelines."

Speculation: Ryzen 4000 series/Zen 3

Page 49 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Topweasel

Diamond Member
Oct 19, 2000
5,327
1,525
136
Laptop is the smallest worry for AMD at this point. Intel has allowed them to push TDP in server space by quite a bunch. Efficiency matters because Density matters to AMD. But removing themselves from any kind of 4S space, means density isn't even a worry unless they shooting for 2s 128c 1U systems at no to little performance loss. Clocks stay about even, maybe increase slightly or decrease slightly dependent on how the process develops or tweaks to uarch for increased clocks.

Then we look at Zen vs. Zen+. We didn't get a new Epyc even though we got a new TR. If anything the improvements to Zen+ should have worked more wonders on the efficiency table and allowed for dramatic increase in clocks on full spec'ed 32c EPYCs. Here we have no additional cores, maybe better efficiency, stagnate clocks, yet there is a Milan on the roadmap.

There is a balance here. But performance must increase. Otherwise if they they didn't make a Zen+ Epyc even thought they used the same tooling they would use for EPYC on TR2. Then they wouldn't be pushing out a new EPYC chip. All these can't be right. Either Zen 3 has higher IPC (enough to impact sales enough to do the validation), more cores, or increased clocks (with enough impact in sales to make it worth validation).
 
  • Like
Reactions: spursindonesia

uzzi38

Golden Member
Oct 16, 2019
1,166
1,972
96
There is a balance here. But performance must increase. Otherwise if they they didn't make a Zen+ Epyc even thought they used the same tooling they would use for EPYC on TR2. Then they wouldn't be pushing out a new EPYC chip. All these can't be right. Either Zen 3 has higher IPC (enough to impact sales enough to do the validation), more cores, or increased clocks (with enough impact in sales to make it worth validation).
Erm, just a quick point, but a Zen+ EPYC would only be different in process node. The 'IPC' gain of 3% that Zen+ Ryzen had was because of a change that made it to both Threadripper and EPYC, but not Ryzen afaik.

Main reason there wasn't a Zen+ EPYC was because there basically was nothing else to offer.
 

Topweasel

Diamond Member
Oct 19, 2000
5,327
1,525
136
Erm, just a quick point, but a Zen+ EPYC would only be different in process node. The 'IPC' gain of 3% that Zen+ Ryzen had was because of a change that made it to both Threadripper and EPYC, but not Ryzen afaik.

Main reason there wasn't a Zen+ EPYC was because there basically was nothing else to offer.
Yeah kind of the point there, Zen 3 has to offer Epyc something if it was worth it.

The 3-6% boost was in max clocks. But there may have actually been a larger increase on base clocks for EPYC considering how well Zen's efficiency increases as you lower clocks. Lets say that that all core boost clocks increased 7-8% under Zen+, plus them pushing more then 180w through the socket (like they did with TR2). So again I point to the lack of a Zen+ EPYC as somewhat of a barometer of what to expect with Zen 3. There has to be something halfway decent on the Server end and efficiency can be part of it. But its still has to be pretty decent for AMD to make Milan rather then just doing a Desktop/TR and eventually a laptop.

Also the fact that Renoir is Zen 2 and not Zen 3 should also tell us something. If prioritizing laptops was a major push for Zen 3. They we wouldn't have to wait 6-8 months (or more) for crown of the development. Priority for AMD in design is really obvious. Server>Desktop>HDET>Laptop. Most of that is because of the lag time on the APU's and a eventual move to MCM will help that, but still Server comes first and the rest of the lineup priority follows the shared component tree. Something that a laptop chip doesn't have any shared components.
 

uzzi38

Golden Member
Oct 16, 2019
1,166
1,972
96
Yeah kind of the point there, Zen 3 has to offer Epyc something if it was worth it.

The 3-6% boost was in max clocks. But there may have actually been a larger increase on base clocks for EPYC considering how well Zen's efficiency increases as you lower clocks. Lets say that that all core boost clocks increased 7-8% under Zen+, plus them pushing more then 180w through the socket (like they did with TR2). So again I point to the lack of a Zen+ EPYC as somewhat of a barometer of what to expect with Zen 3. There has to be something halfway decent on the Server end and efficiency can be part of it. But its still has to be pretty decent for AMD to make Milan rather then just doing a Desktop/TR and eventually a laptop.

Also the fact that Renoir is Zen 2 and not Zen 3 should also tell us something. If prioritizing laptops was a major push for Zen 3. They we wouldn't have to wait 6-8 months (or more) for crown of the development. Priority for AMD in design is really obvious. Server>Desktop>HDET>Laptop. Most of that is because of the lag time on the APU's and a eventual move to MCM will help that, but still Server comes first and the rest of the lineup priority follows the shared component tree. Something that a laptop chip doesn't have any shared components.
APUs won't move over to MCM for a while still. Not viable for multiple reasons.

Also, there's no need for Zen 3 next year. Renoir will be competitive without it - far more than Picasso was.
 

amrnuke

Senior member
Apr 24, 2019
886
1,152
96
It is a speculation thread, doesn't this seem overly harsh?
He said "YT channel Moors laws not dead was right about Zen2 TR timing so we can expect SMT4 for Zen3 too. I would say the probability raised from 60% to 80% now."

That's not really speculation, he is judging the merits of an incorrect prediction to be somehow valid. It makes no sense. I suggested a self-ban half-jokingly.

I don't know enough to know the art of possible, but it isn't outside my capabilities to judge deliveries. Although nVidia was fond of referring to cores as either "low latency" or "high throughput", it's been interesting to watch AMD add throughput to their low latency cores, at a relative (to competition) cost of latency (for low core-count use-cases). Zen3 is not a higher core-count product, so either they're working on increasing their throughput in other ways (in which case something like what Rich has described isn't complete insanity), or they're going to tackle latency (in which case it's a distraction). From what I've gleaned, SMT4 was something looked at, but not adopted :shrug:
Correct. SMT4 is "possible". But it is largely rejected because AMD already tried this before (Greyhound) and couldn't get power consumption controlled. That's a big issue. Plus you have to redesign so much on the front end. Overall resource overhead is very costly. Branch prediction and TLB and register and so on have to be shared among 4 threads instead of just 2 and in order to avoid bottlenecks you have to restructure and expand all of that. Base power consumption is going to be higher and overall power draw and (more importantly given Zen2 heat sensitivity) heat density is going to be higher. Also more threads almost certainly means more misses requiring pass back to memory. The overall effect of going to SMT4 is higher energy consumption, higher heat, and slower peak per-thread performance, in order to produce marginally (~25% gain over SMT2 for doubling threads per core from 2 to 4) improved overall capacity. The issue is that unless your core is stalled/overloaded there will be no benefit, just higher resting power consumption.

Given that SMT4 may produce benefits in the server workspace, I think it's silly to rule it out entirely for the server marketplace. But given AMD's focus on IPC gains I really fail to see why they would even consider SMT4 in the mainstream / HEDT space given the (likely) worsened per-thread performance.
 

dnavas

Senior member
Feb 25, 2017
304
154
86
Yeah kind of the point there, Zen 3 has to offer Epyc something if it was worth it.
Yeah, that's how I read your post first time too :)
Server wouldn't bother shipping something 5% better -- there's something there there.
Mind you, doesn't have to be strictly IPC. There are features for high-end servers that are as important -- security, error-detection/correction, etc. I've had to debug two post-error-correction bit flips -- it isn't fun. I mean, it kind of is, but, :ugh: sort of scary.

If prioritizing laptops was a major push for Zen 3....
Yep, APU is clearly not being prioritized. I'm really more than ok with that -- Intel's focus on laptop efficiency (which I presume they did to feed Apple) has been more than just a sore point with me. I had just assumed it was part of "the plan."

See Anandtech's review on the 3950X, for most people the bandwidth is fine. If you need more, then fear not, as AMD will without a doubt work on IF further. They have to if they want to get it to EMIB level of power efficiency. With Zen 2 they hugely improved bandwidth and reduced power per bit by 27%, there'll be more with Zen 3.
Yes, that's about how I see it as well. 3950x is at the high-end of balanced. That implies to me that further work on widening ALU throughput is contra-indicated absent other improvements. I also agree that some of that "work brought forward in Zen2" was the IF work -- don't forget the asynchronous work they did as well as bandwidth and power efficiency! I assume there will be more changes and improvements along those lines.

The reported reorganization of l3 is interesting, and it's possible that if done well, marginal changes in l3 might contribute more than marginal increases in cache size or memory bandwidth. I think there are changes in memory model that can improve uncontended parallel access. ARM (which has been referenced in this thread before :>) has a weaker memory model, though not as weak as Alpha's. There's only so much you can do without changing the programming model so much that you invalidate your codebase (and as all of the OOO execution interactions with caching has shown, speculative optimization is fraught), but, as I say, it'll be interesting just how far they go and where they choose to tweak.
 
  • Like
Reactions: lightmanek

Topweasel

Diamond Member
Oct 19, 2000
5,327
1,525
136
APUs won't move over to MCM for a while still. Not viable for multiple reasons.

Also, there's no need for Zen 3 next year. Renoir will be competitive without it - far more than Picasso was.
Well I question "a while" but I do get that AMD's current implementation of MCM does not work in their favor. I also think Renoir, specially if 8 cores is going to be fantastic. My point was in response to the Zen 3 development. Prioritizing mobile performance within the uarch, outside its benifit to server power usage doesn't make sense as a primary development angle for Zen 3. You aren't prioritizing laptops if its going to be almost a year after the rest of the lineup before you see the pay off. If anything what I think would happen is that development branches between Mobile and Server uarch design and mobile absorbs desktop as it closes in on core count ala SL/SL-X.
 

DrMrLordX

Lifer
Apr 27, 2000
16,506
5,482
136
@DrMrLordX: Are you psychopath?
Sure pal, sure. Good on everyone else for being able to converse around this . . . response.

You try piece by piece to prove that my prediction of Zen3 being new architecture was wrong.... day after AMD's server leader Norrod confirmed that Zen3 is actualy complete new architecture. It's not my fault you cannot read between the lines.
Norrod said exactly nothing. You're grasping at straws that do not even exist, per @amrnuke 's post.

TR2 is TR based on Zen2.
Only in your world.

Threadripper = 1950X and family
Threadripper2 = 2990WX and family/2950X and family
Threadripper3 = 3970X and family, later to be joined by the 3990X and who-knows-what-else

You are blind by hate.
Not really. I personally don't want to jump down the same rabbit hole. Watching you push the same rumours over and over again is also a bit tiresome.

YT channel Moors laws not dead was right about Zen2 TR timing
Again, that is totally not accurate. Six days from now is not 2020. Your argument has been Terminated! Hasta la vista, baby.
 

Veradun

Senior member
Jul 29, 2016
510
538
136
No, he's essentially pointing out that the Zen1->Zen2 increase will not happen.


Zen1 left some low hanging fruit on the table, which was implemented in Zen2; giving Zen2 a bigger than average boost. I'm not pissing on Zen3, restating the reasonable expectations set out by Norrod.
No, he says that Zen2 saw a bigger than usual gain for an ***evolutionary*** upgrade and that Zen3 is not evolutionary.
 

Veradun

Senior member
Jul 29, 2016
510
538
136
Well, Ryzen 4700X is going to knock the i9 9900k off the top step in ST and widen the lead in MT. It’s going to be a proper bashing. And, as an aside, I hope the new chipset for next year runs cooler so that the fans can be dropped. Kinda wish my PC wasn’t dying - I’d squeeze a bit more life out of it for a chance at Zen3.
I'm ok with B550 boards having PCIe3 CPUPCH and PCIe4 x16 slot and m.2 slot
 

Topweasel

Diamond Member
Oct 19, 2000
5,327
1,525
136
No, he says that Zen2 saw a bigger than usual gain for an ***evolutionary*** upgrade and that Zen3 is not evolutionary.
Yeah its easy to see why some people read what he says as "don't expect so much because Zen 2 was so much better than one could hope for" before talking about Zen 3. But he is saying that.

Zen 1-Zen 2 was a larger upgrade considering the lack of change in architecture then one would expect.
Zen 2 - Zen 3 Is an actual Architecture change (says from the ground up, but I assume it really means taking the Zen design elements and applying different design philosophies to it).

In the end I could be wrong it might be better to compare this to something like Sunny Cove vs. Icelake and such. That Zen is more complete architectural design aspect (modular and scalable using IF as a building block). That the heart of Zen 3 could be a completely new uarch.

Also something else to consider. Zen in all its glory is a Server core design. As AMD's reach into the data center becomes larger. We shouldn't always assume that their "performance increase" assumes our performance increase. We sit back with SL-X and question its design, lower IPC then SL, heavy investment into AVX-512. For the companies that use AVX-512, that work pays huge dividends even if 90% of the users of the CPU don't use the functionality and for all intents is just wasted space.
 

amrnuke

Senior member
Apr 24, 2019
886
1,152
96
No, he says that Zen2 saw a bigger than usual gain for an ***evolutionary*** upgrade and that Zen3 is not evolutionary.
What's more surprising is that he considers the chiplet design, moving to 7nm, and a 15% IPC gain to be evolutionary and not revolutionary.

This says to me that they consider Zen to have been revolutionary, and nothing since. If Zen3 is going to be revolutionary, it'll be interesting to see what they come up with.
 

Thunder 57

Golden Member
Aug 19, 2007
1,547
1,457
136
Some people were expecting that Zen3 will be something new.
There was several hints like:
  1. chief architect eng is Mike T. Clark (who was leading Zen1 - there is no better person to lead big challenges of new uarch)
  2. AMD Zen2 optimization document revealing Zen2 is last 17h Family... suggesting Zen3 will be new Family number, new uarch
  3. Zen3 is coming way early after Zen2 .... suggesting parallel development rather than successor
  4. process node 7nm EUV which is way too complicated when compatible and easier and cheaper nodes like N7P or N6 are available (only completely new uarch makes sense from economy point of view, especially with easy direct path to 5nm EUV).
  5. Lisa Su mentioned after Zen2 launch that she doesn't leave AMD to IBM because the best will come in future.
  6. Lisa Su mentioned AMD is focusing on architecture development more than on process node.
  7. Apple A12/A13 Vortex CPU core with 6xALUs showed that there is huge potential for IPC gain going wider core
  8. first Zen3 leak on May from YT Moors Law Dead, suggesting Zen3 is SMT4, and Zen2 Treadripper about end 2019/early 2020 and maybe canceled in favor of Zen3 while AMD concentrating Zen2 chiplets for server EPYC.... a lot of haters verbally attacked this guy however his sources was right, TR2 will have 2020 availability rather than promised Oct. Honestly I wouldn't be surprised Zen3 will have SMT4 feature as his source was right.
  9. SPECint2006 hints some possibility of future IPC gain:
    • Zen1 - 39.07 (4.1GHz? -> 9.53 pts/GHz)
    • Zen2 - 49.02 (4.5GHz? -> 10.89) .... +14% over Zen1
    • 9900K - 54.28 (5GHz -> 10.86)
    • A10 - approx 28? (2.3GHz -> 12.17) ... +27% over Zen1
    • A11 - 36.93 (2.4 GHz -> 15.38) ... +61% over Zen1 ... +26% over A10 (1st gen 6xALU CPU)
    • A12 - 44.92 (2.5 GHz -> 17.98) ... +89% over Zen1 ... +48% over A10 (2nd gen 6xALU CPU) ... +17% over A11


Just haters are so blind they cannot read between the lines....
Funny how all haters are surprised or silent now :D
There is video evidence that Zen 3 is SMT2. You: Zen is SMT4.

@DrMrLordX: Are you psychopath? You try piece by piece to prove that my prediction of Zen3 being new architecture was wrong.... day after AMD's server leader Norrod confirmed that Zen3 is actualy complete new architecture. It's not my fault you cannot read between the lines.

The same is Soresu. TR2 is TR based on Zen2. I like numbering in proper order, single digit is uarch generation. Jesus Christ, I'm not wonder why you guys cannot put all the puzzles toghether. You are blind by hate.

And last thing. YT channel Moors laws not dead was right about Zen2 TR timing so we can expect SMT4 for Zen3 too. I would say the probability raised from 60% to 80% now.
Are you a buffoon? You can't even get your TR generations right. Name calling isn't so nice, now is it? BTW, I only asked if you were a buffoon because of the psychopath comment and to try to make a point. Can we try to be a bit more civil?

Also, you are not some genius that can put puzzle pieces together surrounded by bozos who can't tie their own shoes. Also, @DrMrLordX is rather kind to AMD, certainly not a hater.

It is a speculation thread, doesn't this seem overly harsh?
Doesn't calling someone a psychopath and implying we are all idiots seem a bit harsh?
 
Last edited:

dnavas

Senior member
Feb 25, 2017
304
154
86
Doesn't calling someone a psychopath and implying were all idiots seem a bit harsh?
Drat, I guess I did that to myself. Yes, many people have said terrible things to many other people in this sub-thread. I prefer to avoid wrestling with pigs, though perhaps in this instance it might have looked like excusing behavior I did not. My apologies.
 

Thunder 57

Golden Member
Aug 19, 2007
1,547
1,457
136
Drat, I guess I did that to myself. Yes, many people have said terrible things to many other people in this sub-thread. I prefer to avoid wrestling with pigs, though perhaps in this instance it might have looked like excusing behavior I did not. My apologies.
No need to apologize. I didn't mean to call you out or anything. I just want to see more civility like I said. When you have a user that gets upset when people disagree with their predictions, it gets annoying. It's a speculation thread. One's opinion is not fact.
 
  • Like
Reactions: dnavas

moinmoin

Golden Member
Jun 1, 2017
1,949
2,229
106
The only promise I can remember being given related to AM4's 2020 CPU compatibility.
While that's true I feel the general expectation was that AMD would only change socket compatibility for any of them once forced by DDR5. This still appears to be true for AM4 and SP3.

What's more surprising is that he considers the chiplet design, moving to 7nm, and a 15% IPC gain to be evolutionary and not revolutionary.
They are talking about architectural changes being more important for them than nodes. Aside improvements from going 7nm and chiplets as well as evolutionary improvements in the core design, Zen 2 was mostly an architectural re-organisation, moving the uncore onto a dedicated die.

The reported reorganization of l3 is interesting, and it's possible that if done well, marginal changes in l3 might contribute more than marginal increases in cache size or memory bandwidth.
Compared to other CPUs Zen chips have an insane amount of l3$. Its use is not particularly efficient, it primary purpose supposedly being hiding latency induced by the architectural topography.

A 64 cores Epyc has a whopping 256MB of l3$. But that huge amount is split into 64 x 4MB per core slices, into 16 x 16MB per CCX groups, into 4 x 64MB close to a dual channel IMC each.

The way Zen 2 unified several NUMA and PCIe root complexes into one IOD, Zen 3 may explore ways to unify the huge l3$ into something that's actually (write-)accessible as a whole for every core.
 

jamescox

Member
Nov 11, 2009
138
254
136
While that's true I feel the general expectation was that AMD would only change socket compatibility for any of them once forced by DDR5. This still appears to be true for AM4 and SP3.


They are talking about architectural changes being more important for them than nodes. Aside improvements from going 7nm and chiplets as well as evolutionary improvements in the core design, Zen 2 was mostly an architectural re-organisation, moving the uncore onto a dedicated die.


Compared to other CPUs Zen chips have an insane amount of l3$. Its use is not particularly efficient, it primary purpose supposedly being hiding latency induced by the architectural topography.

A 64 cores Epyc has a whopping 256MB of l3$. But that huge amount is split into 64 x 4MB per core slices, into 16 x 16MB per CCX groups, into 4 x 64MB close to a dual channel IMC each.

The way Zen 2 unified several NUMA and PCIe root complexes into one IOD, Zen 3 may explore ways to unify the huge l3$ into something that's actually (write-)accessible as a whole for every core.
I don’t think it will be directly accessible except within the core cluster, but it will be a single core cluster per cpu die. That is a big change. Current Zen2 only allows direct access to the 16 MB by the 4 cores in each CCX even though they have 256 MB total on 64-core Epyc.

That isn’t important for many well threaded applications, but some benefit from a much larger, unified last level cache. For Zen 3, they are talking 32 MB to start and possibly larger for dedicated server chips. Keep in mind that current top end Xeons have 38.5 MB. If AMD has Zen 3 with more than 32 MB per CCX, then that is yet another area where Epyc will take the performance crown.

I am currently expecting that the chiplet architecture will remain mostly unchanged. They may make another spin of the IO dies to take advantage of power savings on whatever is Global Foundries latest low power process. They must have completely reworked the cache architecture at a minimum though. Hopefully they have increased cache bandwidth and possible increased AVX throughput again. I don’t really expect the basic ALU set-up to change. That would require completely reworking the scheduler and the scheduler would possibly grow in size and complexity significantly (non-linear with number of units). I would expect mostly cache improvements, but that includes things like prefetch and trace caches and such. They may have made a bunch more tweaks other places also. I don’t expect a radical redesign except for the caches though. I had expected them to stick with the 4 core cluster for a while yet, but it seems 7nm+ must make the 8 core CCX reasonable.

For mobile, I expect that they will release an APU marketed as Ryzen 4000, but it may be roughly Zen 2 based with some improvements. Note that as far as cpu cores go, mobile and server aren’t really that different. They both prioritize power consumption more than desktop parts. It is possible that they will leapfrog a generation (mobile from amd has been a generation behind desktop and server) and make Ryzen 4000 APUs actually based on Zen 3 architecture. They have already said that they are focusing on efficiency with Zen 3.
 

jamescox

Member
Nov 11, 2009
138
254
136
They have been clear in stating the Zen 2 and 3 and so on were distinct architectures rather then some tweaks here or there or a process change. Not all of that means major changes to the core design. Zen 2 didn't change much about how the core worked compared to Zen 1 but the rest of the architecture had massive changes to how the dies connected to each other and the rest of the system. Zen 3 isn't going to be built from the ground up, it will probably look like Zen for the most part. But I look at it this way.

Zen 1 was about creating a competitive die and connecting them.
Zen 2 was about taking the dies and perfecting the connection for balanced and quick performance.
Zen 3 will probably about designing the core arch around the idea of using the now perfected connection. Probably room for improvement if the cores, ccx's (if they are needed) and CDD's are designed with IO die and IF multi-die connectivity from the beginning.
Zen 1 was one die to rule them all. It was the same die from a $100 part all of the way up to Epyc processors costing thousands of dollars. It is very expensive to tape out a design, so one modular die was incredibly efficient design wise. It had quite a bit of waste in actual silicon though. Every die had a massive infinity fabric router with many ports and 4 IFOP links that went completely unused in AM4 parts. It also increased latency over directly connected memory.

Zen 2 fixes that. With the cores separated from the IO, the desktop part doesn’t really have anything that it doesn’t need. They can make specialized IO die for different markets at a more reasonable cost. The cost to tape out a 7 nm chip is going to be high, especially with EUV in the next generation. The IO die designs can be very modular; the desktop IO die is essentially 1/4 the Epyc IO die. The ThreadRipper IO die is 1/2 the Epyc IO die (if it is different). Taping out a chip on 14 or 12 nm presumably cost a lot less than the 7 nm die. They only make one kind of 7 nm cpu die; the same cpu chiplet that gets used everywhere, so they have really only needed to do one 7 nm tape out for Zen 2. I have seen some things indicating that the ThreadRipper IO die is actually not a salvaged Epyc IO die; I am not sure if that is true yet though. That would make sense if they are going to make a push into the workstation market with a ThreadRipper variant. While 14 nm is cheaper than 7 nm, the Epyc IO die is still a lot of silicon. People forget that it is near the size of a high end gpu on 14/12 nm, which people are still “happily” paying nvidia a lot of money for. There are still parts being made on much older processes for things where performance isn’t critical. 14/12 nm still has some demand.

Since a AMD has the cash flow, we are going to see more and more specialized die from them. This generation it seems to mostly be the IO die. We probably have a single chip APU coming soon also, but it is unclear if that is going to be Zen 2, Zen 3, or something in between. With the next generation, we may see an even higher variety of IO die. We are probably going to get some specialized cpu die in that generation also. Perhaps multiple cache size variants; we already expect at least 2 with the 32+ MB comment, but there could be more for really high end designs, like some of their super computer design wins. I am curious as to whether there will be a smaller 16 MB variant for lower end desktop also. They may fill out the lower end with a really powerful 8 core APU instead. That could have lower memory latency with everything on one die.
 

moinmoin

Golden Member
Jun 1, 2017
1,949
2,229
106
I don’t think it will be directly accessible except within the core cluster, but it will be a single core cluster per cpu die. That is a big change. Current Zen2 only allows direct access to the 16 MB by the 4 cores in each CCX even though they have 256 MB total on 64-core Epyc.
We don't know the changes in Zen 3 yet.

The current state up to Zen 2 is that the cores have read access to the whole l3$, but obviously the latency is by far the best within a CCX. Write access on the other hand is currently only to each core's own 4MB slice, even within a CCX. The result is the potential for a lot of redundant data in l3$ across the package.

@DisEnchantment posted several cache related patents that may point to a transparent cache management that could allow cores write access virtually to the whole l3$, with the system (SCF) preemptively moving the data to where it's expected to be required to reduce the inherent latency. For this to work the currently per core branch predictor (TAGE introduced in Zen 2) would get a much bigger role on a system level, potentially predicting and (as local branch predictor) managing loads per core and moving cache data across the package accordingly. These changes would need bigger changes on an architectural level than before, which may be what Norrod was talking about.
 

naukkis

Senior member
Jun 5, 2002
365
208
116
The current state up to Zen 2 is that the cores have read access to the whole l3$, but obviously the latency is by far the best within a CCX. Write access on the other hand is currently only to each core's own 4MB slice, even within a CCX. The result is the potential for a lot of redundant data in l3$ across the package.
CCX has acces to it's own L3 only. When L3 miss happens memory controller is queried which serves memory request. And write access is to whole CCX L3, data is lower-bit interleaved to every L3 chunk. AMD won't even try to keep core's own data in it's own L3 chunk.
 
  • Like
Reactions: moinmoin

inf64

Diamond Member
Mar 11, 2011
3,066
1,789
136
My expectation for Zen3:
- IPC uplift ~20% (based on Norrod's comments this will be a true generational leap as core is based on a new uarchitecture' there will be radical changes in the pipeline structure, reduced latencies for instructions, increased ALU/FPU/AGU counts, other structures increased to accommodate the changes)
- clocks 5-7% higher
- core count 25% higher (10 chiplets x 8 cores; CCX is 8 cores sharing huge L3 cache)
- power draw roughly the same or slightly higher than Rome
- *possibility* of SMT4, not excluding it yet

This will put a stop to any gain Tigerlake/Icelake will make over SKL/X core. I think that the only advantage intel will have is going to be AVX512 and that's it. Everything else will be roflstomped by Milan.
 

ASK THE COMMUNITY

TRENDING THREADS