It's not like AMD gets anywhere near that $300/$500 from retail customers. Retailers get a big chunk, distributors take a chunk...The 8x 6-core configuration lists for $5000. AMD sells the desktop parts for $300/$550.
It's not like AMD gets anywhere near that $300/$500 from retail customers. Retailers get a big chunk, distributors take a chunk...The 8x 6-core configuration lists for $5000. AMD sells the desktop parts for $300/$550.
Sorry, I was in the hospital for surgery and am behind on posts, but I can prove my point pretty easily:@eek2121
Sorry, but most of your post is bollocks. Servers and also Mobile is much more important to AMD than DIY. Everything else is just wishful thinking from your side.
Furthermore it is a FACT that the IFOP interconnect needs much more energy compared to monolithic solutions. This is why everyone and their dog is talking about silicon interconnects.
Zen 1 has 64KB of L1 instruction cache instead of 32KB that the rest of the line has.The whole Zen family so far hasn't seen a change in its L1$ quantity of 64KiB. Only change so far was how its use is partitioned in Zen 2. So far we heard Zen 4 will increase L2$. But a L1$ change is more delicate so I'd deem it very unlikely an optimization step like Zen 4c will contain a big change there.
SPR is most like Zen 1 MCM in its approach to partitioning, so quite lagging the leading edge. And EMIB is described as a generation behind EFB, may have some yield losses.While the ability is applaudable, I'm personally not sure Intel really deserves kudos for their approach. Especially PVC appears to be overenginered, SPR doesn't strike me as particularly elegant either.
Any more info on this chance of upgrade? Upgrade as in more nodes?View attachment 52869
Another slide from Hyperion Research, El Capitan full system delivery in late 2022. This system uses variant of Zen4 with MI300. I think there is a chance the El Capitan will be upgraded.
Just by looking at AMD presentations, Zen 4 desktop seems way below the priority of Genoa. There was one slide showing server roadmap, with Zen 4 Genoa on it, the same presentation for desktop ended with 2021 with Zen 3 (did not even show Zen 4 at all... So desktop is a low priority, which IMO is a mistake that AMD is making...I hit up a couple of the usual leak sites and didn't see this. Twitter leaker?
Sorry, I don't know what SLC means in this context. Single Level Cache?Did AMD mention using an SLC?
That is why I prefaced with reduction in FP compute power compared with the vanilla Zen4 core, speculated here. If that change is in the tables, then an L1 amount change might also be, as both means changing the floor plan of the core, unlike changes in the L2 and L3 cache that are adjacent structures who can be expanded or contracted with relativelly little design effort.But a L1$ change is more delicate so I'd deem it very unlikely an optimization step like Zen 4c will contain a big change there.
Adding more cores and reducing IF clockspeed would be disastrous.Honestly, if you needed 16 cores in a 45W envelope, it is doable now. You just have to drop the fclk.
SLC is usually System Level Cache - an alternative Last Level Cache that can be utilized by more than one logic group. For example, the SLC on Apple SoCs can be used by the CPU cores, iGPU, and um other stuff too.Sorry, I don't know what SLC means in this context. Single Level Cache?
What is the point you are trying to sell? That IFoP needs less energy when slowed down? Well, that is why energy consumption of interconnects is measured in pJ/bit. The point is: At the same speed IFoP will consume at least 10x more energy than the same interconnect on die. This is due to physics. Silicon bridges will bring down that gap by a lot.On my motherboard, with fixed DDR4 3600, no changes to CPU frequencies:
FCLK 1800 MHz, package power: 48W
FCLK 933 MHz, package power: 28.5W
Adding more cores and reducing IF clockspeed would be disastrous.
SLC is usually System Level Cache - an alternative Last Level Cache that can be utilized by more than one logic group. For example, the SLC on Apple SoCs can be used by the CPU cores, iGPU, and um other stuff too.
It was claimed that chiplet based CPUs in general couldn’t possibly be a thing because a chiplet based solution uses too much power. I pointed out that a dynamic fclk and other optimizations could very much allow for a true mobile version of the 5950, or any other chiplet based CPU. Just because AMD hasn’t doesn’t mean they can’t. Could you target the ultralight 15W with a chiplet approach? Not as implemented. 35-54W however? absolutely.What is the point you are trying to sell? That IFoP needs less energy when slowed down? Well, that is why energy consumption of interconnects is measured in pJ/bit. The point is: At the same speed IFoP will consume at least 10x more energy than the same interconnect on die. This is due to physics. Silicon bridges will bring down that gap by a lot.
In case there is confusion:Hoping for same socket as AM4 for Rembrandt!
Right, as I wrote with Zen 2 AMD re-partitioned the whole of L1$. Halving the L1 instruction cache and doubling the µOP cache.Zen 1 has 64KB of L1 instruction cache instead of 32KB that the rest of the line has.
Hm, for PS5 Sony did let Zen 2 customize with a heavily cut FP capability, so that's indeed a possibility. L1$ is yet another level lower though imo. Changes there are usually long planned and simulated for new core designs as changes in balance change everything about cores.That is why I prefaced with reduction in FP compute power compared with the vanilla Zen4 core, speculated here. If that change is in the tables, then an L1 amount change might also be, as both means changing the floor plan of the core, unlike changes in the L2 and L3 cache that are adjacent structures who can be expanded or contracted with relativelly little design effort.
If that can be as easily be done as said it would have happened already. Every increase in cache size is bound to increase latency. It's all a huge balancing act.Doubling or trippling the L1 cache while keeping the same latency measured in cycles would provide a pretty sizeable IPC gain, like the 3D cache will do with L3 cache, but unlike an boost in L3 cache it will probably affect the vast majority of applications.
That only helps you when you're not doing anything. Once you put load on the CPU and bring the IF clock back up to where it needs to be, your power consumption jumps. And when you've got a limited power budget, suddenly bumping up the share of ppt you commit to interconnect potentially strangles performance elsewhere in the CPU. Depending on your limits, of course.EDIT: A low fclk is not disastrous. The key is only ramping fclk when it is needed, and idling it when not.
The question is, “can you afford to burn 10W”. The answer is: yes. We are stuck thinking in terms of Zen3, Raphael will have a GPU on the IO die. I postulate that the power improvements ported from cezanne combined with a 6nm IO die shrink along with (lp)DDR5 support will significantly reduce power consumption. 45W Raphael will not only exist, but it will thrive.That only helps you when you're not doing anything. Once you put load on the CPU and bring the IF clock back up to where it needs to be, your power consumption jumps. And when you've got a limited power budget, suddenly bumping up the share of ppt you commit to interconnect potentially strangles performance elsewhere in the CPU. Depending on your limits, of course.
Yes, the atrociously high idle power consumption of existing IF implementations is something AMD would need to tackle to bring the I/O die (or similar) strategy to mobile. And maybe they'll fix that with Raphael-H. But if your power constraints are 65w or lower . . . do you really want to burn an extra 10W (or more) on interconnect? That's going to choke the rest of the chip, even in bursty loads.
Yeah, mess I up here, Zen+ 3D is what I was thinking! Coming out in January 2022!In case there is confusion:
With the release of ADL-S I suspect some of the info surround the B2 stepping, Zen3D, etc. may be stale.
- Rembrandt is AM5.
- Rembrandt desktop launches Q3 of next year or later.
- Rembrandt mobile is rumored for a CES launch. Given AMD cadence, that seems likely.
- AM4 will receive Zen3D along with a new Zen 3 stepping, which depending on who you believe (AMD or the leakers) may or may not have clock speed improvements
- That is it for AM4.
I realize this and it only helps my argument, but for the sake of simplicity I chose to ignore it. Even without considering that it requires a significant discount to make desktop parts favorable even at their list price.It's not like AMD gets anywhere near that $300/$500 from retail customers. Retailers get a big chunk, distributors take a chunk...
The Zen 4c core is almost certainly a completely new floor plan. The cores will be smaller. The only way they could keep the floor plan would be if the cores are half the size (edit: accidental post: meant exactly half and they share the L2). They are likely making use of the denser design libraries, (edit) so it seems like it needs to be a completely different floor plan.Sorry, I don't know what SLC means in this context. Single Level Cache?
That is why I prefaced with reduction in FP compute power compared with the vanilla Zen4 core, speculated here. If that change is in the tables, then an L1 amount change might also be, as both means changing the floor plan of the core, unlike changes in the L2 and L3 cache that are adjacent structures who can be expanded or contracted with relativelly little design effort.
The Zen4c cores are rumored to be used as "little" cores in Zen5, and maybe beyond, especially on lower power and cost applications, like the Zen2 core is still used now. Given this longevity, it might make sense for AMD to expend the extra engineering effort optimizing it.
Doubling or trippling the L1 cache while keeping the same latency measured in cycles would provide a pretty sizeable IPC gain, like the 3D cache will do with L3 cache, but unlike an boost in L3 cache it will probably affect the vast majority of applications.
The 6 nm IO die and updated interfaces could reduce power usage significantly, so I think a chiplet based mobile part will actually do quite well. It is just one link compared to the massive number on Epyc IO die. I assume that a major goal of the new Genoa IO die was to reduce power consumption. The really low power design will be a stacked solution, but that comes later. They will still probably use monolithic die and stacked solutions for lower power parts. Different types of stacking has been used in mobile for a while. Some of the new chip stacking tech can get close to a monolithic die for power consumption, so all mobile parts will probably be in a stacked package of some kind going forward. It would be great if we can get something like a 16 core processor (possibly Zen 4c based cores), a reasonable GPU, and a stack of HBM2E cache. A single stack is 16 GB now. If you had some LPDDR5 or even a ridiculously fast SSD to back it up, then that is plenty.The question is, “can you afford to burn 10W”. The answer is: yes. We are stuck thinking in terms of Zen3, Raphael will have a GPU on the IO die. I postulate that the power improvements ported from cezanne combined with a 6nm IO die shrink along with (lp)DDR5 support will significantly reduce power consumption. 45W Raphael will not only exist, but it will thrive.
Also, 45W Raphael does exist, whether anyone wants it to or not.
Looking at its block diagram, it seems unlikely to be reduced, considering increased speed of the LCLK and SHUB clock domains, added new MP-based IO controller, two IO hubs and so on, unless some revolutionary powersaving tech's were implemented.I assume that a major goal of the new Genoa IO die was to reduce power consumption.
My thoughts: New floor plan obviously. New balance of existing elements (like the changes between Zen and Zen 2) maybe. New core design less likely. Denser design library rather unlikely since Zen 2 already used the library focused on density and there is no indication AMD approached Zen 3 and Zen 4 differently. What I expect to change is that the longer time is being used to simulate and optimize the existing design in a denser layout, but with all ingredients being the same (so known quantities, important to optimize the hell out of them). The existing designs use the dense library to have a fine grid pattern on which to space out the transistors. The APUs were efforts to reduce the spacing again afterward, Zen 4c should be in line with that, with a high margin market added to the effort.The Zen 4c core is almost certainly a completely new floor plan. The cores will be smaller. The only way they could keep the floor plan would be if the cores are half the size They are likely making use of the denser design libraries.
As the name says HBM is high bandwidth memory. It lacks the low latency to be really useful as cache.and a stack of HBM2E cache.
The current IODs essentially are always full on. In the APUs AMD tweaked the uncore to use lower power modes where and whenever it makes sense. The Genoa/Raphael IOD will be the first new IOD where such IO power saving techniques will be implemented for server/desktop packages as well. The Raphael-H rumors/leaks point to that step happening at least for the Raphael cIOD, and there is no reason it wouldn't be implemented in the Genoa sIOD as well then.Looking at its block diagram, it seems unlikely to be reduced, considering increased speed of the LCLK and SHUB clock domains, added new MP-based IO controller, two IO hubs and so on, unless some revolutionary powersaving tech's were implemented.
If all they have (i hope not) is already implemented in Cezanne, then it's not much to say the least (having both cezanne and vermeer, i may share my thoughts regarding fabric power efficiency, if you want). Rather, i hope, they use specifically optimized 6nm process for the IODs and of similar purpose circuitries. Besides, there are some very innovative powersaving tech's patented in a last 3 or more years, that hopefully were already applied at the design stage.The current IODs essentially are always full on. In the APUs AMD tweaked the uncore to use lower power modes where and whenever it makes sense. The Genoa/Raphael IOD will be the first new IOD where such IO power saving techniques will be implemented for server/desktop packages as well. The Raphael-H rumors/leaks point to that step happening at least for the Raphael cIOD, and there is no reason it wouldn't be implemented in the Genoa sIOD as well then.
I suspect Zen 4c, and possibly other derivatives of it, are going to be around for a while, so it may have more radical changes. It might be mostly process tech optimizations, but it seems like they are also going to cut some stuff out. Giant vector FP units are almost entirely unused in a wide range of servers. There is a big difference between a regular server and an HPC machine. I have wondered if it might be radically different with some number of cores sharing L2 and possibly FP units.My thoughts: New floor plan obviously. New balance of existing elements (like the changes between Zen and Zen 2) maybe. New core design less likely. Denser design library rather unlikely since Zen 2 already used the library focused on density and there is no indication AMD approached Zen 3 and Zen 4 differently. What I expect to change is that the longer time is being used to simulate and optimize the existing design in a denser layout, but with all ingredients being the same (so known quantities, important to optimize the hell out of them). The existing designs use the dense library to have a fine grid pattern on which to space out the transistors. The APUs were efforts to reduce the spacing again afterward, Zen 4c should be in line with that, with a high margin market added to the effort.
As the name says HBM is high bandwidth memory. It lacks the low latency to be really useful as cache.
The current IODs essentially are always full on. In the APUs AMD tweaked the uncore to use lower power modes where and whenever it makes sense. The Genoa/Raphael IOD will be the first new IOD where such IO power saving techniques will be implemented for server/desktop packages as well. The Raphael-H rumors/leaks point to that step happening at least for the Raphael cIOD, and there is no reason it wouldn't be implemented in the Genoa sIOD as well then.
Okey dokey. We'll see how that works out.The question is, “can you afford to burn 10W”. The answer is: yes.
That's not particularly relevant to IF link power consumption.We are stuck thinking in terms of Zen3, Raphael will have a GPU on the IO die.
Phoenix will probably do it all better though. Or at least more efficiently.I postulate that the power improvements ported from cezanne combined with a 6nm IO die shrink along with (lp)DDR5 support will significantly reduce power consumption. 45W Raphael will not only exist, but it will thrive.
Given AMD is still prefixing the cores with “Zen4”, I am going to assume that they are simply using/taking advantage of high density libraries and possibly cutting cache. Maybe I am reading too much into it, however.My thoughts: New floor plan obviously. New balance of existing elements (like the changes between Zen and Zen 2) maybe. New core design less likely. Denser design library rather unlikely since Zen 2 already used the library focused on density and there is no indication AMD approached Zen 3 and Zen 4 differently. What I expect to change is that the longer time is being used to simulate and optimize the existing design in a denser layout, but with all ingredients being the same (so known quantities, important to optimize the hell out of them). The existing designs use the dense library to have a fine grid pattern on which to space out the transistors. The APUs were efforts to reduce the spacing again afterward, Zen 4c should be in line with that, with a high margin market added to the effort.
As the name says HBM is high bandwidth memory. It lacks the low latency to be really useful as cache.
The current IODs essentially are always full on. In the APUs AMD tweaked the uncore to use lower power modes where and whenever it makes sense. The Genoa/Raphael IOD will be the first new IOD where such IO power saving techniques will be implemented for server/desktop packages as well. The Raphael-H rumors/leaks point to that step happening at least for the Raphael cIOD, and there is no reason it wouldn't be implemented in the Genoa sIOD as well then.
Rembrandt, Genoa, and Raphael all have some new power saving tech. AMD actually confirmed this quite recently I believe.If all they have (i hope not) is already implemented in Cezanne, then it's not much to say the least (having both cezanne and vermeer, i may share my thoughts regarding fabric power efficiency, if you want). Rather, i hope, they use specifically optimized 6nm process for the IODs and of similar purpose circuitries. Besides, there are some very innovative powersaving tech's patented in a last 3 or more years, that hopefully were already applied at the design stage.
AMD, like any company in this field, likely has dozens, if not hundreds of variations in flight at any given time. This allows them to be more agile and respond to marketplace changes.I suspect Zen 4c, and possibly other derivatives of it, are going to be around for a while, so it may have more radical changes. It might be mostly process tech optimizations, but it seems like they are also going to cut some stuff out. Giant vector FP units are almost entirely unused in a wide range of servers. There is a big difference between a regular server and an HPC machine. I have wondered if it might be radically different with some number of cores sharing L2 and possibly FP units.
As for HBM cache, I mentioned that it is not great for a cpu cache a few post ago due to the DRAM latency. The HBM is mostly for the integrated GPU. AMD supports virtual memory on their GPUs which allows the system memory to essentially act as swap space. Doing the same thing with an APU-like device with HBM swapped out to DDR5 system memory would be great. We probably will not get something like that unless it is a device made for many chiplets, like separate cpu, IO, gpu, and HBM chiplets. They could possibly put an HBM interface on an IO die with an integrated GPU, but that doesn’t seem that likely.
I think that moving forward, AMD will yse monolithic dies for<40W chips and a mix of monolithic and chiplet based approaches for 45W chips.Okey dokey. We'll see how that works out.
That's not particularly relevant to IF link power consumption.
Phoenix will probably do it all better though. Or at least more efficiently.