Speculation: Ryzen 3000 series

Page 87 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

What will Ryzen 3000 for AM4 look like?


  • Total voters
    230

Caveman

Platinum Member
Nov 18, 1999
2,525
33
91
Thanks for the estimation... 99% of my need for speed is to support a flight simulation habit (DCS, IL-2 BOS, XPlane 11, P3D). Unfortunately, most are not optimized for multi-core operation though that is (finally) beginning to change. Most simulators are also adopting Vulcan which according to my rudimentary knowledge is supposed to help simulators perform better on multi-core machines. If the 3000 series performs near or better than the 9900 for roughly the same price, it may be my next rig, and I'd just look at the multi-core advantages of the AMD architecture as a "bonus". My last AMD was an Athlon about 17 years ago and I loved it.

Is my basic understanding of Vulcan at all accurate - that Vulcan is designed to optimize capability of multi-core CPUs?
 

beginner99

Diamond Member
Jun 2, 2009
5,210
1,580
136
There will be extra speed due to the IF between the 2 core dies, doesn't transport DDR4 data.
It will also be interesting to know if this link stays at 100GB/s when you reduce the speed by 2 so you get 2 time 50GB/s and 1 time 100GB/s.

I also made this image below to show the similarity with a 'crippled' 16 core threadripper 2000.

There won't be any direct link between dies.
 

Kedas

Senior member
Dec 6, 2018
355
339
136
There won't be any direct link between dies.
According to this info it will https://www.techpowerup.com/253954/...reveals-new-options-for-overclocking-tweaking
CAKE, or "coherent AMD socket extender" received an additional setting, namely "CAKE CRC performance Bounds". AMD is implementing IFOP (Infinity Fabric On Package,) or the non-socketed version of IF, in three places on the "Matisse" MCM. The I/O controller die has 100 GB/s IFOP links to each of the two 8-core chiplets, and another 100 GB/s IFOP link connects the two chiplets to each other. For multi-socket implementations of "Zen 2," AMD will provide NUMA node controls, namely "NUMA nodes per socket," with options including "NPS0", "NPS1", "NPS2", "NPS4" and "Auto".

AdoredTV also hinted towards it as possible end January when they checked the first pictures.
 
Last edited:

krumme

Diamond Member
Oct 9, 2009
5,952
1,585
136
Thanks for the estimation... 99% of my need for speed is to support a flight simulation habit (DCS, IL-2 BOS, XPlane 11, P3D). Unfortunately, most are not optimized for multi-core operation though that is (finally) beginning to change. Most simulators are also adopting Vulcan which according to my rudimentary knowledge is supposed to help simulators perform better on multi-core machines. If the 3000 series performs near or better than the 9900 for roughly the same price, it may be my next rig, and I'd just look at the multi-core advantages of the AMD architecture as a "bonus". My last AMD was an Athlon about 17 years ago and I loved it.

Is my basic understanding of Vulcan at all accurate - that Vulcan is designed to optimize capability of multi-core CPUs?
Yes. Dx12 and Vulcan are made to be able to use more cores and have 10 to 30 times as high drawcall capability as poorly threaded dx11 game. Depends on game engine though but dx12 and Vulcan is the foundation that makes it possible.
When we talk flightsims or especially RTS games that can have a potential huge uplift.
I think the consequences is we will see games being throughout limited as opposed to eg memory latency limited like the old arma.
Intel and nvidia have for years try to keep the old standards to protect their business. Then we got Mantle and we are still to see the effect. It's just started.
The new API put a lot of stress on the programming of the engine so we have seen loads of disappointing dx12 games but more and more good results is starting to come.
Good because we were stuck. Glad it's over.
To hell with monopolies.
 

Kedas

Senior member
Dec 6, 2018
355
339
136
So on Rome you would have only links between the 2 dies close to each other?
For Rome they all have a direct connection to the I/O die, what happens with the extra connection on the die I have no idea, with 64core 8 dies you can only connect to 1 other of 7 so not a very big advantage. Maybe it cost to much power than the advantage it gives in case of more than 2 dies. So only for 12-16 core EPYC2 (or Ryzen 3000)
Although if you have only 4 dies (32 cores) on EPYC2 you could connect both IF of the die to the I/O die increasing BW.
 

moinmoin

Diamond Member
Jun 1, 2017
4,956
7,675
136
If nothing else two IFOP per chiplet could be built-in redundancy increasing the yield. And if they do that anyway making use of both in some relatively lower quantity products may seem like a good idea.
 

amd6502

Senior member
Apr 21, 2017
971
360
136
If nothing else two IFOP per chiplet could be built-in redundancy increasing the yield. And if they do that anyway making use of both in some relatively lower quantity products may seem like a good idea.

Maybe the chiplets inherit much from Zepplin and two dies can actually pair to keep coherent as a whole, just as two CCX's on same die can keep coherent.

RR supposedly was not too different from Zepplin, and used the same fabric.

https://fuse.wikichip.org/news/1596/hot-chips-30-amd-raven-ridge/3/

raven_ridge-sdf_plane_block-1024x819.png


Raven Ridge just doesn't have more than 16lanes on the die.
The i/o die will certainly have transistor concerns but it can be made to the am4 spec unlike zeppelin that wastes loads of IO when used in ryzen. So it needs it's memory controllers, 24pcie lanes, the new IF links, 4USB3 lanes and loses the four old IF lanes, various gpio functions (socket interconnect for example), maybe the 10gbe (expect to keep a couple of them for future embedded parts) . I think the ~120mm^2 io die is a bit large without using that space on something new which I hope is a token Igpu but as someone else has mentioned the amount of fixed function gpu parts is quite large.

Well it makes perfect sense to reduce the lanes when RR was focused much on mobile. They really wanted to minimize the uncore whose size is related to the idle wattage. (Okay, it did have a larger non-mobile role for higher margin desktop parts like the 2400g... so it was like 70% mobile and ~30% desktop focused.)

Even on desktop, I actually don't know what the need for all these PCIe lanes is. For people who don't use dual dGPU setup (which for APU users is very uncommon), I can't imagine running short on lanes.
 
Last edited:
  • Like
Reactions: Schmide

DrMrLordX

Lifer
Apr 27, 2000
21,643
10,860
136
Even on desktop, I actually don't know what the need for all these PCIe lanes is. For people who don't use dual dGPU setup (which for APU users is very uncommon), I can't imagine running short on lanes.

MultiGPU is not so common anymore. Power users that want multiple NVMe devices might want the lanes. Also, my local ISP sells 10 Gbps fibre service to the oddballs that want to pay for it. You're gonna need lanes for that if you want to go big on bandwidth. APU users probably don't fall into any of those categories.
 
  • Like
Reactions: DarthKyrie

tomatosummit

Member
Mar 21, 2019
184
177
116
Maybe the chiplets inherit much from Zepplin and two dies can actually pair to keep coherent as a whole, just as two CCX's on same die can keep coherent.

RR supposedly was not too different from Zepplin, and used the same fabric.

https://fuse.wikichip.org/news/1596/hot-chips-30-amd-raven-ridge/3/

Well it makes perfect sense to reduce the lanes when RR was focused much on mobile. They really wanted to minimize the uncore whose size is related to the idle wattage. (Okay, it did have a larger non-mobile role for higher margin desktop parts like the 2400g... so it was like 70% mobile and ~30% desktop focused.)

Even on desktop, I actually don't know what the need for all these PCIe lanes is. For people who don't use dual dGPU setup (which for APU users is very uncommon), I can't imagine running short on lanes.

The hotchips presentation on RR showed it's internals quite well. It uses a network on chip design that is popular these days and the data travels through a series of cross bars around the various clock domains on the die, it's easy to see how modular amd's designs are from that as they've been publicly stating for years now. But it was certainly a mobile first design, the one I find funny is the athlon desktop parts that have even less pcie lanes available. Bringing 4+2+2(4) to the gfx, m2 and pch, confused me when I plugged one into an asus 470prime and found the m2_1 could no longer boot. I really think they should have put an igpu in the desktop IO die just to increase the target market for the part, especially with rr falling behind even intel's i5s with 6cores now. Perhaps the laptop based IO die has something different, like reduced pcie lanes and lpddr4 memory controller along with gpu parts, maybe at 7nm as well.

You might be right about how the dies pair up to keep coherent, that way it can look the same to software as epyc 1 does.

MultiGPU is not so common anymore. Power users that want multiple NVMe devices might want the lanes. Also, my local ISP sells 10 Gbps fibre service to the oddballs that want to pay for it. You're gonna need lanes for that if you want to go big on bandwidth. APU users probably don't fall into any of those categories.

The problem isn't the amount of lanes but how accessible they are. the X chipsets have 2x 8slots and 4x for the m2 slot. Everthing else is pcie2 and shared across the pch including various sata and USB. If we could birfurcate the main gpu lanes to 8+4+4 the usability of a high end board would increase, I wish board vendors would at least use some switches to enable this as they charge so much for such meagre increases apart from pwm and flashing lights. Even if pcie4 is coming there's still no products that can use it so most items that are available now and in the future will be cheaper than first generation pcie4 parts will work at pcie3 speeds negating that advantage.
 
  • Like
Reactions: amd6502

fleshconsumed

Diamond Member
Feb 21, 2002
6,483
2,352
136
I'm dumping all my intel rigs and replacing them with AMD as soon as I can. Somebody just purchased my 4770k on ebay for $180, and I just got Ryzen 1600 for $80 from Microcenter to replace it. I get to have more scalable platform with more modern features with less security holes for cheaper than I can sell my 5 year old CPU on ebay. Nuts.

Can't wait till Ryzen 3000 release a couple of months from now so that I can buy 3300G for my HTPC and a couple of B550 motherboards to completely migrate to AMD.
 

lightmanek

Senior member
Feb 19, 2017
387
754
136
I'm dumping all my intel rigs and replacing them with AMD as soon as I can. Somebody just purchased my 4770k on ebay for $180, and I just got Ryzen 1600 for $80 from Microcenter to replace it. I get to have more scalable platform with more modern features with less security holes for cheaper than I can sell my 5 year old CPU on ebay. Nuts.

Can't wait till Ryzen 3000 release a couple of months from now so that I can buy 3300G for my HTPC and a couple of B550 motherboards to completely migrate to AMD.


This sounds like a made up story, but I too managed to sell my 4 core Intel CPU and buy brand new Ryzen 1700 for the same amount! I get it, some components are collectors items, but to still fetch a good amount for fairly modern and high volume 4th gen I7 is illogical :)
Good for us!

I will have to wait a bit longer for my 7nm Ryzen toy as I have upgraded to Threadripper couple of months ago. Again, I've sold mine, at the time 2700X, for £310 and bought brand new Threadripper 1920X for £348! And I hit silicon lottery jackpot as my TR can do 4.2GHz on all cores and 4GHz at only 1.24V!

Wonder how much fun will next gen Threadripper be to tweak with added IO die and chiplets.
 

Kedas

Senior member
Dec 6, 2018
355
339
136
What if the IF on the core die isn't limited to 2 but has 3 or even 4 IF to connect to other core dies, mainly for TR or EPYC.

I'm not selling my 4 core intel I will move to the other room to control my 3d printer when I upgrade to an AMD workhorse 12-16 cores, maybe TR, depends on low power usage when used for surfing.
 
  • Like
Reactions: Dayman1225

Veradun

Senior member
Jul 29, 2016
564
780
136
What if the IF on the core die isn't limited to 2 but has 3 or even 4 IF to connect to other core dies, mainly for TR or EPYC.

It's basically impossible they chose to link compute dies directly. First of all that's the whole point of a IOdie, and you need 8 IF links per compute die to connect all of them in a Rome setup, wasting a huge amount of power and die size in the process.

IF is already doubled as compared to Zen1.
 
  • Like
Reactions: amd6502

beginner99

Diamond Member
Jun 2, 2009
5,210
1,580
136
What if the IF on the core die isn't limited to 2 but has 3 or even 4 IF to connect to other core dies, mainly for TR or EPYC.

because that is the point of the IO die. if it needs a lot of die space and power so you want to limit the number of such links.
 

Kedas

Senior member
Dec 6, 2018
355
339
136
because that is the point of the IO die. if it needs a lot of die space and power so you want to limit the number of such links.
The point of the I/O is to separate the external I/O circuits on a lower cost process since it doesn't shrink well anyway, hence lower cost, not to disconnect core dies, that is done for yield/cost. A way to fix the die disadvantages is cache and to have good connections between them.

But you are right that it would probably take up to much power in case of many fast connections.
Or they would have to find a way to only consume energy when there is data transferred over IF since the amount of data that is moved around is the same with very fast or fast connections.
 

amd6502

Senior member
Apr 21, 2017
971
360
136
So it looks like we might get 7nm products in the Ryzen 3000 series after all in the form of quadcore salvage with modest clocks and 8 threads enabled.

wccf breaks ES leak ( 2DS104BBM4GH2_38/34_N )

https://wccftech.com/amd-ryzen-3000-zen-2-cpu-msi-meg-x570-creation-motherboard-spotted/

The actual final consumer part may be a few hundred MHz higher than ES. So I figure 4c/8t at 4ghz might either be the new Ryzen 3 or might be bottom Ryzen 5, excluding high CU count APUs .
 
  • Like
Reactions: lightmanek

Topweasel

Diamond Member
Oct 19, 2000
5,436
1,654
136
So it looks like we might get 7nm products in the Ryzen 3000 series after all in the form of quadcore salvage with modest clocks and 8 threads enabled.

wccf breaks ES leak ( 2DS104BBM4GH2_38/34_N )

https://wccftech.com/amd-ryzen-3000-zen-2-cpu-msi-meg-x570-creation-motherboard-spotted/

The actual final consumer part may be a few hundred MHz higher than ES. So I figure 4c/8t at 4ghz might either be the new Ryzen 3 or might be bottom Ryzen 5, excluding high CU count APUs .

Ughh. That's a slippery slope that they might not want to have gotten involved in. I wonder if the 4c dies were really that numerous that it made it worth it. The problem is the lower guys on the totem pole always sell the best within reason (maybe the very bottom doesn't) there is a market for really fast 4c CPU's if this clocks as high as we think. By putting that sku out there they will inevitably have to start harvesting a lot of better dies to fill the volume that this chip might require. On one hand any bonus sales are great on the other hand long term it could affect availability chips that would use the 8c capable chiplets.
 

DrMrLordX

Lifer
Apr 27, 2000
21,643
10,860
136
By putting that sku out there they will inevitably have to start harvesting a lot of better dies to fill the volume that this chip might require.

No, they don't. If there are supply constraints on the 4c SKUs then so be it. People will just have to buy something else. Look at what Intel did some time ago with the G4560. People really wanted that part (for awhile) but supplies were low, so they had to buy something else if they were going to buy anything at all. It's not like Intel lost many sales from G4560 shortages. Prices on the G4560 went up until it was no longer so attractive.

If history is any indicator of what AMD will do, the highest-end parts will launch first, followed by cheaper offerings later (which is what they did with Ryzen). I doubt we'll see anything below 8c on launch day.
 

amd6502

Senior member
Apr 21, 2017
971
360
136
I wonder if the 4c dies were really that numerous that it made it worth it.

Percentage wise the number is probably small, but I think between Epyc and Ryzen, there are going to be a huge number of 7nm dies, and that it would end up a large number of quad dies that would be a shame to throw in a landfill. So for one SKU I think well worth it.

And they can also adjust this percentage by raising/lowering the bar for the frequencies that they set in their Ryzen 7 (and Ryzen 9) lineup.

These quads are also going to be supplemented by Picasso APUs, and furthermore, consumers also have the option of Pinnacles equivalent, which would be the 2600x and 2600. I can actually see the Pinnacle die being rebinned in two SKUs of the 3000 lineup as two 65-95w ctdp parts, rebadged as something like a 3550 (for a hexcore with perf bwtwn a 2600x and 2600) and a 3590 (for the octacore) ryzen 5's. As for question of enough supply of the 4c/8t 7nm parts (as DrMrLordX said) if the demand for quads is still too great, so be it. Or you simply raise the price closer to 7nm hexacore equivalents, or let the market figure it out.
 
Last edited:

beginner99

Diamond Member
Jun 2, 2009
5,210
1,580
136
And they can also adjust this percentage by raising/lowering the bar for the frequencies that they set in their Ryzen 7 (and Ryzen 9) lineup.

Exactly. They don't need to have 4 defect cores but maybe just a general bad chip that needs too much juice. You solve the issue by disabling 4 cores but not half TDP at the same time. You don't want these chiplets in an epcy or high-core count ryzen sku due to TDP constraints.
 
  • Like
Reactions: OTG and amd6502

Veradun

Senior member
Jul 29, 2016
564
780
136
The point of the I/O is to separate the external I/O circuits on a lower cost process since it doesn't shrink well anyway, hence lower cost, not to disconnect core dies, that is done for yield/cost. A way to fix the die disadvantages is cache and to have good connections between them.

But you are right that it would probably take up to much power in case of many fast connections.
Or they would have to find a way to only consume energy when there is data transferred over IF since the amount of data that is moved around is the same with very fast or fast connections.
Since it doesn't shrink that well you don't want 8 IF links on a compute die to reach the IOdie and each one of the other compute dies.

So that's a no no no.

Add that the topology allows for more compute dies to be added if needed just moving from this IOdie to a new one without a new compute die. For example they can do the optimization part of the IOdie@7nm without the pressure of TTM (covered with the IOdie@14nm) and get enough package space to rise the bar to 80c for EPYC xxx3, without any need to redesign the compute die.