Info 64MB V-Cache on 5XXX Zen3 Average +15% in Games

Page 30 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Kedas

Senior member
Dec 6, 2018
355
339
136
Well we know now how they will bridge the long wait to Zen4 on AM5 Q4 2022.
Production start for V-cache is end this year so too early for Zen4 so this is certainly coming to AM4.
+15% Lisa said is "like an entire architectural generation"
 
Last edited:
  • Like
Reactions: Tlh97 and Gideon

Joe NYC

Golden Member
Jun 26, 2021
1,948
2,288
106
Different markets. The TR Pro's competitive product from Intel is Xeon W. The regular TR is Core X. A 16 core TR isn't going to sell much now that the mainstream platform goes up to that.

This goes back to previous versions of Threadripper. The TRx4 platform has more memory channels and more PCIe lanes, which is something people looking for HEDT platform would find useful, even if they don't want any more than 16 cores.
 

jpiniero

Lifer
Oct 1, 2010
14,599
5,218
136
This goes back to previous versions of Threadripper. The TRx4 platform has more memory channels and more PCIe lanes, which is something people looking for HEDT platform would find useful, even if they don't want any more than 16 cores.

Theoretical customer base is too small.

Regular Threadripper is behind Epyc, Ryzen and now TR Pro. So if AMD is able to use up all the chiplets with those 3 products, there isn't going to be any room for it.
 
Last edited:

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
With AM4 going up to 16 cores, and the possibility of AM5 having 24 cores, the low end TR market could be more easily covered with just a decent hedt chipset for the socket. It's not rocket science to just connect the x16 PCIe link for the GPU to a PLX chip that can spread it across 64 lanes. With PCIe 4.0, that enough bandwidth to feed 4 x8 PCIe 3.0 connected cards well enough for most every application that needs multiple GPUs in a desktop or device machine.
 

zir_blazer

Golden Member
Jun 6, 2013
1,165
408
136
With AM4 going up to 16 cores, and the possibility of AM5 having 24 cores, the low end TR market could be more easily covered with just a decent hedt chipset for the socket. It's not rocket science to just connect the x16 PCIe link for the GPU to a PLX chip that can spread it across 64 lanes. With PCIe 4.0, that enough bandwidth to feed 4 x8 PCIe 3.0 connected cards well enough for most every application that needs multiple GPUs in a desktop or device machine.
Nope. PCIe Switches are so ridiculous expensive than it isn't even funny. That is what makes TR and EPYC so attractive. Plus you are still bottlenecked by the switch uplink, so the only thing that a switch can do is fanout.
 

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
I am given to understand that the big expense for PLX switches is the fact that they have low volume and expensive IP. With the negotiating power of an AMD, you'd think that they would get reasonable terms...

Essentially, isn't the existing x570 a PCIe switch at heart? It takes an inbound x4 PCIe 4.0 link and spreads it among a bunch of internal PCIe devices, a few x4 links (nvme), a handfull of PCIe slots, etc. Why not scale up the inbound link to x16, and the outbound links to 4 x8 ones that can be joined for 2 x16 ones or something like that?
 

Joe NYC

Golden Member
Jun 26, 2021
1,948
2,288
106
Theoretical customer base is too small.

Regular Threadripper is behind Epyc, Ryzen and now TR Pro. So if AMD is able to use up all the chiplets with those 3 products, there isn't going to be any room for it.

It would just be the demise of the TRx40 platform, which IMO had merit, if AMD had released lower core count CPUs for it.
 

Insert_Nickname

Diamond Member
May 6, 2012
4,971
1,691
136
Essentially, isn't the existing x570 a PCIe switch at heart? It takes an inbound x4 PCIe 4.0 link and spreads it among a bunch of internal PCIe devices, a few x4 links (nvme), a handfull of PCIe slots, etc. Why not scale up the inbound link to x16, and the outbound links to 4 x8 ones that can be joined for 2 x16 ones or something like that?

That goes for all AMD AM4/TR4 "chipsets". They're essentially glorified PCIe switches, since Ryzen already has an integrated FCH on die.
 

Joe NYC

Golden Member
Jun 26, 2021
1,948
2,288
106
That goes for all AMD AM4/TR4 "chipsets". They're essentially glorified PCIe switches, since Ryzen already has an integrated FCH on die.

But with TR4, you can have access to PCIe lanes directly., depends on how the mobo is layed out.

So you could have multiple full M.2 slots connected directly to CPU, instead of having to go through a chipset.
 

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
Yes, that's a distinct advantage. Unfortunately, the cost associated with that is high. I suspect that the low end threadrippers were dropped for this generation due to a combination of opportunity cost and production capacity. Thee are only so many CCDs to go around, and AMD can sell all the WX rippers they can mare, so why bother with the cheaper ones? Also, for those shopping the cheaper ones, most are after the I/O capabilities over the singe thread performance, which is the main thing that Zen3 brought to the table.

A workstation class chipset for AM4/5 could bring most of the I/O capabilities to market without the added opportunity cost of selling low end rippers.
 

eek2121

Platinum Member
Aug 2, 2005
2,930
4,026
136
I think that AMD massively screwed up the TR lineup since they switched Socket with Zen 2, then released TR PRO matching most of EPYC features. Too much product overlap going on there. If I was a Zen 2 TR user I would be angry that AMD left me without upgrade path, but it makes sense to unify everything behind TR PRO than to have Socket 754/939 coexisting all over again.
The supposed cancellation of Zen 3 Threadripper for DIY is pretty sad. I do hope AM5 gets the core counts up.

I do wonder if this is due to the B2 stepping.
Nope. PCIe Switches are so ridiculous expensive than it isn't even funny. That is what makes TR and EPYC so attractive. Plus you are still bottlenecked by the switch uplink, so the only thing that a switch can do is fanout.

Relatively, yes, but overall? Not really. There are cheaper ones, and they can also do them in house. The issue is that OEMs lose their mind if they have to pay even a penny for an extra thing on the BOM.

The last time this got brought up I did some digging and found a few affordable options, albeit PCIE3. PCIE4 might be trickier due to signaling. Would have to do more digging.
 

Insert_Nickname

Diamond Member
May 6, 2012
4,971
1,691
136
But with TR4, you can have access to PCIe lanes directly., depends on how the mobo is layed out.

So you could have multiple full M.2 slots connected directly to CPU, instead of having to go through a chipset.

AM4 has a dedicated PCIe slot/M.2 port directly from the CPU. If you split the x16 for graphics, you can use 1 or 2 additional (for 3 total) M.2 SSDs depending on board layout using adaptors. If you're using a 4 or 5000 series APU, you can even dedicate the x16 graphics slot for M.2 SSDs, with a total of 5 directly from the CPU.
 

zir_blazer

Golden Member
Jun 6, 2013
1,165
408
136
Relatively, yes, but overall? Not really. There are cheaper ones, and they can also do them in house. The issue is that OEMs lose their mind if they have to pay even a penny for an extra thing on the BOM.

The last time this got brought up I did some digging and found a few affordable options, albeit PCIE3. PCIE4 might be trickier due to signaling. Would have to do more digging.
PCIe Switches were quite common BEFORE Avago purchased PLX and tripled prices overnight. There was a certain monopoly with these, albeit I think Microsemi, Marvell and Asmedia are now into them too thanks to Avago (Now Broadcom) greediness allowing for plenty of profit.

AMD certainly already does them in house, but not in a dedicated PCIe Switch fashion. You have a gigantic 128 lanes switch on EPYC, plus a lot of other stuff on the IO die. Actually, I only consider the ASMedia AM4 Chipsets (Everything but X570) glorified PCIe Switches since they also happen to have USB and SATA Controllers, too, whereas X570 shares the design with the Zen 2/3 IO die and you have a truckload of IO that is not used when in Chipset mode (Like the DDR4 Memory Controller). Being a Chipset is not even the primary use of that design.

If possible, I prefer a bigger Socket with more PCIe Ports than putting a big PCIe Switch to fanout more Ports. If you want to have a lot of stuff plugged in simultaneously, it does the job, but if you want to USE a lot of stuff plugged in simultaneously, you bottleneck the uplink. We already know that since the day that people tried to RAID 0 NVMe SSDs plugged to Chipset PCIe Ports. And if we go to prices, having a moderately sized PCIe Switch already puts you in the ThreadRipper or low end EPYC price ballpark. So why bother with switches on the first place?
 
  • Like
Reactions: Tlh97 and moinmoin

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
Why bother? You're going to have a chipset anyway. Why not just expand it's functionality? AMD is after money, just like any other business that's not a non-profit. They are reserving the Threadripper for the higher end SKUs for now, but there is still a market for HEDT, just a small one. Why not go after that market with a high end chipset? It is doubtful that the boards will cost that much different than the existing TR4 boards do, at the very least due to having to route all those PCIe lanes. It will be LESS expensive to route memory channels because there will be just two of them (yes, yes, DDR5 is two sub channels each...). It will be less expensive because the VRM and power routing of the board will be simpler around the processor due to AM4/AM5 drawing a lot less. It will be more expensive due to the more expensive chipset. I think that, at worst, it's a wash.

And, yes, there certainly is the possibility of bottlenecking at the uplink for the chipset. This is why I propose that they use the x16 PCIe 4.0 link from the CPU for the uplink. That's a TON of bandwidth, and would only start to really be a concern with 3+ x16 PCIe 3.0 cards that are operating at near 100% bus utilization. We know that that's an astonishingly rare situation, unless you're intentionally running a whole bunch of 4 X M.2 NVME cards, fully populated, with RAID-0 on all drives, at maximum utilization on the regular, and if you're doing that, why are you only using an HEDT board in the first place?!?! Using the x16 as the uplink leaves the old x4 uplink free to drive an m.2 NVME port, and you still have the one that's always been there to drive a second one! Plus, AM5 seems to add the possibility of an additional x4 link for some SKUs.

The foundation of an HEDT platform is there. It's up to AMD to choose to use it.
 
  • Like
Reactions: lobz

zir_blazer

Golden Member
Jun 6, 2013
1,165
408
136
Why bother? You're going to have a chipset anyway. Why not just expand it's functionality? AMD is after money, just like any other business that's not a non-profit. They are reserving the Threadripper for the higher end SKUs for now, but there is still a market for HEDT, just a small one. Why not go after that market with a high end chipset? It is doubtful that the boards will cost that much different than the existing TR4 boards do, at the very least due to having to route all those PCIe lanes. It will be LESS expensive to route memory channels because there will be just two of them (yes, yes, DDR5 is two sub channels each...). It will be less expensive because the VRM and power routing of the board will be simpler around the processor due to AM4/AM5 drawing a lot less. It will be more expensive due to the more expensive chipset. I think that, at worst, it's a wash.
You don't NEED to have a Chipset, since Zen can be used as a fully standalone SoC (Like EPYC, and EPYC Embedded/Ryzen Embedded: No Chipset). Since first Zen, there is enough built in IO to actually drive a mATX sized Motherboard with nothing else but a NIC PHY and a Super I/O. I did a Thread about that. Actually, with a slighty bigger Socket with more USBs, you could actually kiss the Chipset goodbye for mATX sized Form Factors, and maybe just add a B550/X570 level Chipset to fanout a few more devices on ATX size, and still cater to 98% or so of the userbase.
I prefer a slighty bigger Processor with dedicated ports than multiplexing them at the Chipset level. I find it suboptimal given than AMD already has almost all the IO it actually needs on the Processor package.
 

biostud

Lifer
Feb 27, 2003
18,251
4,764
136
How much extra will zen3 give in MT compared to zen2 in the same power envelope. Zen3 was a major succes for the ryzen platform as it finally beat Intel in gaming, which relies heavily on ST performance. But would zen3 be a major improvement for the threadripperplatform?
 

DrMrLordX

Lifer
Apr 27, 2000
21,632
10,845
136
How much extra will zen3 give in MT compared to zen2 in the same power envelope.

It's not really hard to answer that question, since both Matisse and Vermeer obey the same power limits on AM4: 142W for 105W TDP chips and 88W for 65W TDP chips. Find a workload that allows both Matisse and Vermeer of the same core count to reach those power limits and compare performance.
 

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
when you compare mobile chips, the MT uplift from Zen3 over Zen2 is barely there. Witness the 5700u vs. the 5800u in benchmarks. They are restricted to similar power envelopes and are in similar environments, and the 5800u has a unified CCX with double the L3 cache, yet, the gains in MT benchmarks are often under 5%, and even regress in one or two cases. Zen3 was a single thread improvement at the cost of increased power draw. That power draw increase comes back to bite it in high count MT tests as it hits power and thermal limits. There's also likely a wall with respect to Ram bandwidth as both have the same capabilities there.
 

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
You don't NEED to have a Chipset, since Zen can be used as a fully standalone SoC (Like EPYC, and EPYC Embedded/Ryzen Embedded: No Chipset). Since first Zen, there is enough built in IO to actually drive a mATX sized Motherboard with nothing else but a NIC PHY and a Super I/O. I did a Thread about that. Actually, with a slighty bigger Socket with more USBs, you could actually kiss the Chipset goodbye for mATX sized Form Factors, and maybe just add a B550/X570 level Chipset to fanout a few more devices on ATX size, and still cater to 98% or so of the userbase.
I prefer a slighty bigger Processor with dedicated ports than multiplexing them at the Chipset level. I find it suboptimal given than AMD already has almost all the IO it actually needs on the Processor package.

It has been a definite bummer that AMD hasn't exposed more of the capabilities of the Ryzen Dies to end users through a more capable socket, though, I can understand it for cost reasons. Their embedded products are quite excellent for what they provide and expose.

But, for HEDT uses on desktops, a chipset from the existing, already established, AM4 socket, is more than enough to cover the majority of the market. I agree that a larger, more capable, socket would be more ideal, I don't think that's even a potential possibility due to cost. Retaining the same socked and just repurposing the traces seems like it would be the path of least resistance.
 

Hitman928

Diamond Member
Apr 15, 2012
5,282
7,915
136
when you compare mobile chips, the MT uplift from Zen3 over Zen2 is barely there. Witness the 5700u vs. the 5800u in benchmarks. They are restricted to similar power envelopes and are in similar environments, and the 5800u has a unified CCX with double the L3 cache, yet, the gains in MT benchmarks are often under 5%, and even regress in one or two cases. Zen3 was a single thread improvement at the cost of increased power draw. That power draw increase comes back to bite it in high count MT tests as it hits power and thermal limits. There's also likely a wall with respect to Ram bandwidth as both have the same capabilities there.

Laptop chip comparisons are always difficult as each laptop configuration from power limits, cooling, and boost behavior are model dependent. If you look at Epyc Milan vs. Rome, the Milan chips are at least 15% faster at the same TDPs with equal core counts in benches that scale across all available cores. In cases where not all cores are loaded, it is usually even faster.

Edit:
From Phoronix:
1634945916230.png
https://www.phoronix.com/scan.php?page=article&item=epyc-7003-linux-perf&num=5
7713 and 7742 both are 225W TDP chips.
https://www.phoronix.com/scan.php?page=article&item=epyc-7003-linux-perf&num=5
 

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
The geometric mean improvement of like for like, 1P, Milan in a well ventilated server case was 12%. The measured IPC improvement of Zen3 over Zen2, on average, was around 12%. When allowed to stretch its legs, ZEN3 is a decent improvement. When thermally constrained and power limited, it is less impressive.
 
  • Like
Reactions: lobz

Hitman928

Diamond Member
Apr 15, 2012
5,282
7,915
136
The geometric mean improvement of like for like, 1P, Milan in a well ventilated server case was 12%. The measured IPC improvement of Zen3 over Zen2, on average, was around 12%. When allowed to stretch its legs, ZEN3 is a decent improvement. When thermally constrained and power limited, it is less impressive.

Where are you getting your data?
 

Hitman928

Diamond Member
Apr 15, 2012
5,282
7,915
136
The geometric mean is at the end of the article that YOU linked.

The average IPC is from several of the widely published zen3 reviews out there.

Geometric mean at the end of the article takes in many tests that can be memory and IO bound and shouldn't be used to compare CPU performance in an absolute way which is why I pointed specifically to the HPC section which relies heavily on actual compute power.

IPC is more like 19%, right in line with what AMD advertised, I don't know where you're getting 12% from. Zen3 does use more power with the increase in IPC, that's been known since the models were announced due to the slightly lower base clocks. The actual increase, even in power/thermal constrained environments, is not 0-5% though as you indicated. It is more like 15% with an obviously wide range depending on work load.

Edit:

Quote from Anandtech's Zen review article
IPC wise, looking at a histogram of all SPEC workloads, we’re seeing a median of 18.86%, which is very near AMD’s proclaimed 19% figure, and an average of 21.38% - although if we discount libquantum that average does go down to 19.12%. AMD’s marketing numbers are thus pretty much validated as they’ve exactly hit their proclaimed figure with the new Zen3 microarchitecture.
 
Last edited:

LightningZ71

Golden Member
Mar 10, 2017
1,627
1,898
136
Where are most of those gains though? Zen3's biggest gains were in avx2, especially aes operations, and single theaded benchmarks that didn't live well in zen2's smaller CCX L3. Not everything lives in AVX2 world, and in highly threaded, low crosstalk situations, the 8 core ccx isn't really a factor. If you check Servethehome's teview, they note the trade off of higher power draw for higher clocks in like for like parts as well.

Yes, in the hpc loads, you saw bigger uplifts. In everything else, where a lot of the rest of the market lives, the gains s were more modest, came from improvements in memory throughput due to clock sync improvements, and other tweaks outside of the core.

I'm not taking a dump on zen3. It's certainly a nice improvement over zen2, but, some of its biggest gains are quite situational, and in desktop and mobile situations, it's MT gai s are far more modest.