Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 100 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

H433x0n

Golden Member
Mar 15, 2023
1,209
1,572
96
Coding games, not playing them ^
It’d absolutely work fine on a Zen 5 CPU that has both a normal CCD and a CCD with dense cores.

There’s a ton of games that utilize e-cores with great performance (The Last of Us & Jedi Survivor are recent ones). Both of those games have e-cores running tasks that aren’t latency critical (streaming assets, decompression, etc). For a modern multi threaded game each thread is given a priority and then handled by the scheduler appropriately.
 

inf64

Diamond Member
Mar 11, 2011
3,865
4,549
136
It’d absolutely work fine on a Zen 5 CPU that has both a normal CCD and a CCD with dense cores.

There’s a ton of games that utilize e-cores with great performance (The Last of Us & Jedi Survivor are recent ones). Both of those games have e-cores running tasks that aren’t latency critical (streaming assets, decompression, etc). For a modern multi threaded game each thread is given a priority and then handled by the scheduler appropriately.
The thing with Zen5c is that it's the same IPC core with less L3, and running at lower clock. E cores on the other hand are way weaker that P cores IPC wise, while having less cache and running at lower clocks. Zen5 hybrid chips would run MTed games much better than any Intel hybrid chip would.
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,102
136
The thing with Zen5c is that it's the same IPC core with less L3, and running at lower clock. E cores on the other hand are way weaker that P cores IPC wise, while having less cache and running at lower clocks. Zen5 hybrid chips would run MTed games much better than any Intel hybrid chip would.
Depends entirely on how ST vs MT sensitive those lower priority tasks are. And for that matter, how Intel's E-cores vs AMD's dense cores evolve over time.
 

H433x0n

Golden Member
Mar 15, 2023
1,209
1,572
96
The thing with Zen5c is that it's the same IPC core with less L3, and running at lower clock. E cores on the other hand are way weaker that P cores IPC wise, while having less cache and running at lower clocks. Zen5 hybrid chips would run MTed games much better than any Intel hybrid chip would.
I'm not making a claim that the Intel e-cores are more performant. I'm just saying that having a CCD with Zen 5C cores wouldn't wreck its ability to be a solid gaming CPU since there are games out right now that can take advantage of a heterogeneous CPU.
 
  • Like
Reactions: Tlh97 and inf64

DrMrLordX

Lifer
Apr 27, 2000
22,065
11,695
136
Why would it close that door? There's nothing about gaming that demands the same performance on every thread, and there are even some games today that will make use of Intel's E-cores.
If the 7950X3D has taught us anything, it's that allowing threads to stray onto the slower CCD makes the entire game slower. If there are 16-32 dense cores running at half the clocks (or worse) as the main CCD, the last thing you're gonna want is any game from 2023 mistakenly using that second CCD! Unless something major changes soon in how game engines work, I don't expect that situation to change much.
 
  • Like
Reactions: Tlh97

moinmoin

Diamond Member
Jun 1, 2017
5,064
8,032
136
The faster CCD won't reach max frequency when all cores are maxed out though. Ideally at that point the slower CCD fits right in, likely managing a higher frequency for additional cores, possibly even for the same amount of cores in case of dense Zen.
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,102
136
If the 7950X3D has taught us anything, it's that allowing threads to stray onto the slower CCD makes the entire game slower. If there are 16-32 dense cores running at half the clocks (or worse) as the main CCD, the last thing you're gonna want is any game from 2023 mistakenly using that second CCD! Unless something major changes soon in how game engines work, I don't expect that situation to change much.
Huh? The 7950X3D performs sub-optimally when either the game's performance critical threads are split across multiple CCDs (die to die communication overhead), or the game is statically bound to the wrong CCD. Changing one die to dense cores doesn't impact that at all. You still want the game primarily bound to one CCD, and it's actually simpler as that one is clearly faster in ST under all circumstances.

And in practice, the issue you describe doesn't happen on Intel hybrid systems, despite being rather more complex. You don't see any game inexplicably running on E-cores, do you? No reason to believe AMD would have it worse.

Moreover, the top SKUs are not really for gaming. If you just want to game, get the 9800X3D or whatever and call it a day. The top SKUs are good for productivity, and E-peen, both of which will benefit from a hybrid approach.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
26,171
15,326
136
Huh? The 7950X3D performs sub-optimally when either the game's performance critical threads are split across multiple CCDs (die to die communication overhead), or the game is statically bound to the wrong CCD. Changing one die to dense cores doesn't impact that at all. You still want the game primarily bound to one CCD, and it's actually simpler as that one is clearly faster in ST under all circumstances.

And in practice, the issue you describe doesn't happen on Intel hybrid systems, despite being rather more complex. You don't see any game inexplicably running on E-cores, do you? No reason to believe AMD would have it worse.

Moreover, the top SKUs are not really for gaming. If you just want to game, get the 9800X3D or whatever and call it a day. The top SKUs are good for productivity, and E-peen, both of which will benefit from a hybrid approach.
The 13900k for example can not touch my 7950x's in productivity in what I do. Not to mention the 7950x can use avx-512 which I also use. You comment only applies to the types of applications that the hybrid approach works better on. NOT ALL PRODUCTIVITY APPS. Please keep your Intel specific comments in Intel threads.
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,102
136
The 13900k for example can not touch my 7950x's in productivity in what I do. Not to mention the 7950x can use avx-512 which I also use. You comment only applies to the types of applications that the hybrid approach works better on. NOT ALL PRODUCTIVITY APPS. Please keep your Intel specific comments in Intel threads.
There is nothing Intel-specific about what I wrote. I've shown you benchmarks before, and will not waste the time to repeat myself. Productivity apps make good use of hybrid, and there's no reason to believe that would not apply to an AMD hybrid offering.

But clearly you just wanted an excuse to go on your typical rant. I'd be shocked if you could go a single comment without mentioning your obsessive hatred of Intel.
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,102
136
Only it isn't just the game being statically bound to the wrong CCD, it can also be the OS allowing threads to wander between CCDs.
That's just a scheduling problem, and a more difficult one for the 7950X3D than for a theoretical hybrid chip at that. Having two CCDs which are both best at ST, just under different workload conditions, is much harder to deal with than knowing that one will always be faster. We see that in practice games never just get pinned to the E-cores on Intel hybrid chip, so I think Windows scheduling is already where it needs to be in that regard.
If even a few threads wander to a high-density, low-clock CCD on a Zen5 desktop chip then performance could crater.
As mentioned, shouldn't be any worse than what we see today. If the OS allows a thread to wander, it would presumably be a perf-insensitive one. The hit from the other CCD seems to be more about the cross-CCD latency than clock speed. Might get a one or two games that need patching, but I doubt it'd be anything significant.
Traditionally they have been in the x86 market.
Yes, the top SKU usually puts out the top gaming numbers, but the difference is almost entirely clock speed, which can be normalized by overclocking, or AMD just making different SKU decisions. Either way, the gap is negligible vs a single CCD 8c chip. If you have a 4090 and money to burn, sure, go for it, but for your average gamer, even enthusiast, grab the 8c X3D and you'll never have to worry about it. Though I am assuming that AMD fixes the v-cache frequency penalty sooner or later.
If AMD joins Intel in the box of "8c is enough" and instead encourages the proliferation of many slow cores in one form or another, then it's going to discourage a certain amount of innovation in that regard.
Is that not the current reality anyway? The 8c CCD boundary is a very similar problem to Intel's 8 P-core limitation. Long term, I'm sure game devs will find ways to use all that compute sitting idle. And when you think about it, density vs peak performance might be an appealing tradeoff for consoles...
 
  • Like
Reactions: Tlh97 and moinmoin

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
26,171
15,326
136
There is nothing Intel-specific about what I wrote. I've shown you benchmarks before, and will not waste the time to repeat myself. Productivity apps make good use of hybrid, and there's no reason to believe that would not apply to an AMD hybrid offering.

But clearly you just wanted an excuse to go on your typical rant. I'd be shocked if you could go a single comment without mentioning your obsessive hatred of Intel.
Since AMD has no hybrid chips, you must be talking about Intel. And with Bergamo, they have made it clear that they will NOT be going with a hybrid approach.

As far as my "hatred of Intel", thats not true. I just would never buy or even use their chips vs AMD (currently) as they are slower and use more power. Those are facts, not opinion.
 

eek2121

Diamond Member
Aug 2, 2005
3,100
4,398
136
They have not
AMD did, in fact, say that Bergamo was server only. They also did say they would not be taking the same approach as Intel. Neither of those statements rules out a hybrid approach, but the few serious rumors I have seen thus far indicate AMD is sticking with 8-16 cores on the desktop. They are even reusing the naming scheme.

Best guess for Zen 5 9000 series (desktop):

Healthy IPC increase, same clocks or slight clock regression, decent performance uplift of around 15-25% at similar power limits. Improved GPU. AI Instructions. Better power management. New features.

Motherboard chipsets and BIOS maturity much improved.

Mobile? Potentially a different story. Mobile could very well be a monolithic hybrid design. 4-8 high frequency cores, 8-16 lower clocking small cores.

This is also the only scenario that doesn’t drastically increase costs for AMD, and mobile could use more cores for both efficiency and performance reasons.

On the desktop you have a 170W power limit, so 16 big cores works great. You could never do that effectively with a 15-30W power limit, however, so on mobile we could see hybrid. The leaks also line up best with this scenario.

Oh and a bold prediction: Arrow Lake will be a huge step up for Intel in terms of perf/watt and will be an okay perf uplift, but Zen 5 will be faster thanks to a healthy IPC improvement.

Regarding your comments about @Markfw , uncalled for. IMO. Many of us are pretty neutral about who makes our hardware, as long as it performs great. Intel is significantly trailing AMD in perf/watt and so many of us won’t use them because of that. If they flipped tomorrow and started winning that race again, or they released a chip that significantly outperformed AMD, many of us would probably buy it. Same reason I bought a 4090: NVIDIA has a better product at the high end. AMD needs to step up the GPU game if they want to win me over.

Finally, sometimes folks here are super pessimistic about Intel’s ability to execute because they have over promised and under- delivered so many times. There is nothing wrong with that. Results are proven with actions, not words.
 

Geddagod

Golden Member
Dec 28, 2021
1,296
1,368
106
AMD did, in fact, say that Bergamo was server only.
No one here is contesting that?
They also did say they would not be taking the same approach as Intel.
Yes.
Neither of those statements rules out a hybrid approach,
So what @Markfw said was false, and was exactly what I disagreed with. Nothing about Bergamo has shown that AMD is not going with a hybrid approach, does Intel's Sierra Forest prove that Intel isn't going with a hybrid approach with Arrow Lake then?
AMD has not made it clear they ruled out a hybrid approach.
A lot of news sites are clickbaiting (cough wccftech) readers with this in their titles- but let's look at the exact quote from AMD:
"One is the notion that P-Cores and E-Cores that the competition uses is not the approach that we plan on taking at all"
Well that makes it cut and dry right? Look at the sentence right after it:
" Because I think the reality is that when you get to the point of having core types with different ISA capabilities or IPC or things like that, it makes it very complicated to ensure that the right workloads are scheduled on the right cores, consistently"
Different ISA and different IPC are both problems AMD avoids with Zen 4C. So it's obvious what AMD is referring to by the statement that they won't use "P+E-cores like the competition" isn't a nod to the use of P+E cores together in the products, but rather the development and design of the P+E cores themselves.
AMD mentions that using P+E cores in desktop is a harder argument to make, but there are more obvious places, and places where we will see a hybrid approach more quickly - laptops.
Mobile? Potentially a different story. Mobile could very well be a monolithic hybrid design. 4-8 high frequency cores, 8-16 lower clocking small cores.
Why wait for Zen 5? Apparently Zen 4 mobile will have Zen 4C cores as well.
Arrow Lake will be a huge step up for Intel in terms of perf/watt and will be an okay perf uplift, but Zen 5 will be faster thanks to a healthy IPC improvement.
That igor leak for ARL makes it look like Zen 5 will be a good a 10-20% faster in ST and MT
Regarding your comments about @Markfw , uncalled for.
All I said was "lol" :)
But I suggest you look through Markfw's comment history on Intel threads or about Intel.
Finally, sometimes folks here are super pessimistic about Intel’s ability to execute because they have over promised and under- delivered so many times. There is nothing wrong with that. Results are proven with actions, not words.
There's a difference between pessimism and outright fanboying, and I suspect many of here on Anandtech noticed Markfw cross that line... numerous times- in a repetitive fashion that has become recognizable as a pattern.
But, in respect to both Markfw and Anandtech's rules (as I very recently got a ban hammer for a day for cussing lol) I won't talk about his antics beyond this response for a bit.
 

Abwx

Lifer
Apr 2, 2011
11,557
4,349
136
There's a difference between pessimism and outright fanboying, and I suspect many of here on Anandtech noticed Markfw cross that line... numerous times- in a repetitive fashion that has become recognizable as a pattern.
But, in respect to both Markfw and Anandtech's rules (as I very recently got a ban hammer for a day for cussing lol) I won't talk about his antics beyond this response for a bit.

Ypu should look within a longer time frame, at a time he was using exclusively Intel CPUs for years and not a single AMD one IIRC...
 
  • Like
  • Love
Reactions: Ajay and Markfw

moinmoin

Diamond Member
Jun 1, 2017
5,064
8,032
136
That's a decision made by AMD when establishing clockspeed and power targets for each CCD under specific utilization scenarios.
How did you manage to bring Intel's outdated turbo tables into a Zen thread? The only frequency hardcoded in today's Zen chips should be the upper limit, everything else depends on cooling headroom.
 

DrMrLordX

Lifer
Apr 27, 2000
22,065
11,695
136
Is that not the current reality anyway? The 8c CCD boundary is a very similar problem to Intel's 8 P-core limitation. Long term, I'm sure game devs will find ways to use all that compute sitting idle. And when you think about it, density vs peak performance might be an appealing tradeoff for consoles...

Intel has introduced this problem by going with an 8P core limit, instead of chasing higher counts of P cores the way AMD did up through the 7950X. If AMD takes the same path then there will be no real alternative. For the foreseeable future, we'll be stuck on 8 main cores and then low clock/low performance core spam. AMD has already leaned into that approach with the 7950X3D. Now whether you think game+scheduler will have a harder time balancing threads on a 7950X3D than on a hypothetical Zen5 with dense cores is purely speculation, but I do think having a dense CCD with low clocks could be just as problematic since it will have the same ISA etc. and will be indifferentiable from the main CCD except when it comes to clocks. So the scheduler won't necessarily know what's going on until it teases a few threads onto CCD1. Could lead to the infamous ping-ponging of threads moving between CCDs.

Hopefully Zen5 will do more to address interconnect speeds/latencies, making cache locality less of an issue for inter-CCD workloads (such as future games needing/wanting more than 8c).

How did you manage to bring Intel's outdated turbo tables into a Zen thread?

I didn't. See below.

The only frequency hardcoded in today's Zen chips should be the upper limit, everything else depends on cooling headroom.

Based on what we know from messing with PBO + Curve Optimizer, AMD has the capability to set whatever power/clockspeed target they want on a per-core or per-CCD basis. There's a lot of room for fine-grain control here. If AMD wanted to keep CCD0's clockspeed target high(er) when utilizing CCD1 then they could probably do so, at the cost of clocks on CCD1. Out-of-the-box they don't do so, and they still haven't offered a "gamer" mode where they attempt such behavior (instead, they let you just disable the second CCD). Again there's no free lunch, but they certainly have the controls available to shift power budget towards CCD0. Of course taking that behavior too far could lead to the same things I'm speculating could happen on future Zen5 products with a dense CCD: CCD1's clocks could become low enough that moving threads there would tank overall performance. In which case AMD might be better off raising power budget a bit for gaming workloads, but only for CCD0. As it stands, most 2 CCD Zen CPUs do not reach their full power budget when running games.
 
  • Like
Reactions: Tlh97

moinmoin

Diamond Member
Jun 1, 2017
5,064
8,032
136
Based on what we know from messing with PBO + Curve Optimizer, AMD has the capability to set whatever power/clockspeed target they want on a per-core or per-CCD basis. There's a lot of room for fine-grain control here. If AMD wanted to keep CCD0's clockspeed target high(er) when utilizing CCD1 then they could probably do so, at the cost of clocks on CCD1. Out-of-the-box they don't do so, and they still haven't offered a "gamer" mode where they attempt such behavior (instead, they let you just disable the second CCD). Again there's no free lunch, but they certainly have the controls available to shift power budget towards CCD0. Of course taking that behavior too far could lead to the same things I'm speculating could happen on future Zen5 products with a dense CCD: CCD1's clocks could become low enough that moving threads there would tank overall performance. In which case AMD might be better off raising power budget a bit for gaming workloads, but only for CCD0. As it stands, most 2 CCD Zen CPUs do not reach their full power budget when running games.
I think we are talking about different thing. The part you are talking about is essentially binning, of course different cores and different CCDs have different qualities. Of course they are set accordingly, you don't want processes to end up on weak cores or the weak CCD if better cores/CCD are available.

What I was talking about is different voltage/frequency curves, mobile and likely also the dense Zen cores have a lower voltage/frequency curve that ends at a wall at a rather low frequency where going any higher kills all efficiency but below that essentially all frequencies are more efficient (Intel's E-cores showcase the same behaviour). Normal Zen cores have a higher voltage/frequency curve that has a gentler slope at the upper end, making higher frequencies somewhat more feasible. On desktop those higher frequencies are mostly used. But if all cores of a strong CCD try to max out their frequency power consumption = heat can hit the upper limit instead, throttling the frequency of all cores again. Depending on how both curves turn out there may be a possibility that giving a specific TDP a non-dense CCD at 8c max may well run at a lower frequency than 8c of a dense CCD. That's all.
 

dacostafilipe

Senior member
Oct 10, 2013
772
244
116
Nothing about Bergamo has shown that AMD is not going with a hybrid approach

Let's say they put Zen4 and Zen4c in an APU. Would it really be hybrid? Isn't "the hybrid" we talk about more of something like ARM big.LITTLE? If the cores are 100% similar in terms of functionality but only differ in terms of performance (frequency, cache, ...) could we really call it hybrid? We already have CPUs where single cores boost/perform better than others ... where does hybrid start?
 

dr1337

Senior member
May 25, 2020
417
691
136
We see that in practice games never just get pinned to the E-cores on Intel hybrid chip, so I think Windows scheduling is already where it needs to be in that regard.
There are many benchmarks for games that run a lot better with E-cores off, WoW being a very prominent example. Frankly I've seen nothing good to be said by either vendor about their relationship with windows thread scheduling. I am not sure why you keep posting this rhetoric that there is some AMD scheduling problem when it literally crops up in Intel chips all the time too.

Here's the deal, even with the few games that hate the 7950x3d, the v-cache 16c chip is still faster on average in games than the base 7950x. Just as Intel has better performance with e-cores 95% of the time, so does AMD with the asymmetric CCDs. I think there legitimately would be a customer base for a 7990x3d/8990x3d with 8+16 with v-cache on the 8c die only. Even if a handful of games need process lasso for maximum performance.

The people considering a 24c/32t 14/13900k are most likely looking for raw thread count first and foremost, with strong gaming performance second. AMD could sell 48 threads in a halo gaming SKU and I have zero doubts It wouldn't chart better than a normal 8+8 chip even in games, all the while being an MT monster.
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,102
136
There are many benchmarks for games that run a lot better with E-cores off
That is certainly not what Raptor Lake benchmarks show.
I am not sure why you keep posting this rhetoric that there is some AMD scheduling problem when it literally crops up in Intel chips all the time too.
Huh? I was explaining why hybrid would not be an issue for gaming, if AMD decides to go that direction. I think 8+16 would be the natural evolution for AMD's halo product.