Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 632 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Abwx

Lifer
Apr 2, 2011
11,884
4,873
136
Ma fault, I should always take AMD marketing for their word. They have surely never lied before!
Top tier cope smh.

I suggest that you keep the thread clean from what is obviously some bad feeling of yours against AMD.

For the time these are the available numbers, if you have others that are contradictory then post them, in the meantime dont expect any other answer from me.
 

Geddagod

Golden Member
Dec 28, 2021
1,524
1,620
106
I suggest that you keep the thread clean from what is obviously some bad feeling of yours against AMD.

For the time these are the available numbers, if you have others that are contradictory then post them, in the meantime dont expect any other answer from me.
🤡
How is saying "wait for official reviews, don't believe first party slides" controversial now.
 

Hail The Brain Slug

Diamond Member
Oct 10, 2005
3,881
3,311
146
How many Zen 4 or Zen 5 products will actually run at 1 or 2 watts per core? Maybe some handhelds and some ultra thin and lights? 80-110 watt results will be the most interesting
At ~35 watts core power, if you run EXPO or tune your ram you need about 65-70W PPT to achieve even this level of power. We're only getting this much out of Zen 4/5 because of JEDEC/stock/low FCLK.
 

Geddagod

Golden Member
Dec 28, 2021
1,524
1,620
106
I will say this though, after the recent new slides release, it looks like overall, Zen 5 has had the most changes and largest core structure size uplift out of all their previous generations since Zen 1.
And yet the performance gains are whelming, and perf/watt still doesn't look that great either....
Whether this is due to diminishing returns, or just on the design team, who knows, but it appears as if M4's P-core is still the winner of the core wars for 2024.
I would like to see Zen 5C N3 vs M4 though, but I suspect due to Zen 5C on N3 only being available on Turin dense, we will pretty much never get the data we actually want to see.
 

Saylick

Diamond Member
Sep 10, 2012
4,033
9,454
136
I will say this though, after the recent new slides release, it looks like overall, Zen 5 has had the most changes and largest core structure size uplift out of all their previous generations since Zen 1.
And yet the performance gains are whelming, and perf/watt still doesn't look that great either....
Whether this is due to diminishing returns, or just on the design team, who knows, but it appears as if M4's P-core is still the winner of the core wars for 2024.
I would like to see Zen 5C N3 vs M4 though, but I suspect due to Zen 5C on N3 only being available on Turin dense, we will pretty much never get the data we actually want to see.
Based on the Mike Clark interview, it appears they widened the core but it’s not fully optimized, so while on paper there should be more improvement, it’s not realized just yet.
 

exquisitechar

Senior member
Apr 18, 2017
722
1,019
136
Wow that's extremely surprising... are we 100% sure its correct? Lol. I haven't seen it reported elsewhere...
I heard that it’s N4X a long while ago and that that’s the case because N4X is different from TSMC’s previous X nodes and far more viable. Not completely sure it’s true, though.
 

Geddagod

Golden Member
Dec 28, 2021
1,524
1,620
106
Based on the Mike Clark interview, it appears they widened the core but it’s not fully optimized, so while on paper there should be more improvement, it’s not realized just yet.
They said they will optimize it with Zen 6, so we will see. I wonder though, N4P already seems to be pretty close to N3 in perf/watt, if the node itself will bring any real perf/watt gains, except maybe perhaps at the very, very low end of the curve.
I heard that it’s N4X a long while ago and that that’s the case because N4X is different from TSMC’s previous X nodes and far more viable. Not completely sure it’s true, though.
It's very, very likely it's not N4X.
 

StinkyPinky

Diamond Member
Jul 6, 2002
6,973
1,276
126
So I'm confused. AMD themselves said the 7800X3D would still be the best gaming cpu, but now the leaks suggest the 9700X is on par and clearly superior elsewhere. So why would AMD undersell their own new cpu's? There must be more to the story, the full reviews can't come soon enough.
 

gdansk

Diamond Member
Feb 8, 2011
4,567
7,679
136
So I'm confused. AMD themselves said the 7800X3D would still be the best gaming cpu, but now the leaks suggest the 9700X is on par and clearly superior elsewhere. So why would AMD undersell their own new cpu's? There must be more to the story, the full reviews can't come soon enough.
As usual it will depend on the selection of games.

And now they may be cherry picking.
 

inf64

Diamond Member
Mar 11, 2011
3,884
4,692
136
So I'm confused. AMD themselves said the 7800X3D would still be the best gaming cpu, but now the leaks suggest the 9700X is on par and clearly superior elsewhere. So why would AMD undersell their own new cpu's? There must be more to the story, the full reviews can't come soon enough.
There will be games where X3D will have a big advantage as the game is very cache sensitive. These games will drag the average down quite a bit.
 

Geddagod

Golden Member
Dec 28, 2021
1,524
1,620
106
No good can come of AMD referring to their AVX blocks as "AI datapaths." Ew ew ew.
be so fcking fr dude, I noticed that too 💀
So I'm confused. AMD themselves said the 7800X3D would still be the best gaming cpu, but now the leaks suggest the 9700X is on par and clearly superior elsewhere. So why would AMD undersell their own new cpu's? There must be more to the story, the full reviews can't come soon enough.
I'm guessing they will be around the same, ARL roughly ties Zen 5, and Zen 5X3D clean sweeps with a ~15%-20% lead.
I wanna know the clustered decode of zen 5 is still useful without SMT?
As @coercitiv said, yes, they actually answered that in the interview, imma SS chips and cheese's article here:
1721058559335.png
 

inf64

Diamond Member
Mar 11, 2011
3,884
4,692
136
Quick question: The guy who tested Strix Point ES and noted structure reductions (notably uOP cache), do we know if this is true ? The slide for Zen 5 calls out 6K figure ,which matches Zen 4. I wonder how he got to the conclusion that Strix version had a reduction of that structure?

edit: It seems that Zen 4 has 6,912 Ops size, which is indeed a bit better than Zen 5's 6K
 

yuri69

Senior member
Jul 16, 2013
677
1,215
136
Based on the Mike Clark interview, it appears they widened the core but it’s not fully optimized, so while on paper there should be more improvement, it’s not realized just yet.
o_O
AMD's Zen line has been running 2 leapfrogging teams to produce a pair of cores - a new design and its refinement. Zen 1 + Zen 2; Zen 3 + Zen 4; Zen 5 + Zen 6. So producing an initial core implementing new ideas followed by its close refinement has been the way AMD's been operating since Zen 1.

However, in this case the new design is rather weak:
* Zen 1 scored 52% following low-IPC Family 15h.
* Zen 3 scored 19% with modest core area investment.
* Zen 5 scores 16% following the previous weakest generational IPC gain, blowing the core area quite a bit and being release 21 months after its predecessor at the same time.

It's sad AMD went all-in for AVX512 with Zen 5. Zen 4's approach could have stuck with us for more than a single generation.
 
  • Like
Reactions: exquisitechar

dttprofessor

Member
Jun 16, 2022
163
45
71
be so fcking fr dude, I noticed that too 💀

I'm guessing they will be around the same, ARL roughly ties Zen 5, and Zen 5X3D clean sweeps with a ~15%-20% lead.

As @coercitiv said, yes, they actually answered that in the interview, imma SS chips and cheese's article here:
View attachment 103107
When every thread has been used ,every thread use just 1*4 decoder(from 2*4 to 1*4),the IPC should decline?
 

SarahKerrigan

Senior member
Oct 12, 2014
735
2,036
136
When every thread has been used ,every thread use just 1*4 decoder(from 2*4 to 1*4),the IPC should decline?

Cumulative iso-clock perf will improve. That's the point of SMT.

Per-thread iso-clock perf will be lower for an SMT thread than for a single thread. That is not some amazing revelation. It's a fact of effectively all SMT implementations.
 

Saylick

Diamond Member
Sep 10, 2012
4,033
9,454
136
Quick question: The guy who tested Strix Point ES and noted structure reductions (notably uOP cache), do we know if this is true ? The slide for Zen 5 calls out 6K figure ,which matches Zen 4. I wonder how he got to the conclusion that Strix version had a reduction of that structure?

edit: It seems that Zen 4 has 6,912 Ops size, which is indeed a bit better than Zen 5's 6K
C&C measured Zen 4's mop cache as 6.75k entries. Reducing it to 6k isn't a crazy reduction; seems like with a better decoder you don't have to rely on the mop cache as heavily. Also, if you can grab more mops from the mop cache, perhaps you don't need to keep so many stored in-flight?

Via C&C:
George Cozma: Now that brings me down to the micro-op cache. Again, in your diagrams you show 2 by 6 wide for the micro-op cache. Now in our conversations you have said that it’s a dual ported 6 wide op cache. What exactly does that mean for the throughput at any given time?

Mike Clark: So, it means that you know in in the best-case scenario we’re accessing the op cache with two fetch addresses and they both hit and we pull out the maximum we can pull out for any hit is 6 instructions and that if they both hit we can then deliver 12 instructions in one cycle out of the op cache.

Now we can’t always build 6 per entry. We don’t always hit or have them properly aligned so they can always hit in the op cache. So that’s why we actually, if you think about it, we’re 8 wide dispatch and be like, well, why would you grab 12 [ops] if you can only dispatch 8 [ops] but 12 [ops] is, you know the maximum and so it has to be a balance point that we pull more than we can, because sometimes we are inefficient and we can’t get all the instructions we want.
 

yuri69

Senior member
Jul 16, 2013
677
1,215
136
I think it was done because the gains will be bigger in the enterprise / ai space and that is where the money is currently not desktop.
Desktop is a super-niche, no doubt. However, I'm not sure about the server market.

IMO the largest part are still hyperscalers providing their instances to run classic workloads like web servers, JVM apps, various implementations of microservices, Lambdas, etc. None of those need massive SIMDs.

What needs SIMDs is HPC/scientific workloads and engineering.

AI is better handled by even more specialized accelerators like Intel's AMX. So going for growing markets would be better off with a SKUs featuring some of these...
 
  • Like
Reactions: exquisitechar