Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 278 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

eek2121

Platinum Member
Aug 2, 2005
2,933
4,028
136
It is highly unlikely that the move to N4P has brought about clock regressions. There might be some on the design side, but due to the node? No.
Indeed, improvements in N4P actually should increase clocks, not drop them.

I am not expecting much of a regression, if any, for desktop, at least. I will retract or double down on that statement if we get more details about the chip and the details reveal core sizes of desktop chips.

I have been a bit out of the loop, however, and haven’t been following the real leaks.

Life has been a bit tough.
 
  • Like
Reactions: Tlh97 and Kolifloro

soresu

Platinum Member
Dec 19, 2014
2,718
1,917
136
It is highly unlikely that the move to N4P has brought about clock regressions. There might be some on the design side, but due to the node? No.
Design/fab node was my point, as in heftier core regressing clock without a dramatically better fab node.

N4P should be significantly better than N5, but enough to absorb the shift to 6+ wide core?

Colour me doubtful for the moment, though I'll be extremely happy to be proven an apostate 🙏
 

DisEnchantment

Golden Member
Mar 3, 2017
1,622
5,892
136
Looking at this again it, vs Z4
  • +2 rename/dispatch
  • +2 ALUs
  • +1 LD/cycle
  • 512b FP width
  • 64B LD/ST queues
  • 48K L1D
  • OOO structures increased
  • Usual generational architectural improvements scattered around
  • New BP with larger BTBs -> zero bubble conditional branches sounded like the patent I listed before where a second BP scans the other conditional branch
  • Decode width unknown, doubtful it is going to be beyond 6 wide if at all they even increase.
  • uop cache unknown
  • "2 basic block fetch" --> Does this mean 2x fetch and decode blocks akin to Tremont?
Does not seem terribly bloated, would have indeed seems akin to the Z2 -> Z3 evolution. The unknowns however do seem like the kind of big ticket items. I think the zero bubble conditional branch could be tied to the "2 basic block fetch".

Low Power core
  • Probably the low power core option is not having 512b FP pipes or 64B LD/ST queues (they mentioned FP 512 variants, which would mean 512 pipes and data structures not standard across all core)
  • Denser node/efficiency optimized libs as usual
  • Cache reduction as usual
  • If the 2x basic block fetch is akin to what I described, they could clock gate the second fetch block aggressively for mobile
However a major departure from Zen 3/4 series are the unified schedulers for INT and FP back to Zen 2 style. Would be interesting to see latencies with Zen 5.

Leak is confirmed by GCC patch.


For now,
6x ALU
4x AGU
512b FP width
2x FP stores/cycle


Code:
;; Integer unit 6 ALU pipes.
(define_cpu_unit "znver5-ieu0" "znver5_ieu")
(define_cpu_unit "znver5-ieu1" "znver5_ieu")
(define_cpu_unit "znver5-ieu2" "znver5_ieu")
(define_cpu_unit "znver5-ieu3" "znver5_ieu")
(define_cpu_unit "znver5-ieu4" "znver5_ieu")
(define_cpu_unit "znver5-ieu5" "znver5_ieu")

;; As of now we have taken based on znver4, We need to revist once znver5 information
(define_cpu_unit "znver5-bru0" "znver5_ieu")
(define_reservation "znver5-ieu" "znver5-ieu0|znver5-ieu1|znver5-ieu2|znver5-ieu3|znver5-ieu4|znver5-ieu5")

;; 4 AGU pipes in znver5
(define_cpu_unit "znver5-agu0" "znver5_agu")
(define_cpu_unit "znver5-agu1" "znver5_agu")
(define_cpu_unit "znver5-agu2" "znver5_agu")
(define_cpu_unit "znver5-agu3" "znver5_agu")

More to digest. Decode width and costs are same for now. they will change it again for sure. Num of FP pipes also same.
Some improvements in scheduling costs are there but definitely not final.

Quite surprised with 4x decode. I hope it there is future update not that it is still at 4 wide decode.
 
Last edited:

Bigos

Member
Jun 2, 2019
131
295
136
Initial Zen 5 support in GCC.

I am not sure we get to learn anything new from this, but I will still list the differences against Zen 4 from the patch.

New instruction sets: AVXVNNI, MOVDIRI, MOVDIR64B, AVX512VP2INTERSECT, PREFETCHI

The instruction cost table appears to be "copied from Zen 4", but I have noticed a few differences:
  • cost of a divide/mod for
    • QI: 12 -> 10
    • HI: 13 -> 11
    • SI: 13 (no change)
    • DI: 18 -> 16
    • Other: 18 -> 16
  • cost of DIVSS instruction: 13 -> 10
  • cost of DIVSD instruction: 13 (no change)
  • cost of SQRTSS instruction: 15 -> 14
  • cost of SQRTSD instruction: 21 -> 20
The CPU model has the following changes since Zen 4 (though it also cites assuming things from Zen 4 so is probably not entirely correct yet):
  • Integer unit 4 -> 6 ALU pipes.
  • 3 -> 4 AGU pipes
  • 1 -> 2 FP store pipelines
    • If I understand the model correctly, these are 256 bit and can be combined to handle a 512 bit store
  • cmov/setcc can be handled by all integer ALUs (only 2 in Zen 4)
  • FP shuffles can be handled by 3 FP pipelines (only 2 in Zen 4)
  • some other FP operations are handled by different pipelines now
  • FP add latency (I think?) 3 -> 2
  • most AVX-512 operations take only 1 pipeline slot (2 in Zen 4)
Phoronix article: https://www.phoronix.com/news/AMD-Zen-5-Znver-5-GCC

Link to the patch: https://gcc.gnu.org/pipermail/gcc-patches/attachments/20240210/b2991675/attachment-0001.obj
 

randomhero

Member
Apr 28, 2020
183
249
116
:yum:Initial Zen 5 support in GCC.

I am not sure we get to learn anything new from this, but I will still list the differences against Zen 4 from the patch.

New instruction sets: AVXVNNI, MOVDIRI, MOVDIR64B, AVX512VP2INTERSECT, PREFETCHI

The instruction cost table appears to be "copied from Zen 4", but I have noticed a few differences:
  • cost of a divide/mod for
    • QI: 12 -> 10
    • HI: 13 -> 11
    • SI: 13 (no change)
    • DI: 18 -> 16
    • Other: 18 -> 16
  • cost of DIVSS instruction: 13 -> 10
  • cost of DIVSD instruction: 13 (no change)
  • cost of SQRTSS instruction: 15 -> 14
  • cost of SQRTSD instruction: 21 -> 20
The CPU model has the following changes since Zen 4 (though it also cites assuming things from Zen 4 so is probably not entirely correct yet):
  • Integer unit 4 -> 6 ALU pipes.
  • 3 -> 4 AGU pipes
  • 1 -> 2 FP store pipelines
    • If I understand the model correctly, these are 256 bit and can be combined to handle a 512 bit store
  • cmov/setcc can be handled by all integer ALUs (only 2 in Zen 4)
  • FP shuffles can be handled by 3 FP pipelines (only 2 in Zen 4)
  • some other FP operations are handled by different pipelines now
  • FP add latency (I think?) 3 -> 2
  • most AVX-512 operations take only 1 pipeline slot (2 in Zen 4)
Phoronix article: https://www.phoronix.com/news/AMD-Zen-5-Znver-5-GCC

Link to the patch: https://gcc.gnu.org/pipermail/gcc-patches/attachments/20240210/b2991675/attachment-0001.obj
Now that is some real microarch pron:p
Me want more!:yum:
 
  • Haha
Reactions: Tlh97 and Elfear

deasd

Senior member
Dec 31, 2013
525
796
136
First GCC patch release before product launch is rare from AMD


However, AMD tends to release such updates even after the release of the new product. That said, AMD is actually speeding up its enablement by introduction first Zen5 patches even before the product launch.

First Zen4 patch at Oct.2022

So this time is a weird movement from AMD to say the least. Feeling like product is ready but the launch and any leaks are suppressed intentionally.
 
Jul 27, 2020
16,724
10,707
106
So this time is a weird movement from AMD to say the least. Feeling like product is ready but the launch and any leaks are suppressed intentionally.
Could be a lot of unsold Zen 4 stock. If we start seeing a fire sale on Zen 4 parts in a month or two, that theory might turn out to be correct.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,627
14,618
136
MTL is decent, but barely keeps Intel in the game for laptops. Everything else is won by AMD. If Zen 5 is really a real leap, then Intel may finally lose some real market share. They have already lost a lot, but this may be a lot more even still.
 
Jul 27, 2020
16,724
10,707
106
I just hope AMD has enough Zen 5 stock accumulated to tilt the desktop CPU marketshare in their favor and make the Arrow Lake launch sound like a bunch of crickets croaking in the dead of night.

Intel needs this kick in the nuts to get their survival instincts going in full gear and do something phenomenal to keep the competition strong.
 

moinmoin

Diamond Member
Jun 1, 2017
4,969
7,722
136
The fact that they are still releasing new Zen3 AM4 chips shows that things are still not so rosy in AM5 land
Still completely different markets to target imo.

That is my expectation based on the quality of his posts here.
Personally I prefer talk about confirmed data in e.g. driver code, discussions and educated guesses based on patents, and essentially randomly guessed numbers all to be clearly separated. So he keeps the quality of his posts high by not mixing them.

They all do, but DTCO makes all the difference.
Technically it's no longer the original off-the-shelf node if it's co-optimized though, is it?
 
  • Like
Reactions: Nothingness