Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

DisEnchantment · Sep 29, 2022

Speculate at will

eek2121 · Feb 10, 2024

DrMrLordX said:
It is highly unlikely that the move to N4P has brought about clock regressions. There might be some on the design side, but due to the node? No.

Indeed, improvements in N4P actually should increase clocks, not drop them.

I am not expecting much of a regression, if any, for desktop, at least. I will retract or double down on that statement if we get more details about the chip and the details reveal core sizes of desktop chips.

I have been a bit out of the loop, however, and haven’t been following the real leaks.

Life has been a bit tough.

igor_kavinski · Feb 10, 2024

eek2121 said:
I have been a bit out of the loop, however, and haven’t been following the real leaks.

That's the problem. There don't seem to be any

soresu · Feb 10, 2024

DrMrLordX said:
It is highly unlikely that the move to N4P has brought about clock regressions. There might be some on the design side, but due to the node? No.

Design/fab node was my point, as in heftier core regressing clock without a dramatically better fab node.

N4P should be significantly better than N5, but enough to absorb the shift to 6+ wide core?

Colour me doubtful for the moment, though I'll be extremely happy to be proven an apostate 🙏

DisEnchantment · Feb 10, 2024

DisEnchantment said:
Looking at this again it, vs Z4

+2 rename/dispatch

+2 ALUs

+1 LD/cycle

512b FP width

64B LD/ST queues

48K L1D

OOO structures increased

Usual generational architectural improvements scattered around

New BP with larger BTBs -> zero bubble conditional branches sounded like the patent I listed before where a second BP scans the other conditional branch

Decode width unknown, doubtful it is going to be beyond 6 wide if at all they even increase.

uop cache unknown

"2 basic block fetch" -~~-> Does this mean 2x fetch and decode blocks akin to Tremont?~~

Does not seem terribly bloated, would have indeed seems akin to the Z2 -> Z3 evolution. The unknowns however do seem like the kind of big ticket items. I think the zero bubble conditional branch could be tied to the "2 basic block fetch".

Low Power core

Probably the low power core option is not having 512b FP pipes or 64B LD/ST queues (they mentioned FP 512 variants, which would mean 512 pipes and data structures not standard across all core)

Denser node/efficiency optimized libs as usual

Cache reduction as usual

If ~~the 2x basic block fetch is akin to what I described, they could clock gate the second fetch block aggressively for mobile~~

However a major departure from Zen 3/4 series are the unified schedulers for INT and FP back to Zen 2 style. Would be interesting to see latencies with Zen 5.

Leak is confirmed by GCC patch.

https://gcc.gnu.org/pipermail/gcc-patches/attachments/20240210/b2991675/attachment-0001.obj

For now,
6x ALU
4x AGU
512b FP width
2x FP stores/cycle

Code:

;; Integer unit 6 ALU pipes.
(define_cpu_unit "znver5-ieu0" "znver5_ieu")
(define_cpu_unit "znver5-ieu1" "znver5_ieu")
(define_cpu_unit "znver5-ieu2" "znver5_ieu")
(define_cpu_unit "znver5-ieu3" "znver5_ieu")
(define_cpu_unit "znver5-ieu4" "znver5_ieu")
(define_cpu_unit "znver5-ieu5" "znver5_ieu")

;; As of now we have taken based on znver4, We need to revist once znver5 information
(define_cpu_unit "znver5-bru0" "znver5_ieu")
(define_reservation "znver5-ieu" "znver5-ieu0|znver5-ieu1|znver5-ieu2|znver5-ieu3|znver5-ieu4|znver5-ieu5")

;; 4 AGU pipes in znver5
(define_cpu_unit "znver5-agu0" "znver5_agu")
(define_cpu_unit "znver5-agu1" "znver5_agu")
(define_cpu_unit "znver5-agu2" "znver5_agu")
(define_cpu_unit "znver5-agu3" "znver5_agu")

More to digest. Decode width and costs are same for now. they will change it again for sure. Num of FP pipes also same.
Some improvements in scheduling costs are there but definitely not final.

Quite surprised with 4x decode. I hope it there is future update not that it is still at 4 wide decode.

Bigos · Feb 10, 2024

Initial Zen 5 support in GCC.

I am not sure we get to learn anything new from this, but I will still list the differences against Zen 4 from the patch.

New instruction sets: AVXVNNI, MOVDIRI, MOVDIR64B, AVX512VP2INTERSECT, PREFETCHI

The instruction cost table appears to be "copied from Zen 4", but I have noticed a few differences:

cost of a divide/mod for
- QI: 12 -> 10
- HI: 13 -> 11
- SI: 13 (no change)
- DI: 18 -> 16
- Other: 18 -> 16
cost of DIVSS instruction: 13 -> 10
cost of DIVSD instruction: 13 (no change)
cost of SQRTSS instruction: 15 -> 14
cost of SQRTSD instruction: 21 -> 20

The CPU model has the following changes since Zen 4 (though it also cites assuming things from Zen 4 so is probably not entirely correct yet):

Integer unit 4 -> 6 ALU pipes.
3 -> 4 AGU pipes
1 -> 2 FP store pipelines
- If I understand the model correctly, these are 256 bit and can be combined to handle a 512 bit store
cmov/setcc can be handled by all integer ALUs (only 2 in Zen 4)
FP shuffles can be handled by 3 FP pipelines (only 2 in Zen 4)
some other FP operations are handled by different pipelines now
FP add latency (I think?) 3 -> 2
most AVX-512 operations take only 1 pipeline slot (2 in Zen 4)

Phoronix article: https://www.phoronix.com/news/AMD-Zen-5-Znver-5-GCC

Link to the patch: https://gcc.gnu.org/pipermail/gcc-patches/attachments/20240210/b2991675/attachment-0001.obj

randomhero · Feb 10, 2024

Bigos said:
Initial Zen 5 support in GCC.

I am not sure we get to learn anything new from this, but I will still list the differences against Zen 4 from the patch.

New instruction sets: AVXVNNI, MOVDIRI, MOVDIR64B, AVX512VP2INTERSECT, PREFETCHI

The instruction cost table appears to be "copied from Zen 4", but I have noticed a few differences:

cost of a divide/mod for

QI: 12 -> 10

HI: 13 -> 11

SI: 13 (no change)

DI: 18 -> 16

Other: 18 -> 16

cost of DIVSS instruction: 13 -> 10

cost of DIVSD instruction: 13 (no change)

cost of SQRTSS instruction: 15 -> 14

cost of SQRTSD instruction: 21 -> 20

The CPU model has the following changes since Zen 4 (though it also cites assuming things from Zen 4 so is probably not entirely correct yet):

Integer unit 4 -> 6 ALU pipes.

3 -> 4 AGU pipes

1 -> 2 FP store pipelines

If I understand the model correctly, these are 256 bit and can be combined to handle a 512 bit store

cmov/setcc can be handled by all integer ALUs (only 2 in Zen 4)

FP shuffles can be handled by 3 FP pipelines (only 2 in Zen 4)

some other FP operations are handled by different pipelines now

FP add latency (I think?) 3 -> 2

most AVX-512 operations take only 1 pipeline slot (2 in Zen 4)

Phoronix article: https://www.phoronix.com/news/AMD-Zen-5-Znver-5-GCC

Link to the patch: https://gcc.gnu.org/pipermail/gcc-patches/attachments/20240210/b2991675/attachment-0001.obj

Now that is some real microarch pron

Me want more!

deasd · Feb 10, 2024

First GCC patch release before product launch is rare from AMD

AMD begins Zen5 enablement for GNU Compiler Collection (GCC), new AVX instructions added - VideoCardz.com

AMD Zen5 patches for GCC AMD has begun work on enabling the Zen5 microarchitecture through the GNU Compiler Collection (GCC). First patches were shared through a mailing list titled “Enable support for next generation AMD Zen5 CPU with znver5 scheduler Model”. One could have expected Zen5...

videocardz.com

However, AMD tends to release such updates even after the release of the new product. That said, AMD is actually speeding up its enablement by introduction first Zen5 patches even before the product launch.

First Zen4 patch at Oct.2022

AMD Rolls Out GCC Enablement for "Zen 4" Processors with Zenver4 Target, Enables AVX-512 Instructions

AMD earlier this week released basic enablement for the GNU Compiler Collections (GCC), which extend "Zen 4" microarchitecture awareness. The "basic enablement patch" for the new Zenver4 target is essentially similar to Zenver3, but with added support for the new AVX-512 instructions, namely...

www.techpowerup.com

So this time is a weird movement from AMD to say the least. Feeling like product is ready but the launch and any leaks are suppressed intentionally.

DrMrLordX · Feb 10, 2024

soresu said:
N4P should be significantly better than N5, but enough to absorb the shift to 6+ wide core?

Remember that AMD never uses an off-the-shelf node from TSMC.

adroc_thurston · Feb 10, 2024

DrMrLordX said:
Remember that AMD never uses an off-the-shelf node from TSMC.

They all do, but DTCO makes all the difference.

igor_kavinski · Feb 10, 2024

deasd said:
So this time is a weird movement from AMD to say the least. Feeling like product is ready but the launch and any leaks are suppressed intentionally.

Could be a lot of unsold Zen 4 stock. If we start seeing a fire sale on Zen 4 parts in a month or two, that theory might turn out to be correct.

Glo. · Feb 10, 2024

So it begins...

Markfw · Feb 10, 2024

Glo. said:
View attachment 93605

So it begins...

adroc said this and nobody seems to believed him.

Glo. · Feb 10, 2024

Markfw said:
adroc said this and nobody seems to believed him.

Did YOU believed Adroc?

Markfw · Feb 10, 2024

Glo. said:
Did YOU believed Adroc?

I was pretty sure he was right, and I said so, but reserved final judgment until it released.

adroc_thurston · Feb 10, 2024

Glo. said:
View attachment 93605

So it begins...

who

igor_kavinski · Feb 10, 2024

Intel shareholders better sell their stock ASAP and buy AMD stock!

Glo. · Feb 10, 2024

igor_kavinski said:
Intel shareholders better sell their stock ASAP and buy AMD stock!

What's an Intel Shareholder?

Markfw · Feb 10, 2024

MTL is decent, but barely keeps Intel in the game for laptops. Everything else is won by AMD. If Zen 5 is really a real leap, then Intel may finally lose some real market share. They have already lost a lot, but this may be a lot more even still.

igor_kavinski · Feb 10, 2024

I just hope AMD has enough Zen 5 stock accumulated to tilt the desktop CPU marketshare in their favor and make the Arrow Lake launch sound like a bunch of crickets croaking in the dead of night.

Intel needs this kick in the nuts to get their survival instincts going in full gear and do something phenomenal to keep the competition strong.

Tup3x · Feb 10, 2024

I expect 20% better performance. If it manages to beat that, then that would be a nice surprise. I'd rather keep my expectations low and not end up disappointed than the other way round.

igor_kavinski · Feb 10, 2024

Zen 5 could be AMD's M1 moment (minus probably power efficiency coz it's a monster core). Intel's such moment isn't expected till Nova Lake. Far, far away...

moinmoin · Feb 10, 2024

soresu said:
The fact that they are still releasing new Zen3 AM4 chips shows that things are still not so rosy in AM5 land

Still completely different markets to target imo.

igor_kavinski said:
That is my expectation based on the quality of his posts here.

Personally I prefer talk about confirmed data in e.g. driver code, discussions and educated guesses based on patents, and essentially randomly guessed numbers all to be clearly separated. So he keeps the quality of his posts high by not mixing them.

adroc_thurston said:
They all do, but DTCO makes all the difference.

Technically it's no longer the original off-the-shelf node if it's co-optimized though, is it?

DrMrLordX · Feb 10, 2024

moinmoin said:
Technically it's no longer the original off-the-shelf node if it's co-optimized though, is it?

That was my thinking, but the distinction might be there.

Fjodor2001 · Feb 10, 2024

Glo. said:
View attachment 93605

So it begins...

Were there any numbers or other info posted along with that, or was it just some general statement about Zen5?

blackangus · Feb 10, 2024

Glo. said:
What's an Intel Shareholder?

An Intel Employee =)

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Golden Member

Diamond Member

Lifer

Diamond Member

Golden Member

Member

Member

Senior member

Lifer

Diamond Member

Lifer

Diamond Member

Moderator Emeritus, Elite Member

Diamond Member

Moderator Emeritus, Elite Member

Diamond Member

Lifer

Diamond Member

Moderator Emeritus, Elite Member

Lifer

Golden Member

Lifer

Diamond Member

Lifer

Diamond Member

Senior member