- Mar 3, 2017
- 1,749
- 6,614
- 136
Indeed, improvements in N4P actually should increase clocks, not drop them.It is highly unlikely that the move to N4P has brought about clock regressions. There might be some on the design side, but due to the node? No.
That's the problem. There don't seem to be anyI have been a bit out of the loop, however, and haven’t been following the real leaks.
Design/fab node was my point, as in heftier core regressing clock without a dramatically better fab node.It is highly unlikely that the move to N4P has brought about clock regressions. There might be some on the design side, but due to the node? No.
Looking at this again it, vs Z4
Does not seem terribly bloated, would have indeed seems akin to the Z2 -> Z3 evolution. The unknowns however do seem like the kind of big ticket items. I think the zero bubble conditional branch could be tied to the "2 basic block fetch".
- +2 rename/dispatch
- +2 ALUs
- +1 LD/cycle
- 512b FP width
- 64B LD/ST queues
- 48K L1D
- OOO structures increased
- Usual generational architectural improvements scattered around
- New BP with larger BTBs -> zero bubble conditional branches sounded like the patent I listed before where a second BP scans the other conditional branch
- Decode width unknown, doubtful it is going to be beyond 6 wide if at all they even increase.
- uop cache unknown
- "2 basic block fetch" -
-> Does this mean 2x fetch and decode blocks akin to Tremont?
Low Power core
However a major departure from Zen 3/4 series are the unified schedulers for INT and FP back to Zen 2 style. Would be interesting to see latencies with Zen 5.
- Probably the low power core option is not having 512b FP pipes or 64B LD/ST queues (they mentioned FP 512 variants, which would mean 512 pipes and data structures not standard across all core)
- Denser node/efficiency optimized libs as usual
- Cache reduction as usual
Ifthe 2x basic block fetch is akin to what I described, they could clock gate the second fetch block aggressively for mobile
;; Integer unit 6 ALU pipes.
(define_cpu_unit "znver5-ieu0" "znver5_ieu")
(define_cpu_unit "znver5-ieu1" "znver5_ieu")
(define_cpu_unit "znver5-ieu2" "znver5_ieu")
(define_cpu_unit "znver5-ieu3" "znver5_ieu")
(define_cpu_unit "znver5-ieu4" "znver5_ieu")
(define_cpu_unit "znver5-ieu5" "znver5_ieu")
;; As of now we have taken based on znver4, We need to revist once znver5 information
(define_cpu_unit "znver5-bru0" "znver5_ieu")
(define_reservation "znver5-ieu" "znver5-ieu0|znver5-ieu1|znver5-ieu2|znver5-ieu3|znver5-ieu4|znver5-ieu5")
;; 4 AGU pipes in znver5
(define_cpu_unit "znver5-agu0" "znver5_agu")
(define_cpu_unit "znver5-agu1" "znver5_agu")
(define_cpu_unit "znver5-agu2" "znver5_agu")
(define_cpu_unit "znver5-agu3" "znver5_agu")
Now that is some real microarch pronInitial Zen 5 support in GCC.
I am not sure we get to learn anything new from this, but I will still list the differences against Zen 4 from the patch.
New instruction sets: AVXVNNI, MOVDIRI, MOVDIR64B, AVX512VP2INTERSECT, PREFETCHI
The instruction cost table appears to be "copied from Zen 4", but I have noticed a few differences:
The CPU model has the following changes since Zen 4 (though it also cites assuming things from Zen 4 so is probably not entirely correct yet):
- cost of a divide/mod for
- QI: 12 -> 10
- HI: 13 -> 11
- SI: 13 (no change)
- DI: 18 -> 16
- Other: 18 -> 16
- cost of DIVSS instruction: 13 -> 10
- cost of DIVSD instruction: 13 (no change)
- cost of SQRTSS instruction: 15 -> 14
- cost of SQRTSD instruction: 21 -> 20
Phoronix article: https://www.phoronix.com/news/AMD-Zen-5-Znver-5-GCC
- Integer unit 4 -> 6 ALU pipes.
- 3 -> 4 AGU pipes
- 1 -> 2 FP store pipelines
- If I understand the model correctly, these are 256 bit and can be combined to handle a 512 bit store
- cmov/setcc can be handled by all integer ALUs (only 2 in Zen 4)
- FP shuffles can be handled by 3 FP pipelines (only 2 in Zen 4)
- some other FP operations are handled by different pipelines now
- FP add latency (I think?) 3 -> 2
- most AVX-512 operations take only 1 pipeline slot (2 in Zen 4)
Link to the patch: https://gcc.gnu.org/pipermail/gcc-patches/attachments/20240210/b2991675/attachment-0001.obj
However, AMD tends to release such updates even after the release of the new product. That said, AMD is actually speeding up its enablement by introduction first Zen5 patches even before the product launch.
Remember that AMD never uses an off-the-shelf node from TSMC.N4P should be significantly better than N5, but enough to absorb the shift to 6+ wide core?
They all do, but DTCO makes all the difference.Remember that AMD never uses an off-the-shelf node from TSMC.
Could be a lot of unsold Zen 4 stock. If we start seeing a fire sale on Zen 4 parts in a month or two, that theory might turn out to be correct.So this time is a weird movement from AMD to say the least. Feeling like product is ready but the launch and any leaks are suppressed intentionally.
adroc said this and nobody seems to believed him.
Did YOU believed Adroc?adroc said this and nobody seems to believed him.
I was pretty sure he was right, and I said so, but reserved final judgment until it released.Did YOU believed Adroc?
who
What's an Intel Shareholder?Intel shareholders better sell their stock ASAP and buy AMD stock!
Still completely different markets to target imo.The fact that they are still releasing new Zen3 AM4 chips shows that things are still not so rosy in AM5 land
Personally I prefer talk about confirmed data in e.g. driver code, discussions and educated guesses based on patents, and essentially randomly guessed numbers all to be clearly separated. So he keeps the quality of his posts high by not mixing them.That is my expectation based on the quality of his posts here.
Technically it's no longer the original off-the-shelf node if it's co-optimized though, is it?They all do, but DTCO makes all the difference.
That was my thinking, but the distinction might be there.Technically it's no longer the original off-the-shelf node if it's co-optimized though, is it?
Were there any numbers or other info posted along with that, or was it just some general statement about Zen5?
An Intel Employee =)What's an Intel Shareholder?