Compared to Zen2, it's about 50-55% (average IPC increase).
Zen5 has the most powerful and modern BPU in the x86 architecture.
Golden/RaptorCove
BTB L0 128
BTB L1 5K
BTB L2 12K
Return Address Stack 2-4
LionCove
BTB L0 256
BTB L1 6K
BTB L2 12K
Return Address Stack 24
Zen4
BTB L0 128
BTB L1 1.5K
BTB L2 7K
Return Address Stack 32
Zen5
BTB L0 1K!(1024!)
BTB L1 16K!
BTB L2 8K(victim cache for BTB L1)
Return Address Stack 52x2(104 for SMT)
Golden/RaptorCove
Cache L3 ST 90-100GB/s (60-70 cycles)
LionCove
Cache L3 ST 57GB/s (84 cycles)???
Zen5
Cache L3 ST 173GB/s (48 cycles)!!!
Edit:
The Zen5 BPU can predict the next two independent branch paths not only for two threads(SMT) but also within a single thread(ST). When the ST code is heavily branched, the second decoder cluster can take over part of the ST code (2x4-Wide(8-Wide))! (Zen1-Zen4 decode 4-Wide)
SMT Zen4 profit average +13%
SMT Zen5 profit average +18%
OP cache 6144 (instruction fusion) 16-way, 12 ops/ST cycle and 2x 6ops/SMT cycle. Thanks to instruction fusion, the Zen5 op cache has larger capacity than the Zen4 (6912, 12-Way, 9 ops/cycle) op cache.