Saylick
Diamond Member
- Sep 10, 2012
- 3,882
- 9,016
- 136
Makes sense. In a post I wrote about 2 months ago, I thought that the number of micro-ops that could be dispatched was limiting the overall throughput of the core:There’s loads of potential for further increases, and that’s without radical redesigns needed.
Just some basic stuff of the top of my head
- More execution units (Doesn’t have to be ALU, can be AGU, LEA, FPU, etc...)
- Larger Caches
- Increased ROB and Memory, Scheduler Buffers
- More ports to dispatch instructions to execution units and reduce back end bottle necks
https://forums.anandtech.com/threads/speculation-ryzen-4000-series-zen-3.2567589/post-39896162
Not to say that the 6 ops/cycle dispatch & 8 ops/cycle retire was the main bottleneck, but given how wide the core is and how many micro-ops the micro-op cache could deliver (8 ops per cycle), AMD could afford to increase the dispatch rate to match. That alone, in theory, gives a maximum of 33% more instructions in a given cycle assuming nothing else is a bottleneck. Everything else is just making sure there's enough load/store resources and enlarging buffers to keep up.