NostaSeronx
Diamond Member
- Sep 18, 2011
- 3,809
- 1,289
- 136
Instruction Fetch Window => 2 32B aka 2 x 256b
4-Way Decoder => Fastpath Single => 4 x 16B(4 Macro-ops), Fastpath Double => 4 x 32B(8 Macro-ops)
Integer Cluster Scheduler => 2 x 40-Entry, 40 Macro-ops(Core A) 40 Macro-ops(Core B) (2 x 5KB L0i)
Floating Point Coprocessor Scheduler => 2 x 30-Entry, 30 Macro-ops(Core A) 30 Macro-ops(Core B) (1 x 7.5KB L0i split into two so 3.75KB L0is)
Fetch/Dispatch/Retire per core => 4 Macro-ops, 2 from the Integer Cluster and 2 from the Floating Point Coprocessor
Any points you want to fix Dresdenboy?
(If we add them using the same math we can come up with L0i$ differences)
(L0i$ is 17.5KB large for AMD Bulldozer
and
L0i$ is 6.75KB large for Intel Sandy Bridge)
4-Way Decoder => Fastpath Single => 4 x 16B(4 Macro-ops), Fastpath Double => 4 x 32B(8 Macro-ops)
Integer Cluster Scheduler => 2 x 40-Entry, 40 Macro-ops(Core A) 40 Macro-ops(Core B) (2 x 5KB L0i)
Floating Point Coprocessor Scheduler => 2 x 30-Entry, 30 Macro-ops(Core A) 30 Macro-ops(Core B) (1 x 7.5KB L0i split into two so 3.75KB L0is)
Fetch/Dispatch/Retire per core => 4 Macro-ops, 2 from the Integer Cluster and 2 from the Floating Point Coprocessor
Any points you want to fix Dresdenboy?
(If we add them using the same math we can come up with L0i$ differences)
(L0i$ is 17.5KB large for AMD Bulldozer
and
L0i$ is 6.75KB large for Intel Sandy Bridge)
Last edited: