• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Discussion ZEN2 2-CCD chip inter core latencies = EPYC advances in server rooms

JoeRambo

Golden Member
When reading overclock.net and Stilt's take on ZEN2, i have found this little gem in the thread, coming from Russian site https://3dnews.ru/990367/obzor-amd-ryzen-9-3900x

https://3dnews.ru/assets/external/illustrations/2019/07/08/990367/image_2019-07-08_12-51-48.png

It might be DDR4 3600 lowering overall levels of latency, but those inter-CCD latencies are going to be amazing for server CPUs and will lead to quite a few wins. No more hidden gotchas when moving between CCXs and further, just flat sweet sub ~70ns for all inter thread comms.

Not only AMD will definitely rule those 1-2S system, but it seems that they will do so with amazing technical grace.
 
When reading overclock.net and Stilt's take on ZEN2, i have found this little gem in the thread, coming from Russian site https://3dnews.ru/990367/obzor-amd-ryzen-9-3900x

https://3dnews.ru/assets/external/illustrations/2019/07/08/990367/image_2019-07-08_12-51-48.png

It might be DDR4 3600 lowering overall levels of latency, but those inter-CCD latencies are going to be amazing for server CPUs and will lead to quite a few wins. No more hidden gotchas when moving between CCXs and further, just flat sweet sub ~70ns for all inter thread comms.

Not only AMD will definitely rule those 1-2S system, but it seems that they will do so with amazing technical grace.
lets wait for database benchmarks
looks very good
 
I was surprised that the two CCX in a CPU chiplet are not interconnected together but instead have to go though the IO chiplet. It seems to be a mix between Pentium D Smithfield, which was two adjacent Prescott dies cut together but with no internal communication, thus they had to go though the Chipset to talk to each other, and a Core 2 Quad Kentsfield, which was a MCM based on two Conroe dies where each Conroe saw the other core and shared the Cache L2, but to talk to the other Conroe die it had to go though the Chipset, too.
At first it doesn't make sense since they have twice as many traces between a CPU chiplet and the IO one than if there was some crossbar in the CPU chiplet, plus the chiplet interCCX latency is increased. But somehow is far more elegant, as having two latency tiers is simpler that having three from dumb CPU Schedulers standpoint. There is technically no penalty from going from one chiplet to two or more in that regard.

Where you will notice this the most? ThreadRipper and EPYC, obviously.
 
Last edited:
Back
Top