• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Question Core-to-Core latency

We can get two threads to do several compare-and-exchange operations while measuring delay by assigning them to two distinct CPU cores.
Also Type to return
$ cargo install core-to-core-latency
$ core-to-core-latency
 
For those that do not know: In mainstream system architectures, cores do not really communicate directly with each other — e.g. there are no instructions in the x86-64 ISA to "talk" to another core. A program process executes threads on the cores, and the threads may communicate through shared memory. This puts a burden on the system to make sure the contents of all the caches in the system are coherent, so that the threads can agree on a consistent view of the memory. The rules that threads need to follow and the guarantees that the system provides if they do are called a memory model.

Hence, what is meant by "core-to-core latency" is actually memory access latency, taking into account the penalty of the cache coherency protocol, which in turn is very much affected by the amount of sharing (concurrent reading and writing, in particular).

Here is a great lecture on CPU caches and their effect on software performance. Spoiler: If a program's performance is limited by "core-to-core latency", the programmer may have done something really bad by accident or ignorance. Note that the final slides have great references to further in-depth material on this topic.


Main reference: "What Every Programmer Should Know About Memory", Ulrich Drepper, 2007
 
Last edited:
For those that do not know: In mainstream system architectures, cores do not really communicate directly with each other — e.g. there are no instructions in the x86-64 ISA to "talk" to another core. A program process executes threads on the cores, and the threads may communicate through shared memory. This puts a burden on the system to make sure the contents of all the caches in the system are coherent, so that the threads can agree on a consistent view of the memory. The rules that threads need to follow and the guarantees that the system provides if they do are called a memory model.

Hence, what is meant by "core-to-core latency" is actually memory access latency, taking into account the penalty of the cache coherency protocol, which in turn is very much affected by the amount of sharing (concurrent reading and writing, in particular).

Here is a great lecture on CPU caches and their effect on software performance. Spoiler: If a program's performance is limited by "core-to-core latency", the programmer may have done something really bad by accident or ignorance. Note that the final slides have great references to further in-depth material on this topic.

Intel Memory Latency Checker. Not open source, but at free.
Thank you!
 
Back
Top