For Intel systems how many clock cycles to go from logical address to physical address?

chrstrbrts

Senior member
Aug 12, 2014
522
3
81
Hello,

So, I'm making my way through the Intel bible, Intel 64 and IA-32 Architectures Software Developer’s Manual Combined Volumes 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, and 3D, and I'm reading about segmentation and paging.

I understand, basically, how to go from a logical address to a physical one that you put on a bus and hand over to a memory controller of some kind.

Though, I'm wondering how many clock cycles does it take.

There seem to be quite a few steps in referencing tables and directories and such.

That is, if you want to reach into physical memory and read or write a location, you have to access descriptor tables, page directories, and page tables all of which are located in physical memory themselves.

So, one attempt at a memory fetch requires 5 or more memory fetches just to grab all the pointers you need to find the address of the memory location that you initially wanted.


So, I'd like to know how many clock cycles does it take to go from a logical address to a physical address if no TLB or paging structure caches are used.

I'd also like to know how many clock cycles does it take to go from a logical address to a physical address if TLB or paging structure caches are used.

Thanks.
 
Last edited:

Schmide

Diamond Member
Mar 7, 2002
5,692
930
126
Well it doesn't take 5 memory fetches to load memory. The GDT can be thought of as always loaded and with modern page based OSes, the LDT is generally unnecessary. So say you're working in a 32 bit system. This means a paging system for a 4GiB space. That's a 1024 entry directory pointing to 1024 entry page tables. If data is in the TLB you can almost bet that it's in the cache system somewhere. So you're looking at 4-75 cycles. Otherwise you're looking at 3 memory accesses 60+ each. Now if that page is not in memory, you're loading that data from disk and well, that's in the 500k cycles.

Edit: on this.
So, I'd like to know how many clock cycles does it take to go from a logical address to a physical address if no TLB or paging structure caches are used.

If you're not using paging. The logical address is mostly the physical address+(offset). In general it takes zero time for the processor to compute this. Because of caching; paging puts, on average, very little on top of any address calculation. Some say at most 10%. Pretty much once you go to main memory it's all the same.

Here's a good table. https://gist.github.com/jboner/2841832
Code:
Latency Comparison Numbers
--------------------------
L1 cache reference                           0.5 ns
Branch mispredict                            5   ns
L2 cache reference                           7   ns                      14x L1 cache
Mutex lock/unlock                           25   ns
Main memory reference                      100   ns                      20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy             3,000   ns        3 us
Send 1K bytes over 1 Gbps network       10,000   ns       10 us
Read 4K randomly from SSD*             150,000   ns      150 us          ~1GB/sec SSD
Read 1 MB sequentially from memory     250,000   ns      250 us
Round trip within same datacenter      500,000   ns      500 us
Read 1 MB sequentially from SSD*     1,000,000   ns    1,000 us    1 ms  ~1GB/sec SSD, 4X memory
Disk seek                           10,000,000   ns   10,000 us   10 ms  20x datacenter roundtrip
Read 1 MB sequentially from disk    20,000,000   ns   20,000 us   20 ms  80x memory, 20X SSD
Send packet CA->Netherlands->CA    150,000,000   ns  150,000 us  150 ms

Notes
-----
1 ns = 10^-9 seconds
1 us = 10^-6 seconds = 1,000 ns
1 ms = 10^-3 seconds = 1,000 us = 1,000,000 ns

Credit
------
By Jeff Dean:               http://research.google.com/people/jeff/
Originally by Peter Norvig: http://norvig.com/21-days.html#answers

Contributions
-------------
Some updates from:       https://gist.github.com/2843375
'Humanized' comparison:  https://gist.github.com/2843375
Visual comparison chart: http://i.imgur.com/k0t1e.png
Animated presentation:   http://prezi.com/pdkvgys-r0y6/latency-numbers-for-programmers-web-development/latency.txt
 
Last edited:

Merad

Platinum Member
May 31, 2010
2,586
19
81
So, I'd like to know how many clock cycles does it take to go from a logical address to a physical address if no TLB or paging structure caches are used.

I'm assuming you mean "how long does it take when there's a TLB hit", since all address translation (AFAIK) goes through the TLB. Modern CPUs have a TLB hierarchy and AFAIK the L1 TLB is basically on par with the L1 d-cache at ~4 cycles to access.

I'd also like to know how many clock cycles does it take to go from a logical address to a physical address if TLB or paging structure caches are used.

There isn't really an easy answer to this, because resolving a TLB miss is a complex operation with many factors involved. Modern x86 has dedicated hardware to walk the page tables and resolve the miss, and I believe even has facilities to perform speculative searches before a miss actually occurs, as well as supporting multiple simultaneous searches. I don't know many details about them, though. They're probably somewhere in those manuals you're reading.

The main potential problem is that the page tables may not be resident in memory. Absolute worst case scenario I suppose you could have many (several? a dozen? I'm not sure) page faults trying to resolve the correct PTE. Each page fault will cost you millions of cycles. That's highly unlikely, however. IIRC "typical" time to resolve a TLB miss is ~100 cycles.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,585
4,495
75
In practice how would one use this information ?
Try to limit the amount of memory you need access to at any one time to something that fits in the L1 cache. If that's not possible, try for something that fits in the L2 cache. If that's not possible either, consider prefetching - though it may not be as effective as it used to be.
 

chrstrbrts

Senior member
Aug 12, 2014
522
3
81
In practice how would one use this information ?

LOL....If you follow my lines of questioning here, you'll see that I'm mostly concerned with hypotheticals and the garnering of knowledge for the sake of the garnering of knowledge.

Practical I am not.
 
  • Like
Reactions: Ken g6