More info regarding Phenom TLB issues

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

taltamir

Lifer
Mar 21, 2004
13,576
6
76
Held to a higher standard then INTEL?!
AMD is the underdog, everyone is giving it the benefit of the doubt, but you can only do it so far.

BTW I looked up the intel errata (which was handled much better by the intel press, execs, etc), it basically allows people to bypass the no execute bit, a hardware method for eliminating a specific kind of virus (a hardware method that didn't EXIST in older chips anyways)...

While it is kind of bad that a crippled virus will now work again (assuming you run it to begin with), its not hardly as bad as 13% average drop (0-59% drop depending on application) in performance as well lying about the default speeds of hyper transport (another few percent drops) as well hiding the whole shebang.
 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
If we can get away from the Intel/AMD issues for a bit, I have some more info on the TLB bug directly from an AMD source...

https://www.x86-64.org/piperma...7-December/010259.html">x86-org discussion group</a>
(please note that the link's site certification is messed up)
Edit: The link is no longer working...I will try to get a copy of the posting here.

What I've learned so far as to the nature is:

1. The problem is due to the inclusive nature of the L3 cache. This means that a page table entry can simultaneously be in the L1 or L2 cache for one or more cores, and in the L3 cache shared among cores. When one CPU core needs to update the page table entry, it has to allow for the case that it is in more than one cache.

2. to get the errata to have an effect, requires actions by two cores. One core (call it A) needs to access the cache line (and mark it as accessed). The page table entry is not in L2, but happens to be in L3. So the CPU copies it to L2, and marks the line as accessed. Another core, B, comes along during this process, also finds the cache line not in any L1 or L2 cache, and copies the cache line to its L2 cache, then marks it. However, it changes a different bit.

3. According to the write-up, the L2 copies are updated consistently. But what if the L2 caches are updated in order AB, and the L3 caches in order BA? That is the race condition, and it takes additional traffic to the L3 cache to cause that misordering. Now we have valid copies in L2, but an invalid copy in L3. However, nothing is really broken, since the L2 copies, which will be seen by all CPUs, are consistent. However, cache pressure can but usually won't, force the L2 copies to be evicted, while the invalid L3 copy stays around. Now you have potentially a dirty page table entry getting lost. Notice it is that the wrong L3 copy is not the problem, it is that the page that the entry describes won't get written out when it should be.

4. Finally, how do you fix this errata? You change the order of the updates of L3 (if present) and L2, and keep the local L2 locked against modification while doing so. This will cause a performance hit if there are page table entries in the L3 cache. That can be ameliorated by insuring that page table entries are loaded directly to L2, bypassing L3. You still need to deal with a potential copy in L3, as the operating system will access page table entries (even if just to create them) through ordinary code.

(My thanks to Robert E. for these points...)
 

AmberClad

Diamond Member
Jul 23, 2005
4,914
0
0
The fix that you just described, is that the current microcode update? Or is that what the permanent solution implemented in the future revisions will entail?
 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
Originally posted by: AmberClad
The fix that you just described, is that the current microcode update? Or is that what the permanent solution implemented in the future revisions will entail?

I believe that's what the patch does...
 

CTho9305

Elite Member
Jul 26, 2000
9,214
1
81
Originally posted by: Viditor
If we can get away from the Intel/AMD issues for a bit, I have some more info on the TLB bug directly from an AMD source...

https://www.x86-64.org/piperma...7-December/010259.html">x86-org discussion group</a>
(please note that the link's site certification is messed up)
Edit: The link is no longer working...I will try to get a copy of the posting here.
It works fine - you botched the link. Here is the first post, and here is the thread index.
 

Viditor

Diamond Member
Oct 25, 1999
3,290
0
0
Originally posted by: CTho9305
Originally posted by: Viditor
If we can get away from the Intel/AMD issues for a bit, I have some more info on the TLB bug directly from an AMD source...

https://www.x86-64.org/piperma...7-December/010259.html">x86-org discussion group</a>
(please note that the link's site certification is messed up)
Edit: The link is no longer working...I will try to get a copy of the posting here.
It works fine - you botched the link. Here is the first post, and here is the thread index.

Many thanks CTho...do you have any comment on the post?
 

CTho9305

Elite Member
Jul 26, 2000
9,214
1
81
Originally posted by: Viditor
Originally posted by: CTho9305
Originally posted by: Viditor
If we can get away from the Intel/AMD issues for a bit, I have some more info on the TLB bug directly from an AMD source...

https://www.x86-64.org/piperma...7-December/010259.html">x86-org discussion group</a>
(please note that the link's site certification is messed up)
Edit: The link is no longer working...I will try to get a copy of the posting here.
It works fine - you botched the link. Here is the first post, and here is the thread index.

Many thanks CTho...do you have any comment on the post?

Not really, sorry. I found this interesting, though - it fits with some other observations about crappy things that Gecko (Firefox, SeaMonkey, Thunderbird, etc) does. There are claims that the linux patch results in a ~1% performance hit, and that was interesting because it means that the A and D bits aren't actually all that significant from a performance standpoint. Given that this linux patch is available now, it'd be interesting if someone could run benchmarks using it.

edit: clarifying that I'm looking for linux-patch benchmarks, not bios-patch benchmarks myocardia linked to in the post below this one.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Originally posted by: myocardia
Originally posted by: CTho9305
Given that the patch is available now, it'd be interesting if someone could run benchmarks using it.

Someone has already run benchmarks using it.

Absolutly dismal results.

By time they fix this, it will make a correctly functioning (if there is such a thing) K10 cpu a year late.

At least AMD is consistent.

 

CTho9305

Elite Member
Jul 26, 2000
9,214
1
81
Originally posted by: myocardia
Originally posted by: CTho9305
Given that the patch is available now, it'd be interesting if someone could run benchmarks using it.

Someone has already run benchmarks using it.

I was referring to the linux patch, NOT the BIOS patch. The linux patch is the one with supposedly minimal performance impact (it's not fundamentally linux-specific - someone with the source code to Windows could do the same thing).

Originally posted by: ViRGE
Originally posted by: myocardia
Originally posted by: CTho9305
Given that the patch is available now, it'd be interesting if someone could run benchmarks using it.

Someone has already run benchmarks using it.
Holy cow, look at those Firefox results!:Q :(

Firefox does Very Bad Things that happen to be exacerbated by the BIOS patch. I'd like to see the results with the linux patch.