- Nov 19, 2007
- 2,793
- 2
- 0
Discuss...
TLB bug is gone! Fix has no real performance impact.
I guess that's one less thing to worry about.
TLB bug is gone! Fix has no real performance impact.
I guess that's one less thing to worry about.
Originally posted by: BlueAcolyte
TLB bug is gone! Fix has no real performance impact.
Originally posted by: lopri
What's the consensus when it comes to the best motherboard for Phenom? Stablility first and overclockability second. Features, layout, etc. not considered.
Originally posted by: lopri
What's the consensus when it comes to the best motherboard for Phenom? Stablility first and overclockability second. Features, layout, etc. not considered.
Originally posted by: taltamir
actually, notice that the B3 phenom is slightly faster then the B2 without the bios fix. they used winrar because it was the biggest sufferer from the bios fix, loosing 72.8% of speed.
Nobody publishes standard deviations, so we don't know confidence intervalsOriginally posted by: Idontcare
Originally posted by: taltamir
actually, notice that the B3 phenom is slightly faster then the B2 without the bios fix. they used winrar because it was the biggest sufferer from the bios fix, loosing 72.8% of speed.
Not sure how much benching you do with the WinRAR utility but it is not the bastion of repeatability. Anything that is in the same "ballpark" merely means equivalence.
You can't take a Phenom B2, or any chip that uses multi-level cache, and hard-wire it to force cache-evictions and suddenly get higher performance. It simply doesn't work that way.
Originally posted by: CTho9305
Actually, if hitting in the L3 is faster than hitting in another processor's L2, it could improve performance, couldn't it? I hadn't thought about that before. It shouldn't be too hard for a good programmer to write a test program that thrashes the TLB from multiple threads.
Originally posted by: Idontcare
Originally posted by: CTho9305
Actually, if hitting in the L3 is faster than hitting in another processor's L2, it could improve performance, couldn't it? I hadn't thought about that before. It shouldn't be too hard for a good programmer to write a test program that thrashes the TLB from multiple threads.
This is a good point. Hyperthreading spent a couple years in the doghouse because its corner conditions of thread thrashing actually resulted in performance degradation so I have no doubt that anything which gets thrashed around (when Windows moves threads from core to core to core) will actually benefit in having its prior L2 data evicted to the L3 as windows moves the thread to another core...as then when the thread accesses its own data (otherwise stuck on the other core's L2) at least it only has to hit the L3.
Thread thrashing on my Intel quads is awful under WinXP. When I run four single-threaded applications which saturate a CPU core each I will see upwards of a 15% performance penalty from thread thrashing unless I lock the affinity for each thread.
Probably the same factor at work is why using Affinity Changer for F@H gives better PPD, it pins down the threads and prevents cache thrashing.Originally posted by: soccerballtux
Wow that is very interesting!
Originally posted by: VirtualLarry
Probably the same factor at work is why using Affinity Changer for F@H gives better PPD, it pins down the threads and prevents cache thrashing.Originally posted by: soccerballtux
Wow that is very interesting!
Originally posted by: CTho9305
Originally posted by: VirtualLarry
Probably the same factor at work is why using Affinity Changer for F@H gives better PPD, it pins down the threads and prevents cache thrashing.Originally posted by: soccerballtux
Wow that is very interesting!
It's not just caches - you throw away all of your branch prediction history when you move a thread across processors.
Originally posted by: VirtualLarry
IDC, have you tested XP with the MS multicore patch V4 (or newer)? Supposedly that helps with the thread thrashing, so I've read.
Originally posted by: Idontcare
Originally posted by: VirtualLarry
IDC, have you tested XP with the MS multicore patch V4 (or newer)? Supposedly that helps with the thread thrashing, so I've read.
If it isn't something that gets taken care of by Microsofts auto-update then no I haven't tested it.
If microsoft isn't confident in it enough to make it part of their auto-update KB's then I wouldn't put it on my computers anyway. I have enough troubles as it is without inviting new ones that even Microsoft's QC isn't willing to stand behind.
But if this is something they are "patching" on XP, then I take it this is no longer an issue on Vista? Anyone care to pop open task manager on a dualcore or a quadcore and watch the CPU loads while you run a single-threaded app? Does it peg just one core at 100% utilization or do you see all cores getting hit with the average utilization number (50% on dual-core, 25% on quad)?
Originally posted by: soccerballtux
Originally posted by: Idontcare
Originally posted by: CTho9305
Actually, if hitting in the L3 is faster than hitting in another processor's L2, it could improve performance, couldn't it? I hadn't thought about that before. It shouldn't be too hard for a good programmer to write a test program that thrashes the TLB from multiple threads.
This is a good point. Hyperthreading spent a couple years in the doghouse because its corner conditions of thread thrashing actually resulted in performance degradation so I have no doubt that anything which gets thrashed around (when Windows moves threads from core to core to core) will actually benefit in having its prior L2 data evicted to the L3 as windows moves the thread to another core...as then when the thread accesses its own data (otherwise stuck on the other core's L2) at least it only has to hit the L3.
Thread thrashing on my Intel quads is awful under WinXP. When I run four single-threaded applications which saturate a CPU core each I will see upwards of a 15% performance penalty from thread thrashing unless I lock the affinity for each thread.
Wow that is very interesting!