I just wanted to chime in about VRM cooling affecting overclocks. Even though I don't have an FX (undecided), but when I lowered the fans on my Cogage Arrow so that some air would flow over the large VRM heatsink I could reduce both the CPU-NB voltage (which reduced temps dramatically) and core voltage. I left my PC folding for the day and if I don't find that it has crashed then I think we can confirm the theory that certain Asus AMD motherboards need better cooling for the VRMs (I have a Sabertooth 990FX).
😱 Wow, very cool to hear this :thumbsup: thanks for taking the time to weigh in and confirm the observation (or rather, mine confirms yours
😉)
That at least suggests it isn't entirely a fluke or a one-off situation.
I have one of those IR guns that measures the surface temperature of objects, I think I'll explore this a little more and see just what kind of temperature delta I am getting with the extra cooling :hmm:
It blows my mind too, but for the opposite reason: The FX 8350 actually has equal IPC to a Core2Quad? Because even the first batch of Denebs were supposedly slightly slower clock for clock compared to C2Q, and Bulldozer was far slower than Deneb clock for clock, but Piledriver improved it a wee bit - but to my understanding, still not at Deneb levels. And yet, here it is well within Core2Quad levels or slightly better. What is it with gaussian that makes Piledriver suddenly look Deneb/C2Q level in single-thread performance? How would you characterize the computations or various other processing gaussian does?
This particular app traditionally strongly favored AMD's microarchitectures starting with the K7 Athlon. I suspect what we are seeing here is that AMD continued to hold this commanding IPC lead over Intel all the way up through Thuban, and that the lead was so large that even with the step back in IPC that piledriver might be relative to thuban it still isn't enough of a step back as to erase its IPC lead over the Kentsfield microarchitecture.
No,
You have to compare 8x single Bulldozer Cores (Half Module) against 8 CMT Bulldozer Cores (4x Modules)
An example is to use 2x Modules CMT (4 Threads) against 4x Modules (4 Threads) No CMT.
This is the only way to see how the CMT design works.
AMD’s Bulldozer CMT Scaling
Compering 8 K10 cores(8 threads) against 4 Bulldozer Modules (8 threads) makes no apples to apples comparison.
I have some data on that for this app.
At 4GHz, loading two modules with one thread each results in a compute time per thread of 330 seconds.
Loading both of those threads onto one module results in a compute time per thread of 376 seconds.
In this application the "CMT tax" is quite small, 330/376 = 0.88x, or roughly 14% loss (1/0.88) in scaling efficiency for the FX-8350.
Compare this to the "HT tax" on the 3770k, at 4GHz loading two physical cores results in a compute time per thread of 218 seconds. (ridiculously faster than the FX8350)
But load both threads onto a physical + virtual core pairing and the compute time per thread balloons to 397 seconds.
In this application the "HT tax" is quite large, 218/397 = 0.55x, or roughly 82% loss in (1/0.55) scaling efficiency for the FX-8350
😱
(100% loss in scaling efficiency would be tantamount to the performance you would expect in putting two threads onto just one single-threaded core)
edit: I see the powers that be have managed to torpedo your URL's again
🙁 given enough time I have every confidence we'll eventually find ourselves censoring anandtech.com too and righteously proclaiming "mission accomplished" over the spammers in the process
😀