Why is the response to Bulldozer so overwhelmingly negative?

Page 13 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

StrangerGuy

Diamond Member
May 9, 2004
8,443
124
106
In before how x86 doesnt matter anymore because ARM is going to rule over Intel but somehow AMD remains unaffected so AMD wins.
 
Dec 30, 2004
12,553
2
76
A Korean site had done a pretty thorough investigation of this behavior on release day.

http://udteam.tistory.com/442 [comparison of 2M/4C versus 4M/4C)

http://udteam.tistory.com/440 (scores from other CPUs of the benches used above)
http://udteam.tistory.com/441

In short:

Pro: Measurable performance increase

Con: At <= 4 threads, still loses to the lower clocked Phenom II 980 in everything except the WinRar benchmark.

Prefer the Techreport bench for some reason...easier to read.
 
Last edited:

Ajay

Lifer
Jan 8, 2001
16,094
8,112
136
Well, that's actually pretty interesting, at least for heavily threaded apps. It would be nice if M$ would add a revised scheduler to Win7 SP2.

It's easier to see why Cray didn't ditch Interlagos, with their custom OS and custom compiler, they can probably really tweak the typically heavily threaded apps to run very fast on Bulldozer.
 

Ferzerp

Diamond Member
Oct 12, 1999
6,438
107
106
Well, that's actually pretty interesting, at least for heavily threaded apps. It would be nice if M$ would add a revised scheduler to Win7 SP2.

It's easier to see why Cray didn't ditch Interlagos, with their custom OS and custom compiler, they can probably really tweak the typically heavily threaded apps to run very fast on Bulldozer.

By only using half the "cores?"

You realize what you're saying here, right?
 

bryanW1995

Lifer
May 22, 2007
11,144
32
91
Well, that's actually pretty interesting, at least for heavily threaded apps. It would be nice if M$ would add a revised scheduler to Win7 SP2.

It's easier to see why Cray didn't ditch Interlagos, with their custom OS and custom compiler, they can probably really tweak the typically heavily threaded apps to run very fast on Bulldozer.

Unless they have some magic power-saving bullet, however, they will also be spending a lot more money to run the machine.
 

Chiropteran

Diamond Member
Nov 14, 2003
9,811
110
106
By only using half the "cores?"

You realize what you're saying here, right?

Bulldozer actually does really well on the few applications that stress all 8 cores. It's main problem is less threaded applications that use between 1 and 4 cores.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,112
136
Unless they have some magic power-saving bullet, however, they will also be spending a lot more money to run the machine.

The server CPUs will be clocked down quite a bit, so it should be pretty efficient - maybe not as efficient as SB Xeons, but probably close enough.
 

ocre

Golden Member
Dec 26, 2008
1,594
7
81
Since when did performance/watt become king? Why are we not looking at absolute performance?

Since AMD used it as a selling point in the gpu divisions. All over the forums there has been a huge focus on performance per watt because AMD had the advantageover nvidia. the advantage isnt really that big now with cayman vs gtx500 series but its still thrown up all the time. AMD and the fanboys made it a big deal, the biggest deal possible, and now its come back to bite them in the....... buttocks
 

Arg Clin

Senior member
Oct 24, 2010
416
0
76
Since AMD used it as a selling point in the gpu divisions. All over the forums there has been a huge focus on performance per watt because AMD had the advantageover nvidia. the advantage isnt really that big now with cayman vs gtx500 series but its still thrown up all the time. AMD and the fanboys made it a big deal, the biggest deal possible, and now its come back to bite them in the....... buttocks
Surely I must have missed some of the finer points of previous fanboy vs fanboy debates. Thank you for the explanation. It actually makes sense by its sheer absurdity. :)
 

LOL_Wut_Axel

Diamond Member
Mar 26, 2011
4,310
8
81
Since AMD used it as a selling point in the gpu divisions. All over the forums there has been a huge focus on performance per watt because AMD had the advantageover nvidia. the advantage isnt really that big now with cayman vs gtx500 series but its still thrown up all the time. AMD and the fanboys made it a big deal, the biggest deal possible, and now its come back to bite them in the....... buttocks

What are you talking about? Performance/watt has ALWAYS been a big deal. It's the whole reason why the Athlon 64 was much better overall than the Pentium 4.

And yes, AMD currently does have a significant advantage in that metric over NVIDIA, not to mention most AMD GPUs are priced more competitively, just further sweetening the deal.

perfwatt_1920.gif


The first 13 cards in that chart are all by AMD and include some very popular ones. To say that AMD's GPUs currently don't have a good advantage in performance/watt against NVIDIA's is denial.

Talking about CPUs, Bulldozer looks horrible in this metric, and looking at its lackluster performance it's no wonder why it's up there with the Pentium 4 when it comes to bad CPU designs.

efficiency_total_wh.png


At least Fermi could defend itself a bit by the way of having very high performance.
 

piesquared

Golden Member
Oct 16, 2006
1,651
473
136
Because programing is lagging while going through a pretty big transition and haven't properly coded all software to be multicore aware yet (but couldn't reviewers at least use up to date software?????), means Bulldozer is a bit ahead of it's time. Fortunately, the landscape is changing and more and more software is available for either GPGPU hardware acceleration OpenCL, MS C++ AMP, Direct Compute or for the CPU through the newest compilers with FMA4 and XOP, AVX. Or all of them with Fusion products.

Appro seems to like what it sees in Interlagos where they take advantage of what they're given.

HPCwire: Why Appro and why AMD Opteron&#8482;-powered servers?

McLaughlin: Appro offers innovative supercomputing solutions by combining the performance advantages of AMD Opteron processors such as higher core count, significant memory enhancements, and improved floating point processing with optimized power management features. Being on the forefront of system cooling and optimization, Appro offers many configuration options and having features such as performance-on-demand by dynamically adjusting performance based on CPU utilization &#8211; helps systems to run at optimum performance and power levels, reducing electricity costs while maximizing IT budget dollars. Appro platforms based on AMD &#8221;Interlagos&#8221; processors will deliver greater performance per watt with improved memory bandwidth while reducing memory latency. AMD has been a great technology partner and has provided us with the tools and expertise to accelerate HPC business results. Working together, Appro and AMD provide customer-focused innovations that our High-Performance computing customers require.


HPCwire: Can we talk about the Appro Xtreme-X Supercomputer and new servers based on &#8220;Interlagos&#8221;? What can we expect?

McLaughlin: The Appro next generation, server platforms and the Xtreme-X Supercomputer will increase memory performance for high-performance computing while adding reliability, high availability, and flexibility for the end-users with the best performance/value. In addition, the new AMD architecture drastically improves the memory and I/O latency. It provides a much easier way to address larger chunks of memory faster, yielding huge performance boosts.

HPCwire: Who will benefit from Appro Xtreme-X Supercomputer and server platforms based on AMD Opteron&#8482; processors?

McLaughlin: Many industries will benefit, in particular industries that are using memory-intensive applications, such as those used in scientific research and engineering. Compute-intensive multi-threaded applications will also see an immediate benefit.


HPCwire: Many people in the industry are eager to see what the AMD &#8220;Interlagos&#8221; processors can offer. Where do things stand with the Appro Xtreme-X Supercomputer based on these processors?

McLaughlin: We're on target for a Q4 launch with &#8220;Interlagos&#8221;-based systems. We're actually pretty excited about the product, particularly with respect to some of the features that we're going to enable. We started out Appro&#8217;s next generation Xtreme-X, with a certain feature set. After gathering customer feedback on the design, we changed the features to address faster hybrid multi-core technologies with improved power and cooling, flexible Interconnect options with fast I/O bandwidth, superior HPC software stack integration with complementary Appro Cluster Engine&#8482; (ACE) management suite, including capabilities such as job scheduling, revision control, and fault tolerance. Taken together, these features are very important to the HPC industry. As a result, we think that our next generation product will have a bigger impact than we originally anticipated.

HPCwire: The AMD &#8220;Interlagos&#8221; processor offers some interesting innovations. How are you able to take full advantage of the processor&#8217;s capabilities?

McLaughlin: The Appro Xtreme-X supercomputer will offer enhancements to match the new processor and future generations. To accomplish this, we made significant changes to the Xtreme-X architecture to complement the new AMD &#8220;Interlagos&#8221; processor. We looked at the overall platform including thermal, mechanical, and chassis cooling design, as well as various interconnects so we can continue to deliver on quality, value, and performance.

HPCwire: As we know, whenever a new processor is announced customers need more than just a spec sheet to understand how the processor will help their specific HPC applications. What&#8217;s been done to prepare the market for the AMD &#8220;Interlagos&#8221; processors?

McLaughlin: The AMD Seed program is the key. It is definitely important and critical for our industry. People want to kick the tires, especially when they are getting a lot more cores to make sure their applications can benefit and can reach a quicker ROI. Through the seed program, we&#8217;ve provided samples to customers and we also performed customer benchmark testing in Appro&#8217;s Houston and Milpitas facilities. In response to that, we already have customers lined up and more announcements will be made in the near future.

This answer shows it wasn't a prior commitment made by Appro, and are now just fullfiling that commitment. They knew exactly what Interlagos was capable of before making any design decisions.

http://www.hpcwire.com/hpcwire/2011...7;E2%80%9Cinterlagos%E2%80%9D_processors.html
 
Last edited:

bononos

Diamond Member
Aug 21, 2011
3,928
186
106
Software not properly coded? If AMD is ahead of its time, why does its multithreaded performance benchmarks(all the way to 8 threads) only lie btwn the 4 cores 2500k and 2600k? Can faulty Win7 scheduling problems account for most of the shortfall?

One reviewer (hardocp) I think also said that his Intel setup was more responsive in stress testing (full cpu/memory load) compared to the BD rig.
 

LOL_Wut_Axel

Diamond Member
Mar 26, 2011
4,310
8
81
Software not properly coded? If AMD is ahead of its time, why does its multithreaded performance benchmarks(all the way to 8 threads) only lie btwn the 4 cores 2500k and 2600k? Can faulty Win7 scheduling problems account for most of the shortfall?

One reviewer (hardocp) I think also said that his Intel setup was more responsive in stress testing (full cpu/memory load) compared to the BD rig.

That's the advantage of having higher single-threaded performance when comparing two multi-core CPUs. In almost all modern Operating Systems you gain an advantage in overall responsiveness and multitasking when you go from a single-core to a dual-core, but after a dual-core it's all single-threaded performance that makes the big difference. It's the reason why talking about older CPUs you'd prefer (only taking into account performance here, not power consumption) a Pentium D 805 over an Athlon 64 3000+ for a modern OS like Vista/Win 7. At the same time, it's a better idea to have a Core i3-2100 versus a Phenom II X4 955 for normal tasks because of the higher single-threaded performance.

TL;DR: an i5 would feel more responsive under load than an FX 8-core because of the higher single-threaded performance. IIRC, the fact it has a much faster L1 cache should help, too.
 

bononos

Diamond Member
Aug 21, 2011
3,928
186
106
That's the advantage of having higher single-threaded performance when comparing two multi-core CPUs. In almost all modern Operating Systems you gain an advantage in overall responsiveness and multitasking when you go from a single-core to a dual-core, but after a dual-core it's all single-threaded performance that makes the big difference......
Hmm I was thinking that maybe windows dedicates a thread to explorer among other things so on a heavily loaded system, the multicore PC with the faster single threaded perf wins out in terms of responsiveness. But if explorer itself was multithreaded....

Well that doesn't matter, can't keep shifting the goalposts.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
Hmm I was thinking that maybe windows dedicates a thread to explorer among other things so on a heavily loaded system, the multicore PC with the faster single threaded perf wins out in terms of responsiveness. But if explorer itself was multithreaded....

Well that doesn't matter, can't keep shifting the goalposts.
Explorer doesn't do enough actual work to need to be multithreaded. It will be mostly limited by disk and network performance, even with a fast SSD. For tasks which are not embarrassingly parallel, diminishing returns typically begin after two threads (making the assumption that if it was made to use more than one thread, going from one to two threads was was worthwhile). Things like games can keep wringing out more performance because at a low level, they are themselves multitasking.

We will be able to use more and more cores at once, over time, but the performance of a single thread will not become unimportant. So, what you're thinking of does happen, but it has been happening for some time, and it hasn't become more useful than it was when we got the Athlon64 X2, nor will it. What you want to have, generally, is one more core than your most multithreaded application can make good use of.