Anyone have BullDozer *Distributed Computing* benchmarks?

VirtualLarry

No Lifer
Aug 25, 2001
56,587
10,227
126
As most of you probably know, BullDozer was officially released to the world last night at midnight. And not to a lot of acclaim. Many benchmarks, show it SLOWER than their older 6-core Thuban chip.

But all is apparently not lost, the new CPU supports some new opcodes that might cause it to show improvement:
http://forums.anandtech.com/showthread.php?t=2198168

Just wondering if anyone has DC-related benchmark info for BD, and whether or not their favorite project is going to be re-compilied to support BD new opcodes?

It should be noted also that these chips burn A LOT of power when overclocked. Something like 200W MORE than an equivalent Intel SB CPU.
 

somethingsketchy

Golden Member
Nov 25, 2008
1,019
0
71
I've been doing some reading and from what I can tell, unless the DC projects suppose some of the newer architectures outlined (AVX for example), I don't believe it would do much better than most. Granted you do have eight disccrete cores with their own L1/L2 cache, but I feel like the instruction set may not be enough to give the eight core beasts a good run on the Sandy Bridge chips.

Judging by some of the real world benchmarks (i.e. the software build of Chromium), I was very disappointed by those numbers. Maybe DC projects are a key demographic for this kind of processor, but again I don't believe there are any projects that have AVX instruction set coded for Bulldozer chips.
 

Rudy Toody

Diamond Member
Sep 30, 2006
4,267
421
126
Perhaps, Ken_g6 could tell us if any of the new instructions could be used to good advantage on parallel dc projects.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,839
4,820
75
First of all, the AVX instruction set probably would help some projects on some chips. The problem with Bulldozer is that an 8-core chip has only 4 full AVX units. Each AVX unit will have to trade off between two cores. This can work if there's extra integer work to be done in the meantime; but an AVX instruction will also block any other floating-point work (for a cycle.) Efficient projects probably won't be helped much by AVX on Bulldozer, as they could be doing 2 SSEx operations in parallel on the two cores anyway.

If you're thinking of PrimeGrid sieves, I don't use floating-point in those. (Well, except for some 80-bit FPU math hijinks, but that's not helped by AVX either.) I'm waiting for Haswell and its integer equivalent to AVX. But the sieves could be obsolete by then anyway.
 

Rudy Toody

Diamond Member
Sep 30, 2006
4,267
421
126
Thanks, Ken!

I misunderstood the setup. I thought that only the use of double precision FP blocked one core, but that each could use single precision FP without getting in each others way.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,839
4,820
75
Oh, yeah, we're way beyond double precision! (And yet not.)

Way back on the original Pentium, Intel noticed that there were these eight 80-bit registers (usually used for 64-bit double-precision math at most) that couldn't be used for anything integer. So they came up with MMX, which allowed a single instruction to operate on sets of either 8 bytes, 4, pairs-of-bytes ("words"), or occasionally two double-words (32-bit integers).

Well, this was a hit. But floating-point math remained as it had always been, awkward and slow. AMD actually had the bright idea of working with two 32-bit floating point numbers at once in the same way, in the same registers. This was called "3Dnow!". But it never really took off.

So in the Pentium III, Intel created eight SSE registers. Each held 128 bits (4 single-precision numbers or 2 double-precision numbers), and could operate on them as sets. You can also work with as little as a single FP number at a time here. This is the new standard way to do floating point math. (Except for some 80-bit FPU math hijinks.)

Well, SSE2 added integers to the mix, and SSE3, 4.1, and 4.2 have added new instructions. With 64-bit processors they even added 8 more SSE registers. But they've stayed 128 bits, until Sandy Bridge.

Sandy Bridge and Bulldozer have 256-bit AVX registers, 128 bits of which can still be used as SSE registers. Applications made specifically to use AVX (either through compiler optimizations that generally don't exist yet or through manual assembly coding) can work with 4 double-precision or 8 single-precision FP numbers at a time. But "at a time" can be misleading - I believe Sandy Bridge simply works with half of the AVX registers in each clock cycle. So there are fewer instructions to read if that's a bottleneck, but otherwise speed stays the same for now.

Bulldozer has an interesting trick, sharing 1 256-bit FPU between two integer processors. In applications that use regular SSE registers, half the FPU is allocated to each processor, and they can both do work at the same time. If one (and only one) of a pair of processors wants to do an AVX instruction, that instruction takes only one clock cycle, apparently working twice as fast. But if, as in DC, all the processors want to do AVX work at the same time, they have to trade off, and the speed again becomes 2 clock cycles per instruction, just like Intel.

In case you're wondering, yes, you could run 4 AVX-heavy WUs and 4 integer-heavy WUs on an 8-core bulldozer at the same time. If they were on the proper cores, the AVX work would be up to twice as fast. But current OSes don't know how to allocate work to the proper cores. Reportedly Windows 8 will know how, but it would seem to be a very tough thing to do.
 

somethingsketchy

Golden Member
Nov 25, 2008
1,019
0
71
But current OSes don't know how to allocate work to the proper cores. Reportedly Windows 8 will know how, but it would seem to be a very tough thing to do.

This, to me, is quite a huge dealer breaker for me with Bulldozer. When I read that Windows 7 was optimized to deal with this scheduling for Bulldozer, I was wondering why AMD even bothered. They should have waited until Piledriver and then work on the poor single thread performance, or even the extra memory latency they've introduce. I figured with Phenom II, AMD would have been on track to match Intel, but that's not the case.
 

bryanW1995

Lifer
May 22, 2007
11,144
32
91
I brought this thread-scheduling issue up with John Fruehe 6 months ago. At the time, I thought that he gave a good answer, but apparently 6 months later they're still "working on it". Maybe they have top men working on it like in Raiders of the Lost Ark, but it seems like an issue with no easy answers.
 

biodoc

Diamond Member
Dec 29, 2005
6,350
2,243
136
I'll wait until I see performance data with linux as OS on various dc projects including F@H before I pass judgement. However, I'm not optimistic.
 

ZipSpeed

Golden Member
Aug 13, 2007
1,302
170
106
I'm looking to add another cruncher into my arsenal and was dreaming big that Dulldozer would be a great fit. My eyes are now set on a 2600k. Hopefully AMD can fix the IPC and power consumption with Piledriver but I'm not holding my breath.
 

brownstone

Golden Member
Oct 18, 2008
1,340
33
91
Regardless of whether or not it was what everyone expected, I'd still like to see some F@H benchmarks. There seem to be quite a few similar threads floating around in different forums, but I can't seem to make it past all the "I'm butthurt" comments to get to any actual data. So if anyone knows where I can get F@H info without the useless/frivolous/needless/annoying/fanboy,anti-fanboy/obnoxious/disappointed/hurt/angry/offensive/ridiculous/hypothetical/I-told-you-so/argumentative/unoptimistic/flaming commentary, please, do tell.
 

Smartazz

Diamond Member
Dec 29, 2005
6,128
0
76
Another problem with Bulldozer is that the 2600K can usually outperform it using a lot less power. I'm not sure how the 4 extra HT cores help in DC vs the 8 integer cores on Bulldozer.
 

nanaki333

Diamond Member
Sep 14, 2002
3,772
13
81
i should have some tomorrow. i have an 8150 i setup last friday to be a folder. going to fire it up tomorrow to see how it fares.
 

Mr. Pedantic

Diamond Member
Feb 14, 2010
5,027
0
76
Last edited:

brownstone

Golden Member
Oct 18, 2008
1,340
33
91
According to the first comment for this review (from the reviewer) it gets about 14k ppd at stock on SMP (and also for -bigadv which is strange if true).

http://www.overclockers.com/amd-fx-8150-bulldozer-processor-review

Thanks theAnimal for the link. What is the state of bigadv these days? Have the points been nerfed on them, are they as prevalent as they used to be?

i should have some tomorrow. i have an 8150 i setup last friday to be a folder. going to fire it up tomorrow to see how it fares.

I look forward to the results nanaki333.
 

theAnimal

Diamond Member
Mar 18, 2003
3,828
23
76
Thanks theAnimal for the link. What is the state of bigadv these days? Have the points been nerfed on them, are they as prevalent as they used to be?

The bigadv WU base points were dropped a while ago; they used to give 50% bonus over SMP but that is now 20%. If you have at least 12 threads you can run the new bigger bigadv which give very nice PPD.
 

brownstone

Golden Member
Oct 18, 2008
1,340
33
91
I'm getting 12913.58 in fah currently without -bigadv

Very interesting. I guess I may have to stick with the ol' 1090T for now then. On a side note, you are about to blow past me on the F@H stats...2 days, my drop in rank is imminent. Congrats!

The bigadv WU base points were dropped a while ago; they used to give 50% bonus over SMP but that is now 20%. If you have at least 12 threads you can run the new bigger bigadv which give very nice PPD.

lol, now with more biggerness! Now I just need to figure out how to get 12 threads on the cheap.
 

Smartazz

Diamond Member
Dec 29, 2005
6,128
0
76
Very interesting. I guess I may have to stick with the ol' 1090T for now then. On a side note, you are about to blow past me on the F@H stats...2 days, my drop in rank is imminent. Congrats!



lol, now with more biggerness! Now I just need to figure out how to get 12 threads on the cheap.

I wonder how much the Gulftown 6-core processors will go for used once SB-E is released.
 

nanaki333

Diamond Member
Sep 14, 2002
3,772
13
81
well my bulldozer was hanging on the same WU at 0% for over 9 hours when i checked a few minutes ago. now? it locked up.

i power cycled it and started it back up on a completely new WU, now it's pully 16365.33 PPD.
 
Last edited:

nanaki333

Diamond Member
Sep 14, 2002
3,772
13
81
wow.. ok.. this thing has serious problems. it just errored and restarted on a new project.

[03:59:16] CoreStatus = C0000029 (-1073741783)
[03:59:16] Client-core communications error: ERROR 0xc0000029

CPUz shows max TDP of 223W. that's almost as high as an i5 2400 + gtx 465 (unlocked to 470) running 24/7.
 
Last edited:

nanaki333

Diamond Member
Sep 14, 2002
3,772
13
81
it is 100% stock. i had to put a phenom x4 840 in to flash the bios (brand new mobo) to get the FX to POST.

just errored again.

[04:16:22] Completed 5000 out of 500000 steps (1%)
[04:20:06] CoreStatus = C0000029 (-1073741783)
[04:20:06] Client-core communications error: ERROR 0xc0000029
[04:20:06] Deleting current work unit & continuing...