Steamroller on AM3+

lifeblood

Senior member
Oct 17, 2001
999
88
91
I’ve read in a few places that AMD will still use the AM3+ socket for one more CPU after Vishera (probably Steamroller).
http://hexus.net/tech/news/mainboard/45889-amd-looks-standardise-sockets-am3-fm2/
http://www.theinquirer.net/inquirer/news/2208525/amd-sticks-with-socket-am3-for-steamroller
To me that is bad news for two reasons:

1. Soon after BD was released (and shown to suck), AMD stated PD would also use AM3+. I felt they did it to try to 'make up' to those of us who bought AM3+ boards in prep for BD. Now that PD is around the corner they do the same again. That makes me suspicious that PD will also be a huge disappointment so they are trying to keep us hanging on for SR.

2. All this talk about Fusion and its benefits hinge on the CPU also having a iGPU. One of the reasons the BD architecture has 8 integer clusters but only 4 FPU is the idea that the GPU can perform many functions that FPUs do. If SR doesn’t have an integrated GPU then it can’t offload FPU calculations to the GPU which puts it at a disadvantage. The AM3+ architecture does not have pins for a GPU, so I don’t see how SR can have the integrated GPU AMD keeps telling us is the future.

Maybe I’m wrong and AMD is just doing this to avoid having to design a new socket that will only last until they release all their processors on a single socket, whatever that socket is. That means item number 1 isn’t relevant, but item 2 still applies. Either way, this doesn’t fill me with confidence. If fusion is the future then get to it and make it happen. Stop with the long drawn out transition.
 

Arkaign

Lifer
Oct 27, 2006
20,736
1,379
126
I hope they keep AM3+ for a while longer, that would be awesome, particularly if they make some decent gains.

I can fix BD architechture for you :

Die-shrink + lower latency caches + double the FPUs. Done.

As long as they try this bullcrap 'module' idea with gimped/incomplete 'cores', then performance will remain very hit or miss depending on app. Fix it so Integer + FPU = 1 core, multiply by 4, 6, or 8, and bam, it will work well for all scenarios.

Intel doesn't try to sell hyperthreaded I3s as 'quad cores'. AMD shouldn't try to sell gimped cores as being more than they are.

Look at an 8350 as a quad core with an AMD form of HT, and it actually looks half decent.
 

blastingcap

Diamond Member
Sep 16, 2010
6,654
5
76
I would LOVE it if they came out with a very-low-idle-power single or dual core Steamroller CPU with ECC support so I could stick it into my spare rig and convert it into my new NAS.

If they do NOT do this, I will simply go over to Intel and AMD gets no revenue from me at all.
 
Last edited:

Insert_Nickname

Diamond Member
May 6, 2012
4,971
1,695
136
I hope they keep AM3+ for a while longer, that would be awesome, particularly if they make some decent gains.

I can fix BD architechture for you :

Die-shrink + lower latency caches + double the FPUs. Done.

As long as they try this bullcrap 'module' idea with gimped/incomplete 'cores', then performance will remain very hit or miss depending on app. Fix it so Integer + FPU = 1 core, multiply by 4, 6, or 8, and bam, it will work well for all scenarios.

Intel doesn't try to sell hyperthreaded I3s as 'quad cores'. AMD shouldn't try to sell gimped cores as being more than they are.

Look at an 8350 as a quad core with an AMD form of HT, and it actually looks half decent.

What I do not understand is why they chose not to expose the second "core" of each module as a form of HT on steroids. It would even be easy as windows has support for HT built in...:confused:

As I remember it (please correct if wrong) the windows scheduler prioritizes filling the physical cores first and then load the HT ones...

I would LOVE it if they came out with a very-low-idle-power single or dual core Steamroller CPU with ECC support so I could stick it into my spare rig and convert it into my new NAS.

Me too...:D

I could come up with a few uses for such a box...
 

nehalem256

Lifer
Apr 13, 2012
15,669
8
0
What I do not understand is why they chose not to expose the second "core" of each module as a form of HT on steroids. It would even be easy as windows has support for HT built in...:confused:

Because that would be admitting that their 8-"core" processor is really just a 4core process with super-HT.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
What I do not understand is why they chose not to expose the second "core" of each module as a form of HT on steroids. It would even be easy as windows has support for HT built in...:confused:

As I remember it (please correct if wrong) the windows scheduler prioritizes filling the physical cores first and then load the HT ones...



Me too...:D

I could come up with a few uses for such a box...

First of all, the Bulldozer architecture has nothing to do with HT, BD actually has 8 Integer Cores, that's why they(AMD) call it 8-core CPU.

Secondly, Windows doesn't understand the difference of Logical and Physical cores.
 

eternalone

Golden Member
Sep 10, 2008
1,500
2
81
I say BS you really believe AMD????? They also said BD would work 100% on AM3 all lies again and again. They are just trying to maintain confidence in their crappy cpu's. Just wait till it actually comes out before you invest, patience pays off.
 

Arkaign

Lifer
Oct 27, 2006
20,736
1,379
126
First of all, the Bulldozer architecture has nothing to do with HT, BD actually has 8 Integer Cores, that's why they(AMD) call it 8-core CPU.

Secondly, Windows doesn't understand the difference of Logical and Physical cores.

The problem lies in the fact that AMD's definition of a 'core', when they call 8150/etc "8 core" cpus, is that they in fact are slower than many Intel quads due to the gimping of the FPU units. Slow cache is another issue, but it's a problem.

Imho if they doubled the FPU so that each core had Integer + FPU, and combined that with faster cache they'd truly have a winner.
 

Yuriman

Diamond Member
Jun 25, 2004
5,530
141
106
The problem lies in the fact that AMD's definition of a 'core', when they call 8150/etc "8 core" cpus, is that they in fact are slower than many Intel quads due to the gimping of the FPU units. Slow cache is another issue, but it's a problem.

Imho if they doubled the FPU so that each core had Integer + FPU, and combined that with faster cache they'd truly have a winner.

Unfortunately that was one of their major space-saving measures. AMD went their module design because they could have 80% of the performance of 2 cores when both are loaded, with only 30%(?) more transistors than a single core would have. When only 1 core is loaded, performance doesn't suffer at all because no execution hardware is being shared.

In a way, it's like saying "Intel should duplicate all of their execution hardware for their virtual cores because hyerthreading only adds at-best 15% more performance" - that wasn't really the point. AMD wanted to get the most performance out of the fewest transistors because they need a smaller die in order to turn any kind of profit. An FX-4170 performs remarkably well considering it's a 32nm chip built with a fraction of the R&D money when all of its resources are used, considering it's only 30%(?) larger than a true dual core would have been. It doesn't scale well to the 8150 because very few people want or need the ability to run 8 threads.
 
Last edited:

Arkaign

Lifer
Oct 27, 2006
20,736
1,379
126
True, I understand that, but was hoping that a die shrink and new process would let them re-add the missing FPUs without busting the bank in physical space.
 

inf64

Diamond Member
Mar 11, 2011
3,884
4,691
136
It is a CPU that has 8 integer cores and 4 FP units. From pure integer code POV it is 8 core chip,from fp heavy workload's POV it is a QC chip with SMT. That's why in SSE workloads it falls short vs intel's i7, those shared 4 FP units are not that fast. Maybe they will fix that shortcoming with SR or excavator. They just need to stick around that long.

As for performance,it doesn't perform THAT bad,it's juts falls short in some areas. That's why it is priced accordingly: in some workloads you get i7 performance (MTed) and in some i3 performance or less(games). Price of around 190$ reflects that.
 

Hubb1e

Senior member
Aug 25, 2011
396
0
71
To be fair to AMD they did upgrade the FP sections of the CPU compared to PhII so they are half as many but faster.
 

Insert_Nickname

Diamond Member
May 6, 2012
4,971
1,695
136
First of all, the Bulldozer architecture has nothing to do with HT, BD actually has 8 Integer Cores, that's why they(AMD) call it 8-core CPU.

Secondly, Windows doesn't understand the difference of Logical and Physical cores.

You misunderstand. I was pointing out that it would be easier to treat a "module" as a Hyper Threaded Intel core from an OS perspective. That way the OS does not have to keep track of which "half" cores are in use and does not have to do complex scheduling. The internal module scheduler could then do the heavy lifting of assigning execution resources INT/FPU as needed without direct OS intervention...

Of course from within Windows it would look like an i7 with HT...

I hope that makes sense...
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
I wouldnt put my hopes up for Steamroller on AM3+. FM2 and later FM3 will rule them all in the desktop segment. And with the continual layoffs. You gonna see AMD trying to simplify everything as much as possible.
 

Insert_Nickname

Diamond Member
May 6, 2012
4,971
1,695
136
I wouldnt put my hopes up for Steamroller on AM3+. FM2 and later FM3 will rule them all in the desktop segment. And with the continual layoffs. You gonna see AMD trying to simplify everything as much as possible.

Considering they are already offering GPU-less Athlon II X4s for FM2, I expect we are going to see some platform integration with FM2/3 serving everything from high-end to low-end...

my 2c...:whiste:
 

nehalem256

Lifer
Apr 13, 2012
15,669
8
0
Considering they are already offering GPU-less Athlon II X4s for FM2, I expect we are going to see some platform integration with FM2/3 serving everything from high-end to low-end...

my 2c...:whiste:

There is really no reason to split AM3+ and FM2.

People who want the highest end processor are also probably willing to shell out for a discrete video card (preferably from AMD :D)

The only market you lose is people who want the Octocore SR, but also integrated graphics.
 

pantsaregood

Senior member
Feb 13, 2011
993
37
91
Adding an extra FPU to each module wouldn't help much. Bulldozer is slow because the cores are designed poorly, not because the module-based architecture is bad.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,809
1,289
136
Bulldozer is slow because the cores are designed poorly, not because the module-based architecture is bad.
It isn't the cores or the module bud. :rolleyes:

amd-32nm-processor-03.jpg

Note the year.... ( http://images.dailytech.com/nimage/5493_large_falcon.jpg <-- so you know what that really is...)
 
Last edited:

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
It isn't the cores or the module bud. :rolleyes:

amd-32nm-processor-03.jpg

Note the year.... ( http://images.dailytech.com/nimage/5493_large_falcon.jpg <-- so you know what that really is...)

65nm 2007
45nm 2008
32nm 2009

:hmm:

1 year node cadence? I don't think so.

If AMD was counting on that happening then that was the source of their roadmap falling apart. You kinda have to base your roadmaps on stuff that is actually mildly plausible and not just full of leprechauns with pots of gold at the end of a rainbow.
 

smangular

Senior member
Nov 11, 2010
347
0
0
Unfortunately that was one of their major space-saving measures. AMD went their module design because they could have 80% of the performance of 2 cores when both are loaded, with only 30%(?) more transistors than a single core would have. When only 1 core is loaded, performance doesn't suffer at all because no execution hardware is being shared.

Very cool point and impressive the module design with FP made that much savings...
 

Ferzerp

Diamond Member
Oct 12, 1999
6,438
107
106
What I do not understand is why they chose not to expose the second "core" of each module as a form of HT on steroids. It would even be easy as windows has support for HT built in...

That's how a patched Win7 system (and win8 natively) schedules for it now. It's much, much closer to what the architecture really is, but it didn't help much. Originally, MS listened to AMD marketing and treated them the same, but it works slightly better scheduling it just like HT.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
man, do you have some slides about the early days of bulldozer? i remember that it was going to be a more traditional core, but then changed latter

Actually early bulldozer was going to be the opposite of the traditional - aka "fat" - core design.

The original incarnation of what was to come after the Stars core microarchitecture was going to be slim cores and lots of them - it really was planned to be Niagara on x86 (i.e. Moar cores with piss poor IPC).

When the quad-core kentsfields came out in late 2006 that roadmap was torn up because AMD realized they couldn't possibly go to market and compete with the equivalent of a 12-core bobcat versus a quad-core Nehalem, single-threaded IPC had to be competitive.

That is when the fattened up version of bulldozer was envisioned, and is what we got now. A fatter-core version was never in the plans as far as I am aware.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,809
1,289
136
If AMD was counting on that happening then that was the source of their roadmap falling apart. You kinda have to base your roadmaps on stuff that is actually mildly plausible and not just full of leprechauns with pots of gold at the end of a rainbow.
AMD Foundries had taped out Falcon(32-nm) and Sandtiger(45-nm) before it was sold so basically the fall of AMD started with Hector Ruiz.

AMD Engineering portion finish making Sandtiger and Falcon and started to show them in closed circles. AMD Executive/Financial sells foundries making the two chips pointless as ATIC/GlobalFoundries delayed 32-nm for 2 years to make GlobalFoundries profitable. AMD then gives GlobalFoundries $4B to $5B to have 22-nm FinFETs or FD-SOI by 2012 to get back where they started. With concurrent bad planning it leads to a confusion what is actually coming out:

Trinity Rev. B(Steamroller) by Q2 2013 "Trinity 2.0" <-- 32-nm
Orochi Rev. E(Steamroller) by Q2 2013 "Vishera 2.0" <-- 32-nm
Kaveri(Steamroller?) by Q2 2014 <-- 28-nm
Viperfish(Steamroller?) by Q2 2014 <-- 28-nm
man, do you have some slides about the early days of bulldozer? i remember that it was going to be a more traditional core, but then changed latter
10.5h and 15h are easily confused. Bulldozer has largely always been Cluster Multithreading or pseudo-SIMT.
 
Last edited: