[Deustche Bank Conference] AMD's New x86 Core is Zen, WIll Launch WIth K12

Page 10 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
It's a great slide, really, but unfortunately it is one to which AMD's shareholders respond "Awesome! So when will you actually make some money on this whole 'the future is fusion' phenomenon? Cause all we see is that Intel is making all the profits along the way..."

Yeap thats the main problem so far, AMD havent managed to leverage its iGPU advantage in to making more money until now.
But that is starting to change now, Intel cannot devote less die area to iGPU anymore. We can see from Broadwell that iGPU portion of the die does occupy larger area than its CPU parts. This trend will only continue in the future.
So now Intel will have to play at AMDs court field, Intel will only keep the power consumption lead until AMD will start to manufacture its APUs with FFs. Unless Intel keep one node ahead of AMD they will keep the power consumption lead. If AMD manages to use close the same node at less than 6 months later, then any advantage Intel had so far will be negated.

Intel_Core_M_Broadwell_Die_Slide_Wide.jpg
 

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
Just that SB 1C/2T is as big as a whole module, the comparison is 1C/2T with 1M/2T, in ST SB has the advantage but in MT that s not the case, the 2600K as well as the 2500K used to perform very well against a 8350 but that was almost two years ago, current benches show the 2500K almost oudated in MT and the 2600K globaly outmatched by the 8350 in MT, Kaveri can only be compared to i3s and in his register it does quite well overall.

You are mixing timeframes with architectures.
A SB HT core is similar in size (not going into specifics) to a Bulldozer module. Roughly speaking intels HT core has been equal in MT performance to the equivalent AMD module (SB and BD, IVB and PD, HW and SR). However for the 4M8T chips AMD's caching system requires a ton of die space making the area devoted to CPU performance much larger (16 MB vs 9 MB cache).

Kaveri compares well to haswell i3s generally performing very similarily clock for clock.

Yes please,

Compare 2M 4T Kaveri to 2C 4T GT3 Haswell.

Isolating CPU area.

Kaveri is 245 mm^2 * 0.53% non gpu = 130 mm^3.
Haswell die sizes are difficult to find. GT3 ULT is 181 mm^2. Haswell Y is 131mm^2. It looks like Haswell 2C4T (desktop i3 and mobile 37W i3/i5/i7) is 130 mm^2.

Going again by the previous slide, haswell 4C8T is 31% GPU making the GT2 GPU approximately 55mm^2. Thus it is likely that there is around 75 mm^3 of die devoted to non-gpu processing on the 2C die.

I'm not sure how the density metrics of 28nm density optimized and 22nm finFET compare but it does appear that intel is getting more CPU perf/mm^2.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
Isolating CPU area.

Kaveri is 245 mm^2 * 0.53% non gpu = 130 mm^3.
Haswell die sizes are difficult to find. GT3 ULT is 181 mm^2. Haswell Y is 131mm^2. It looks like Haswell 2C4T (desktop i3 and mobile 37W i3/i5/i7) is 130 mm^2.

Going again by the previous slide, haswell 4C8T is 31% GPU making the GT2 GPU approximately 55mm^2. Thus it is likely that there is around 75 mm^3 of die devoted to non-gpu processing on the 2C die.

I'm not sure how the density metrics of 28nm density optimized and 22nm finFET compare but it does appear that intel is getting more CPU perf/mm^2.

Obviously if you compare 22nm CPU Cores to 28nm CPU cores(or Module) you will come to the conclusion that Intels 22nm CPU Cores have higher perf/mm^2 than 28nm AMD Module simple due to higher density of 22nm process.
But i believe thats not what the context of the conversation was. At the same process (32nm) both Bulldozer Module and SandyBridge Core are very close in size. SteamRoller Module is not larger than BD/PD and thus the percentage of CPU die area remains the same as before.

The reason i said to compare the GT3 Haswell against Kaveri is because both devote a large die size for the iGPU and they both are 2C 4T CPUs. You will find that even Kaveri on a less advanced 28nm process can directly compete in CPU performance, iGPU performance and both die sizes are very close.
The only real advantage Intel has is lower power consumption due to its 22nm FF process. And that is what AMD needs to address in the near future if they want to directly compete against Intel in Laptops and mobile in general. They really need a good low power high performance FF process yesterday.
 

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
Obviously if you compare 22nm CPU Cores to 28nm CPU cores(or Module) you will come to the conclusion that Intels 22nm CPU Cores have higher perf/mm^2 than 28nm AMD Module simple due to higher density of 22nm process.
But i believe thats not what the context of the conversation was. At the same process (32nm) both Bulldozer Module and SandyBridge Core are very close in size. SteamRoller Module is not larger than BD/PD and thus the percentage of CPU die area remains the same as before.

The reason i said to compare the GT3 Haswell against Kaveri is because both devote a large die size for the iGPU and they both are 2C 4T CPUs. You will find that even Kaveri on a less advanced 28nm process can directly compete in CPU performance, iGPU performance and both die sizes are very close.
The only real advantage Intel has is lower power consumption due to its 22nm FF process. And that is what AMD needs to address in the near future if they want to directly compete against Intel in Laptops and mobile in general. They really need a good low power high performance FF process yesterday.

I thought 22nm intel vs. pretty much indentical in terms of 28 nm density optimized in terms of transistors/mm^2.

http://www.anandtech.com/show/7677/amd-kaveri-review-a8-7600-a10-7850k/4

Kaveri is 2.41 M transistors or 9.8M/mm^2. Haswell 4C is 7.9 M/mm^2. When you take into account GPU density and cache they seem pretty similar. GPU less haswell EP ranges from 7.3 to 8.4M per mm^2.

Again, someone needs to chime in. Even on intel's foundry graphs they have their 22 nm equal to TSMC's 28 nm in density.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
I thought 22nm intel vs. pretty much indentical in terms of 28 nm density optimized in terms of transistors/mm^2.

http://www.anandtech.com/show/7677/amd-kaveri-review-a8-7600-a10-7850k/4

Kaveri is 2.41 M transistors or 9.8M/mm^2. Haswell 4C is 7.9 M/mm^2. When you take into account GPU density and cache they seem pretty similar. GPU less haswell EP ranges from 7.3 to 8.4M per mm^2.

Again, someone needs to chime in. Even on intel's foundry graphs they have their 22 nm equal to TSMC's 28 nm in density.

Ehm no way, 28nm GF dont have the same density as Intels 22nm FF. It is just that Kaveris 53% of the die is a very dense iGPU bringing the transistor/area density higher.

This is a nice slide having all the Haswell dies together (courtesy of Hiroshige Goto). You can see that 2+3 Desktop is close to 190mm2 due to larger L3 cache over the ULV die.

3.jpg
 

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
Ehm no way, 28nm GF dont have the same density as Intels 22nm FF. It is just that Kaveris 53% of the die is a very dense iGPU bringing the transistor/area density higher.

This is a nice slide having all the Haswell dies together (courtesy of Hiroshige Goto). You can see that 2+3 Desktop is close to 190mm2 due to larger L3 cache over the ULV die.

3.jpg

You are right. Looked it up and it looks like SRAM density on intel 22nm is 0.092 um^2 while 28 nm glofo SHP is 0.120 um^2. 30% more dense.

I don't know why you keep dragging igp into this. I'm just talking about CPU perf/mm^2.

That AMD can get similar CPU perf/mm^2 isn't terribly surprising. Perf/mm^2 almost always belongs to smaller weaker cores (compare A15 to A7 and the perf/mm^2 is solidly in the A7's court). Look at jaguar and the construction cores.
 

MiddleOfTheRoad

Golden Member
Aug 6, 2014
1,123
5
0
Kaveri = 245 mm^2 * 53% = 130 mm^2 non GPU
Sandy Bridge = 216 mm^2 * 0.83 = 179 mm^2 non GPU

Kaveri is 28 nm (density optimized) vs 32 nm so both have roughly have the same amount of normalized die area for the CPU.

4C8T SB destroys kaveri 2M4T in single and multithread.

Wow, seriously? Talk about Apples To Oranges.....
Who compares an 8 Thread Sandy Bridge with a 4 Thread Kaveri?
They are completely different chips for completely different audiences.

I mean -- Why don't you compare an 800 Horsepower Lamborghini Sports Car to a 400 Horsepower Chris Craft boat while you're at it.
 

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
Wow, seriously? Talk about Apples To Oranges.....
Who compares an 8 Thread Sandy Bridge with a 4 Thread Kaveri?
They are completely different chips for completely different audiences.

I mean -- Why don't you compare an 800 Horsepower Lamborghini Sports Car to a 400 Horsepower Chris Craft boat while you're at it.

All you are looking at is perf/mm^2. The FX is difficult because of its poor cache implementation. Its going to be an apples to oranges comparison no matter what. As for who compares the chips, AMD does as they did in their kaveri marketing slides against the 4670k (4 threads but same physical silicon).
Kaveri is the most recent architecture.

You can use SB GT1 (130 mm^2), likely ~100 mm^2 non GPU logic. It will be less dense (cell SRAM density of 0.17-0.18 um^2).
 

MiddleOfTheRoad

Golden Member
Aug 6, 2014
1,123
5
0
Again, compare SB Xeons with Pilediver Opterons. And you can easily see how bad AMDs uarch is.

What is the work again on your 7 selected benchmarks in Linux? How big of a crowd does it apply to? See the issue?

How is Kaveri holding up with the SR uarch? Pretty terrible, thats how. So unless you use 300mm2 on multiple modules to run some very scaling INT MT load. Its just terrible slow and inefficient compared to the competition.

And again, even AMD stated the uarch is a total failure. It cant be much more clear than that.

You are completely biased and there is nothing I could possibly say to change that fact.

The reality is far different. I run chips from both manufacturers -- and for video editing, scientific computing, video encoding, games optimized for mantle, OpenCL apps.... An AMD can match or even surpass its closest Intel rival. My FX-8320 averages the same daily scores as my i7 3770k on the World Community Grid -- yet the AMD retailed for half the price. That same FX encodes video faster than my i7 under Sony Vegas as well.

The only "total failure" is your closed mindset.
 

MiddleOfTheRoad

Golden Member
Aug 6, 2014
1,123
5
0
All you are looking at is perf/mm^2. The FX is difficult because of its poor cache implementation. Its going to be an apples to oranges comparison no matter what. As for who compares the chips, AMD does as they did in their kaveri marketing slides against the 4670k (4 threads but same physical silicon).
Kaveri is the most recent architecture.

You can use SB GT1 (130 mm^2), likely ~100 mm^2 non GPU logic. It will be less dense (cell SRAM density of 0.17-0.18 um^2).

It's still apples to oranges -- they are completely different architectures..... Kaveri's design goal was maximum GPU performance.... Intel was focused on maximum CPU performance for Sandy Bridge. Just like Centair/Via dials their designs in for lowest possible power consumption. You can't fault any of them for having vastly different goals.
 

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
The reality is far different. I run chips from both manufacturers -- and for video editing, scientific computing, video encoding, games optimized for mantle, OpenCL apps.... An AMD can match or even surpass its closest Intel rival. My FX-8320 averages the same daily scores as my i7 3770k on the World Community Grid -- yet the AMD retailed for half the price. That same FX encodes video faster than my i7 under Sony Vegas as well.

Those are the best case scenario for the FX. AMD has to spend more in wafer allocation and sell something that consumes more power just to achieve parity with Intel mainstream chip. In oher workloads it is badly beaten, but I think you and Shintai are analyzing the question from very different POVs.

Is the FX a bad deal for consumers? It depends. For you, since you run the kind of application that the FX excels, it is not, especially at half the price. A gamer might not find this such a sweet deal, because it will have to leverage on the weak spots of the FX processor and even at half the price it might not make sense. But give FX away for $100 and it might make sense again. I'm not really an FX fan because I don't overclock anymore, so I tend to focus more on efficiency than on raw performance, but give me an FX for 30 bucks and I might consider it. There are no bad products, only bad prices, that's the consumer perspective.

But, was the FX a bad deal for AMD? Let's leave this question for Andrew Feldman, which was Senior VP at the time of the comment:

Andrew Feldman said:
http://www.pcworld.com/article/2040...o-arm-with-new-lowpower-x86-server-chips.html

“Bulldozer was without doubt an unmitigated failure. We know it"

“It cost the CEO his job, it cost most of the management team its job, it cost the vice president of engineering his job. You have a new team. We are crystal clear that that sort of failure is unacceptable,”

The company basically imploded because Bulldozer missed badly their projected expectations, so they have to basically sell a server die for less than 200 dollars of MRSP. The reason you are praising the FX is the very reason on why Bulldozer is such a failure for AMD.
 
Last edited:

monstercameron

Diamond Member
Feb 12, 2013
3,818
1
0
-snip-
The company basically imploded because Bulldozer missed badly their projected expectations, so they have to basically sell a server die for less than 200 dollars of MRSP. The reason you are praising the FX is the very reason on why I think Bulldozer is such a failure for AMD.

FTFY.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
The reality is far different. I run chips from both manufacturers -- and for video editing, scientific computing, video encoding, games optimized for mantle, OpenCL apps.... An AMD can match or even surpass its closest Intel rival. My FX-8320 averages the same daily scores as my i7 3770k on the World Community Grid -- yet the AMD retailed for half the price. That same FX encodes video faster than my i7 under Sony Vegas as well.

For all that can be said of what AMD had going against it with Piledriver (less resources, less development time, less people, less advanced process node, etc etc), the end result does surprisingly well despite all the reasons it shouldn't.

This was my experience with it and it did well in real-world apps that were relevant to me. (mind you that was still with the stock HSF even)

Gaussian98A7Multi-TaskingSingle-ThreadedPerformanceScaling.png


And then there were situations where I was surprised that AMD's architecture didn't shine better (some people say it is compiler issue, others say it is L3$ issue, either way the problem for me was that the performance was lacking regardless the excuse).

Metatrader4Multi-TaskingSingle-ThreadedPerformanceScaling.png


TMPGEnc5MerbabiesBenchi7-3770KvsFX8350vsQ6600.png


Despite the variable performance results, truthfully the only place where my piledriver was a letdown was in terms of power consumption.

Whether it was the CPU or the platform (which can't be unmarried so it doesn't really matter which is the bigger culprit at this point) the power usage was just unacceptably high versus the performance for the apps that mattered to me, so in the end I bought more IB setups and no more AMD setups.

I'm not sure what happened within AMD, but I can't help but to think that had AMD put a little more effort into leveraging their internal SoC capabilities at the time and put even more of the platform chipset into the chip so that the platform for piledriver also got a nice 32nm refresh then the entire power issue would have been nicely dealt with. (higher power in the socket but much less power overall for the platform) Perhaps going with less derped L3$ and use the die savings to significantly update the chipset?
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91

I don't think anyone in their right mind would attempt to make the counter-argument, that Bulldozer was a success for AMD.

And if you can't successfully make the case that it was an economic success for AMD, then in business terms it would most definitely be categorized as a failure.

Even the P4, as bad as it was in light of the K7 and K8 at the time, was still not a failure as it enabled Intel to continue making profits. AMD could have only dreamed of being in that position with Bulldozer as Bulldozer led the company into a pit of red ink.
 

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
It's still apples to oranges -- they are completely different architectures..... Kaveri's design goal was maximum GPU performance.... Intel was focused on maximum CPU performance for Sandy Bridge. Just like Centair/Via dials their designs in for lowest possible power consumption. You can't fault any of them for having vastly different goals.

It will always be apples to oranges but you do the best you can.

Kaveri is a mixed bag. AMD needs to fix their CPU which kaveri did. Kaveri, was more constrained by budget, if they had truly gone after better GPU performance then adding some cache or a better memory controller (kaveri is better than trinity but not as good as the 8350).

Both of your stantements are somewhat incorrect. With SB intel was going after mobile which it did with massive gains in performance and adding a mediocre but usable igp and servers with a focus on efficiency. AMD's kaveri GPU IP was already fleshed out, and likely the lion's share of R&D went toward the CPU with a somewhat "pasted on" igp. Kaveri is barely faster than the 6800k and horribly constrained by bandwidth.
 

monstercameron

Diamond Member
Feb 12, 2013
3,818
1
0
I don't think anyone in their right mind would attempt to make the counter-argument, that Bulldozer was a success for AMD.

And if you can't successfully make the case that it was an economic success for AMD, then in business terms it would most definitely be categorized as a failure.

Even the P4, as bad as it was in light of the K7 and K8 at the time, was still not a failure as it enabled Intel to continue making profits. AMD could have only dreamed of being in that position with Bulldozer as Bulldozer led the company into a pit of red ink.

I dont know the nuances of the companies financials, but to say that bulldozer[or its successor] was a failure, to me, say that it doesnt work properly and/or never made a profit. I mean it wasnt as good as its competitors but so were arm cpus and the like. just because they arent as fast as intels best, I think doesnt make them failures.
 

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
I dont know the nuances of the companies financials, but to say that bulldozer[or its successor] was a failure, to me, say that it doesnt work properly and/or never made a profit. I mean it wasnt as good as its competitors but so were arm cpus and the like. just because they arent as fast as intels best, I think doesnt make them failures.

It was a failure in the sense that the company basically imploded because of it, not because it does not work. It was a financial disaster, plain and simple.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
First of all we are talking about applications, not gaming. Games behave differently than all other applications. Haswell core i3 is only faster than Kaveri APUs in Cinebench, they loose or are equal in the vast majority of MT workloads out there.
I'm talking about stuff that's worth choosing a faster CPU for. A few highly multithreaded programs that get <1% of wall time on a PC aren't worth that. If you spend most of your time in those applications, then fine, but that's not typical.

All applications behave differently. Some are just easy to scale out horizontally, like video encoding, and substandard file compression (you will not get as good results with LZMA or LZMA2, FI, with a many-thread compression job, as with fewer threads--my VM backups can be hundreds of MB smaller, FI). Again, if that's what you do most of the time, then get whatever will work best for your budget. If you don't, though, it doesn't make a CPU that can do it faster a better one. And, again, most people's "typical" is going to be limited by RAM or HDD, more than anything else, with any decent CPU, today, and that's better for AMD than it is for Intel. IE, Intel's CPUs being better, overall, BD being a failure for AMD, does not equate to having an AMD CPU as being bad, or insufficient; but merely not the common case, due to Intel serving more of the x86 markets better.

Secondly there aren't any SteamRoller products like Kaveri APUs in those links you provided above.
Blame reviewers for that. I get annoyed at that segmentation with reviews, myself. For some reason, a Core i3 with improved 4th-gen IGP is worth testing with a dGPU, but hey, nobody's going to buy an FM2+ CPU and add a dGPU, so why bother (just ignore the newer chipsets, cheaper mobos, and nearly the same pricing as the FX line per core)? Likewise, nobody looking at an i3 and dPU is going to give an AMD a second thought, so why bother comparing them? And that says nothing of i5s. If nothing else, it would make it easier to compare the relative bang/buck for desktop machines. I'd like more sites to test a more representative range, rather than most of them testing a handful of models near one of the price spectrum, and not testing some models due to seeing them as in different markets. Also, I wish video card reviews of non-high-end models would compare to a common IGP, for a good reference point.
 
Last edited:

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
I dont know the nuances of the companies financials, but to say that bulldozer[or its successor] was a failure, to me, say that it doesnt work properly and/or never made a profit. I mean it wasnt as good as its competitors but so were arm cpus and the like. just because they arent as fast as intels best, I think doesnt make them failures.
It needed to be sufficiently superior, at launch, to the Phenom II line. Instead, it was so-so, edging out in highly multithreaded stuff, but falling flat elsewhere. It was also a huge die. It was also made to be a speed demon, which is why the single-threaded stuff faltered, just like the P4 a decade before it. So, they had trouble selling it, because it was hot and slow. As they improved it, so did their competition, so the specific performance metrics to be goalposts shifted to being faster and lower power, over time.

If AMD had made it more sane, not made to fly at 5GHz and up, and made it smaller, so it was cheaper for them to manufacture, it could have been quite good, even with its top end performing a bit lower than Intel's, because they could demand higher average prices, and get more OEMs using them. The really quirky bits of BD actually worked out only slightly worse than the theoretical research that inspired them. The cache latencies, branch penalties, weird scheduling issues of the, "AGLU," etc., which were almost certainly part of its ability to run at very high clock speeds, hampered it.
 

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
Even AMD has admitted that the Bulldozer line was a disaster, so why some posters continue to try to defend it is unclear to me. If you want someone to improve, you need to be honest about their flaws, not make excuses for everything they do.

As to why Bulldozer failed: It's not because AMD didn't have enough money (their financial situation was quite a bit better at the time). It's not because AMD's engineers and designers are incompetent. It's because the high concept behind the chip was flawed from the outset. It failed for the same reason as NetBurst: the architects incorrectly believed it was possible to achieve far higher clock speeds than they actually could by trading off IPC. It is clear now that this is a fool's game, doomed to fail from the start. High IPC results in better power efficiency and better overall performance (especially in single-threaded and lightly threaded applications). AMD also believed that using CMT would help them win in the server market, by providing the highest possible performance on multi-threaded integer workloads. That was a more defensible belief but it proved to be empirically wrong; Intel's far higher IPC combined with HyperThreading and better power efficiency caused AMD to fall behind very quickly.

The good news is that, as noted above, AMD does understand that Bulldozer didn't work. Moreover, even though the lessons they've learned have been very expensive, they have not been in vain. AMD can now apply the lessons that they've learned from BD (just as Intel applied their experience with HyperThreading and better branch prediction on the P4 to later architectures), as well as the improvements they've incorporated into their cat cores. The memory controller is one area where AMD's x86 chips really need work, but this burden need not be borne by the x86 big core team alone; a good memory controller design will be equally useful to the GPU division, on ARM products, and on the cat cores. No doubt the GPU team has already contributed some useful insights in this direction. And I would be surprised if AMD hasn't extensively studied the Sandy Bridge and Haswell to find out what advances from their designs can be incorporated into its own.
 

MiddleOfTheRoad

Golden Member
Aug 6, 2014
1,123
5
0
The company basically imploded because Bulldozer missed badly their projected expectations, so they have to basically sell a server die for less than 200 dollars of MRSP. The reason you are praising the FX is the very reason on why Bulldozer is such a failure for AMD.

That really does sum it up quite well -- the fact that you can buy a server grade CPU that can feed 8 threads for a street price around $125 (FX-8320) -- is an insanely great value to a consumer.

This definitely cuts the other way as you point out. That price point probably wrecks AMD's balance sheet -- and they probably lost at least $40 - 50 bucks on every entry level 8 core they sold. But regarding the Bulldozer -- I think the bottom line is that Cliff Maier is no Jim Keller. Maier tried hard but Bulldozer was his first big core design (and probably last). Keller is just a lot more experienced and was the reason why AMD leapfrogged Intel in performance (K7 / K8).

I doubt anyone will argue that Bulldozer was a financial success..... Everyone knows it wasn't. But it's failure was a result of the previous management -- the current management has at least addressed some of the FX shortcomings with the E and Vishera updates. I have a lot more confidence in AMD's future with Rory Read and Jim Keller now in the mix.
 
Last edited:

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
This definitely cuts the other way as you point out. That price point probably wrecks AMD's balance sheet -- and they probably lost at least $40 - 50 buck on every entry level 8 core they sold. But regarding the Bulldozer -- I think the bottom line is that Cliff Maier is no Jim Keller. Maier tried hard but Bulldozer was his first big core design (and probably last). Keller is just a lot more experienced and was the reason why AMD leapfrogged Intel in performance (K7 / K8).

Not to undersell Keller's abilities, but at the time he was basically tasked with taking Dirk Meyer's (who later to became CEO) K7 design which heavily leveraged Dirk's DEC Alpha 21264 development experience and owned the project of iterating an already kickass (albeit imported from DEC via Meyer and co) architecture by moving the IMC onto the CPU and extending the instruction set to 64bit.

Compared to the K6 (imported from NexGen) that preceded the K7 (imported from DEC), the K8 was a mere evolutionary iteration that was not repeated when K10 (Bulldozer) was developed.

To put it differently, the guys who were tasked with developing the K5 (a failure), the K6 (bought from NexGen), the K7 (bought from DEC) and the K10 (AMD's second only truly internally designed CPU from the ground up), had a far different job than that of Keller who is vaunted for having developed what is pretty much a run of the mill evolutionary product based on the already successful K7 (bolt on an IMC, bolt on a second core, extend existing 32bit instructions to 64bit).

Had AMD not embraced SOI at the time, the K8 would have never gained the fame it did.

Bulldozer may have been the product of an inexperienced product manager, Maier, but Keller was no magician when it came to the K8 either. He may have grown to become one in the meantime while at Apple, but that remains to be seen.
 
Last edited:

NTMBK

Lifer
Nov 14, 2011
10,525
6,050
136
Not to undersell Keller's abilities, but at the time he was basically tasked with taking Dirk Meyer's (who later to became CEO) K7 design which heavily leveraged Dirk's DEC Alpha 21264 development experience and owned the project of iterating an already kickass (albeit imported from DEC via Meyer and co) architecture by moving the IMC onto the CPU and extending the instruction set to 64bit.

Compared to the K6 (imported from NexGen) that preceded the K7 (imported from DEC), the K8 was a mere evolutionary iteration that was not repeated when K10 (Bulldozer) was developed.

To put it differently, the guys who were tasked with developing the K5 (a failure), the K6 (bought from NexGen), the K7 (bought from DEC) and the K10 (AMD's second only truly internally designed CPU from the ground up), had a far different job than that of Keller who is vaunted for having developed what is pretty much a run of the mill evolutionary product based on the already successful K7 (bolt on an IMC, bolt on a second core, extend existing 32bit instructions to 64bit).

Had AMD not embraced SOI at the time, the K8 would have never gained the fame it did.

Bulldozer may have been the product of an inexperienced product manager, Maier, but Keller was no magician when it came to the K8 either. He may have grown to become one in the meantime while at Apple, but that remains to be seen.

Who knows, maybe Zen will be a return to evolutionary improvements? Take K10, bolt on stacked memory, bolt on SMT...
 

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86
Not to undersell Keller's abilities, but at the time he was basically tasked with taking Dirk Meyer's (who later to became CEO) K7 design which heavily leveraged Dirk's DEC Alpha 21264 development experience and owned the project of iterating an already kickass (albeit imported from DEC via Meyer and co) architecture by moving the IMC onto the CPU and extending the instruction set to 64bit.

Compared to the K6 (imported from NexGen) that preceded the K7 (imported from DEC), the K8 was a mere evolutionary iteration that was not repeated when K10 (Bulldozer) was developed.

To put it differently, the guys who were tasked with developing the K5 (a failure), the K6 (bought from NexGen), the K7 (bought from DEC) and the K10 (AMD's second only truly internally designed CPU from the ground up), had a far different job than that of Keller who is vaunted for having developed what is pretty much a run of the mill evolutionary product based on the already successful K7 (bolt on an IMC, bolt on a second core, extend existing 32bit instructions to 64bit).

Had AMD not embraced SOI at the time, the K8 would have never gained the fame it did.

Bulldozer may have been the product of an inexperienced product manager, Maier, but Keller was no magician when it came to the K8 either. He may have grown to become one in the meantime while at Apple, but that remains to be seen.


I hope some university is pursuing a case study with Bulldozer development. There must be some great lessons around why they stuck with that cache structure? Why they chose MHz over IPC even while P4 was burning up? Why didn't they go whole hog in their POWER, SPARC vision and add SMT to their CMT design for maximum threads? What time if any did they spend to determine exactly what sort of instruction patterns are used in deployed software that Intel's Core 2+ CPUs handle much better, were there patent barriers involved in the notable stagnation of CPU performance on said deployed software? Why did they integrate the memory controller with their CPUs ahead of Intel but seem to drag their feet pulling the GPU and chipset features into the CPU?

Knowing more about what actually occurred to result in Bulldozer being what it was would probably help us guess where K12 is headed. If they do ditch CMT I'd think that would mean we see AMD join the SMT club.
 
Last edited: