What’s the fate of K12/ARM at AMD?

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Thunder 57

Platinum Member
Aug 19, 2007
2,012
2,515
136
Few questions if somebody has a idea what was K12:
  1. As Keller mentioned he can get more performance from ARM than x86, does it means K12 architecture was planned as Zen1/Zen2 line up successor (higher performance than Zen 2)?
  2. if K12 was next-gen of Zen1/2 how much possible is that Keller designed this K12 as ultra wide 6xALU core he knew from Apple?
  3. How much is possible that Zen 3 is based on some ideas/peaces developed for K12? Hard to imagine low resource company like AMD wasted all the work was done on K12.
  4. Was K12 killed completely or they just killed ARM branch and put all resources into x86 version? Is it possible that Zen 3 is x86 branch of K12 just with appropriate naming to Zen chronology? Forrest Norod said that Zen 3 completely new uarch which opens the possibility for this option.
I know you might not want to hear from me but let me try. First, thanks for posting in an appropriate thread. I'll give you my thoughts and hopefully we can have a constructive conversation. Maybe we can be nicer to each other, :) and I do mean it.

1. When did Keller say he could get more performance out of ARM than x86? I know Keller is a bit of an ARM fan and wanted to keep K12 alive and well, but unfortunately AMD didn't have the resources at the time to develop both.

2. This is kind of a mix of your 1/2. Why do you say K12 was planned as a Zen 1/Zen 2 successor? IIRC Zen and K12 were announced around the same time. You probably remember that AMD had plans to make a single socket that could take both Zen and K12. That would've been interesting and I am not surprised it was canceled (complexity).

As recently as 2014, AMD had a public roadmap for a common socket platform between x86 and ARM cores that would bridge the two, with an HSA-enabled version of the Jaguar architecture that might have helped plug the holes in AMD’s roadmap between now and Zen’s launch in 2017. By 2015, those plans had been canceled.
Source: https://www.extremetech.com/extreme/214568-jim-keller-amds-chief-cpu-architect-leaves-the-company

So my question is, why would AMD be developing a K12 with a 6-wide ALU while they were originally developing Zen with 4? From what you are saying, you make it sound like K12 was developed later.

3. I don't know, this kind of also is answered in my #2. AMD at the time did not have many resources, I cannot say what they may have taken from K12 and put into Zen. Their GPU's still suffer because they were lacking so much in the resource department ($$$) just a few years ago.

4. I doubt many people could answer that. I'm sure they were able to use some of the stuff they learned, but who knows what? I know there has been some talk about Zen 3 being a new uarch, but I'm pretty sure Mark Papermaster brought that back somewhat. My best guess is "new uarch" will be similar to K8 --> K10. Very much based on it's predecessor but expanded for more performance. We have many months to speculate though.
 

scannall

Golden Member
Jan 1, 2012
1,902
1,529
136
I know you might not want to hear from me but let me try. First, thanks for posting in an appropriate thread. I'll give you my thoughts and hopefully we can have a constructive conversation. Maybe we can be nicer to each other, :) and I do mean it.

1. When did Keller say he could get more performance out of ARM than x86? I know Keller is a bit of an ARM fan and wanted to keep K12 alive and well, but unfortunately AMD didn't have the resources at the time to develop both.
In this interview. Jim Keller and Mark Papermaster
 

moinmoin

Diamond Member
Jun 1, 2017
4,038
6,071
136
Both Zen and K12 were worked on by the same team at the same time until work on K12 was stopped. Since it was by the same team they very likely reused as many parts as possible between both designs. Considering this Keller calling it having more performance makes sense K12 not being shackled by the x86 decoding necessary in Zen. I personally don't think there is any more to K12 than this. AMD still uses an ARM core in every Zen chip as controller for the SCF (scalable control fabric) as well as the AMD Security Processor.
 

naukkis

Senior member
Jun 5, 2002
538
401
136
Both Zen and K12 were worked on by the same team at the same time until work on K12 was stopped. Since it was by the same team they very likely reused as many parts as possible between both designs. Considering this Keller calling it having more performance makes sense K12 not being shackled by the x86 decoding necessary in Zen. I personally don't think there is any more to K12 than this. AMD still uses an ARM core in every Zen chip as controller for the SCF (scalable control fabric) as well as the AMD Security Processor.
Keller did state that K12 has bigger engine because ISA makes that possible. Whatever that means - somebody could think that "engine" of CPU is it's execution units and ports.
 

Thunder 57

Platinum Member
Aug 19, 2007
2,012
2,515
136
So I watched most of it and think I got the pertinent information. I'm not a fan of videos, but at least if it's interesting rather than a 20 minute "review" of a part with graphs I'm OK with that.

So Jim is certainly pro ARM, not a big surprise. And he praises it. I'm sure he would've loved to have that K12 baby of his, but I still think that would have put AMD out of business. He is/was passionate about K12 as Lisa Su is/was passionate about AMD's survival. In the end, prioritizing Zen was still the right call IMO.

However, going back to what @Richie Rich was saying, how could K12 be a followup to Zen 2 if it was being developed concurrently with Zen 1? That is what I still don't get. If they planned for K12 to have 6 ALU's, then wouldn't the original Zen as well? You would burn some more power perhaps, but if the design was there for one why not use it in the other?
 
  • Like
Reactions: Tlh97 and moinmoin

Richie Rich

Senior member
Jul 28, 2019
470
228
76
I know you might not want to hear from me but let me try. First, thanks for posting in an appropriate thread. I'll give you my thoughts and hopefully we can have a constructive conversation. Maybe we can be nicer to each other, :) and I do mean it.
To be honest I'm open to any good discussion with anybody who can keep US politeness standard. That's why I'm here at AnandTech and not at some eastern PutinTechPCForum. I don't like any sociopathic behavior no matter who write this. I'm glad we can discuss now like humans.

1.
Jim Keller himself mentioned in an interview, that he can get more performance out of K12 compared to Zen at same size due to the more modern ISA. Today however K12 would have to compete with Cortex A77 - which is quite a bit smaller than Zen 2 - I assume less than half the size of Zen 2.
In this interview (time 3:40) Keller mentioned they can get more performance out of ARM core due to this things:
  • - ARMv8 is quit good, more registers, proper 3d instruction set, 3 operands instruction set
  • - less transistors on decoding than complex x86, so more transistors can be spend on performance (back-end)
  • - more performance can be turned into better efficiency

2.
I though that K12 was mentioned to be common back-end core with different front-end (ARM version with simple decoder and x86 with complex one). Keller mentioned this idea in the interview. So K12 was either sister core to Zen1 or next gen to Zen1. To be honest Zen1 was just reworked Dozer uarch with 4x ALU core back-end (to save AMD from bankruptcy ASAP, no time to develop anything wider and advanced than that). Also not sure how good idea is to build new ARM uarch on Dozer one. That's why I suggest K12 might be next gen to Zen1/2 line up.

Another thing is K12 was stopped in 2015 so if it would be Zen1's sister core then it was stopped right before tape out. AMD in 2015 was not that rich company to afford wasting 3 years of development. I can imagine canceling 2nd gen K12 after 1st gen doesn't sell good. But right before tape out? It'd be weird money wasting.

All this leads me to K12 as brand new next gen uarch with common back-end for Zen3. In this case canceling ARM K12 version would save a lot of money because it was just 1st half of development when they were starting work on details and ISA differences.

Edit: Keller says in that interview at 12:39 that:
the way we built the ARM processor is little bit different from x86, we built bigger engine, it has some interesting features and thinking about that is giving some ideas for the next gen
So bigger engine means what, wider? 6xALU?
Interesting features and ideas for the next gen means the K12 was planned as the base for new uarch? We know Zen3 is completely new uarch as Norrod said. Does it means Zen3 is based on K12? This kind of make sense as K12 development time wouldn't wasted completely and rather used in new x86 core.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,511
3,554
136
There's also a Fudzilla exclusive about Keller not being a big fan of K12.

The A77 is complex enough that it has its own uop cache.

Jim Keller was also at Samsung, and that didn't turn their Mongoose architecture into a golden goose.
 

scannall

Golden Member
Jan 1, 2012
1,902
1,529
136
I really don't want to sit through a 15 minute discussion. Is there text anywhere, or could you at least point out a time in the video I should be looking at? Anyway, this the type of discussion I would rather have. If I'm wrong, I'm wrong.
At about the 3:48 time frame they start talking about that.
 

scannall

Golden Member
Jan 1, 2012
1,902
1,529
136
So I watched most of it and think I got the pertinent information. I'm not a fan of videos, but at least if it's interesting rather than a 20 minute "review" of a part with graphs I'm OK with that.

So Jim is certainly pro ARM, not a big surprise. And he praises it. I'm sure he would've loved to have that K12 baby of his, but I still think that would have put AMD out of business. He is/was passionate about K12 as Lisa Su is/was passionate about AMD's survival. In the end, prioritizing Zen was still the right call IMO.

However, going back to what @Richie Rich was saying, how could K12 be a followup to Zen 2 if it was being developed concurrently with Zen 1? That is what I still don't get. If they planned for K12 to have 6 ALU's, then wouldn't the original Zen as well? You would burn some more power perhaps, but if the design was there for one why not use it in the other?
You could make the fastest ARM CPU ever, that ate every x86 design for breakfast. And you'd still go out of business. Too much of the world runs on x86, even if it isn't the best. So Lisa Su did make the right call. And it shows. A fire breathing ARM CPU would be great to see in the wild, but there is too much software out there that would never be rewritten or recompiled to use it.
 

moinmoin

Diamond Member
Jun 1, 2017
4,038
6,071
136
K12 was originally (in 2014) scheduled to release along Zen in 2016. Priority and dates changed.




In the end only "Seattle" released in 2016 as Opteron A1100.
 

Richie Rich

Senior member
Jul 28, 2019
470
228
76
K12 was originally (in 2014) scheduled to release along Zen in 2016. Priority and dates changed.
In the end only "Seattle" released in 2016 as Opteron A1100.
Two years of delay are quite a lot. Maybe K12 got two years of delay too (to somewhere 2019). Who knows. Zen1 was delayed one year to 2017 as well.

Interestingly A1100 was released in 2016 in spite of canceled K12 in 2015. AMD had probably already typed out A1100 samples so they decided to deliver that. Doesn't that mean if they would typed out K12 in 2015 (or being anywhere close to it) they will deliver that too?
 

Thunder 57

Platinum Member
Aug 19, 2007
2,012
2,515
136
To be honest I'm open to any good discussion with anybody who can keep US politeness standard. That's why I'm here at AnandTech and not at some eastern PutinTechPCForum. I don't like any sociopathic behavior no matter who write this. I'm glad we can discuss now like humans.

1.

In this interview (time 3:40) Keller mentioned they can get more performance out of ARM core due to this things:
  • - ARMv8 is quit good, more registers, proper 3d instruction set, 3 operands instruction set
  • - less transistors on decoding than complex x86, so more transistors can be spend on performance (back-end)
  • - more performance can be turned into better efficiency

2.
I though that K12 was mentioned to be common back-end core with different front-end (ARM version with simple decoder and x86 with complex one). Keller mentioned this idea in the interview. So K12 was either sister core to Zen1 or next gen to Zen1. To be honest Zen1 was just reworked Dozer uarch with 4x ALU core back-end (to save AMD from bankruptcy ASAP, no time to develop anything wider and advanced than that). Also not sure how good idea is to build new ARM uarch on Dozer one. That's why I suggest K12 might be next gen to Zen1/2 line up.

Another thing is K12 was stopped in 2015 so if it would be Zen1's sister core then it was stopped right before tape out. AMD in 2015 was not that rich company to afford wasting 3 years of development. I can imagine canceling 2nd gen K12 after 1st gen doesn't sell good. But right before tape out? It'd be weird money wasting.

All this leads me to K12 as brand new next gen uarch with common back-end for Zen3. In this case canceling ARM K12 version would save a lot of money because it was just 1st half of development when they were starting work on details and ISA differences.

Edit: Keller says in that interview at 12:39 that:

So bigger engine means what, wider? 6xALU?
Interesting features and ideas for the next gen means the K12 was planned as the base for new uarch? We know Zen3 is completely new uarch as Norrod said. Does it means Zen3 is based on K12? This kind of make sense as K12 development time wouldn't wasted completely and rather used in new x86 core.
I don't agree with what you are saying about Zen borrowing anything significant from BD. There's a video out there where David Kanter is asked about that and he seemed to struggle to come up with much other than the branch predictor (?), not sure. I'll try to find the video. If anything I'd think Zen borrowed some from the Cat Cores and maybe a bit from K10, but I think it was just mostly fresh stuff.

I'm still having trouble trying to follow your K12 timeline. As @moinmoin pointed out it was on their roadmap alongside Zen (Though not in that image as it's ARM only).

Lastly, what Keller said is too vague to speculate about IMHO. Was it more ALU's? Maybe. Was it wider buffers? Perhaps. Was it anything else or just plain voodoo magic? We don't know.

You could make the fastest ARM CPU ever, that ate every x86 design for breakfast. And you'd still go out of business. Too much of the world runs on x86, even if it isn't the best. So Lisa Su did make the right call. And it shows. A fire breathing ARM CPU would be great to see in the wild, but there is too much software out there that would never be rewritten or recompiled to use it.
In the real word absolutely. If you could create an ARM core that was 100x faster and emulated x86 at comparable performance to other designs, then I'd bet that'd sell. Like I said though, real world, so you're right :p .

Two years of delay are quite a lot. Maybe K12 got two years of delay too (to somewhere 2019). Who knows. Zen1 was delayed one year to 2017 as well.

Interestingly A1100 was released in 2016 in spite of canceled K12 in 2015. AMD had probably already typed out A1100 samples so they decided to deliver that. Doesn't that mean if they would typed out K12 in 2015 (or being anywhere close to it) they will deliver that too?
There's just not much out there on K12. It was more of a side project (as was the GPU business at the time) compared to Zen. Even if AMD had more resources I'd bet they launch Zen before K12. And as far as I recall, AMD was always saying end of 2016 for Zen, but it was pushed until March 2017. So it was delayed a quarter, not a year, unless I am mistaken.

EDIT

Here is the video I was talking about.

Around 18:50 he says that Zen borrowed the branch predictor from Jaguar.

Around 39:00 he is asked about what came from BD, and can't come up with anything. Says ALU's then walks that back and says microcode. Then goes on about planar to FinFet and saying they basically had to redo it all anyway. I take that as basically saying nothing much if anything was taking from BD/XV.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,511
3,554
136
In the real word absolutely. If you could create an ARM core that was 100x faster and emulated x86 at comparable performance to other designs, then I'd bet that'd sell. Like I said though, real world, so you're right :p .
If it was actually 100x faster(I know you are exaggerating, but hear me out), nevermind x86, everyone would switch to that CPU. Sure, it would take some time, but it will happen.

But you don't do it for 15-20%. Not even for 50%. Because if you want to run your business and you need to recompile everything, see if it works with existing infrastructure and they encounter serious issues, they'd say no.

And emulation has problems beyond just performance. It has compatibility issues. It'll never be 100%, it can't be. It's trying to be something its not.

Ars Technica had an article about the difficulty of emulation. They said you can do it with 50% loss if you are willing to sacrifice compatibility. But the more you want it working like native, the greater the loss is. In the extreme cases, if you want to do transistor-level emulation, your 4GHz 16 core CPU may barely run Pong. But at least it'll run everything, right?

Again, emulation will never achieve 100% performance nor compatibility. It's a transitory thing, not a replacement. History teaches us that fact. The hardware emulator in Itanium, the translation layer with IA32EL, the FX!32, WoA efforts.

The single greatest thing against emulation is the incumbent. Sure, native to native you might be better, but enter their turf and suddenly your product is inferior. And you are trying to shoot a moving target, as they aren't standing still.

*There probably is a cost associated with using x86, but not something that can't be overcome with proper execution and engineering.

Like I said before, the likely culprit behind ARM catching up to x86 has more to do with failures at both AMD/Intel than the ISA itself.

Want proof? Look at how compact and high performing the mobile GPUs are. Is that due to ISA? Nope.*

Keller was never at any point at Samsung.
I see. It was widely reported. If that's wrong, then damn the press. That's why I actually appreciate Anandtech not talking about products too far into the future. Rumors are more often than not correct, but if you report on it too much, then you'd get lot of flack for the 10% of the time you are wrong.
 
  • Like
Reactions: Tlh97

naukkis

Senior member
Jun 5, 2002
538
401
136
And emulation has problems beyond just performance. It has compatibility issues. It'll never be 100%, it can't be. It's trying to be something its not.

Ars Technica had an article about the difficulty of emulation. They said you can do it with 50% loss if you are willing to sacrifice compatibility. But the more you want it working like native, the greater the loss is. In the extreme cases, if you want to do transistor-level emulation, your 4GHz 16 core CPU may barely run Pong. But at least it'll run everything, right?

Again, emulation will never achieve 100% performance nor compatibility. It's a transitory thing, not a replacement. History teaches us that fact. The hardware emulator in Itanium, the translation layer with IA32EL, the FX!32, WoA efforts.
Emulation of direct programmed hardware is costly, but for emulating x86 instruction set it isn't needed. x86 cpu's also only emulate original x86 hardware, they aren't 100% compatible. So emulating only instruction set can be done with that 50% loss or even less. Which is why Intel tries to prevent software emulation of x86 in legal ways.
 
  • Like
Reactions: Tlh97 and maddie

soresu

Golden Member
Dec 19, 2014
1,853
1,022
136
I see. It was widely reported. If that's wrong, then damn the press. That's why I actually appreciate Anandtech not talking about products too far into the future. Rumors are more often than not correct, but if you report on it too much, then you'd get lot of flack for the 10% of the time you are wrong.
There were former AMD engineers at Samsung for the Mongoose project I think, either from the Bobcat team or the Bulldozer team (or both?).
 

Nothingness

Platinum Member
Jul 3, 2013
2,194
455
136
You could make the fastest ARM CPU ever, that ate every x86 design for breakfast. And you'd still go out of business. Too much of the world runs on x86, even if it isn't the best. So Lisa Su did make the right call. And it shows. A fire breathing ARM CPU would be great to see in the wild, but there is too much software out there that would never be rewritten or recompiled to use it.
Yes and the Reich will last a thousand years. You know things don't stay the same forever, don't you? Of course displacing x86 will take time, and it might not be ARM, but x86 will die that's for sure.
 
  • Like
Reactions: Adonisds

scannall

Golden Member
Jan 1, 2012
1,902
1,529
136
Yes and the Reich will last a thousand years. You know things don't stay the same forever, don't you? Of course displacing x86 will take time, and it might not be ARM, but x86 will die that's for sure.
At some point in the future x86 will likely be displaced. But in the time frame AMD had to either get a good product out the door that actually sold, or going bankrupt ARM wasn't the answer.
 

moinmoin

Diamond Member
Jun 1, 2017
4,038
6,071
136
Technically x86 was already displaced by x64. Intel kinda got lucky they got easy access to it.
 

soresu

Golden Member
Dec 19, 2014
1,853
1,022
136
At some point in the future x86 will likely be displaced. But in the time frame AMD had to either get a good product out the door that actually sold, or going bankrupt ARM wasn't the answer.
It was likely only the fervor surrounding ARM at the time that had K12 being developed in the first place given AMD's constrained budget.

They likely won't even consider going that way again until there is a big move towards ARM in the software arena - especially AAA games which are very anaemic on ARM platforms, excepting the Switch which is constrained to an A57 based CPU and sub WiiU graphics power (albeit greater gfx featureset and API support).

If they were going to do an ARM core now, they would likely have to start from scratch, considering both the competition in that market, and the changes to the ARM ecosystem too since K12 went into the deep freeze (ARM server standard, TME, SVE1/2, v8.6-A).

There is a very small possibility that AMD might do an ARM SoC to court Nintendo, but it seems likelier that Nin would just opt for a cut down version of the new Tegra Orin SoC for a future Switch successor.
 

soresu

Golden Member
Dec 19, 2014
1,853
1,022
136
Technically x86 was already displaced by x64. Intel kinda got lucky they got easy access to it.
Likely a cheaper x86 license for AMD was part of the deal - I can imagine x64 was a pretty heavy bargaining chip there.
 

moinmoin

Diamond Member
Jun 1, 2017
4,038
6,071
136
It was likely only the fervor surrounding ARM at the time that had K12 being developed in the first place given AMD's constrained budget.
My guess is that their semi custom business (especially with Sony and Microsoft consoles) made ARM seem like a feasible cop out should a higher performance and efficiency successor to Puma not materialize. Zen ended up covering that need very well so ARM was no longer needed there.

See also how "Project SkyBridge" was detailed:
 
  • Like
Reactions: Tlh97 and soresu

maddie

Diamond Member
Jul 18, 2010
4,331
3,849
136
Emulation of direct programmed hardware is costly, but for emulating x86 instruction set it isn't needed. x86 cpu's also only emulate original x86 hardware, they aren't 100% compatible. So emulating only instruction set can be done with that 50% loss or even less. Which is why Intel tries to prevent software emulation of x86 in legal ways.
We often forget that at its core, the modern X86 CPU is a son of RISC.

The instruction decoder and associated hardware is probably the main deficit of X86 vs ARM. The patent shown in this post by (DisEnchantment) should compensate a bit.

METHOD AND APPARATUS FOR VIRTUALIZING THE MICRO-OP CACHE
 
  • Like
Reactions: Tlh97 and moinmoin

IntelUser2000

Elite Member
Oct 14, 2003
8,511
3,554
136
The instruction decoder and associated hardware is probably the main deficit of X86 vs ARM. The patent shown in this post by (DisEnchantment) should compensate a bit.
There's some trade-offs they have to consider. The decoded operations are several times larger than non-decoded ones.

AMD said they reduced the size of the L1 cache to make room for the increased uop cache.
 

ASK THE COMMUNITY