Speculation: AMD's 7nm processors will all be APUs

Will 7nm Ryzen have integrated graphics?


  • Total voters
    57

Vattila

Senior member
Oct 22, 2004
799
1,351
136
I have this crazy hunch, based on that 48-core Starship rumour, as well as the topology of the Infinity Fabric configurations (see own thread), that 7nm Zen 2 will come on a die with 3 quad-core CCXs and 1 "GCX" (GPU complex), all directly connected (6 links) using Infinity Fabric.

This hypothetical APU die will allow the Starship configuration for EPYC, i.e. 4 dies on a package, each die having 12 cores (3 quad-core CCXs times 4 equals 48), with 4 GCXs (1 per die) for parallel compute acceleration (as a more efficient alternative to AVX512). Threadripper, implemented with two dies on package, like before, will have 24 cores and 2 GCXs. And Ryzen 7, with one die, will have 12 cores and 1 GCX, and thus built-in graphics, for better competitiveness with the feature set of Intel's range. With the GCX, the same die will also be applicable to high-end notebook SKUs. In short, AMD's processor range will all be APUs from that point onwards.

We know AMD wants to implement a scalable design strategy for its GPUs, and reportedly Nvidia is working on this as well. So this hypothetical GCX will also be the building block for larger graphics configurations, just as with the CCX today on the CPU side. For discrete graphics, there will be various configurations based the number of GCXs per die, the number of dies on package, and the number of chips per card.

Is this too crazy an idea?
 

Ajay

Lifer
Jan 8, 2001
15,431
7,849
136
Yes, this is too crazy an idea. AMD needs to get on solid ground economically with a large addressable TAM and profit potential before branching out. Adding an iGPU on all chips would hurt margins on Server Chips.
 

Vattila

Senior member
Oct 22, 2004
799
1,351
136
Adding an iGPU on all chips would hurt margins on Server Chips.

Yes, that is the only argument I see against it. On the other hand you have to consider the excellent reusability of such a die for multiple products — it is even more attractive than the current Zeppelin configuration, which lacks built-in graphics and hence doesn't fully play in the mainstream desktop segment. The reusability reduces manufacturing cost, improving margins.

Also worth considering are all the FLOPS such a die could muster, which would look good at least on the spec sheet.
 

maddie

Diamond Member
Jul 18, 2010
4,738
4,667
136
I would hazard a guess that 6C/12T and above CPUs mainly run discrete graphic cards. The upcoming RR will satisfy the integrated CPU/GPU [APU] crowd with up to 4C/8T models . My guess is that Zen 2 will have more cores/CCX not more CCXs/die, leading to a 6C/12T or above, RR successor.

Why would they include an iGPU on a server/HEDT/performance die when in nearly all cases it would never be used?

2 or maybe 3 production lines. A Ryzen class product, a RR class product and maybe an ultra low power [4W-10W] product
 

maddie

Diamond Member
Jul 18, 2010
4,738
4,667
136
Yes, that is the only argument I see against it. On the other hand you have to consider the excellent reusability of such a die for multiple products — it is even more attractive than the current Zeppelin configuration, which lacks built-in graphics and hence doesn't fully play in the mainstream desktop segment. The reusability reduces manufacturing cost, improving margins.

Also worth considering are all the FLOPS such a die could muster, which would look good at least on the spec sheet.
If it was done now you would get a ~ 300^mm2 Ryzen die. Does this make any sense, seeing that ratio [200/300] would remain roughly the same irrespective of the node.

Once you get above a certain product volume in a given manufacturing run, the cost savings become small. It's not linear, basically an inverse polynomial, and reducing output by 1/3 will have a much bigger detrimental effect on overall costs.
 

Vattila

Senior member
Oct 22, 2004
799
1,351
136
Why would they include an iGPU on a server/HEDT/performance die when in nearly all cases it would never be used?

For efficient parallel compute, as an alternative to AVX512.

For a very long time, AMD has been working on the architecture (HSA), programming languages/models and libraries needed to efficiently use the GPU part of an APU for parallel compute. In a slide a few years back they had "HSA realised" on one of their slides showing a server APU on their roadmap. So we know that is the destination — unless it has changed since then.
 

krumme

Diamond Member
Oct 9, 2009
5,952
1,585
136
We got avx128 in jaguar probably to make it fit for consoles.
I cant find similar reason here. But thumbs up for the creativity!
A bigger die also means performance problems elsewhere besides pure cost for such a big gpu part.
A similar situation is avx512. It really is a very costly feature that could potentially take away excactly the margin that gives the fat profitability the day a real compettitor enters.
Imo it then goes from smart re-use to unfocused business.
 

krumme

Diamond Member
Oct 9, 2009
5,952
1,585
136
For efficient parallel compute, as an alternative to AVX512.

For a very long time, AMD has been working on the architecture (HSA), programming languages/models and libraries needed to efficiently use the GPU part of an APU for parallel compute. In a slide a few years back they had "HSA realised" on one of their slides showing a server APU on their roadmap. So we know that is the destination — unless it has changed since then.
I think its a similar argument as against avx512; those customers that need to solve that kind of load get either gpu for it or get a custom soc.
There is lots of compettition here.
I need to see the business case? Or simply a customer that as Maddie outlines it can also pay for all the other customers wasted die.
 

Vattila

Senior member
Oct 22, 2004
799
1,351
136
I think its a similar argument as against avx512; those customers that need to solve that kind of load get either gpu for it or get a custom soc.

I presume that an HSA-compliant SoC/SiP will have much better compute efficiency and density, and that this is the reason AMD have spent all those resources on developing HSA and HSA-compliant APUs.
 

PeterScott

Platinum Member
Jul 7, 2017
2,605
1,540
136
Is this too crazy an idea?

Yes.

Remember, cost/transistor is not shrinking that fast anymore, and 7nm probably won't mean it' is full half the size either.

A 12 core CPU + GPU on one die, will be a relatively big expensive die, with lower yield.

So, just about ZERO chance of this being what AMD builds.
 

Vattila

Senior member
Oct 22, 2004
799
1,351
136
7nm probably won't mean it' is full half the size either.

This EE Times article quotes "2.8x better routed logic density" for 7LP, so I gather AMD could fit 4 CCXs in less than the die space taken up by Zeppelin today. Replace one of the CCXs by my hypothetical GCX and you may be able to keep the die at a similar size, perhaps?

That said, I don't know what the size of a suitable GCX would need to be, considering a single one needs to be sufficient for mainstream desktop and notebook. At a minimum, it needs to beat Intel's iGPU in performance. Any inkling?
 

Vattila

Senior member
Oct 22, 2004
799
1,351
136
6C/12T + 24 CU(1536 GCN core) design and HBM2 on top of it? Sign me up for one.

For my hypothetical, fully enabled, 7nm APU die (for 7nm Ryzen 7), it would be 12C/24T + any number of CUs AMD can fit in a GCX.

PS. HBM will come on package, eventually. Perhaps first in server.
 

maddie

Diamond Member
Jul 18, 2010
4,738
4,667
136
Yes.

Remember, cost/transistor is not shrinking that fast anymore, and 7nm probably won't mean it' is full half the size either.

A 12 core CPU + GPU on one die, will be a relatively big expensive die, with lower yield.

So, just about ZERO chance of this being what AMD builds.
Wrong.
 

Yotsugi

Golden Member
Oct 16, 2017
1,029
487
106
You're suggesting wasting die area on something that is going to be useless 90% of the time?
 

PeterScott

Platinum Member
Jul 7, 2017
2,605
1,540
136

You have seen the future?

Maximum density SRAM cells don't tell the whole story. We have to wait and see the size of performance oriented full processor designs to really see how much density improves, in that application.
 

maddie

Diamond Member
Jul 18, 2010
4,738
4,667
136
You have seen the future?

Maximum density SRAM cells don't tell the whole story. We have to wait and see the size of performance oriented full processor designs to really see how much density improves, in that application.
Yes.
 

Vattila

Senior member
Oct 22, 2004
799
1,351
136
You're suggesting wasting die area on something that is going to be useless 90% of the time?

Yes, except it wont be useless in 90% of the target market.

To fully compete with Intel, AMD needs a die in the bulk of the market that requires integrated graphics. The hypothetical die provides that. Whether AMD finds it cost-effective to develop a smaller separate die for the high volume low-end market is a good question. If so, it will of course include graphics. Whether AMD develops a dedicated die for the low volume HEDT market is a bad question. And whether it is cost-effective to develop a separate die for the server market is also doubtful. Compared to server ASP, the extra manufacturing cost of including a GCX is miniscule, and server volume is miniscule compared to the mainstream.

Also, the GCX is a compute feature in itself that, with the right software, can be put to good use in both the HEDT and the server markets. See my earlier replies on HSA in this thread.

But, most important, consider the benefit of cost and time-to-market with having only one die to develop, qualify and manufacture across all processor SKUs. Note the increasing cost for mask sets on 7nm. Note the current long delay between Ryzen and Ryzen Mobile using separate dies on 14nm.
 
Last edited:

ao_ika_red

Golden Member
Aug 11, 2016
1,679
715
136
I believe having 2-5CU is good enough. It will only be used for emergency case (GPU died, monitor issue, etc). Basically, doing what Intel did for years.
 

Yotsugi

Golden Member
Oct 16, 2017
1,029
487
106
But, most important, consider the benefit of cost and time-to-market with having only one die to develop, qualify and manufacture across all processor SKUs.
You'll still have to develop APU and low-power die. It's very, very, very pointless. Waste of die area.
Besides RR was obviously delayed because of Vega.
 

Mopetar

Diamond Member
Jan 31, 2011
7,834
5,981
136
I doubt it just because there are some people who have no use for the GPU part, though I suppose they could build those parts from salvaged dies where the GPU is defective.

Besides RR was obviously delayed because of Vega.

While that's certainly possible, I think that AMD may also be facing limits to wafer availability. They're making their CPUs and GPUs all at GF for the first time and they actually have a CPU worth selling for the first time in five years, so I would imagine that they're using more wafers than any time in the recent past and may have all that GF can produce already. In that case it makes far more sense to try selling more expensive server, workstation, and gaming CPUs than it does to make a part targeting lower margin spaces. That's doubly true given AMDs financial situation.

Also, the rumors seem to suggest that HBM2 limitations are what's limiting desktop Vega supply, in which case they probably could have gone with the APU implementation a lot sooner if they had wanted to.
 

Vattila

Senior member
Oct 22, 2004
799
1,351
136
You'll still have to develop APU

The hypothetical die is an APU. See the original post.

and low-power die.

If Zen 2 has good performance/watt they can clock it down to meet power targets. The 7LP process has announced good power saving numbers as well — the EE Times article referenced earlier quotes "55 percent lower power" at the same performance compared to 14LPP.

And, with regards to die size, if Zen 2 really is Intel-beating, AMD can use its available manufacturing capacity to attack the market from above. Only if competition pushes them to the lower ASP and higher volume part of the market do they really need a smaller die. That said, it would perhaps be interesting to make a small die on the cheaper low-power FD-SOI process (22FDX/14FDX) for a play in mobile, tablet and/or IoT.
 
Last edited:

Excessi0n

Member
Jul 25, 2014
140
36
101
I could see them adding a very modest iGPU. Enough to drive a couple of screens, but not enough for any substantial 3D workloads.

You know, that actually makes me wonder if you could integrate something like that into the CCX itself.