AMD “Next Horizon Event" Thread

Page 10 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

beginner99

Diamond Member
Jun 2, 2009
5,210
1,580
136
I also think that it's unlikely to see Ryzen use more than 8C (1 chiplet) as there's little reason for AMD to do so at this time. If they're getting ~10-15% bumps for both clock speed and IPC, there's sufficient reason to upgrade right there. For a big part of the market, 8C is already more than they need, so AMD is better off selling a 4C part as R3 that's much cheaper to produce and allows them to offer better prices to consumers.

Also, having to use 2 chiplets for Ryzen means that they can only effectively sell half the chips. Even though it might only cost an extra $20 in parts, that's potentially hundreds of dollars of revenue that they don't get by having more product. And they're already going to have limited supply due to being on a new process and wanting to use most of their chiplets for Epyc to get the most revenue. Even if they are just under parity with Intel, they'll still sell a lot of chips. If they're within 5% of Intel's 9900k, but sell that for what the top 8C/16T Ryzen goes for (~$350), it's pretty much a slam dunk in terms of which to buy.

Exactly. More than 8-core makes little sense on AM4. A 16-core would eat into threadripper and would make the IO die more complex and bigger.

They will save that step for next platform (AM5?) with Zen3 and probably ddr5 and pcie5.

More succinctly: An 8C minimum for R3, would probably have 99% of buyers choose the minimum. That drives down ASP/profits.

Exactly. The only way to make this work would be to lock the R3 give it low clocks. and that would just be stupid compared to simply making it a unlocked quad with high clocks. They can just dump all the poor dies with bad power characteristics into R3.


What is much more interesting for Ryzen3000 series and it's IO die. Will it support PCIe 4.0 and is the chipset connected to it via pcie 4.0?
 

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136
Also, having to use 2 chiplets for Ryzen means that they can only effectively sell half the chips.

Literally no one is suggesting that they would have to use 2 chiplets to produce a <=8 core part!

In that instance, one of the IF ports on the 2-port IOC would be fused off.

So on your package you'd have:
- 1x8C chiplet
- 1x 2-IF-port, dual channel DDR4 IOC [with an IF port fused off]
 

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136
and would make the IO die more complex and bigger.

They will need an IOC that has two infinity fabric ports to make their APU anyway. Incremental cost over a single port version is so marginal as to make design/verification cost not worthwhile.

[On the assumption that they reuse the IF port they have announced on 7Vega for communication.]
 
  • Like
Reactions: Gideon

beginner99

Diamond Member
Jun 2, 2009
5,210
1,580
136
They will need an IOC that has two infinity fabric ports to make their APU anyway. Incremental cost over a single port version is so marginal as to make design/verification cost not worthwhile.

[On the assumption that they reuse the IF port they have announced on 7Vega for communication.]

Assuming the APU isn't monolithic. We will get a 12nm APU first and a 7nm one is at least a year out probably more.

The real question is why would AMD increase core count again? Clock + IPC increase is more than enough already to sell it at 2700x or even a bit higher price but still a lot cheaper than a 9900k. It just doens't make much financial sense because most users don't need 16 cores, even 8 ins't needed for gaming. So as PeterScott says, a ver large part will just buy the low-cost 8-core part which now would be what $150? Instead of $300? doesn't make much sense.
 
  • Like
Reactions: amd6502

Olikan

Platinum Member
Sep 23, 2011
2,023
275
126
Assuming the APU isn't monolithic. We will get a 12nm APU first and a 7nm one is at least a year out probably more.

If is 1 year out, propably be made at 7nm EUV, cheaper to produce, easyer to desing... And Glofo does have EUV machines
 

Gideon

Golden Member
Nov 27, 2007
1,641
3,678
136
More succinctly: An 8C minimum for R3, would probably have 99% of buyers choose the minimum. That drives down ASP/profits.
Well, that doesn't really work that way. Otherwise i3 8350k would be the most sold retail Intel CPU, yet it was the 8700K by a large margin. People don't need 8 cores but still buy them. They will also buy 12 or 16 if the price is in range.
 

jpiniero

Lifer
Oct 1, 2010
14,599
5,218
136
Well, that doesn't really work that way. Otherwise i3 8350k would be the most sold retail Intel CPU, yet it was the 8700K by a large margin. People don't need 8 cores but still buy them. They will also buy 12 or 16 if the price is in range.

Ah but it doesn't have HT which does limit gaming performance. There's a reason Intel has resisted going to 4C8T on i3.
 

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
Thanks guy, had a feeling... I can't stand these reddit fan fiction no source bait posts.

The guy who posted it on Reddit clearly pointed out that it was a mockup and speculation (link):
disclaimer: All this is just a discussion of what could possibly be done, we don't know what will happen. We know that things can be done but AMD might opt to not do it for various reasons including: too soon, already planned for Zen3 or Zen4, costly, not practical atm, not possible technically, etc..

...

I took the AM4 package, and scaled two of the new 8C ZCC to make a 16C chiplets complex. Added an IO estimated die.

Not sure how you managed to miss that.
 

Arzachel

Senior member
Apr 7, 2011
903
76
91
I feel like y'all are talking right past Atari2600 while arguing about the evils of communism 8core R3s.

AMD can literally cover the entire consumer desktop range while only using the basic building blocks of a CPU chiplet, IO hub and GPU chiplet. CPU chiplet + IO, 2 CPU chiplets + IO, CPU + GPU + IO should all be possible without having to create new masks. And then you get regular binning *on top of that*!
 

coercitiv

Diamond Member
Jan 24, 2014
6,201
11,903
136
I feel like y'all are talking right past Atari2600 while arguing about the evils of communism 8core R3s.
That's because core count extravaganza is a lot more attractive than the elegance of a modular system design.

Personally I'm a bit more worried about VRM efficiency on AM4 boards trying to feed these (so far) imaginary communist beasts as they scale to 16 cores. Power may be down on 7nm by 50% at same clocks, but when you double the cores, and mantain or increase clocks while using lower voltage... you end up back where you started power wise, but with even higher currents to drive. Things would likely get toasty on cheaper motherboards.
 

PeterScott

Platinum Member
Jul 7, 2017
2,605
1,540
136
I feel like y'all are talking right past Atari2600 while arguing about the evils of communism 8core R3s.

AMD can literally cover the entire consumer desktop range while only using the basic building blocks of a CPU chiplet, IO hub and GPU chiplet. CPU chiplet + IO, 2 CPU chiplets + IO, CPU + GPU + IO should all be possible without having to create new masks. And then you get regular binning *on top of that*!

Where do you think the smaller IO Hub, GPU, Chiplets come from? The chiplet fairy? They need masks/designs too.

The problem is some people see a neat solution for a specific problem space, and quickly leap to thinking that it's the best solution for ALL problem spaces. It isn't.

There is a reason the industry maximizes integration, and if you look at mobile they put everything in an SoC.

1) Power usage is minimized on a monolithic chip.
2) Latency is minimized on a monolithic chip.
3) Board design and manufacturing is simplified with a monolithic chip.

1) is critical for laptops. This is the biggest portion of the Windows/Mac PC market, and one were AMD lags the most.
2) is critical for Gaming enthusiasts. Other than outright clock speed, it appear the biggest difference in gaming performance is Intels lower latency.

Chiplets on desktop/laptop may incur some up front design/pre-production savings, but will almost certainly have higher production costs, worse power usage, and worse latency. It's worse in every way, except up front costs.

The Total Addressable Market for Windows/Mac PC is well over 200 million units annually. Its worth the up front costs to build a superior dedicated part.

If AMD can't afford to build a dedicated part to chase this market, how in heck are they going to build dedicated parts for the MUCH Smaller dGPU market?
 

jpiniero

Lifer
Oct 1, 2010
14,599
5,218
136
Wafer costs are so insane at 7 nm that it's worth it to keep as much of the product on a lower node if you can get away with it.

You will see Intel do the exact same thing although I imagine you won't see this until they implement EMIB or Ferveros.
 

PeterScott

Platinum Member
Jul 7, 2017
2,605
1,540
136
Wafer costs are so insane at 7 nm that it's worth it to keep as much of the product on a lower node if you can get away with it.

An assumption, or do you have numbers? if so, I'd love to see them.

You will see Intel do the exact same thing.

For larger Server chips sure. For Desktop/Laptop parts? I don't think so.
 

Glo.

Diamond Member
Apr 25, 2015
5,711
4,555
136
More succinctly: An 8C minimum for R3, would probably have 99% of buyers choose the minimum. That drives down ASP/profits.
And what drives the market share, and sales share?

Best price/performance CPUs.

There is a reason why the most popular CPUs from past two releases from Intel and AMD are Core i5-8400 and Ryzen 5 1600/2600.

Have you ever wondered what will happen when AMD will release 65W 8C/8T 3.8 GHz/4.2 GHz Ryzen 3 3200 for 109$?
Have you ever wondered what will happen when AMD will release 65W 8C/16T 4.0/4.5 GHz Ryzen 5 3400 for 169$?

Have you ever wondered what will happen when AMD will release 65W 16C/32T noboduknowwhatbaseclock/4.5 GHz Ryzen 7 3700 for 299$?

Why would AMD do this?

To completely MURDER everything Intel can come up with. MURDER. This is essentially what will happen if Zen 2 has 5% higher IPC than Skylake/Kaby Lake/Coffee Lake, and very high clocks, and those prices.

AMD needs both: margin and sales market share. This lineup, with Ryzen 3 3200 being 8C/8T will insanely increase market share for AMD. 16C/32T will insanely drive the margin, because that chip can be priced at 500$ price tag, even.
 

exquisitechar

Senior member
Apr 18, 2017
657
871
136
https://www.servethehome.com/amd-epyc-2-rome-what-we-know-will-change-the-game/

From the comments:
Misha Engel said:
“Intel may retain the per-core performance lead which is important for many licensed software packages.”

I don’t think all EPYC 2 CPU’s will have 8 (relative)slow cpu dies, some will have 4 or even 2 very high speed dies hence 4 GHz+ speeds and still be in the thermal envelope.

Misha Engel said:
Just had confirmation(Scott Aylor of AMD) on high frequency low core count while maintaning the rest of EPYC ROME’s strengtth (octa channel mem, 128 PCIe-4 lanes, etc…)

Patrick Kennedy said:
I had a drink with Scott last evening along with Ian Cutress, Paul Alcorn, and Charlie Demerjian.

There are going to be things coming out between now and Rome. Sit tight.

Interesting.
 

beginner99

Diamond Member
Jun 2, 2009
5,210
1,580
136
The problem is some people see a neat solution for a specific problem space, and quickly leap to thinking that it's the best solution for ALL problem spaces. It isn't.

nobody said is the best technical solution but it is the most cost-effective with minor trade-off. And ryzen3 is a desktop product so space savings or power saving from integration matter much, much less. I mean a 64-core monolith server die would technically also be better but certainly not cost-wise. And the 7nm APU is probably early 2020 so I expect it to be a 7nm monolith because yeah for laptops it makes sense.
 
  • Like
Reactions: Gideon

Gideon

Golden Member
Nov 27, 2007
1,641
3,678
136
The Total Addressable Market for Windows/Mac PC is well over 200 million units annually. Its worth the up front costs to build a superior dedicated part.

IMO you are somewhat preaching to the choir here. Quite a few people agree here that the APU will probably be monolithic. But as it probably uses Navi it'll appear at a later (roadmaps point to 2020 :( ). This still leaves the door open for a 8-16 core chiplet design for AM4.
 
Last edited:

Arzachel

Senior member
Apr 7, 2011
903
76
91
Chiplets on desktop/laptop may incur some up front design/pre-production savings, but will almost certainly have higher production costs, worse power usage, and worse latency. It's worse in every way, except up front costs.

Y'know and manufacturing costs from being able to minimize the die size on a new process. And being able to bin and reuse those dies across the whole product stack. And being better able to optimize for frequency with most IO spun off to a seperate die. And being able to provide semi-custom designs at a dramatically lower price point.

I'm not sold on memory latency being a big deal, in gaming or in general, else we would see much better scalling with memory speed on Intel systems.
 

dnavas

Senior member
Feb 25, 2017
355
190
116
Personally I'm a bit more worried about VRM efficiency on AM4 boards trying to feed these (so far) imaginary communist beasts as they scale to 16 cores

Well, let's see what some imaginary numbers look like. Take the 1700 as a starting point -- 65W, 3Ghz base, 3.7 boost. Double the die -- 130W. Now, take the I/O out and halve it -- you'll pay a little price for cross-die movement, but probably save a little more by halving it. Call it 125W. At iso power there's an advertised 25% speed increase. Assuming 3Ghz is the spot at which this is measured, you could expect 3.7ish base frequency. How far it will single-core boost will depend on process, but let's call it 4.5 despite the rumors of high power, low count modes and even better precision boost. Is 125W for a 16-core AM4 processor that runs base 3.7 and boosts to mid 4s a bad deal? The 9900k runs a 3.6 base, and 125W doesn't sound exceptionally scary to me. Boosting all cores to 4.5 will be a problem, but if you're running that hard, you're likely to want the increased memory bandwidth of TR as well, so....

The other way to go is to peek at 12 cores and aim for a 4G+ base and/or the 9900k's tdp. What almost certainly won't happen is 16 cores at 5Ghz.

It seems more like a marketing choice than an engineering one at this point. Does it make more sense to ship a high frequency 8 core, or a 9900k-ish 12 core, or a slightly more moderate 16 core? Note: I'm still not bought into the separate I/O die, but the fused pairs does make it look more likely than I would have thought before. That said, marketing may well have made an early call for the high frequency 8 core part, leaving the higher core counts on TR for now.
 
  • Like
Reactions: lightmanek

PeterScott

Platinum Member
Jul 7, 2017
2,605
1,540
136
IMO you are somewhat preaching to the choir here. Quite a few people agree here that the APU will probably be monolithic. But as it probably uses Navi it'll appear at a later (roadmaps point to 2020 :( ). This still leaves the door open for a 8-16 core chiplet design for AM4.

Yeah, the APU is the one I am most convinced will be monolithic. There have been some leaks suggesting we are going to see a process refresh of Raven Ridge on GloFo "12nm" first. Which does indicate a much longer wait for 7nm APU :(.

They are already building Vega on 7nm. I think they could justify a Vega/Zen 2 APU on 7nm rather than wait for Navi. I really think Navi is just Vega with slightly tweaked GCN(by much the same), perhaps with Ray Tracing/Deep Learning HW, which wouldn't be very applicable to an APU anyway.
 

maddie

Diamond Member
Jul 18, 2010
4,740
4,674
136
The chiplet mask is the only 7nm mask - it is already built.

The IOC masks are all 14nm and are much cheaper.
Probably channeling Nosta here, but just think of migrating the IOC die to a FDX process in the future. For all we know, GloFlo going big with SOI might be related to this possibility.

All the talk of fully integrated as the best option might appear sensible, but is BS. Zen has ~1/2 power used outside the actual cores. Ideally what you want is highest frequency in the cores and lowest power outside. A segmented design is a godsend for optimization options.

Monolithic mobile options. Big, little (same process).
Chiplet mobile options. Big (7nm), little (12FDX). Result= much lower power

Segmented can be better than monolithic. It depends on many factors.

Even the latency canard is a red herring. Once latency is below a critical value then further reductions are fairly useless, and interposers and similar tech for example have negligible latency penalties.
 

coercitiv

Diamond Member
Jan 24, 2014
6,201
11,903
136
Well, let's see what some imaginary numbers look like. Take the 1700 as a starting point -- 65W, 3Ghz base, 3.7 boost. Double the die -- 130W. Now, take the I/O out and halve it -- you'll pay a little price for cross-die movement, but probably save a little more by halving it. Call it 125W. At iso power there's an advertised 25% speed increase. Assuming 3Ghz is the spot at which this is measured, you could expect 3.7ish base frequency. How far it will single-core boost will depend on process, but let's call it 4.5 despite the rumors of high power, low count modes and even better precision boost. Is 125W for a 16-core AM4 processor that runs base 3.7 and boosts to mid 4s a bad deal?
Using Zen 1 arch may be a bit misleading in estimating power numbers for high throughput scenarios: remember Zen 2 uses a trifecta to increase performance: more cores, more mhz, wider cores for icreased AVX throughput. The last part will take it's toll on power consumption, and while voltage will go down, current will go up.
 

Atari2600

Golden Member
Nov 22, 2016
1,409
1,655
136
I really think Navi is just Vega with slightly tweaked GCN(by much the same), perhaps with Ray Tracing/Deep Learning HW, which wouldn't be very applicable to an APU anyway.

Had a thought on this - posted it in the other thread.

This infinity fabric on the "socket" and single point of connection to memory could allow them to do some quite radical things with Navi.