AMD Bristol/Stoney Ridge Thread

NostaSeronx · Jun 20, 2018

dark zero said:
Bristol Ridge Refresh is not happening.

Heck, I see only AMD K12 using 22 nm.

My sources implied;
- approximate 200 mm squared 22FDX replacing Vishera and targeted at Denverton/Jacobsville.
- approximate 150 mm squared 22FDX replacing both Bristol Ridge and Stoney Ridge, and targets at GLM+/JPL. (One die, multiple salvage slots; high FBB with low Vdd to low FBB with high Vdd to high FBB with high Vdd.) //high Vdd being 0.9V and low Vdd being 0.6V.

Both CPU and APU are AM4 compatible and the APU will be on FP5.

Vishera MSRP before EOL in 2017 was peaked at $130. So, the replacement at not salvaged should still be at $129. Bristol Ridge A8-9600 doesn't exceed $70. So, the successor at peak will not surpass that.

It is not expected that these processors will compete with any of the Zen/Zen+/Zen 2 SKUs. The platform is mainly aimed at providing features for low budget. So, performance is not a concern but power and cost is a concern.

Features (examples);
AVX512VL
AV1 L5.1? 12-bit? Encode/Decode
Follows Bristol Ridge as disguised Fire Pro, full half/double/quad precision speeds.

Speculation:
-> Zen2 Power/Area + FDSOI Intrinsic optimizations on top of Bristol/Stoney CPU.
-> Navi Power/Area + FDSOI """ on top of Bristol/Stoney GPU.
//Only two dimensions over each architecture. Power and area, with performance for the next node.
-> 22FDX Performance of Excavator/Puma, Enhanced Power/Area from Zen2.

2017 is an interesting year for AMD;
- AMD got a VISC person
- They invested in VISC
- Share patents with Intel.

I don't expect AMD to use Virtual Core or Soft ISA. Rather multi-core/module-orientated XVP and x86 ISA.
www.cs.cmu.edu/~seth/wild-and-crazy09/koreysewell.pdf
www.cs.cmu.edu/~seth/wild-and-crazy09/002-koreysewell.pptx

I'm slotting that for 12FDX models however. (If it is in 12FDX, then it will be in 3-nm. -> 12FDX Performance of Zen/Zen+, Enhanced Power/Area from Zen5.)
---
Extension edit:
Now the gritty details is that FBB/ABB has been significantly investigated.

So, Multi-FBB design does not increase area where Multi-Vdd does.

So, recompiling Excavator with a reduced Vdd design is going to drop its size. With Multi-FBB we are looking at max FBB up to 20x leakage and 5x increase in frequency.

A9-9420e is 2.6 GHz @ 1V~1.05V within a 6W TDP.
22FDX same design Excavator shrinks with a truncated Vdd. -> lower area
22FDX new design post-Excavator shrinks without that truncated Vdd. -> lower power

Math wise the 2.6 GHz @ ~1V on 28HPA becomes >2.6 GHz @ 0.9V on 22FDX. Now architecturally, the ideal is to get the same EPI as IPC from Zen. So, ~52% lower EPI for the new core. So, a 6W A9-9420e becomes a 3W part for 22FDX. The core with 20x the leakage and 5x the frequency is 60 watts and 13 GHz/13,000 MHz. This is just an example of max FBB, no one is ever going to do this ever. Except, maybe IBM or Oracle, the Z and M series from each are insane.

===
Edit for @Thunder 57

Thunder 57 said:
GloFo is clearly using FD-SOI, I was curious as to anything AMD related it.

Nothing is final as of yet.

The expectation from myself is that it is inversion;
-> Jaguar evolves into Zen(17h/18h) a high performance core.
-> Excavator evolves into a new low-power and low-cost core(19h?).

AMD has been filter hiring from 2014 onwards, low power SoC/CPU-GPU RTL/Verification individuals. One major hire group is from STMicroelectronics.

For example, NovaThor L8580 eQuad(Dual-core A9 @ 3 GHz) which was followed in AMD with Stoney Ridge replacing Carrizo-L. A dual-core at faster clock rate replacing a quad-core or two bigger cores while running at lower energy. (ST-E NovaThor L8580 28FD Equad(Dual A9) vs Samsung Exynos 4412 Quad A9 and Exynos 5250 Dual A15)

There is plenty of architects that could have worked on low-power and low-cost design at AMD. There is a design and it should be substantially fast if the above is considered.

Zen started around 2012, and sampled late 2016.
The low-power and low-cost core should have started around 2014. So, if following the above it should sample late 2018.

Zen sampled to the public in 2016 -> https://techreport.com/review/30540/amd-gives-us-our-first-real-moment-of-zen

Early, I commented Atari VCS might be the one. Following the pattern of above then...
https://c1.iggcdn.com/indiegogo-med...it,w_695/v1527637991/pzuaczs9wvmx4n7pfx70.jpg
...the Atari VCS could possibly be the first to use the LP/LC core.

https://www.theregister.co.uk/2018/06/21/atari_interview_in_full/
https://regmedia.co.uk/2018/06/20/atari-chip.mp3 <-- For example.

Utilizing a new chip on FDX, is more cost-effective than using FinFET chips.

https://www.anandtech.com/show/1243...ew-with-dr-gary-patton-cto-of-globalfoundries <== quotes from this guy.
- 22FDX will be a long-lived node so I expect will retrofit many technology modules
- We have been doing work on 12FDX here in NY for over a year .... We expect to be taking risk production on the parts early next year (2019), so we are pretty far along with the technology.
- We expect tape outs on 12FDX in 2020 with deliveries in 2021.
/ End quotes from GP.

22FDX thus tapes out in 2018, delivers in 2019. With the above information.

ET · Jun 22, 2018

Thanks for all the interesting info. Always a nice read, even if I'm still taking it with a small mound of salt.

NostaSeronx said:
Both CPU and APU are AM4 compatible and the APU will be on FP5.

I don't see how a Vishera replacement would fit into the AM4 ecosystem. Now, if AMD made that AM3+ (and got motherboard makers to release BIOSes for it), I'd consider it super exciting.

NostaSeronx · Jun 22, 2018

ET said:
I don't see how a Vishera replacement would fit into the AM4 ecosystem. Now, if AMD made that AM3+ (and got motherboard makers to release BIOSes for it), I'd consider it super exciting.

The closest thing I can find; low-power servers and cloud/IoT edge computing.
- https://en.wikichip.org/wiki/socionext/sc2a11
- http://www.socionext.com/en/products/assp/SynQuacer/Cloud/SC2A11/
- http://www.socionext.com/en/products/assp/SynQuacer/Edge/
- 24-core @ 1 GHz, 5-watt

The FX series and Ax series is AMD's mainstream/essential branding for 2017. I assume it will continue on to the next SKUs. Either it will follow Mullins with Micro- moniker or get Carrizo-L in the model name/SKU treatment. If it isn't FX or Ax for the models/SKU. <-- mild speculation.(it will either be FX/Ax or something new.)

Examples; <-- speculation.
FX Next 8xxx -> 12-watt TDP ; 8-core @ >3 GHz
FX Next 9xxx -> 25-watt TDP; 8-core @ >4.2 GHz

This is compared to; https://en.wikichip.org/wiki/amd/epyc_embedded
Epyc Emb 3101 w/ 4 cores @ 2.9 GHz w/ 35W TDP
Epyc Emb 3151 w/ 4 cores 8 threads @ 2.9 GHz w/ 45W TDP
Epyc Emb 3201 w/ 8 cores @ 3.1 GHz w/ 30W TDP
Epyc Emb 3251 w/ 8 cores 16 threads @ 3.1 GHz w/ 45W TDP
All on the SP4r2 BGA.

None of the above, are in the average of Denverton's TDP(Jacobsville as well); https://ark.intel.com/products/codename/63508/Denverton

Xeon D goes from $213.00(4-core) to $2,406.00(18-cores w/ 32 PCie)
Epyc Emb ~$100(guess 1P/4-cores) to $880(2P/16-cores w/ 32 PCIe)
Denverton goes $27(2-core) to $449.00(16-core)
FX Next would be sub-$50 for 4-core (6W+), sub-$75 for 6-core(9W+), and sub-$100 for 8-core(12W+).

A420?(430?)/AB400 being exclusive to FX/A next series? Since, based on the slides X470/B450 are 2018, and the rest Z490(24 GFX_PC4+8 GPP_PC3)/A420?(430?)/X400/AB400 will be introduced in 2019. Above 449 is the Ryzen set, and below 450 is the FX/A series set, with 400 valued series being a free-for-all.

Following the A9 series; <-- speculation
The desktop TDPs for 8th Gen A-series APUs would be 10W to 25W. Technically, the new FX chips would also be 10W-25W. With the laptop chips and FXe chips being 4.5W to 10W.

The two new Cluster-based multi-threading designs that I have found. First one is Scalable CMT and the second design is Competitive CMT. Both of these, do not touch CSMT which allows threads to go on different cores.

Scalable CMT is also Disintegrated CMT. What this design does is remove the front-end and back-end of the module. So, the module is only the cluster of cores. The front-end (branch predictor, fetch, decode, dispatch) and back-end (tlb, cache unit, etc) is moved into the L2. So, there is two NoCs or Network in Module units; One for instructions and another for data. This design can have a four core cluster. By definition, it is a two-level module. This was built to be an intermediate for VISC(also CSMT) designs later.

Competitive CMT is also Evolved Integrated CMT. This design is aimed mostly at the most efficient area utilization. With this the FPU is fused into the cores, scalable CMT does the same or has a heterogeneous FPU(discrete FPU decoders). The front-end is mostly unchanged from Bulldozer/Piledriver. The changes are the cores are competitive in the IBB/PQs; SMT tag and priority. While dropping the two decode design from Steamroller/Excavator. The biggest change however is the back-end. The two LSU regions get pushed into a single LSU unit. With emphasis on a two-level LSU design. Each core gets a register(Int<->FP) queue and both cores share the load/store queue(RQ1/RQ2<->LSQ). So, there is only one L1d unit and rest of the LSU. All of the above is to reduce area and power. This design however will most likely stick with the two core module. By defintion, this is a single-level module. With the cores having buffers between the front-end and back-end, aka increased pipeline length. Also, if it works for Zen, it will for cCMT.

sCMT is orientated for performance which instantly kills it off for gen 1. cCMT is power-focused which makes it the sensible one to implement for gen 1.

Speculation of product pipeline.
Gen 1(2019) -> cCMT // 22nm FDSOI (20nm/14nm FDSOI (Leti))
Gen 2(2021) -> sCMT or 2nd gen cCMT // 12nm FDSOI (20nm/14nm FDSOI (Leti))
Gen 3(2023?) -> VISC? or 1st/2nd gen sCMT // 7nm FDSOI (10nm FDSOI (Leti))
Gen 4(2025?) -> VISC? or 2nd gen sCMT // 3nm FDSOI (7nm Planar Nanosheet?) or 5nm FDSOI (10nm FDSOI).

22FDX+ being the node for the gen1. 22FDX+ most likely uses DITO/Dual BB on NMOS/PMOS. It also has performance enhancers which can be used as power enhancers. With 12FDX+ being the gen2 node. 12FDX+ most likely will use a revised souce-drain BEOL/MOL. Which reduces leakage and increase body biasing potential. 7FDX onwards, no idea.

Hypothetically only;
-> L1 instruction -> 64KB
-> L1 data -> 32 KB
-> 2 ALUQ - 2 ALUs // Both ALUs are complex.(One has DIV, one has MUL)
-> 2 AGUQ - 2 AGUs // Both AGUs are complex(Both load/store)
-> 2 FPUQ - 2 FMACs // 1 FP FMAC(Division, Dot, Square-root, etc) and 1 Int FMAC(Integer FMAC, Rotates, Shuffles, Extract, Store, CEU?(AES, SHA, RSA, Etc))
-> Register Queue -> 20 Memory(Jaguer-esque), 12 Store(64-byte?)
-> Load/Store Queue -> Take Zen diagram for AGU0 = RQ0 and AGU1 = RQ1. <64 Load, <40 Store
The FMAC unit is the unbridged version used from Bulldozer to Cannonlake. The main reason to utilize this format is that it consumes less power and less area. The FMACs to be more descriptive is FP-orientated(simple Int) P0 and Int-orientated(simple FP) P1. 32/28nm PD/SR to 22FDX is at most twice as much FMACs w/o BB and at most quadruple as much w/ embedded-BB design.

amd6502 · Jun 24, 2018

Some of this stuff is far out Nosta Seronx, but I'm thinking about just your Gen1 2019 prediction.

Dozers can improve in area efficiency, but they were never area efficient and would have much catching up to do. XV attained pretty nice power efficiency with gating? So your Gen1 prediction is Dozers reverting back to shared decoders (like PD) and trading in multi thread for some area and power efficiency. Hard to believe wattages would be so low.

Also, a new gen core means lots of testing.

Why not just do something simple, like port Bristol-CPU with Stoney-sized GPU to 22FDX and call it a day? It would be cheap and small ~150mm2. Could one reason be that much of the XV energy efficiency nitty gritty might be tuning really specific to 28nm and that a port would be expensive and not worth it over 2c/4t zen. The other reason, prbly to focus on zen products and gpus.

Is the socionext you mention made using FDX?

AMD has Stoney to compete with some of the Atom line, and almost everyone expects that they will just use an upcoming dual core zen APU to compete with the rest of the consumer Atom line. And then ultrabudget to budget and <10W has not been a profitable market to chase anyway.

I love the FX octa cores but would rather run something like a R5 2600 if I were to buy new hardware. I'd get 50% more threads.

NostaSeronx · Jun 24, 2018

amd6502 said:
Is the socionext you mention made using FDX?

Socionext chip is on-or-around TSMC 28HPC. Socionext uses custom TSMC nodes.
CS405/CS407 series -> 28nm TSMC.
CS602/CS661 - 16nm-12nm TSMC
It is clearly 28nm as their IP store does not have PCI Express 2.0 in the 16-nm/12-nm field.
https://www.socionext.com/en/products/customsoc/ip_macro/lineup.html => PCIe Gen2 RT/EP only goes to 28nm. // Product Brief -> PCI Express Gen2, Root/Endpoint

amd6502 said:
Dozers can improve in area efficiency, but they were never area efficient and would have much catching up to do. XV attained pretty nice power efficiency with gating? So your Gen1 prediction is Dozers reverting back to shared decoders (like PD) and trading in multi thread for some area and power efficiency. Hard to believe wattages would be so low.

Also, a new gen core means lots of testing.

Why not just do something simple, like port Bristol-CPU with Stoney-sized GPU to 22FDX and call it a day? It would be cheap and small ~150mm2. Could one reason be that much of the XV energy efficiency nitty gritty might be tuning really specific to 28nm and that a port would be expensive and not worth it over 2c/4t zen. The other reason, prbly to focus on zen products and gpus.

The design methodology would have been brought across from Zen. With focus more on power side of things rather performance.

- Zen will rise in performance at same and less power and area. Switching to latest nodes is very important to keeping up increase in performance pace.
- The design hypothesized will dive to lower power and area at same or more performance. Switching to the latest and lowest nanometer node is not vital in comparison.

GlobalFoundries partners with STMicro June 11, 2012; https://www.st.com/content/st_com/e...m-fd-soi-technology-with-globalfoundries.html
GlobalFoundries in late 2014 show this => https://i.imgur.com/NzuMkMY.jpg
GlobalFoundries in 2015 announce this => https://www.globalfoundries.com/new...dustrys-first-22nm-fd-soi-technology-platform
So, late 2014 Advanced FDSOI versus February 2017 => https://i.imgur.com/67tdclK.png
Late 2014 doesn't have body biasing while early 2017 does.

Porting doesn't make sense as AMD has had plenty of warning for everything. 22FDX didn't come out miraculously out of nothing. So, if something is coming it is going to be new. My guess is a Bobcat(Excavator) to Jaguar(New Core) like evolution at minimum.

amd6502 said:
AMD has Stoney to compete with some of the Atom line, and almost everyone expects that they will just use an upcoming dual core zen APU to compete with the rest of the consumer Atom line. And then ultrabudget to budget and <10W has not been a profitable market to chase anyway.

I love the FX octa cores but would rather run something like a R5 2600 if I were to buy new hardware. I'd get 50% more threads.

Any 22FDX product would be low cost-orientated. Position of budget is more for less is always better. It can only be profitable if the product is cheap and volume is high.

R5 2600 there is 50% more threads. NextFX 8-core might be less than half to more than half the cost of the R5 2600 however.

Historical context:
April 9, 2014 - Athlon 5350(top budget SKU)/25 Watt = $55 -- July 3, 2014 - A10-7800(top locked SKU)/65 Watt = $153

If following ULP road and if it follows exactly what happened before...
Bobcat 1.0 has two-cores and one VLIW5 unit.
Jaguar has four-cores and two GCN units.
Stoney Ridge has two-cores and three GCN CUs units.
Next APU would have four-cores and six GCN NCUs units.

https://i.imgur.com/HkMtYii.png
Just food for thought from 2016 => 90 million chips of ~182 mm squared by 2020(estimate) from SOITEC. End half is annual wafer consumption and that FDSOI price decline.

250 mm squared however from 28nm 9T(114CP/90MP) to 22nm 8T(104CP/80MP) is 158 mm squared. So, 182 mm squared can't be AMD.

amd6502 · Jun 24, 2018

NostaSeronx said:
So, late 2014 Advanced FDSOI versus February 2017 => https://i.imgur.com/67tdclK.png
Late 2014 doesn't have body biasing while early 2017 does.
[...]
Porting doesn't make sense as AMD has had plenty of warning for everything. 22FDX didn't come out miraculously out of nothing. So, if something is coming it is going to be new. My guess is a Bobcat(Excavator) to Jaguar(New Core) like evolution at minimum.Any 22FDX product would be low cost-orientated.
If following ULP road and if it follows exactly what happened before...
Bobcat 1.0 has two-cores and one VLIW5 unit.
Jaguar has four-cores and two GCN units.
Stoney Ridge has two-cores and three GCN CUs units.
Next APU would have four-cores and six GCN NCUs units.

I agree native die is likeliest 4t + 5 to 6 CU. And RR cut in half would likely be able to do this at a ~140mm2 at 14nm.

Your chart shows 22FDX is very performance competitive with 14 finfet while being considerably cheaper. So there could be a niche in it for ~100mm2 to 140mm2 budget project on 22FDX.

Stoney (and future dual core die APUs) at under 125mm2 will have to do if they choose not to chase this margin-thin bottom end market. Perhaps with trends in Asia and other populous emerging markets it is a very large market, so maybe it's worth it. If they do dedicate resources and come up with a new architecture like you predict it would be super nice to see a low wattage ITX platform like an AM1 successor.

I also like ET's suggestion of extended support for old platforms like AM3. And my idea of multipurposing these dies with low end discrete video card GPUs.

NostaSeronx · Jun 25, 2018

AMD would want to funnel everything into the TR4/AM4/FP5 chipsets. With the 22FDX parts utilizing revised low-cost versions of AM4 and FP5.

The markets are usually divided this way:
25% is the premium market and make 66% of the revenue. <-- FinFETs where higher density is needed now.
75% is the budget market and makes 33% of the revenue. <-- FDSOI where lower cost is needed now.

-> 22FDX brings 7LP IP; VCN, GFX, IFX, etc
22FDX/7LP sample with each other. So, IP would generally be slotted together. With 22FDX getting the ULP versions.

2019:
7LP CPU/GPU (Successor to Pinnacle and Polaris(7LP+ being Vega successor))
12LP APU (Successor to Raven)
22FDX APU (Successor to Bristol/Stoney)
^-- all utilizing the same IP gen.

2021:
3LP CPU/GPU
7LP+ APU
12FDX APU

Putting on my Nosta hat for 2023: (For 2.5-nm, or 25 Angstrom node; 3LP -> 2.5LP is equivalent to 32-nm to 28-nm/22-nm.)
2.5LP CPU/GPU
3LP+ APU
7FDX APU

Back to the 22FDX...
2013/2014 -> architectural specification.
2016/2017 -> Physical design on 22FDX
https://en.wikipedia.org/wiki/Physical_design_(electronics)
https://en.wikipedia.org/wiki/Physical_design_(electronics)#/media/File:PhysicalDesign.png

Utilizing the above 2018/2019 is a good enough point. The data does point to the possibility of a 22FDX GPU. It might even use LPDDR4X.

Low Power Group:
22FDX CPU still is a "High-Performance" CPU
22FDX GPU still is a "High-Performance" GPU
22FDX APU still is a "High-Performance" APU

//1. GlobalFoundries 28FDSOI (Old FDSOI, before Steamroller/Excavator delay)
2. STMicroelectronics 28FDSOI (More recent in hires: 28nm bulk/28nm FDSOI exp.)
3. Samsung 28FDSOI (This one follows above with STM customers emps that went AMD)
4. GlobalFoundries 22FDSOI(contractors to employees w/ 22FDX and 28/40nm Bulk exp.)
5. GlobalFoundries 12FDSOI(Technical Staff optimizations for 12FDX and 7LP.)

dark zero · Jun 25, 2018

Sadly both of them are impossible at 22nm

Also 14 nm is for APUs...

amd6502 · Jun 26, 2018

NostaSeronx said:
AMD would want to funnel everything into the TR4/AM4/FP5 chipsets. With the 22FDX parts utilizing revised low-cost versions of AM4 and FP5.

The markets are usually divided this way:
25% is the premium market and make 66% of the revenue. <-- FinFETs where higher density is needed now.
75% is the budget market and makes 33% of the revenue. <-- FDSOI where lower cost is needed now.

-> 22FDX brings 7LP IP; VCN, GFX, IFX, etc
22FDX/7LP sample with each other. So, IP would generally be slotted together. With 22FDX getting the ULP versions.

2019:
7LP CPU/GPU (Successor to Pinnacle and Polaris(7LP+ being Vega successor))
12LP APU (Successor to Raven)
22FDX APU (Successor to Bristol/Stoney)
^-- all utilizing the same IP gen.

Do you really think it's that extreme?! (25% premium market gets 66% revenue?)

(In the dozer days it seems AMD was almost strictly in the budget market (am1, fm2) and toward the end when am3 fx chips became "budget" and almost budget, they were almost entirely in that market. )

So you're saying if AMD can fill part of that sheer volume, say half of 75% with lower to produce chips it makes an important impact on the bottom line.

Not counting packaging, how much do you think the cost difference is between:
a 12/14nm 140mm2 (2c/4t+5CU) die
a budget die of equal area based on 22FDX (4c/4t and 3+CU)?
a 125mm2 Stoney die
a 250mm2 Bristol die
a 210mm2 RR die

I think AMD has an approach to divide the market into three categories. Mainstream, ultra budget, ultra high end, and when they have a choice, they give emphasis on mainstream, followed upper end, then budget. So it is fiscally smart of Dr Su to prioritize these markets. Should intel gauge prices on atom based pentirums maybe she'd reconsider. But it looks unlikely in the near future, and even if it does happen, Stoney and dual thread zen APU would still provide decent coverage of this ultra budget end.

NostaSeronx · Jun 27, 2018

amd6502 said:
Do you really think it's that extreme?! (25% premium market gets 66% revenue?)

(In the dozer days it seems AMD was almost strictly in the budget market (am1, fm2) and toward the end when am3 fx chips became "budget" and almost budget, they were almost entirely in that market. )

So you're saying if AMD can fill part of that sheer volume, say half of 75% with lower to produce chips it makes an important impact on the bottom line.

In AMD's Total Market, FDX products would be 45% of that market. With Leading Performance products targeting 55% of their market.

FDX markets for x86:
- SBCs => single board computers
- Low cost OEMs => HP/Dell
- Low cost Semi-custom => Bi-directional semi-custom
-> Non-AMD products with AMD IP, royalties paid to AMD. (Higon, etc)
-> AMD products with foreign/third party IP, royalties paid to foreign/third party. (Cyclos, etc.)

amd6502 said:
Not counting packaging, how much do you think the cost difference is between:
a 12/14nm 140mm2 (2c/4t+5CU) die
a budget die of equal area based on 22FDX (4c/4t and 3+CU)?
a 125mm2 Stoney die
a 250mm2 Bristol die
a 210mm2 RR die

Arbitrarily numbers, initial numbers:
Bandy Plus => 87,500 arb pts
22FDX SKU => 52,500 arb pts
Stoney => 34,375 arb pts
Bristol => 68,750 arb pts
Raven => 131250 arb pts

Reorder of the above from most to least expensive to produce:
Raven with 131.25 pts
Bandy Plus with 87.5 pts
Bristol with 68.75 pts
22FDX with 52.5 pts
Stoney with 34.375 pts

22FDX die would feature Bristol Ridge-esque performance. While being cheaper, lower power, and increased market capability. Increased specification from Stoney to 22FDX would allow the salvaged dies to be Stoney-esque in cost.

amd6502 said:
I think AMD has an approach to divide the market into three categories. Mainstream, ultra budget, ultra high end, and when they have a choice, they give emphasis on mainstream, followed upper end, then budget. So it is fiscally smart of Dr Su to prioritize these markets. Should intel gauge prices on atom based pentirums maybe she'd reconsider. But it looks unlikely in the near future, and even if it does happen, Stoney and dual thread zen APU would still provide decent coverage of this ultra budget end.

If Intel contra-revenues Atom, AMD is in spot to also contra-revenue 22FDX. Intel would lose the battle as 22FDX costs less than 14nm/10nm/7nm FinFETs. So, subsidizing 22FDX in EPYC margins is completely feasible.

EPYC 32c = $3000 ASP (20x Raven 4c/11 ASP)
Ryzen 8c = $300 ASP (2x Raven 4c/11 ASP)
---
FinFET lifespan is considered average at or around five years for customers.
FDSOI lifespan by Intel Research is average at or around twenty five years for customers.

Semi-custom lifespan for AMD then will last for twenty five years. Which is a big deal for lets say Higon's consumer division (STB/DTV/etc).

amd6502 · Jun 27, 2018

NostaSeronx said:
In AMD's Total Market, FDX products would be 45% of that market. With Leading Performance products targeting 55% of their market.
[...]
Reorder of the above from most to least expensive to produce:
Raven with 131.25 pts
Bandy Plus with 87.5 pts
Bristol with 68.75 pts
22FDX with 52.5 pts
Stoney with 34.375 pts
[...]
EPYC 32c = $3000 ASP (20x Raven 4c/11 ASP)
Ryzen 8c = $300 ASP (2x Raven 4c/11 ASP)
---
FinFET lifespan is considered average at or around five years for customers.
FDSOI lifespan by Intel Research is average at or around twenty five years for customers.

Semi-custom lifespan for AMD then will last for twenty five years. Which is a big deal for lets say Higon's consumer division (STB/DTV/etc).

Risc-V and acorn are much bigger competition for semi-custom headed to consumer products. I think x86 in that market will be almost niche. I think one die project on FDSOI to cover diverse markets would make a lot of sense, but I'd think that it could cover no more than 10% of AMD sales (lucky if it were 5%).

Given the pricing I seen on BR AM4 products, and on Stoney mobile products, it seems that they may have either slowed 28nm production to a trickle or even stopped production. This hints that they are phasing out BR mobile/bga for cut down RR dual cores. (It suggests a native zen dual core APU is close; also phasing out BR might be followed by increased Stoney production).

Looking at yesterday and today's prices for Stoney laptops is kind of shocking. It's probably the OEMs fault, they are fascinated with products that smack of planned obsolescence and have no qualms about price gauging naive suckers that go for the A-series numbering inflation (A9 really should have been A4 in the original scheme, or A6 in the circa 2014 numbering scheme.) Unless it is a top binning they belong in netbooks or bottom end all-in-ones (if driven with over 20W tdp).

Are you using this transistor pricing? https://electroiq.com/petes-posts/wp-content/uploads/sites/6/2014/01/Jones5.jpg It seems like it's years old; 14nm must have come down in cost significantly. At ~$0.50 per pts I would guestimate and change those prices to:

Stoney $18
22FDX $26
Bristol $35
RR/2 $34
Raven $68

Then you add ballpark $10 to $18 of packaging depending on whether it's BGA or AM4 with no-frills TIM heat spreader. And ~$1 for OEM distribution. (Note I used the word guestimate, so these numbers are my wild guess, which is mostly based on your guess or numbers).

NostaSeronx · Jun 28, 2018

@amd6502
I am not using gate/transistor pricing. I am using MPW-based calcs: $$ * mm squared divided by Sample#. Hence, Arbitrary because different MPW providers have different costs. While, the MPW costs do not reflect deals between top tier customers and their foundries.

2017 IBS:
http://soiconsortium.eu/wp-content/uploads/2017/08/MS-FDSOIPRS9.2617.pdf

28-nm 2014 => $1.4 @ 100M or 0.014 @ 1M
28-nm 2017 => $0.92 @ 100M or 0.0092 @ 1M
14-nm 2014 => $1.62 @ 100M or 0.0162 @ 1M
14-nm 2017 => $1.43 @ 100M or 0.0143 @ 1M
22-nm 2017 => $1.07 @ 100M or 0.0107 @ 1M
12-nm 2017 => $1.11 @ 100M or 0.011 @ 1M

Costs get driven down over time.

Explains why AMD went on 14-nm HVM in 2017. It was about the same cost as 28-nm HVM was in 2014.

22FDX/12FDX are both set up to be replacing 28-nm 2017 before 2020. I assume the delay for 12FDX to 2021 deliverables is meant to get some semblance of Moore's Law.

22FDX deliverables 2019 => ~0.9x @ 100M gates
12FDX deliverables 2021 => ~0.8x @ 100M gates

amd6502 · Jun 29, 2018

NostaSeronx said:
@amd6502
28-nm 2017 => $0.92 @ 100M or 0.0092 @ 1M
[...]
22-nm 2017 => $1.07 @ 100M or 0.0107 @ 1M
12-nm 2017 => $1.11 @ 100M or 0.011 @ 1M

$11 per billion transistors is too good to pass up (vs $14 or $9 for current 14nm or 28nm), especially for 12nm fdsoi which seems like i would have great advantages of 12/14nm finfet and be in a different league from their 28nm; I hope they're putting at least a little resources into this worthwhile branch. (If AMD misses 22nm they better not miss 12FDX.) I think GCN 1.2 is current enough for budget segment for the next five plus years, so no need to port finfet GPUs.

amd6502 · Jun 29, 2018

It's interesting to see how AMD grew the APU. These numbers are from cpu-world.com:

A8-3xxx : 1.2 billion ( LLano was smaller than 1st gen dozer thanks to the compact k10 core, see http://www.cpu-world.com/CPUs/K10/AMD-A-Series A8-3800.html )
A10-5800 : 1.3 billion transistors (This was the first gen dozer APU which followed llano which was the first APU).
A10-7xxx : 2.4 billion transistors. (almost twice as many, 1.84x in same die area as the 32nm predecessor, a testament to density of 28nm bulk silly cone; prbly mostly GPU growth from 8CUs of GCN).
Bristol A12: 3.1 billion transistors (using same die area , thanks to high density libs; upgraded GCN and DP GPU champ).
Stoney A9's: 1.2 billion transistors (just about as many transistors as first gen dozer APU, but with half the cores and half the CUs).
Raven Ridge: 5 billion transistors (massive transistor growth due 11 CU's of graphics and four big cores, Moore's law in action. )

EDIT: I left out the kitten based APUs. They were always small and stayed under a billion transistors. Beema: 0.93 Billion transistors and under 110mm2 with minimal iGPU. Very much worth to look at this old AnandTech article: https://www.anandtech.com/show/7974...hitecture-a10-micro-6700t-performance-preview

Screen-Shot-2014-04-29-at-1.06.31-AM_678x452.jpg

NostaSeronx said:
@amd6502
Costs get driven down over time.

Explains why AMD went on 14-nm HVM in 2017. It was about the same cost as 28-nm HVM was in 2014.

22FDX/12FDX are both set up to be replacing 28-nm 2017 before 2020.

So what happens to 28nm as it is phased out? Will they use most 28nm fabs for DDR4 and GDDR memory production?

I heard they were already using 16/14nm for premium memory in recent years.

NostaSeronx · Jun 29, 2018

amd6502 said:
So what happens to 28nm as it is phased out?

It doesn't get phased out completely. It gets shifted to lower capacities or to other foundries. There are always more customers with niche markets.

28nm Bulk might shift into 28nm FDSOI as 28-nm/22-nm/18-nm/12-nm from Samsung/GlobalFoundries ramp up. There is also possible progress for FDSOI in 130-nm, 90-nm, and 45-nm nodes. As it gets more mainstream and as the initial cost of ownership drops below bulk.

GlobalFoundries has an unofficial successor to 28SLP called 22ULP. It follows TSMC's 22ULP, but it is mainly used as a jump pad to 22FDX.

dark zero · Jun 29, 2018

As far I know only Mediatek has plans to use the 22nm process on their entry processors if the Helio A22 is not the most basic one.

Also they are planning to use it for A75 chips for tablets or Gaming consoles along (at last) a decent GPU

NTMBK · Jul 3, 2018

amd6502 said:
It's interesting to see how AMD grew the APU. These numbers are from cpu-world.com:

A8-3xxx : 1.2 billion ( LLano was smaller than 1st gen dozer thanks to the compact k10 core, see http://www.cpu-world.com/CPUs/K10/AMD-A-Series A8-3800.html )
A10-5800 : 1.3 billion transistors (This was the first gen dozer APU which followed llano which was the first APU).
A10-7xxx : 2.4 billion transistors. (almost twice as many, 1.84x in same die area as the 32nm predecessor, a testament to density of 28nm bulk silly cone; prbly mostly GPU growth from 8CUs of GCN).
Bristol A12: 3.1 billion transistors (using same die area , thanks to high density libs; upgraded GCN and DP GPU champ).
Stoney A9's: 1.2 billion transistors (just about as many transistors as first gen dozer APU, but with half the cores and half the CUs).
Raven Ridge: 5 billion transistors (massive transistor growth due 11 CU's of graphics and four big cores, Moore's law in action. )

EDIT: I left out the kitten based APUs. They were always small and stayed under a billion transistors. Beema: 0.93 Billion transistors and under 110mm2 with minimal iGPU. Very much worth to look at this old AnandTech article: https://www.anandtech.com/show/7974...hitecture-a10-micro-6700t-performance-preview

You also left out the console APUs!

The Stoney Ridge/Llano comparison is interesting. For the same transistor count you gained a whole bunch of features (video decoding, integrated on-die southbridge) and some single-thread performance, but lost a whole bunch of GPU shaders and an entire memory channel.

So what happens to 28nm as it is phased out? Will they use most 28nm fabs for DDR4 and GDDR memory production?

I heard they were already using 16/14nm for premium memory in recent years.

There's a whole bunch of non-leading edge applications for silicon manufacturing. Storage controllers, microcontrollers, low end phone SoCs, motherboard chipsets, USB controllers, networking chips, etc etc. They'll find plenty of uses for it.

It will probably also see a lot of use for niche parts that can't justify the cost of sub-28nm mask sets. 28nm was the last node to not need multi-patterning, as far as I know: https://www.extremetech.com/computi...hography-technique-to-push-moores-law-to-20nm For small to medium sized production runs, it can be hard to justify the capital investment needed to make anything smaller.

EDIT: For reference, here's how TSMC's revenue share for one quarter in 2017 looked:

TSMC%20revenue%20breakdown%20by%20technology_1508454188.png

Almost half of their revenue came from nodes older than 28nm. And since chips on those older nodes will be cheaper, I suspect that the number of wafers would be over half.

amd6502 · Aug 7, 2018

Hey guys, does anyone here use a Stoney mobile (like 10w a9-9400) ? I'm wondering about the lowest power p state. How low can these cores clock? My 19w kaveri A10 (rip) could only go down to 1100mhz which is ~ twice the frequency that my 7.5w celeron atom can go down to (533mhz), and it was not terribly good on battery life (esp'ly in linux which did not kick down the gpu freq). My piledriver desktop (fx-8300) can go down to 1400mhz. I don't have access to my 15w bristol mobile right now, so I have no info on what BR p-states are either. They might be similar. If anyone has jaguar/puma I'd also be curious about lowest p state frequency. TIA

LTC8K6 · Aug 8, 2018

700mhz for the 9400 I believe.
1200 for BR, I think.

amd6502 · Aug 8, 2018

LTC8K6 said:
700mhz for the 9400 I believe.
1200 for BR, I think.

Very nice, so that's one improvement stoney has over bristol (and better than k10, probably, 35w llano could do 800.) It's best to have that wide range. I'd rather have a 10w or 15w a9 that could clock high than a 6w a9, as long as it had a similar minimum p state. The 10w 9400 seems like a decent product for netbooks and emmc laptops.

ET · Sep 3, 2018

With GF dropping from the 7nm race, 22FDX may be a good way to AMD to continue taking advantage of GF manufacturing.

I'm wondering, given the availability of eMRAM for 22FDX, does it make sense to use it for caches? GF advertises it as having 12.5ns/40ns R/W access time.

NostaSeronx · Sep 3, 2018

ET said:
I'm wondering, given the availability of eMRAM for 22FDX, does it make sense to use it for caches? GF advertises it as having 12.5ns/40ns R/W access time.

eMRAM-F with SRAM interface is not meant for perf-orientated applications with the 10^8 endurance.
eMRAM-S is with the 10^14 endurance.

There is also the VCMA-MTJ with MeRAM. Which is part of the MRAM series and loves the body-bias of UTBB FDSOI.
"With the same acceptable WER, MeRAM shows advantages of 83% faster write speed, 67.4% less write energy, 138% faster read speed, and 28.2% less read energy compared with STT-RAM. Benefiting from the VCMA effect, MeRAM also achieves twice the density of STT-RAM with a 32 nm technology node, and this density difference is expected to increase with technology scaling down."
- Comparative Evaluation of Spin-Transfer-Torque and Magnetoelectric Random Access Memory

"Our analysis show that reliability issues (including process variation, HCI, NBTI and SBD) induced performance degradation and failure can be well-mitigated in the MeRAM bit-cell design."
- Addressing Failure and Aging Degradation in MRAM/MeRAM-on-FDSOI Integration

Then, you have VCMA-MEJ!
"MeRAM can achieve ultrafast switching (<;1 ns), low switching energy (~1 fJ), and compact cell size of 6 F2 with a shared source region, as well as nonvolatility. For another application, we propose the VCMA-MEJ-based TCAM, which will be referred to as MeTCAM, consisting of 4T-2MEJs. Since MeTCAM fully exploits the low power and high density features of the VCMA effect both in write and search operation modes, it obtains a fast searching speed (0.2 ns) with the smallest cell area (44 F2) compared to previous works."
- Magnetoelectric Random Access Memory (MeRAM) based circuit design by using Voltage-Controlled Magnetic Anisotropy in Magnetic Tunnel Junctions

But, back to STT-MTJ MRAM;
"Furthermore, through system-level workload characterizations and write traffic calculations for a variety of memory applications, we have found that write endurance in the range of 10^12 cycles would be sufficient for practically unlimited operations."
- A Study on Practically Unlimited Endurance of STT-MRAM
// Goes on to use it as L2 and L3 cache. With MRAM-S being enough to survive to effective unlimited endurance.

eMRAM-S => 2 MB L2 per module(Excavator) or 4 MB L2 per cluster(Jaguar) or 4 MB L2/16 MB L3 per CCX(Zen). (MRAM-S so far is >10^14 endurance)
eMRAM w/ VCMA => 4 MB L2 per module(Excavator) or 8 MB L2 per cluster(Jaguar) or 8 MB L2/32 MB L3 per CCX(Zen). (VCMA so far is >10^12 endurance)

TSMC plans to slowly introduce MRAM:
"TSMC plans to enter so-called "risk production" of its eMRAM in chips in 2018 ... using a 22nm manufacturing process ..." -> 22ULP/22ULL
http://www.eenewsanalog.com/news/report-tsmc-offer-embedded-reram-2019-0

Compared to GlobalFoundries:
- FinFET 14lpp/12lp SRAM type 1T1R MRAM-S/ReRAM-S
&
1T-1R (ReRAM and MRAM) in 14nm FinFET and 22nm FDSOI
//Linkedin.

ET · Sep 4, 2018

Interestingly, it looks like GF originally described eMRAM-F as having 10^8 endurance for the Flash interface and 10^10 for the SRAM interface, but then dropped it to 10^6 and 10^8.

Given that I saw eMRAM-S associated with the FinFET processes, and that's a dead end now that 7nm is going away, and GF apparently settling on FDSOI for future development, will we even see it come to market?

Obvcop · Sep 4, 2018

ET said:
Interestingly, it looks like GF originally described eMRAM-F as having 10^8 endurance for the Flash interface and 10^10 for the SRAM interface, but then dropped it to 10^6 and 10^8.

Given that I saw eMRAM-S associated with the FinFET processes, and that's a dead end now that 7nm is going away, and GF apparently settling on FDSOI for future development, will we even see it come to market?

Considering glofo cancelled pretty much everything, then surely it's a no. 22fdx is the only thing we can say for sure. Otherwise you enter the realm of pure speculation and fantasy

ET · Sep 4, 2018

Obvcop said:
Considering glofo cancelled pretty much everything, then surely it's a no.

GF didn't cancel the existing 14nm, and eMRAM-S was promised for that, so it could still appear there. I'm speculating that GF would prefer to concentrate on 22FDX for now, but that doesn't mean that 14nm is dead.

AMD Bristol/Stoney Ridge Thread

Diamond Member

Senior member

Diamond Member

Senior member

Diamond Member

Senior member

Diamond Member

Platinum Member

Senior member

Diamond Member

Senior member

Diamond Member

Senior member

Senior member

Diamond Member

Platinum Member

Lifer

Senior member

Lifer

Senior member

Senior member

Diamond Member

Senior member

Junior Member

Senior member