AMD Bristol/Stoney Ridge Thread

amd6502 · Jan 14, 2019

ET said:
I don't think they're available yet, so delidding isn't yet possible, but AMD had confirmed to Anandtech that the new mobile line is two new chips. Like bsp2020, I can't find it with a trivial search, but I'm sure a little more work would dig that up.

I imagine that the 200GE is purely binned first gen Raven Ridge. It was said that Raven 2 has 4x PCIe, and, assuming that's true, it would conflict with 200GE's specs. I figure we'll have to wait for the 300GE to see Raven 2.

There are only three ways I'll believe it: 1. literal statement by AMD saying it is new silicon, not rebinned or just rebinned and retuned by firmware update, 2. die shots, 3. wccftech confirms RR-L silicon. (There was some vague statement that the 14nm have all the features of the rest of the line but that isn't literal enough for me and can be interpreted other ways.)

If it were new silicon these two binning +/- 100 MHz away from a 2200u don't really make that much sense. The 2200u dies are RR silicon rejects. Likewise, if these were RR-L, the die salvage would launch after the prime and fully functional silicon. There would at least be some SKUs out there with much higher boost and base frequencies.

Sure, they would have lesser binnings too at some point, but bottom binnings just 200mhz apart again don't much sense.

NTMBK · Jan 14, 2019

Some interesting comparisons between Bristol Ridge and modern Pentium and Athlon parts: https://www.anandtech.com/show/13660/amd-athlon-200ge-vs-intel-pentium-gold-g5400-review/ Even in integrated graphics tests, the 3CU Athlon is beating a fully enabled A12-9800 fairly often- looks like that CPU bottleneck kicking in:

NostaSeronx · Jan 14, 2019

While, the CPU might be a big symptom. It is also likely the;
Onion3(up to 40 GB/s) vs IF(up to 51.2 GB/s)
1x L2 cache in GPU vs 4x L2 cache in GPU
--> https://fuse.wikichip.org/news/1596/hot-chips-30-amd-raven-ridge/3/

There is no info on if the above bench would be effective to derive 22FDX/12FDX performance.

Theoretically, lets go like this;
Sempron 3xG => ~15W PGA
Sempron 3xU => ~6W BGA
Sempron 3xY => ~3W BGA

CPU core => Lobotmized Excavator at minimum; faster memory and I/O means instruction and data gets to where it wants to go faster.
What is impacted: Branch Predictor, Pick Buffers, Retire Queue, FPU Rename Queue, etc.
What can be improved: Algorithms in each, so more is done with less.
More speed means less need for inflight operations.

Internal I/O => IF, but it will most likely be 16B on 22FDX. However, we might see 1:1 memclk, etc.
16B * 4 GHz(DDR5 @ 4 GHz) => 64 GB/s
16B * 1.6 GHz(DDR5 @ 3.2 GHz) => 25.6 GB/s

GPU core => ABB, FDSOI means that the GPU does not need the added transistors of Polaris/Vega for higher clocks.
A9-9425 is 900 MHz => Sempron 3xG at 1350 MHz
A6-9225 is 686 MHz => "" """ at 1029 MHz
A6-9220C is 720 MHz => Sempron 3xU at 1080 MHz
A4-9120C is 600 MHz => "" """ at 900 MHz
Then, the Y SKUs would be 720 MHz/600 MHz respectively.

Most expensive SKU: $30 plus/minus $1.
With the 6W SKU peaking out the same performance as the 65W Bristol. Mainly, becuase Bristol was optimized for the 15W TDP. Which in turn has 35W-65W have increasing depreciated returns.

The numbers are without body bias by the way.

Shivansps · Jan 14, 2019

NTMBK said:
Some interesting comparisons between Bristol Ridge and modern Pentium and Athlon parts: https://www.anandtech.com/show/13660/amd-athlon-200ge-vs-intel-pentium-gold-g5400-review/ Even in integrated graphics tests, the 3CU Athlon is beating a fully enabled A12-9800 fairly often- looks like that CPU bottleneck kicking in:

Something is up with those A12-9800 results, even the A8-9600 gets around 40 fps on GTA V at 720p.
A6-9500 also tends to perform the same or better than the 200GE in non cpu bottleneck games, like Fortnite and World of Tanks.

amd6502 · Jan 14, 2019

NostaSeronx said:
The numbers are without body bias by the way.

Would it be possible to make one of the (int) cores in an XV module run at reverse bias (for optimizing freqs at maximal perf/watt and somewhat above) while making the other int core run at fwd bb to maximize freqs (performance)?

If so, what top freqs could a module reach given ~5W?

And secondly, could the perf/watt of the low freq thread surpass two threads on a single zen+ core at ~1.5-2ghz

Abwx · Jan 14, 2019

amd6502 said:
And secondly, could the perf/watt of the low freq thread surpass two threads on a single zen+ core at ~1.5-2ghz

A single zen core loaded with two threads use 1W@1.6GHz, with 4 such cores loaded the uncore use 3.5W, that s 7.5W SoC power at 1.6 and a Cinebench score of about 350, that s with a R5 2500U.

The uncore is quite power hungry, at least when it comes to a 2C/4T SKU harvested from a native 4C, with a dedicated design using natively 2C, and less CUs, they can reduce substancialy the uncore power, wich would render any shrinked XV or Jaguar totaly pointless.

NostaSeronx · Jan 14, 2019

amd6502 said:
Would it be possible to make one of the (int) cores in an XV module run at reverse bias (for optimizing freqs at maximal perf/watt and somewhat above) while making the other int core run at fwd bb to maximize freqs (performance)?

FBB is aimed at running frequencies and RBB is aimed at idle frequencies. In multi-vt designs FBB and RBB can be used at the same time. However in single-vt designs only FBB or RBB can be used. This is fixed in 12FDX with DITO, which allows the transistor to have two body bias diodes; one for FBB and one for RBB. They do not run at the same time; FBB is for active/awake and RBB is for idle/sleep.

It is not possible in a migration port of Excavator to have independent core body biasing.

https://imgur.com/WCkYsuy

Any and all of those AVFS modules can be upgraded to AVFBS modules. The B in the letters is the body bias.

Lets say they can body bias independently...
Core A and Core B have intedependent voltage and body bias

Workload A goes to Core A
Workload B goes to Core B

Workload A is going to take awhile.
Workload B can be finished quickly.

Core A will then use FBB to get same frequency at a trimmed voltage. To maximize perf/watt. <== crawl-to-sleep
Core B will then use FBB to get higher frequencies at a higher voltage. To maximize performance. <== race-to-sleep

amd6502 said:
If so, what top freqs could a module reach given ~5W?

And secondly, could the perf/watt of the low freq thread surpass two threads on a single zen+ core at ~1.5-2ghz

Given a 5W TDP.
1.2(Throttle)~2.1 GHz(Base)~2.7 GHz(boost) @ ~1.25W
3.5W * 0.35 = ~1.23W
500~600 MHz @ <2.5W

^-- this expects similar devices, no tweaks for the newer client of the core(cpu/gpu): IoT/Consumer/Mobile, etc.

-> Faster AVFS implementations, increased frequency multiplier granularity, etc.
Makes all of the calculations based on 28BLK/XV pretty much mute. AMD does not do simple ports, it will be a full redesign(top-to-bottom) port at minimum. At maximum, it could be a full new design(bottom-to-top) for a new processor lineup for 22FDX/12FDX/7FDX.

The added components in Piledriver, Steamroller, and Excavator. Have implementations in FDSOI which have further increases performance or increase power efficiency. Basically, a flat 1.5x or 50% increase in frequency is not what is the max capability in 22FDX. Which limits the numbers that I can actually provide without knowing what is implemented.

I'm not going to look up Resonant Clock Mesh vs Adaptive Quasi-resonant Clock Meshes.
I'm not going to look up AVFS to AVFBS. I'm not going to look at what happens if a clock multiplier can be switched in 1us vs 100us, or by 0.0025x vs 0.25x, etc.
I can sort of look up potentials of a Integer PRF redesign and FPU datapath/PRF redesign.
In regards to track height, 22FDX 7T/8T vs 28nm 9T. Which if 28FDS is any indication: 28FDS 7T is close to 28G/HP at 12T. Do to the decline in scaling in 22BLK, I would say 7T/8T 22FDX design would perform higher than in 12T/13T 22BLK design.

The transition to FinFETs means sacrifice. While, the switch to FDSOI means "there is so much optimizations!"

amd6502 · Jan 14, 2019

Very interesting and informative Nosta! So it seems XV is a very fine tuned machine and nothing like Jaguar (which is simple port).

Maybe Zen4 or Zen_n will go FDX someday, have SMT4 and we return to single cores; 1c/4t on fdsoi. Until then a simple XV successor or the somewhat more expensive Zen on 12nm will have to do. FDX should have big advantage for low core low cost APUs.

I'd imagine the same engineers and designers that fine tuned and built XV from that power hungry Steamroller are very busy now tuning Zen+ and Zen2.

Abwx said:
A single zen core loaded with two threads use 1W@1.6GHz, with 4 such cores loaded the uncore use 3.5W, that s 7.5W SoC power at 1.6 and a Cinebench score of about 350, that s with a R5 2500U.

The uncore is quite power hungry, at least when it comes to a 2C/4T SKU harvested from a native 4C, with a dedicated design using natively 2C, and less CUs, they can reduce substancialy the uncore power, wich would render any shrinked XV or Jaguar totaly pointless.

Well, that sure is going to be very hard to beat if they manage to halve the SoC power. I guess the only clear win then is the cost, assuming the next gen Stoney stays about as cheap.

NTMBK · Jan 15, 2019

NostaSeronx said:
The transition to FinFETs means sacrifice. While, the switch to FDSOI means "there is so much optimizations!"

AMD has announced absolutely zero plans to produce a single device on FDSOI. Neither Bristol Ridge nor Stoney Ridge is FDSOI. Why do you keep filling this thread with Soitec and GlobalFoundries marketing?

amd6502 · Jan 15, 2019

NTMBK said:
AMD has announced absolutely zero plans to produce a single device on FDSOI. Neither Bristol Ridge nor Stoney Ridge is FDSOI. Why do you keep filling this thread with Soitec and GlobalFoundries marketing?

It might be because Zen is giving some of them tunnel vision. Especially so for the marketing and maybe PR people.

NTMBK · Jan 15, 2019

amd6502 said:
It might be because Zen is giving some of them tunnel vision. Especially so for the marketing and maybe PR people.

You mean that they're actually focused on the product that exists, as opposed to focusing on the fantasy product invented by Seronx? Perish the thought!

NostaSeronx · Jan 15, 2019

NTMBK said:
AMD has announced absolutely zero plans to produce a single device on FDSOI.

AMD has probably announced the FDSOI product. Any product without a node assigned to it, is probably the 22FDX or 12FDX product.

NTMBK said:
Neither Bristol Ridge nor Stoney Ridge is FDSOI. Why do you keep filling this thread with Soitec and GlobalFoundries marketing?

I never said either of them were, at least not currently. Stoney Ridge could have originally been 28nm FDSOI. Since, the team that did Stoney was from STMicroelectronics part of ST-Ericson. However, GlobalFoundries dropped 28nm FDSOI to go "Advanced FDSOI" which became 22FDX.

AMD is tied to GlobalFoundries;
AMD forecasted that they will not be going on 7LP. So, the 7nm agreement is void. Which means until the 7th WSA is signed, AMD<->GlobalFoundries are operating under the original WSA. Since, the original WSA is SOI for AMD, SOITEC is naturally part of the group.

SOITEC sells FDSOI wafers to GlobalFoundries at "future" price. GlobalFoundries can then sell processed FDSOI wafers at reduced cost per transistor. Making it a cost-competitive options compared to 28nm. Where as FinFETs are lagging, and GloFo only has one FinFET fab. While, TSMC has multi-fabs in which FinFETs are produced. TSMC allows higher profit margins on FinFETs.

So, if premium products go to TSMC and when budget products launch they will be at GlobalFoundries. Which node will the budget parts be on? 22FDX/12FDX.

Abwx · Jan 15, 2019

NTMBK said:
AMD has announced absolutely zero plans to produce a single device on FDSOI. Neither Bristol Ridge nor Stoney Ridge is FDSOI. Why do you keep filling this thread with Soitec and GlobalFoundries marketing?

At some point, and because of the senselessness, i m wondering if it s not some kind of basic AI that is at work here..

Hitman928 · Jan 15, 2019

NostaSeronx said:
AMD has probably announced the FDSOI product. Any product without a node assigned to it, is probably the 22FDX or 12FDX product.

Which product has AMD announced that doesn't have a node attached to it?

NostaSeronx · Jan 15, 2019

Hitman928 said:
Which product has AMD announced that doesn't have a node attached to it?

AMD's Dali processor.

AMD displayed it after they signed an agreement to go 12FDX instead of 7LP. <-- This was announced at a GTC(GlobalFoundries' Technology Conference) and is one of the reasons Chengdu Phase 1 is 22FDX, and Chengdu Phase 2 is 22FDX and 12FDX.

There is also processors without a market codename also known by not being in PR slides. That are also 22FDX and 12FDX.

Hitman928 · Jan 15, 2019

NostaSeronx said:
AMD's Dali processor.

AMD displayed it after they signed an agreement to go 12FDX instead of 7LP. <-- This was announced at a GTC(GlobalFoundries' Technology Conference) and is one of the reasons Chengdu Phase 1 is 22FDX, and Chengdu Phase 2 is 22FDX and 12FDX.

There is also processors without a market codename also known by not being in PR slides. That are also 22FDX and 12FDX.

So nothing they've announced, just more supposed leaks and rumors?

NostaSeronx · Jan 15, 2019

Hitman928 said:
So nothing they've announced, just more supposed leaks and rumors?

Nothing has been "officially" announced in regards to the 7th WSA and the plans of GlobalFoundries' 22FDX/12FDX.

Of those, the only guarantee is that the budget line is the successor to Bristol/Stoney. Not Raven/Raven2/Picasso/Renoir.
Premium product line increases in average selling price at increasing profit margin. (Hard to do at GlobalFoundries)
Budget product line decreases in average selling price at static or increasing profit margin. (Easy with growing FDSOI demand.)

Premium core = Zen
Budget core = GlobalFoundries IP

amd6502 · Jan 15, 2019

NostaSeronx said:
AMD's Dali processor.

AMD displayed it after they signed an agreement to go 12FDX instead of 7LP. <-- This was announced at a GTC(GlobalFoundries' Technology Conference) and is one of the reasons Chengdu Phase 1 is 22FDX, and Chengdu Phase 2 is 22FDX and 12FDX.

There is also processors without a market codename also known by not being in PR slides. That are also 22FDX and 12FDX.

Excellent info on the budget value APU.

However, I have a hard time believing 12FDX will replace 7LP. Can you point us to ref's for agreement?

There is almost zero chance that Zen teams are migrating to FDX in large droves.

It would truly blow my mind. Imho FF 12nm and 7nm is here to stay; no way they will move anything but small iGPU, APU, and low core count to FDSOI.

I can believe small transistor count products move to FDSOI and the rest of the products pretty much stay with finfet.

NTMBK · Jan 15, 2019

NostaSeronx said:
AMD's Dali processor.

AMD displayed it after they signed an agreement to go 12FDX instead of 7LP. <-- This was announced at a GTC(GlobalFoundries' Technology Conference) and is one of the reasons Chengdu Phase 1 is 22FDX, and Chengdu Phase 2 is 22FDX and 12FDX.

There is also processors without a market codename also known by not being in PR slides. That are also 22FDX and 12FDX.

Nonsense. Provide a single link backing up your claim that AMD signed this agreement.

NostaSeronx · Jan 15, 2019

amd6502 said:
However, I have a hard time believing 12FDX will replace 7LP.

12FDX doesn't replace 7LP, other than fiscally.
ex: Zen2 will not be ported to 12FDX. As 12FDX isn't replacing 7LP. It is a shift roadmaps that lead 12FDX to be prioritized over 7LP.

https://imgur.com/f4twB4E

GlobalFoundries had? the smallest leading edge fab.
TSMC is more large, and Samsung's S3 is dedicated with lots of EUV.
So, 7LP + EUV => better to go Samsung or TSMC.
So, 7LP w/ 193i => better to go TSMC.
Which is when Avera Semi comes in. GlobalFoundries is required to assist movement to other fabs.

The plan is:
22FDX has more customers than 14LPP. So, post-14LPP roadmap isn't worth supporting if AMD isn't going to support it. While, 22FDX has 12FDX as successor, which apparently AMD will be supporting. However, AMD needs not rush to 12FDX. As there is Moore's Law to take account. Budget lineup will follow the nodes that actually follows More Moore.

28BLK (Stoney) -> 22FDX (New budget APU) -> 12FDX (New budget APU+1)
1x transistor budget -> ~0.5 transistor budget -> ~0.25 transistor budget.

Why support 7LP, if majority of the product will be fabbed at TSMC anyway? GlobalFoundries might like the difference being paid to them. Thus, it isn't ideal for AMD to go through that agreement. So, now the new agreement follows the above roadmap. Which GlobalFoundries can now utilize the 7LP/7LP+/5LP/3LP R&D/Capital Expenditure for the FDX roadmap.

amd6502 · Jan 15, 2019

The bottom end is low margin and not going to net a huge portion to the bottom line, even with huge volumes. Most of the operating income (needed to repay debt and accumulate a net cash position to secure the future) is going to come from products with large numbers of transistors. This means fin fet over the medium term (next ~5 years), 12nm and 7nm specifically.

I haven't been paying close attention, but the last renegotiation I remember gives them much flexibility to go to other providers if GF cannot deliver the desired node or capacity.

NTMBK said:
Nonsense. Provide a single link backing up your claim that AMD signed this agreement.

If I missed something like this please fill me in.

I don't doubt FDX is going to be popular and have a large number of applications. It's just not going to fill the bulk of AMD's GPUs, APUs, and CPUs. It would be great if GF could fab memory on 22FDX.

If FDX were to replace the chiplet driver for Ryzen that would be good, but even there they went with 14LPP for Epyc (~400+mm2), and almost surely are reusing the work and repeating this for 3rd gen Ryzen (~120mm2) too.

amd6502 · Jan 18, 2019

So it seems like this gen APU, Picasso, is what BR was to Carrizo (a significant optimization to an already quite decent mobile APU), plus the 12nm process improvement.

The supposedly power hungry uncore and IF was tuned and greatly improves battery life (~40%). It's possible the 14nm dual cores also got this enhancement by firmware update, or by whole new native dual core die.

Dali isn't for about another year (sometime 2020 in the roadmap), although at earliest it could launch at the very end of this year (like Nov Dec) and have full revenue 2020.

I give it 60% chance it's XV+ on 22FDX and 40% chance it's a dual core Zen-lite, probably with a miniaturized cache.

krumme · Jan 18, 2019

NostaSeronx said:
12FDX doesn't replace 7LP, other than fiscally.
ex: Zen2 will not be ported to 12FDX. As 12FDX isn't replacing 7LP. It is a shift roadmaps that lead 12FDX to be prioritized over 7LP.

https://imgur.com/f4twB4E

GlobalFoundries had? the smallest leading edge fab.
TSMC is more large, and Samsung's S3 is dedicated with lots of EUV.
So, 7LP + EUV => better to go Samsung or TSMC.
So, 7LP w/ 193i => better to go TSMC.
Which is when Avera Semi comes in. GlobalFoundries is required to assist movement to other fabs.

The plan is:
22FDX has more customers than 14LPP. So, post-14LPP roadmap isn't worth supporting if AMD isn't going to support it. While, 22FDX has 12FDX as successor, which apparently AMD will be supporting. However, AMD needs not rush to 12FDX. As there is Moore's Law to take account. Budget lineup will follow the nodes that actually follows More Moore.

28BLK (Stoney) -> 22FDX (New budget APU) -> 12FDX (New budget APU+1)
1x transistor budget -> ~0.5 transistor budget -> ~0.25 transistor budget.

Why support 7LP, if majority of the product will be fabbed at TSMC anyway? GlobalFoundries might like the difference being paid to them. Thus, it isn't ideal for AMD to go through that agreement. So, now the new agreement follows the above roadmap. Which GlobalFoundries can now utilize the 7LP/7LP+/5LP/3LP R&D/Capital Expenditure for the FDX roadmap.

Nosta.
While you talked about those nodes that never showed up in meaningful numbers the world moved from 28nm to 20nm.
Then from 20nm to 14nm.
And then from 14nm to 7nm.
And we went from being young men to old men.

Have you considered the price of a fully depreciated 28nm or some well proven and well depreciated 14nm vs some new and supposedly cheap nodes?
the extreme high price of new high end nodes like 7nm plus or 5nm and as a result steep depreciation that follows killed the entire concept of those "cheap" nodes?

The market results is darn brutal and imo it's because the basic business logic was wrong.

NostaSeronx · Jan 19, 2019

The majority of fabless semiconductors are on nodes older than 28-nm. The world halted at 28-nm, the world never went 20-nm or 14-nm, nor 7-nm.

28-nm(bulk) to 22-nm(FDSOI) is significantly cheaper in total cost compared to 28-nm(bulk) to 22-nm(FinFET, DDC, Bulk). There has not been a steep depreciation that follows for FinFET.

28-nm bulk in 2014:
$0.014 per mil gates

14-nm bulk in 2014:
$0.0162 per mil gates

To
28-nm bulk in 2018:
$0.0092 per mil gates => $0.005 drop

14-nm bulk in 2018:
$0.0143 per mil gates => $0.002 drop

FDSOI however has largely remained static, throughout as it hasn't been in production;
- Wafer supply from SOITEC. /FIXED
- Demand of node from customers. /INCREASING

So, while FDSOI has largely been static:
14-nm FDSOI projected cost per mil gate => $0.0117 per mil gate
to 2018 projected cost per mil gate;
22FDX => $0.0107 per mil gate
12FDX => $0.0111 per mil gate

The increase in wafer supply (plus, the aggressive 7% off substrate per year plan) and the increase demand for FDSOI products. Means that these nodes will get their steep depreciation. I would say that we will most likely see a steep depreciation on FDSOI. While, demand for older FinFETs get replaced by FDSOI. This means that there might be a rise in older FinFET nodes. While 22FDX/12FDX will depreciate in costs for customers, 22/16/14/12/10 FF will get an appreciation of costs for customers. This will in turn exacerbate the slump in total cost in favor of FDSOI.

https://imgur.com/8S5TDYY

amd6502 · Jan 19, 2019

Well, $11 vs $14 per billion transistors is not too much of cost difference. Mainly, an XV based APU will win the cost effectiveness by going for lower range mobile segment with a fewer transistor in the APU.

$17 production cost for 1.6B tr's Stoney++ (4 threads)
$33 production cost for 2.3B tr's RR-L (2c/4t)

The Stoney++ would have higher design work, offset by cheaper cost to design on 22FDX.

RR-L design already exists, all the modules are already ready to be put together, but the lithography work is still there and will cost something.

AMD Bristol/Stoney Ridge Thread

Senior member

Lifer

Diamond Member

Diamond Member

Senior member

Lifer

Diamond Member

Senior member

Lifer

Senior member

Lifer

Diamond Member

Lifer

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Lifer

Diamond Member

Senior member

Senior member

Diamond Member

Diamond Member

Senior member