AMD Bristol/Stoney Ridge Thread

Page 58 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

NostaSeronx

Diamond Member
Sep 18, 2011
3,683
1,218
136
Well since this thread has been revived I guess. Also, the patent is pretty old. => https://patents.google.com/patent/US10698472B2/ [2017-10-27]
It also doesn't imply x86 being low-feature -> high feature. It could easily be RDNAx -> CDNAx, etc. In this case, RDNA is for easy simple FP32/Wave32 stuff and CDNA for harder packed FP64/Wave64 stuff, or whatever. They tend to mix CPU<->GPU patents a lot, a bunch of CMT is now hidden in GPU patent text.

On another note; 22FDX since May 2020 has been on-par with other competitors 7nm FinFETs. If it is Samsung or TSMC who knows...
It is apparently with the 18-nm Gate Length/C18/Tsoi(options from 6 to 7nm) libs. (Tsoi=6nm => ~10-nm FDSOI and Tsoi=7nm => ~22-nm FDSOI; = reference targets)

Also, on the successor to 15h appears to be blocked with 128-bit Hi-Perf SIMD/VLIW2.
64-bit ALU/FPU0 - PRN/PRF0 - 64-bit ALU/FPU1 ||
64-bit ALU/FPU2 - PRN/PRF1 - 64-bit ALU/FPU3 || Beside:
64-bit AGU/Store - PRN/PRF2 - 64-bit AGU/Store || Scheduler(Convert mapped scalar ALU/AGU/FPU to VLIW2 or direct receive mapped SIMD), Map/NSQ(Mapped Micro-ops), Retire (Dense Micro-op)
\\ Load/Store Unit, somewhere near above units
^-- Duplicated by each core near each other like RDNA SIMD32 units.

Might be a rename planned from compute unit of 15h to workgroup unit to match with GPUs like they previously did.

AMD will probably wait for 12FDX with Tsoi=5nm and Tbox=15nm(~7nm FDSOI ref. target) and denser Cx/Lg option. (~2021-2024 timeframe)
 
Last edited:
  • Like
Reactions: amd6502

NostaSeronx

Diamond Member
Sep 18, 2011
3,683
1,218
136
Or they will probably continue to use TSMC FinFET and GAAFET...
I doubt anything related to Zen4 will use GlobalFoundries. It will be all N7-related IOD and N5-related CCD.

How ever related to GlobalFoundries, that is still the home of 15h successor and AMD's newer unrelated to CPU/GPU product lines.
 

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136
The patent is interesting insofar as the hardware doesn't require OS support but transparently moves threads from the low feature to the high feature core when the thread requires features of the latter:
US10698472-20200630-D00004.png


Since the patent is from 2017 (and granted last month) I'd expect it to be Zen based (so this would be the wrong thread for it), just with all the advanced features removed. The patent also talks about shared L2$ and shadowed L1$, so unlike common big.LITTLE implementations the different cores are very tightly coupled. I can imagine an implementation of this patent appearing in a future Ryzen mobile APU.
 
  • Like
Reactions: amd6502

amd6502

Senior member
Apr 21, 2017
971
360
136
As Nosta points out it could also be very applicable to GPU.

However, if it's Zen, what mini core would they tightly integrate within a Zen cluster? I guess something Jaguar-like is most plausible.

Because the mini core only needs to be a subset, it wouldn't even need a math coprocessor. Like you point out, it could halt, and pass it on as a background thread on the big core.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,683
1,218
136
However, if it's Zen, what mini core would they tightly integrate within a Zen cluster? I guess something Jaguar-like is most plausible.
Zen2 onwards are all mini-cores w/ high frequency optimization. The only solution AMD has is different physical implementations. With hi-perf core being second most dense track lib and low-power being most dense track lib. They would probably also cut off YMM/ZMM registers off to only support 128-bit length.
Different track height
Different native SIMD length.

They don't need a from-scratch architecture to do pure low-power on leading edge FinFET/GAAFET.
At 6W and that pricing, this will throw Atoms and Stoney Ridge APU where they're belong. In the bins. Forever.
The only one that competes/succeeds Stoney is https://www.amd.com/en/products/apu/amd-3015e

FT5 from FT4
Up to DDR4-1600 from up to DDR4-1866
1.2/2.3/0.6 GHz from 1.8/2.7/0.72 GHz
150 mm2 from 125 mm2
$219 for 1366x768 and $299 for 42 Wh battery from $299 for 1080p and 57 Wh battery

22FDX w/ optimization => ~100 mm2 (Stoney-shrink) [Capable of 7nm FinFET perf]
12FDX w/ optimization => ~53 mm2 (Stoney-shrink) and ~116 mm2 (Raven2-shrink) [Capable of 5nm/3nm FinFET perf]

Also, Lenevo isn't selling in the peoples interest anyway;
^== $299 in web
^== $319 in web
^== $214 in web (only one remotely sane for the price, but heck I'm going on a limb that it should be $149.99 or lower)

All cases it is the full part for 28nm. However, it is a salvaged part for 14nm. Either get no SMT or no dual channel, even then I wouldn't trust Lenevo to actually do dual channel. With what SMT is actually meant to do probably means the FT5 model will be better. :mad:

Whacky issues at Malta as it goes for more mature nodes. Or, go for stable Dresden as it gets 12FDX up.

Apparently, there is LPDDR5 there.

And, there is with Synposis;
LPDDR4 multiPHY V2 - GF 22FDX
LPDDR4 multiPHY V2 - GF 22FDX18 for Automotive Grade 1
125 deg C ambient operating right there.
 
Last edited:
  • Wow
Reactions: amd6502

Asterox

Golden Member
May 15, 2012
1,026
1,775
136
Zen2 onwards are all mini-cores w/ high frequency optimization. The only solution AMD has is different physical implementations. With hi-perf core being second most dense track lib and low-power being most dense track lib. They would probably also cut off YMM/ZMM registers off to only support 128-bit length.
Different track height
Different native SIMD length.

They don't need a from-scratch architecture to do pure low-power on leading edge FinFET/GAAFET.
The only one that competes/succeeds Stoney is https://www.amd.com/en/products/apu/amd-3015e

FT5 from FT4
Up to DDR4-1600 from up to DDR4-1866
1.2/2.3/0.6 GHz from 1.8/2.7/0.72 GHz
150 mm2 from 125 mm2
$219 for 1366x768 and $299 for 42 Wh battery from $299 for 1080p and 57 Wh battery

22FDX w/ optimization => ~100 mm2 (Stoney-shrink) [Capable of 7nm FinFET perf]
12FDX w/ optimization => ~53 mm2 (Stoney-shrink) and ~116 mm2 (Raven2-shrink) [Capable of 5nm/3nm FinFET perf]

Also, Lenevo isn't selling in the peoples interest anyway;
^== $299 in web
^== $319 in web
^== $214 in web (only one remotely sane for the price, but heck I'm going on a limb that it should be $149.99 or lower)

All cases it is the full part for 28nm. However, it is a salvaged part for 14nm. Either get no SMT or no dual channel, even then I wouldn't trust Lenevo to actually do dual channel. With what SMT is actually meant to do probably means the FT5 model will be better. :mad:

Whacky issues at Malta as it goes for more mature nodes. Or, go for stable Dresden as it gets 12FDX up.

Apparently, there is LPDDR5 there.

And, there is with Synposis;
LPDDR4 multiPHY V2 - GF 22FDX
LPDDR4 multiPHY V2 - GF 22FDX18 for Automotive Grade 1
125 deg C ambient operating right there.

"I will take salvaged 2/2 or 2/4 14nm Zen APU any day+a slap in the face.

"Are we now complaining about Zen based 6W Athlons vs full great Excavator 28nm 6W TDP APU".:p

 

NostaSeronx

Diamond Member
Sep 18, 2011
3,683
1,218
136
I will take salvaged 2/2 or 2/4 14nm Zen APU any day+a slap in the face.

Are we now complaining about Zen based 6W Athlons vs full great Excavator 28nm 6W TDP APU.
It is still cheaper to buy the Excavator part. It would be even cheaper for a 22FDX part or a 12FDX part w/ same spec as the Excavator. However there is signifigant room to remove the server-orientated optimization in XV. Producing a cheap core on a controllable(SOI/channel width, SOI length, SOI depth, etc) low-risk/low-step/extremely-high volume process.

Stoney Ridge highest end part is still sub-$30 and the 6W parts don't even touch $20. While, Pollock and Dali 6Ws are more evenly spread in the $40-$60 range.

Since, GlobalFoundries owns a bit of 15h and successive CMT technology. It would make sense for the core to be licensable at GlobalFoundries. Especially, with 7th WSA defaulting to the OG WSA on node work by AMD. (22FDX so far has been done via contract/ODA, while 12FDX has been internal at AMD)

Also, in my opinion if going for absolute performance. Lucienne at 4.5W will probably provide 8-cores/7 CUs at a greater bang-for-buck than Dali/Pollock.
 
Last edited:

amd6502

Senior member
Apr 21, 2017
971
360
136
Nosta, I think Stoney as it is is on its way out. A process redesign alone isn't going to cut it, as dual thread demand is already a bit niche and it will only get more so over the next few years. A Stoney refresh isn't going to compete with cheap 28nm mediatek quadcores tablets/2-in-1's, either in price nor in perf/watt, and it is in about the same performance range. So that really limits things and explains why Dr. Su minimized investments for the extreme budget end.

Unless it makes sense to tinker and put out a semi-experimental big-little hybrid core and work things out (design and software pipecleaner) on a cheap project with a node like 28nm or 22fdx.
 
Last edited:
  • Like
Reactions: Tlh97 and krumme

NostaSeronx

Diamond Member
Sep 18, 2011
3,683
1,218
136
Nosta, I think Stoney as it is is on its way out. A process redesign alone isn't going to cut it, as dual thread demand is already a bit niche and it will only get more so over the next few years. A Stoney refresh isn't going to compete with cheap 28nm mediatek quadcores tablets/2-in-1's, either in price nor in perf/watt, and it is in about the same performance range. So that really limits things and explains why Dr. Su minimized investments for the extreme budget end.

Unless it makes sense to tinker and put out a semi-experimental big-little hybrid core and work things out (design and software pipecleaner) on a cheap project with a node like 28nm or 22fdx.
Fam15h is pretty much done, I've been implying that another family w/ CMT improvements are planned. Atom Bonnell being the closest earliest example and Centuar's CNS being the latest closest example of an unified execution core. Then, the front-end and back-end are expanded across all planned execution cores, etc. However, unlike Fam15h which were uplifted to do server work w/ long-slow interconnects/memory, the new core is client-orientated w/ short-fast interconnects/memory. Which means less inflight instructions, and lower TLB, BTB, L1i, L1d, L2 sizes. Which can lead to a better Perf/Watt envelope than with fatty XV.

No newer designs will have less than the above, and FT5's single memory channel.

Stoney Shrink is mainly for example purposes. Given Raven(in Dali) 3150G and Pollock FT5(AMD 3015e); new target is quad-core/vega 3 in/around Bobcat's ~75 mm sq target. This only points to 12FDX(higher performance) unless the execution core+module is aggressively density focused for 22FDX(lower performance).

12FDX imho is the winner as the slides for it imply it as the successor to AMD's 28nm/14nm @ GloFo. The arrow from 14nm to 12FDX also helps.

Cheaper is usually better...
BCM2711 w/ 8 GB LPDDR4-3200 @ $75 > S922X Rev. C w/ 4 GB DDR4-2640 @ $79 > R1606G/R1505G 4 GB DDR4-2400 @ $378/333

The 14nm AMD part is so expensive, that the Renoir SBC's planned I have seen are actually cheaper than it.
 
Last edited:
  • Like
Reactions: amd6502

tempytempytemp

Junior Member
Mar 31, 2021
2
1
36
Does anyone have a dieshot of Stoney Ridge?

I can't tell if Stoney Ridge is its own silicon, or if it's just Bristol Ridge with a deactivated module.

Edit: Found this article claiming the Stoney Ridge die is 124 sq. mm, while the Bristol Ridge die is 250, but it's the only source I have.
 
  • Like
Reactions: LightningZ71

NTMBK

Lifer
Nov 14, 2011
10,208
4,940
136
Does anyone have a dieshot of Stoney Ridge?

I can't tell if Stoney Ridge is its own silicon, or if it's just Bristol Ridge with a deactivated module.

Edit: Found this article claiming the Stoney Ridge die is 124 sq. mm, while the Bristol Ridge die is 250, but it's the only source I have.

This article: https://www.anandtech.com/show/10362/amd-7th-generation-apu-bristol-ridge-stoney-ridge-for-notebooks Has slides with marketing images of the BR and SR chips. They're renderings, not photos, but you can see that SR is meant to be significantly smaller.

Edit: Also see this Reddit post.
 
  • Like
Reactions: lightmanek

amd6502

Senior member
Apr 21, 2017
971
360
136
My friend lost a motherbird due to cranberry juice and I spent fun times fixing her laptop. Well, the only fun thing about it was taking the heat sink off and taking this die shot
b5d5kch6ww061.jpg


Stoney die shot (Lenovo ideaPad)

It's pretty small, about the size of a dime. About half of what Kaveri was, or something like roughly 120mm2. You can find the exact number on cpu-world.com
 

burninatortech4

Senior member
Jan 29, 2014
661
368
136
Does anyone have a dieshot of Stoney Ridge?

I can't tell if Stoney Ridge is its own silicon, or if it's just Bristol Ridge with a deactivated module.

Edit: Found this article claiming the Stoney Ridge die is 124 sq. mm, while the Bristol Ridge die is 250, but it's the only source I have.

Stoney is it's own die. See package diagram here: https://en.wikichip.org/wiki/amd/packages/ft4

This article: https://www.anandtech.com/show/10362/amd-7th-generation-apu-bristol-ridge-stoney-ridge-for-notebooks Has slides with marketing images of the BR and SR chips. They're renderings, not photos, but you can see that SR is meant to be significantly smaller.

Edit: Also see this Reddit post.

There's a first! Someone shared an old reddit post of mine :)
 
  • Like
Reactions: lightmanek

tempytempytemp

Junior Member
Mar 31, 2021
2
1
36
So... are A6-9220C and A4-9120C the last of the excavators?

I was kinda hoping for the rumored dieshrink. I like my 9220C-powered 14w.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,683
1,218
136
So... are A6-9220C and A4-9120C the last of the excavators?

I was kinda hoping for the rumored dieshrink. I like my 9220C-powered 14w.
The last of the Family 15h architectures. The rumored dieshrink was speculation on the capabilities of 22FDX.

Which can support direct retapeouts from 28nm/32nm/40nm/45nm with continous transistor length from it's planar transistor structure.
It can also do a shrink new tapeout with lower complexity than that of shrinking it down to 14LPP/12LP/12LP+.

Both of which help keep the price below the ~$36 price of consumer Stoney Ridge and previous-gen Sempron Kabini products.


With that, my current folder of the hunt of the x2FDX/Embedded successor of the Stoney Ridge series has led me away from x86-64.

CPU Design Engineer 2 for AMD @ Orlando, FL:
The Person

A self-motivated CPU enthusiast. An effective team player who focuses on collaboration, team building, mentoring, and furthering team success.

Key Responsibilities
  • Work with a team of architects for developing new innovative embedded RISC-V CPUs.
  • Identify complex technical problems, break them down, summarize multiple possible solutions, and help the team make advances in Performance, Power, and silicon Area (PPA).
  • Understand and improve existing and emerging graphics/compute paradigms and new APIs employing RISC-V Processors.
  • Work with subsystem architects to understand bottlenecks and other problems where an embedded processor will improve the performance.
Perferred Experience
  • RISC-V
  • Verilog
  • VHDL
  • C++/Python
 
Last edited:
  • Like
Reactions: Tlh97 and amd6502

amd6502

Senior member
Apr 21, 2017
971
360
136
The last of the Family 15h architectures

Sadly. Phased out for native 2c Athlon (14nm) that now can do 6W tdp, which is comparable to Bristol Ridge except with tiny iGPU, and much better power efficiency.

I guess the ultra bottom end will be 2c atoms, quadcore acorn soc's, and whatever bit of stock is left of the Stoney dies.
 

amd6502

Senior member
Apr 21, 2017
971
360
136
CPU Design Engineer 2 for AMD @ Orlando, FL:
...
  • Work with a team of architects for developing new innovative embedded RISC-V CPUs.
  • ....

Is this broadening the portfolio in case the Softbank ARM deal goes through with Nvidia?
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,683
1,218
136
Is this broadening the portfolio in case the Softbank ARM deal goes through with Nvidia?
I don't think so. AMD would find a way to get ARM just like they did for x86. If customers wanted AMD to compete with ARM server processors within ARM space.

I think RISC-V is more in line with the approach of the AMD Am29k series(RISC-I derived?) . With a more modern focus and potentially being 2022-2024 in regards for GlobalFoundries WSA.

The instant harvested Renoir-derived 6W Embedded which is only 6mm2 larger than Dali/Pollock hit. I suspect AMD to EOL Embedded Dali/Pollock just as they did with these embedded versions:
 
Last edited:

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136
in regards for GlobalFoundries WSA
On that note I wonder when we get info about the now current agreement. The last one from 2019 should have run out with the last month, and the IR places all still reference no new agreement for the remainder 3 years of the WSA.
 

NTMBK

Lifer
Nov 14, 2011
10,208
4,940
136
The last of the Family 15h architectures. The rumored dieshrink was speculation on the capabilities of 22FDX.

Which can support direct retapeouts from 28nm/32nm/40nm/45nm with continous transistor length from it's planar transistor structure.
It can also do a shrink new tapeout with lower complexity than that of shrinking it down to 14LPP/12LP/12LP+.

Both of which help keep the price below the ~$36 price of consumer Stoney Ridge and previous-gen Sempron Kabini products.


With that, my current folder of the hunt of the x2FDX/Embedded successor of the Stoney Ridge series has led me away from x86-64.

CPU Design Engineer 2 for AMD @ Orlando, FL:
The Person

A self-motivated CPU enthusiast. An effective team player who focuses on collaboration, team building, mentoring, and furthering team success.

Key Responsibilities
  • Work with a team of architects for developing new innovative embedded RISC-V CPUs.
  • Identify complex technical problems, break them down, summarize multiple possible solutions, and help the team make advances in Performance, Power, and silicon Area (PPA).
  • Understand and improve existing and emerging graphics/compute paradigms and new APIs employing RISC-V Processors.
  • Work with subsystem architects to understand bottlenecks and other problems where an embedded processor will improve the performance.
Perferred Experience
  • RISC-V
  • Verilog
  • VHDL
  • C++/Python

Note the mention of graphics in there. This is designing cores that are embedded into the GPU, not a standalone CPU product.
 
  • Like
  • Wow
Reactions: Skoynay and amd6502