Less wide yet faster - ARM A73

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DrMrLordX

Lifer
Apr 27, 2000
21,634
10,847
136
what about 22nm fdsoi, that is cheaper than finfet and better than 28nm bulk. I think now is the time for desktop arm to start competing.

I've been wondering for awhile why nobody but IBM (that I know of) uses this process node. POWER8 seemed to do just fine on it. Lots of clockspeed potential. AMD never bothered with it when they bloody well could have, and now that GF controls the process, they're moving on to 14lpp instead of following in IBM's footsteps to use the successor to 22fdsoi (which is . . . 14nm SOI finfet?). I've always been told that 22fdsoi was too expensive. Too expensive compared to what?

The linux/android kernel has tonnes of drivers and this new uarch and new ecosystem probably wouldnt want to support legacy devices.

Linux support for ARM is kind of sketchy. The problem is that Linux has to support different implementations of the ARM IP which is more difficult than supporting x86 CPUs. If there were more standardization of ARM CPUs, it would be less of an issue.
 

monstercameron

Diamond Member
Feb 12, 2013
3,818
1
0
I've been wondering for awhile why nobody but IBM (that I know of) uses this process node. POWER8 seemed to do just fine on it. Lots of clockspeed potential. AMD never bothered with it when they bloody well could have, and now that GF controls the process, they're moving on to 14lpp instead of following in IBM's footsteps to use the successor to 22fdsoi (which is . . . 14nm SOI finfet?). I've always been told that 22fdsoi was too expensive. Too expensive compared to what?



Linux support for ARM is kind of sketchy. The problem is that Linux has to support different implementations of the ARM IP which is more difficult than supporting x86 CPUs. If there were more standardization of ARM CPUs, it would be less of an issue.

allegedly 22nm fdsoi is supposed to be cheaper than finfet, and yeah 14nm fdsoi would have been an interesting process for amd to use.

as for the arm support in the linux kernel, I know you're a smarter guy than me but the uarch support is there afaik, they just need to standardize the booting process a la x86.

Aside, not sure if support is on a per core -read mongoose, krait, etc- or uarch -read v7, v8, etc- basis.

My point was that the support was there for a viable commercial product.
 

Hans de Vries

Senior member
May 2, 2008
321
1,018
136
www.chip-architect.com
nexus2cee_ARM10nm.jpg


So Mercury has become the A35(?), and Artemis the A73
 
Last edited:

jhu

Lifer
Oct 10, 1999
11,918
9
81
allegedly 22nm fdsoi is supposed to be cheaper than finfet, and yeah 14nm fdsoi would have been an interesting process for amd to use.

as for the arm support in the linux kernel, I know you're a smarter guy than me but the uarch support is there afaik, they just need to standardize the booting process a la x86.

Aside, not sure if support is on a per core -read mongoose, krait, etc- or uarch -read v7, v8, etc- basis.

My point was that the support was there for a viable commercial product.

It is really device dependant. Raspberry Pi has really good support and can boot mainline. A lot of dev boards can boot mainline. They boot via uboot, etc. So that's not a problem. The problem is mainly phones and tablets. You'll find the same issue with x86 Android phones and tablets.
 

Lepton87

Platinum Member
Jul 28, 2009
2,544
9
81
Is anything known about that Ares core? Is it the successor to the A72 like the A73 is to the A17? Still 3-wide?
 

StrangerGuy

Diamond Member
May 9, 2004
8,443
124
106
Makes all sense. But i think that A72 was already clean and cheap enough.

Anyway is pretty elegant extracting such performance of a narrower CPU.



And 10nm process is not looking much good, will come in time, but will not have considerable performance improvements over 14/16nm.

AFAIK none of x86 OoO uarchs with the possible exception of Netburst were decode limited, bottlenecks were always found in some place else.

And I do think Qualcomm's days of custom ARM are numbered, when it's already kinda hard for Kyro to justify itself over A72 now.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
Didn't work so well on desktop. Why are they doing this for mobile?

There hasn't really been a desktop-oriented two-wide CPU since K6, unless you want to count the cat and Atom cores. Bulldozer doesn't qualify.

On desktop you can realistically use as much as 50+W on a single core, and you generally don't become thermally limited until you have at least two cores active. On phones, especially smaller ones, you can easily push the limits of sustainable power with one core, while still having clock headroom in the microarchitecture. So it may make more sense to have a more narrow core if it improves perf/W more than enough to allow the difference to be made up in clock speed. I think that's what we're seeing here. That, and there would be a significant design time overhead and potential for new risk in evolving A73 to become three wide.

It seems like A73 benefits from some symmetry in the design, apparently that simplifies some of the pipeline stages (perhaps why the branch scheduler accepts 2 ops/cycle like the other schedulers, which would normally seem like overkill) Going three wide could distort that.

I'd like some more insight as to exactly what ARM is talking about when they refer to instruction fusing. Going from two-wide decode to four-wide (uop?) input to the schedulers seems conspicuous. ARM instructions should not normally crack all that far, especially not in a way that can be sustained by the backend. Does it mean a single decoder can ever handle multiple instructions in one cycle, like branch + op?
 

DrMrLordX

Lifer
Apr 27, 2000
21,634
10,847
136
allegedly 22nm fdsoi is supposed to be cheaper than finfet, and yeah 14nm fdsoi would have been an interesting process for amd to use.

It'll be fun to watch and see what IBM can do with POWER9. POWER8 can hit clockspeeds about as high as Piledriver, though the power consumption on those things . . . wheeeew.

as for the arm support in the linux kernel, I know you're a smarter guy than me

Jury's still out on that one.

but the uarch support is there afaik, they just need to standardize the booting process a la x86.

Oh yeah it's there. I really haven't looked into the issues with getting various ARM platforms to boot mainline Linux, but if it's just a boot issue and nothing to do with variations in instruction set support etc. then it should not be a big issue. I know that a few months ago that the Caveum ThunderX support got to the point that it can boot mainline Linux. No idea on how well Seattle is supported.

Aside, not sure if support is on a per core -read mongoose, krait, etc- or uarch -read v7, v8, etc- basis.

The ARM ecosystem also has a lot of fixed function hardware, and I'm wondering how much support there is for that? Of course, newer x86 products aren't much different nowadays . . .

My point was that the support was there for a viable commercial product.

Yeah it's there. It drives Torvalds nuts though. He's actually threatened ARM SoC designers before. I think the situation is improving somewhat.

It is really device dependant. Raspberry Pi has really good support and can boot mainline. A lot of dev boards can boot mainline. They boot via uboot, etc. So that's not a problem. The problem is mainly phones and tablets. You'll find the same issue with x86 Android phones and tablets.

It was a problem with ThunderX too. I think they have that ironed out now.
 

monstercameron

Diamond Member
Feb 12, 2013
3,818
1
0
It'll be fun to watch and see what IBM can do with POWER9. POWER8 can hit clockspeeds about as high as Piledriver, though the power consumption on those things . . . wheeeew.



Jury's still out on that one.



Oh yeah it's there. I really haven't looked into the issues with getting various ARM platforms to boot mainline Linux, but if it's just a boot issue and nothing to do with variations in instruction set support etc. then it should not be a big issue. I know that a few months ago that the Caveum ThunderX support got to the point that it can boot mainline Linux. No idea on how well Seattle is supported.



The ARM ecosystem also has a lot of fixed function hardware, and I'm wondering how much support there is for that? Of course, newer x86 products aren't much different nowadays . . .



Yeah it's there. It drives Torvalds nuts though. He's actually threatened ARM SoC designers before. I think the situation is improving somewhat.



It was a problem with ThunderX too. I think they have that ironed out now.
I think phoronix benched a power8 dev platform http://www.phoronix.com/scan.php

If they can bump production, drop it to $999 and partner with a linux distro...well thats a stretch.
 

beginner99

Diamond Member
Jun 2, 2009
5,210
1,580
136
Yes please. Time for a decent pair of CPU cores to start appearing at the low-mid range instead of a quad A53 or those crazy eight A53 implementations.

I find it interesting they managed to outperform the A72 while reducing complexity and being more efficient with the resources at hand while improving perf/w. Nice engineering.

Let's hope someone can do a cheap yet worthwhile SoC around 2xA73+4xA53 in 28nm that can show up in something along the lines of the Moto G and similar phones.

Yeah those quad or octa-core A53 crap is really annoying. Would actually take a dual A72 over all of them. If the display usually already uses 50% of the battery you soon hit diminishing returns saving 20% on the CPU. It may be 20% from the CPU bit only 5% for the total end product. Just look at Apple. they don't suffer battery live even they do not have big.little.

About the engineering. Well A72 was made with servers ins mind. Strip that out and you are more efficient for where the chip is actually used. But yeah, that's how it should be done. Intel should take core-m, remove some shit like do we need AVX on smartphones? No, not really. And they would have a winner.
 

Nothingness

Platinum Member
Jul 3, 2013
2,420
749
136
Intel should take core-m, remove some shit like do we need AVX on smartphones? No, not really. And they would have a winner.
I don't think it would work: they'd have to reduce the frequency so much to fit smartphone thermals, that it likely wouldn't be faster than the high-end competition.
 

MarkizSchnitzel

Senior member
Nov 10, 2013
403
31
91
Just look at Apple. they don't suffer battery live even they do not have big.little.

iPhone 6s battery life in real work is atrocious. Barely a day, or only a few hours SOT.

Does not mean it's SOC fault though (battery is small so there's that).
 

StrangerGuy

Diamond Member
May 9, 2004
8,443
124
106
iPhone 6s battery life in real work is atrocious. Barely a day, or only a few hours SOT.

Does not mean it's SOC fault though (battery is small so there's that).

The main culprit is the LCD on the 6S, because the SE has better than 6S+ battery life with the same SoC running on a even smaller battery.

BTW the iPhone facebook app is known to be a battery-sucking POS.
 

NTMBK

Lifer
Nov 14, 2011
10,237
5,020
136
Narrower but higher frequency is an interesting design choice, especially for low-power mobile. Total and complete opposite of Apple's approach (very wide CPU, relatively low Fmax).

I wonder how much of the frequency gain is architectural, and how much is process? The comparison slide talking about frequency compares a 16nm A72 with a 10nm A73. Or given that A72 is power/thermal limited on 28nm, a more efficient A73 could reach higher clocks in the same limits (while neither design is actually close to their architectural frequency limit).
 

DrMrLordX

Lifer
Apr 27, 2000
21,634
10,847
136
I think phoronix benched a power8 dev platform http://www.phoronix.com/scan.php

If they can bump production, drop it to $999 and partner with a linux distro...well thats a stretch.

I wouldn't mind seeing a 1c/8t or 1c/10t (I think POWER8 can go to 10 "chiplets" per core) system in that price range. Not sure if any of the OpenPOWER partners can make it happen at that price. If they could, then at what clockspeed? And what else would you get with the system? There's that whole CAPI interface, and NVLink is featured as a part of the platform.

https://en.wikipedia.org/wiki/OpenPOWER_Foundation

All that aside, I'm not sure it would be operating in the same space as workstation/server ARM or anything like that.
 

monstercameron

Diamond Member
Feb 12, 2013
3,818
1
0
The saddest part of it all is that chip design isnt so expensive. IIRC adapteva did the design a few hundred thousand then went to kickstarter for the production run.

A cool 5 mil should get the ball rolling.
 

Nothingness

Platinum Member
Jul 3, 2013
2,420
749
136
The saddest part of it all is that chip design isnt so expensive. IIRC adapteva did the design a few hundred thousand then went to kickstarter for the production run.
Indeed (simple) chip design is not that expensive, but validation costs are large. You need a lot of very expensive machines (emulators + cluster for simulation).

A cool 5 mil should get the ball rolling.
That'd only cover masks :)

http://www.adapteva.com/andreas-blog/semiconductor-economics-101/
 
Apr 30, 2015
131
10
81
A73 is going into mid-range phones as hexa-core ( 2xA73 + 4xA35 or 4xA53), premium phones as octa-core (4xA73 + 4xA53), tablets, clamshells, DTVs and other consumer devices.
Production processes are 28nm, 14/16nm and 10nm.
10 companies are developing A73 SoCs.
First premium-smartphones in early 2017.

https://community.arm.com/groups/pr...ves-efficiency-performance-for-mobile-designs
refers.

It seems that algorithm-design is the key to the processor design, instantiated in silicon, of course. The small size of the A73 seems amazing, given its processing power. There are successors in the pipeline, with a yearly cadence. By 2020 they should have won a slice of the notebook market, perhaps the majority of it.

Mali-G71 GPU, CCI-550, and other IP enables 4k120 video with 10-bit depth, in 2017 I think.
AR and VR devices are coming.
 
Apr 30, 2015
131
10
81
nexus2cee_ARM10nm.jpg


So Mercury has become the A35(?), and Artemis the A73

It seems that Artemis, A73, has been promoted since that chart was leaked, as the A72's successor.
But what is Aries? - and what is Prometheus?

Last year and this, ARM have been recruiting for their server processor design teams. The gestation period is about 7-years to silicon in production, according to ARM.
 
Apr 30, 2015
131
10
81
The saddest part of it all is that chip design isnt so expensive. IIRC adapteva did the design a few hundred thousand then went to kickstarter for the production run.

A cool 5 mil should get the ball rolling.

I think I saw a figure of $30+ million for 10nm, to put a SoC into production; I think that this is the cost after tape-out, which I understand to mean production of the first RTL design.
 

krumme

Diamond Member
Oct 9, 2009
5,952
1,585
136
We havnt touched this from Andrei but imo its one of the most unteresting perspectives:

"If *the A73 is able to hit all of*its promised targets then it leaves me with some doubt on what this means for vendors who are currently using their own microarchitectures. Apple has proven that they’re able to execute and deliver outstanding performance at high efficiency, but vendors such as Qualcomm and Samsung aren’t in an as good position. We'll have a more in depth discussion*about Snapdragon 820’s Kryo and Exynos 8890’s Mongoose cores in an upcoming deep dive article, but both microarchitectures have trouble in terms of differentiating themselves in terms of performance and power compared to ARM’s own current designs, casting some doubt on how they'll be able to evolve and compete against SoCs using Artemis cores."

What do you think about it and the implecations?