• Guest, The rules for the P & N subforum have been updated to prohibit "ad hominem" or personal attacks against other posters. See the full details in the post "Politics and News Rules & Guidelines."
  • Community Question: What makes a good motherboard?

News Big movements afoot in ARM server land

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

soresu

Golden Member
Dec 19, 2014
1,532
757
136
I was never intrinsically aware of how bad Jaguar was or rather is.
Only for the purpose of a high end console.

For low power PC APU's it was fine - if Bulldozer was not so terrible I could definitely see it being developed further, but alas.
 

piokos

Senior member
Nov 2, 2018
554
202
86
It doesn't have to be "best at everything", it has to be "Best in the most common use case"
And can you point such a "most common use case"? :)
Again, there's nothing that makes ARM inherently bad as an ISA.
There's nothing inherently bad about x86 either (clearly). This isn't really a trigger for a revolution. ;)

Because even if ARM is 10% better than x86 (based on performance vs TCO), it may not be enough to cover the cost of transition. Because hardware isn't that expensive. Software is.
And you have to tweak, rebuild and test everything just to make it work. And ideally write from scratch to make the most of the 10% potential.

Also, ARM is not a coherent architecture.
First of all: you have versions, which aren't fully compatible (but they very much exist in parallel)
Second: beside the core ARM you have a bunch of optional extensions, which are often omitted. And some of them are flexible, like SVE. So one company can implement 512-bit instruction and another can do 1024-bit. Because why not?
ARM was designed for embedded devices. Compatibility wasn't a top priority. You take the blocks that ARM provides, you choose what you need. This is one of the reasons ARM can be made light and efficient. You can make a processor very optimal for your needs, whereas in x86 you kind of work with what you get.

ARM processors in Fugaku (leader of Top500), on AWS A1 instance (Graviton), in your smartphone and in the incoming Macs are different. Many programs written specifically for one of them will probably not work on another without refactoring.
x86 processors in Selene (7th on Top500), on AWS F1 instance, in your laptop and in current Macs are almost the same.

It's vastly different in the homely, friendly and unified world of x86, where Intel and AMD support pretty much the same instructions, so that you can basically move your Windows installation from Intel to AMD (or replace a 2015 CPU with 2020 CPU) and it will works perfectly.
And when a differentiator appears, like Intel's AVX-512, you have a shitstorm lasting years. Fuji implemented 512-bit instructions in their ARM. No one gives a rats ass. You wrote a program for Fuji, run it on Fuji. F... you very much. :)
 

soresu

Golden Member
Dec 19, 2014
1,532
757
136
So i cant find this (might be the long work hours...lol) , when can we actually buy chips?
These are server/datacenter/enterprise cores so no consumer will ever be buying them - that's the point of the separate Neoverse branding rather than Cortex.

OTOH they will definitely end up in Amazon AWS servers and many others.
 

name99

Senior member
Sep 11, 2010
383
289
136
No it doesn't you just invented that
One could imagine ways this could happen. For example Apple might conclude (I have no idea, but they presumably know the stats from Arcade and what developers tell them) that it would be worth creating an Apple console. (Apple TV semiPro?)
Basic aTV functionality, of course, but
- given as much power budget as feasible WITHOUT A FAN
- substantially larger SSD
- guts consist of Apple's mac high end cores and GPU (+NPU, which might have game value...)

Point is suppose Apple launches something like that and it provides the visual quality of current consoles, but without the noise level. Would that tempt a substantial number of light console users, people who only care about one or two games (which are present on Arcade) and who appreciate a smaller, quieter, perhaps cheaper device?

I know, I know, there's a certain crowd of "hardcore" gamers who hate Apple, insist on having access to fifty different games, and would never buy such a device. That's not important. What's important is whether it's attractive enough to move 20, 30, 40%? people off the platform, to the extent that it's worth it to Sony and MS to switch chips.
(Especially since, well duh!, nV will -- assuming purchase goes through -- be happy to supply them with the appropriate chips...)
 

soresu

Golden Member
Dec 19, 2014
1,532
757
136
Nintendo kept the same lineage of backwards compatible PowerPC hardware all the way from Gamecube to the Wii U, but then ditched it entirely and moved over to ARM for the Switch.
Not quite the same - in that case it was practically the same core with clock bumps for Wii and extra cores too for WiiU.

The PS5/XSX gen on the other hand retains compatibility with GCN code but has undergone an overhaul for greater efficiency and way more clockspeed.

Not to mention adding RT acceleration and all the features of modern GPU's since PS4/XB1 were designed.

Likewise Zen2 has ISA code compatibility to that written for Jaguar but is an entirely different beast microarchitecturally.

Nintendo's path of BW compatibility is minimum investment and internal HW change for maximum financial gain - this worked for Wii but not WiiU, hence the abandonment.

Doesn't help that the PowerPC core was extremely inefficient IPC wise by that time - even the two to three year old A57 was a significant step up from it at the time Switch came out.

Despite the fact that Switch GPU doesn't really have much on the WiiU in raw power, just a much more modern feature set - in other words, yes Nintendo did abandon the old paradigm, but for a pre existing design already in the world.

It would not greatly surprise me if Switch 2 was basically just a cut down variant of Tegra Oren reusing nVidia's R&D on it.
 
Last edited:

name99

Senior member
Sep 11, 2010
383
289
136
N1 was originally announced in Feb. 2019 with actual chips becoming available as you said late Q1 - Q2 of 2020. If ARM is only now sharing the V1 architecture with partners then I wouldn't expect actual chips to be available until Q4 2021 and then N2 to be available some time in 2022.
"Availability" means more than just "can random person on the street buy it".
Amazon announced Graviton2 (based on N1) in Dec last year, and people were able to buy time on in by about March this year.
I expect the same schedule for Graviton 3 (presumably based on V1).

The same schedule compression holds for "ARM in server". It's true that there are some products that haven't been ported to ARM, and never will be; and that there are lagging users, especially in the "one company buying a server for a department" use case.
But Amazon has much of AWS infrastructure available on ARM already, and presumably Graviton 3 and 4 will make switching to these even more compelling in price/performance. At some point Azure will have to offer something similar.

Likewise we know Cloudflare wanted to go this route (until QC screwed them over). I suspect Google and FB are already running internal tests.

The main thing is that Amazon is really our only window into this world. We have no idea how many sales are being made by Ampere, what Google's plans are for upgrading their fleet, what Apple's data center plans are, etc etc.
BUT
the fact that ARM sees enough business in this space (compare with QC, always ready to screw up in the least imaginative way possible) to not just continue with Neoverse, but even extend the program beyond the prior roadmap into further models, suggests to me that there are lots of customers who have made serious inquiries and recommendations as to what they want going forward. We just see none of this activity because, how would we?
 

name99

Senior member
Sep 11, 2010
383
289
136
And can you point such a "most common use case"? :)

There's nothing inherently bad about x86 either (clearly). This isn't really a trigger for a revolution. ;)

Because even if ARM is 10% better than x86 (based on performance vs TCO), it may not be enough to cover the cost of transition. Because hardware isn't that expensive. Software is.
And you have to tweak, rebuild and test everything just to make it work. And ideally write from scratch to make the most of the 10% potential.

Also, ARM is not a coherent architecture.
First of all: you have versions, which aren't fully compatible (but they very much exist in parallel)
Second: beside the core ARM you have a bunch of optional extensions, which are often omitted. And some of them are flexible, like SVE. So one company can implement 512-bit instruction and another can do 1024-bit. Because why not?
ARM was designed for embedded devices. Compatibility wasn't a top priority. You take the blocks that ARM provides, you choose what you need. This is one of the reasons ARM can be made light and efficient. You can make a processor very optimal for your needs, whereas in x86 you kind of work with what you get.

ARM processors in Fugaku (leader of Top500), on AWS A1 instance (Graviton), in your smartphone and in the incoming Macs are different. Many programs written specifically for one of them will probably not work on another without refactoring.
x86 processors in Selene (7th on Top500), on AWS F1 instance, in your laptop and in current Macs are almost the same.

It's vastly different in the homely, friendly and unified world of x86, where Intel and AMD support pretty much the same instructions, so that you can basically move your Windows installation from Intel to AMD (or replace a 2015 CPU with 2020 CPU) and it will works perfectly.
And when a differentiator appears, like Intel's AVX-512, you have a shitstorm lasting years. Fuji implemented 512-bit instructions in their ARM. No one gives a rats ass. You wrote a program for Fuji, run it on Fuji. F... you very much. :)
You seem to imagine that we are still living in 1995, and that software is defined by custom assembly distributed by floppies.
Even IF the differences you described between ARM models matter (they don't; SVE targets any vector length BY DEFINITION; the other differences are matched by differences in x86 and dealt with in the same way; you seem to be confusing ARMv8 with RISC-V)
the difference between now and 1995 is that
- very little code is assembly.
- distribution happens at much more abstract levels.

Data warehouse companies compile to whatever target they like.
Apple dynamically compiles to the target machine. Yes, it's not perfect, but the aWatch transition to 64-bits was pretty spectacular, and their current level of abstraction can easily handle things like optimizing for new targets, even targets that didn't exist when the bitcode was submitted to the app store.
Android, of course, reach the end result (independence from small ISA details) via a different mechanism.

That sort of fluidity is the future, and one day MS will get their act together enough to provide the same thing. (In theory they had it with .NET, maybe one day they'll actually figure out how to use that properly.)
If your primary value is no longer "we're the fastest cheapest option" but "we can run all that obsolete software you have from 30 years ago", that's not a great business proposition going forward.

Business segments never COMPLETELY go away. There are still people ploughing with horses. But they do shrink. x86 is right at its peak (in dollars; already the peak in fraction of the compute market ended years ago as phones, then IoT, then cars all grew). The next ten years will be the x86 shrinking in dollars as one segment after enough discovers that switching to ARM was not that hard.
It's Macs today, signage and kiosks tomorrow, then printers, then NAS, one segment at a time.
 
  • Like
Reactions: Tlh97 and Gideon

Hitman928

Diamond Member
Apr 15, 2012
3,199
3,204
136
"Availability" means more than just "can random person on the street buy it".
Amazon announced Graviton2 (based on N1) in Dec last year, and people were able to buy time on in by about March this year.
I expect the same schedule for Graviton 3 (presumably based on V1).

The same schedule compression holds for "ARM in server". It's true that there are some products that haven't been ported to ARM, and never will be; and that there are lagging users, especially in the "one company buying a server for a department" use case.
But Amazon has much of AWS infrastructure available on ARM already, and presumably Graviton 3 and 4 will make switching to these even more compelling in price/performance. At some point Azure will have to offer something similar.

Likewise we know Cloudflare wanted to go this route (until QC screwed them over). I suspect Google and FB are already running internal tests.

The main thing is that Amazon is really our only window into this world. We have no idea how many sales are being made by Ampere, what Google's plans are for upgrading their fleet, what Apple's data center plans are, etc etc.
BUT
the fact that ARM sees enough business in this space (compare with QC, always ready to screw up in the least imaginative way possible) to not just continue with Neoverse, but even extend the program beyond the prior roadmap into further models, suggests to me that there are lots of customers who have made serious inquiries and recommendations as to what they want going forward. We just see none of this activity because, how would we?
Availability from my post essentially means available for OEMs or deployments to receive chips to start to build solutions for the market to use. Amazon is a special case because they are vertically integrated here and started offering 'beta' systems that you could sign up to use early on and slowly built out instances from there for 'regular' use. General availability wasn't announced until June.

The only way we get any of this info about product adoption is from design win/deployment announcements and financials. Obviously what gets announced is far from a complete list, but even still, up to now the lack of known wins from the ARM side is pretty telling. From the whispers I've heard (not claiming major sources or anything, just some contacts who work around this area) is that there's been a lot of interest and 'kicking the tires' on ARM server solutions but in the end, not a lot of actual deployments. Even in Amazon's case, we have no idea what their actual Graviton server utilization is as far as I've seen.

ARM servers haven't exactly been very attractive up until very recently and even then I don't think the current offerings really are at a great place competitively against Rome (or Intel CPUs in certain corner cases). This next gen though looks like it could be very interesting and may allow ARM servers to finally get some actual momentum on their side.
 

NTMBK

Diamond Member
Nov 14, 2011
8,903
1,977
136
Not quite the same - in that case it was practically the same core with clock bumps for Wii and extra cores too for WiiU.

The PS5/XSX gen on the other hand retains compatibility with GCN code but has undergone an overhaul for greater efficiency and way more clockspeed.

Not to mention adding RT acceleration and all the features of modern GPU's since PS4/XB1 were designed.

Likewise Zen2 has ISA code compatibility to that written for Jaguar but is an entirely different beast microarchitecturally.

Nintendo's path of BW compatibility is minimum investment and internal HW change for maximum financial gain - this worked for Wii but not WiiU, hence the abandonment.

Doesn't help that the PowerPC core was extremely inefficient IPC wise by that time - even the two to three year old A57 was a significant step up from it at the time Switch came out.

Despite the fact that Switch GPU doesn't really have much on the WiiU in raw power, just a much more modern feature set - in other words, yes Nintendo did abandon the old paradigm, but for a pre existing design already in the world.

It would not greatly surprise me if Switch 2 was basically just a cut down variant of Tegra Oren reusing nVidia's R&D on it.
True enough on the CPU side- though they no doubt had to pay for a lot of customisation to enable multi-core support for that old PowerPC core. But the GPU was heavily customised- I can't find the old forum post, but I remember someone analysing the die shot. AMD had to do a ton of work to build in backwards compatibility with the Flipper GPU from the GameCube, combined into a totally different VLIW4 GPU. It certainly wasn't a path of least resistance!
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,153
1,673
136
These are server/datacenter/enterprise cores so no consumer will ever be buying them - that's the point of the separate Neoverse branding rather than Cortex.

OTOH they will definitely end up in Amazon AWS servers and many others.
I design large converged/infrastructure networks, I wasn't talking about consumers, I was talking about general availability.

When can I build one in oca or ccw etc?
 

Hitman928

Diamond Member
Apr 15, 2012
3,199
3,204
136
I design large converged/infrastructure networks, I wasn't talking about consumers, I was talking about general availability.

When can I build one in oca or ccw etc?
We'll have to hear from the actual CPU designers themselves but I wouldn't expect V1 CPUs to be available before 4Q2021 unless ARM gave licensees more of a head start with the architecture this time.
 

soresu

Golden Member
Dec 19, 2014
1,532
757
136
But the GPU was heavily customised- I can't find the old forum post, but I remember someone analysing the die shot. AMD had to do a ton of work to build in backwards compatibility with the Flipper GPU from the GameCube, combined into a totally different VLIW4 GPU.
It probably would have been cleaner to add the entire Flipper GPU onto it than customising a newer GPU.

And it wasn't VLIW4 - it was based on a HD 4xxx series, not even DX11 capable HD 5xxx series VLIW5.
 
  • Like
Reactions: Tlh97 and NTMBK

soresu

Golden Member
Dec 19, 2014
1,532
757
136
True enough on the CPU side- though they no doubt had to pay for a lot of customisation to enable multi-core support for that old PowerPC core. But the GPU was heavily customised- I can't find the old forum post, but I remember someone analysing the die shot. AMD had to do a ton of work to build in backwards compatibility with the Flipper GPU from the GameCube, combined into a totally different VLIW4 GPU. It certainly wasn't a path of least resistance!
From the Wikipedia article on WiiU:

The Latte graphics chip contains both a "GX2" GPGPU, which runs Wii U applications, and a "GX" GPU, which enables backward compatibility with Wii games. The GX2, designed by AMD, is based on the Radeon R600/R700 architecture and is clocked at approximately 550 MHz. It contains 32 MB of eDRAM cache memory, which can also act as L3 cache for the CPU. The GX, originally designed by ArtX, contains a 1 MB and a 2 MB bank of eSRAM cache memory.
So basically 2 different GPU's as the single 'Latte' chip package as I suspected, and definitely VLIW5 for the GX2.

VLIW4 came with Cayman/Northern Islands/HD 6000 after HD 5000, somewhat short lived as it was with only 2-3 designs actually fabbed (Cayman and Trinity/Richland IGP).
 

soresu

Golden Member
Dec 19, 2014
1,532
757
136
I design large converged/infrastructure networks, I wasn't talking about consumers, I was talking about general availability.

When can I build one in oca or ccw etc?
I'd imagine based on the lead time from the Neoverse N1 announcement that it will be at least 6-9 months before we see chips based on V1.

The first is likely to be the 'Rhea' SoC from SiPearl that pre announced it was using Zeus cores recently.

As I mentioned earlier we will probably see Amazon follow suit as they have their own ARM SoC design team that grew from Annapurna Alpine chips supplied to NAS box makers.
 

name99

Senior member
Sep 11, 2010
383
289
136
Availability from my post essentially means available for OEMs or deployments to receive chips to start to build solutions for the market to use. Amazon is a special case because they are vertically integrated here and started offering 'beta' systems that you could sign up to use early on and slowly built out instances from there for 'regular' use. General availability wasn't announced until June.

The only way we get any of this info about product adoption is from design win/deployment announcements and financials. Obviously what gets announced is far from a complete list, but even still, up to now the lack of known wins from the ARM side is pretty telling. From the whispers I've heard (not claiming major sources or anything, just some contacts who work around this area) is that there's been a lot of interest and 'kicking the tires' on ARM server solutions but in the end, not a lot of actual deployments. Even in Amazon's case, we have no idea what their actual Graviton server utilization is as far as I've seen.

ARM servers haven't exactly been very attractive up until very recently and even then I don't think the current offerings really are at a great place competitively against Rome (or Intel CPUs in certain corner cases). This next gen though looks like it could be very interesting and may allow ARM servers to finally get some actual momentum on their side.
My take is that when Anandtech can perform benchmarks it's "generally available"!

But the serious point is what I said: "general availability" is not an especially meaningful number for devices that aren't sold to consumers, but the two Amazon dates I think are useful. One tells us something about partner schedules, one tells us when average Joe's can run code.
I would expect this year the gap between the re:Invent announcement and user access to be shorter given that more of the AWS infrastructure is now in place and this will be their third turn of the merry-go-round.

To make a statement like "ARM servers haven't exactly been very attractive up until very recently" is meaningless unless you tell us what a server is.
EVERY time this argument comes up, it turns out that half of us (like me) think the only "servers" that matter are things in data warehouses, whereas the other half seem to think a server is some combination of workstation, SAP host, and NT host.
If you think server means Google, AWS, Facebook, then ARM has been attractive for a year. If you think server means something that acts as an Outlook host, then you have a rather different opinion.

The people who disagree with you are not idiots. They're just don't care about the details of the x86 Enterprise ecosystem because they don't consider any more interesting than the z/OS ecosystem -- economically important, yes, but not a locus of innovation, and not much relevant to wider computing.
 
Last edited:

Hitman928

Diamond Member
Apr 15, 2012
3,199
3,204
136
My take is that when Anandtech can perform benchmarks it's "generally available"!

But the serious point is what I said: "general availability" is not an especially meaningful number for devices that aren't sold to consumers, but the two Amazon dates I think are useful. One tells us something about partner schedules, one tells us when average Joe's can run code.
I would expect this year the gap between the re:Invent announcement and user access to be shorter given that more of the AWS infrastructure is now in place and this will be their third turn of the merry-go-round.

To make a statement like "ARM servers haven't exactly been very attractive up until very recently" is meaningless unless you tell us what a server is.
EVERY time this argument comes up, it turns out that half of us (like me) think the only "servers" that matter are things in data warehouses, whereas the other half seem to think a server is some combination of workstation, SAP host, and NT host.
If you think server means Google, AWS, Facebook, then ARM has been attractive for a year. If you think server means something that acts as an Outlook host, then you have a rather different opinion.

The people who disagree with you are not idiots. They're just don't care about the details of the x86 Enterprise ecosystem because they don't consider any more interesting than the z/OS ecosystem -- economically important, yes, but not a locus of innovation, and not much relevant to wider computing.
There are obviously lots of different segments within the server market. ARM hasn't really established a large foothold in any of them outside of Amazon rolling their own with unknown utilization. Saying ARM hasn't been very attractive in the server space was my opinion based upon performance data I've seen and discussions I've had with IT folk in the space as well as the clear difficulty ARM server CPU vendors have had trying to be successful in the market. Each gen has been a lot better than the previous one though and for me, this upcoming gen seems to be the first one where they might really be able to make some waves, we'll see.
 

soresu

Golden Member
Dec 19, 2014
1,532
757
136
In case nobody noticed from the Anandtech article there is still an ARM DevSummit event from October 6th-8th which will further detail the V1 architecture and possibly outline future things such as ARMv9-A.
 
  • Like
Reactions: Lodix

NTMBK

Diamond Member
Nov 14, 2011
8,903
1,977
136
From the Wikipedia article on WiiU:



So basically 2 different GPU's as the single 'Latte' chip package as I suspected, and definitely VLIW5 for the GX2.

VLIW4 came with Cayman/Northern Islands/HD 6000 after HD 5000, somewhat short lived as it was with only 2-3 designs actually fabbed (Cayman and Trinity/Richland IGP).
Yes, definitely VLIW5! I mistyped when I put that.

Here's the forum post I was thinking of, though annoyingly the images are broken: https://www.neogaf.com/threads/wiiu-latte-gpu-die-photo-gpu-feature-set-and-power-analysis.511628/ Lots of interesting speculation in there.
 

soresu

Golden Member
Dec 19, 2014
1,532
757
136
Huh, just noticed something I missed on my first read of the roadmap image I posted:



HBM3 is listed under platform features from 2021 onwards, so we can assume that the announcement is coming finally by end of next year at the latest.
 
  • Like
Reactions: Tlh97 and NTMBK

piokos

Senior member
Nov 2, 2018
554
202
86
You seem to imagine that we are still living in 1995, and that software is defined by custom assembly distributed by floppies.
I don't understand why you mention 1995 so often. Some special year or what?
SVE targets any vector length BY DEFINITION
No. SVE guarantees compatibility between different vector register sizes. But the implementation impacts efficiency (both in memory and performance).
Data warehouse companies compile to whatever target they like.
Data warehouse is a system. You probably meant:
- data centers - no, most of the time they don't compile the code that is run. You're doing it as a client.
- cloud service providers - yes, they maintain their services (but clients still take care of the VMs)
That sort of fluidity is the future, and one day MS will get their act together enough to provide the same thing.
But this "fluidity" leads to exactly what I mentioned earlier: specialized, incompatible (or compatible but cumbersome) implementations.
Sure, it's all under a sexy umbrella of ARM, but so what?

And if we start to make ARM more universal, it'll start losing a lot of it's advantages - surely lightness, likely efficiency as well.
If your primary value is no longer "we're the fastest cheapest option" but "we can run all that obsolete software you have from 30 years ago", that's not a great business proposition going forward.
ARM is not faster. It's just more efficient in some solutions.
And yes, it's a very solid business proposition that will give x86 many years to adapt. Or Intel and AMD to drop it.

There's no doubt that x86 was a bit stagnant lately. But I don't think that's something the "enthusiast PC" community as a whole should criticize - given the shitstorm against AVX-512 or because Lakefield doesn't support AVX. This is an everyday normality in the ARM world.
Seriously, to everyone who treats ARM as a mystical golden grail of future computing - just get a cheap RPi and try setting it up as a second PC. ;)

And yes, I'm writing this on my RPi 3A+. ;)
 
  • Haha
Reactions: pcp7

Antey

Member
Jul 4, 2019
92
130
66
You are going to kill me. Is there a possibility for AMD to port zen4 or any of their upcoming x86-64 designs to ARM? wasn't K12 a Zen twin core?
 

piokos

Senior member
Nov 2, 2018
554
202
86
or wait 1-2 months and get an ARM apple macbook...
That probably won't be able to run Windows ARM, Android or many existing ARM Linux distros (in contrast to the x86 reality of today).
We'll see how well x86 emulation works and if Macs are still capable as professional tools. I'm not holding my breath, but I'd like to be surprised - honestly.

I'm not sure MacBooks are a sensible alternative to RPis as a cheap platform for experiencing ARM. :D
 

ASK THE COMMUNITY