• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

News ARM Server CPUs

Page 6 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Gideon

Platinum Member
With all the upcoming ARM servers (Let's not forget Nuvia, etc) it probably makes sense to have 1 thread from them all, instead of creating new ones for each announcement (if Not, I will rename it).

However:
Anandtech: Marwell Announces 3rd Gen Arm Server Thunder X3: 96 Cores/384 threads
ServeTheHome: Marvell ThunderX3 Arm Server CPU with 768 Threads in 2020

MarvellTX3_13_575px.jpg


Could be pretty impressive, though 25% Single Threaded performance gain seems a bit meh, compared to X2, which @2.5Ghz was ~50% slower than Xeon @ 3.8Ghz

But Their marketing slides sure show potential:

MarvellTX3_16_575px.jpg



MarvellTX3_15_575px.jpg
 
Last edited:
Seems like they're trying to market on having a large number of cores rather than actual performance. Halving the SLC when you add 60% more cores is not a good strategy for gaining performance, that's why it is only a 19% bump.

I have to wonder if going from 80 to 96 cores and taking SLC to 24 or 32 MB would have resulted in the same or better performance for less power?
 
Does ARM have any provision or designs for chiplet-based processors? maybe that is the core thing. Pure core-performance and efficiency is one thing, being able to produce more and sell them cheaper is another one. I say it's actually these "secondary" aspects that will become more and more important including packaging.

They are working with TSMC to work in this direction but I don't know where/when they plan to introduce it into their roadmap.

 
Seems like they're trying to market on having a large number of cores rather than actual performance. Halving the SLC when you add 60% more cores is not a good strategy for gaining performance, that's why it is only a 19% bump.

I have to wonder if going from 80 to 96 cores and taking SLC to 24 or 32 MB would have resulted in the same or better performance for less power?

I am sure they should have consulted you instead of running their in-house simulations when optimizing perf/area - or perhaps they just rolled a dice?
 
I am sure they should have consulted you instead of running their in-house simulations when optimizing perf/area - or perhaps they just rolled a dice?

If you think engineers always win out over marketing in product decisions then you must have been working in one of the rare companies where that's true.
 
If you think engineers always win out over marketing in product decisions then you must have been working in one of the rare companies where that's true.

If you really believe such drastic measures like reducing SLC in favor of cores are marketing driven - you must have been working for companies, which don't survive for long. This is not even a consumer facing product, where marketing has a bigger impact - and this holds even more for a start-up.
In any case, looks like they are successful in squeezing 19% more performance out of similar power and area budgets even if average performance per core goes down. And now comes you, not having touched any performance/area system simulation with a ten-foot pole and claim the opposite?
 
I'm still at least moderately interested in acquiring one of the low-end $800 Altra CPU/board combos. Has anyone seen a vendor selling those parts yet?

edit: the Q32-17 I think. Unless they've already updated the product stack?
 
I'm still at least moderately interested in acquiring one of the low-end $800 Altra CPU/board combos. Has anyone seen a vendor selling those parts yet?

edit: the Q32-17 I think. Unless they've already updated the product stack?

I'm guessing the $800 is what the server vendor pays and unless you find one that somehow leaked out of the normal distribution channel or from a decommissioned server, you would have to buy a full blade solution to get access to an Altra CPU / board.
 
I'm guessing the $800 is what the server vendor pays and unless you find one that somehow leaked out of the normal distribution channel or from a decommissioned server, you would have to buy a full blade solution to get access to an Altra CPU / board.

You may be right. But I haven't even seen anyone selling blades.
 
You may be right. But I haven't even seen anyone selling blades.

I don't think they have the commercial volume for 'retail' sales. If you're really interested you'd have to contact one of their sales partners. I'd expect to pay quite a bit more than $800 though to get your hands on one.
 
I don't think they have the commercial volume for 'retail' sales. If you're really interested you'd have to contact one of their sales partners. I'd expect to pay quite a bit more than $800 though to get your hands on one.

Yeah that's what I thought. They probably wouldn't be interested in small volume sales either.
 
I tried to find a proper thread about Arm-based server CPUs and couldn't find anything newer 😑

So here I go with Phoronix review of AmpereOne 192-core: https://www.phoronix.com/review/ampereone-a192-32x

I'm a bit surprised it does that well despite some odd choices in its micro-architecture. I will take that as a lesson in humility.
The high idle power usage mentioned in that review is a bad surprise, it's much worse than both AMD and Intel chips. That again shows that the ISA used is not the be-all but may well play second fiddle to the often neglected uncore.
 
The high idle power usage mentioned in that review is a bad surprise, it's much worse than both AMD and Intel chips. That again shows that the ISA used is not the be-all but may well play second fiddle to the often neglected uncore.
Yes, the idle power consumption is definitely an issue for the target market where you don't want unused servers to waste watts doing nothing. I hope for them it's some SW/BIOS issue.
 
Interesting (as usual) Chips & Cheese article about AmpereOne: https://chipsandcheese.com/2024/08/29/ampereone-at-hot-chips-2024-maximizing-density/

They explain and measure some of the design decisions (small Icache, write-through Dcache, etc.).

Too bad they didn't measure power consumption, in particular when the cores are idle (though I guess they'd have found similar issues as Phoronix did as it likely is the very same platform/BIOS combination).

EDIT: they ran their benchmark on Oracle cluster, which might explain the lack of power figures.
 
N3 is A720 based?
If you look at A720, A725 and N3 software optimization guides and look for UMOV, SMOV instructions you'll notice the N3 latency/throughput numbers match those from A725, and not A720. That's no proof, this might be a bug in documentation. Or perhaps it's something in between A720/A725 with other undocumented changes.
 
Back
Top