That's not a list of ways Apple Silicon, or ARM, is "undesirable for server". That's a list of ways their current implementation may be
I don't think anyone in this thread is arguing that the ARM architecture is inherently worse for server. All I've seen is discussion about current implementations.
Obviously they aren't designing cores with features that they feel are only useful on servers when they don't make servers
And yet you also insisted that there was no difference between "designed for server" vs "designed for mobile"? I agree with you on most of the points you presented in that post, and yet...
It isn't as if adding SVE2 (or beefing up their SSVE implementation) would be that hard if they decided to target that.
Perhaps. Intel faced issues when they first added AVX-512 IIRC, but that's also Intel lol.
The problem I see is that doing so would also increase core area, and I would imagine by a good bit too. ARM's classic cores might be able to tank this and retain good PPA, but Apple's cores would get even chonkier than they already are. Power, even with great power gating, prob won't benefit either, in cases where the wider vector units won't be used. Though I also don't think it will be hurt
too much either.
There's a lot of server loads that don't use much SIMD, and most server loads are almost exclusively INT.
Despite that, Intel is beefing up their E-cores with wider vector units, and AMD is retaining full width AVX-512 on their cores that are custom designed to compete in non-HPC markets.
Clearly it's still pretty important.
If Apple wanted to offer a much bigger cache to support chips that have many more cores they could do what AMD has done and simply stack a cache chip
This seems like it will increase cost dramatically.
Also a good bit of Apple's area efficiency advantage rests on the unique cache hierarchy they have employed.
I'm skeptical that SMT is any sort of a "must have". The average performance gain with SMT is less than half the performance of one of Apple's E cores. I think a P/E mix may work better - and I think Intel's designers probably agreed but due to their precarious state they're focusing on a "unified core" so they have little choice but to go back to SMT to have it do the work that small cores could be doing.
For client I agree, but for DC I doubt it. No one is doing heterogenous designs in DC, and I think it's because no one wants to deal with it.
There's also the question of how much of a benefit Apple adding SMT to their cores will bring vs AMD. AFAIK, longer pipelines benefit from SMT more, AMD may see larger gains in nT perf with 1 core than Apple does.