What I meant was for the on premise data centers they are using a third party vendor for the hardware, like say HP. It may not even be running OSX, more like Linux.
That's probably true today. But everything takes time.
First you have to build the services (and remember Apple was what some would consider late to that party --- you can argue that this is because they had different standards wrt crypto, privacy, and the type of functionality offered, but they were late).
Then building data centers and migrating to running them takes time.
Then building up your own codebase (and exploring whether and when it makes sense to run on top of OSX rather than Linux) takes time.
Then moving to ARM takes time.
Then building your own HW takes time.
They're likely only halfway through this process.
But look at this more generically. EVERY substantial cloud vendor -- MS, Google, Amazon, Baidu -- has been exploring hardware options. This varies from using GPUs to Amazon's ARM cores to FPGAs to TPU and other AI accelerators. Why would Apple be the one large cloud company that doesn't investigate alternative, superior, hardware? And why wouldn't their exploration include not just these areas (accelerators, GPUs) but also better CPUs --- where better would mean anything from faster to cheaper to lower power to better integration with accelerators to better memory support to better security?
Apple has shown repeatedly that they're engaged in 5..10 yr projects that remain essentially secret (nothing but rumor and informed speculation) till the day of the announcement. The fact that we know nothing about Apple's data center operations today, or their plans for 2025, is par for the course. But there is a certain level of common sense.
I mean, come on, if you were ANY cloud provider and you looked at the endless stream of Spectres and Zombies and JCC-bugs and promises (next year, next year) about 10nm and EMIB, wouldn't you seriously investigate alternatives?
The other thing about Apple is that they are very disciplined about not doing too much too fast, and reusing what works. Meaning, I would guess, that
- TODAY they are experimenting with A11s, A12s, A13s on cloud workloads, to see what works and what doesn't. Are small cores useful or not? How about the onboard (ie small) GPU and inference NPU? Are there concerns with the caching system and the cost of inter-processor communication?
- TOMORROW they will ship desktop ARMs, which allows the cloud folks to start testing how well inter-chiplet communication works, and whether there are weaknesses in their IO system when pushed hard. All of which means that
- A GENERATION LATER they can rework this learning into better desktop cores, which can be reconfigured for cloud work, and serious mass scale deployment can begin.