Discussion We were wrong about GPUs (Article for the AI developer crowd)

soresu

Diamond Member
Dec 19, 2014
3,708
3,037
136

Wow. Nvidia isn't the answer for everyone it seems.

@Nothingness , you may like this.
I mean, this is kinda known already.

The IBM NorthPole chip has already shown that there are much more efficient ways to go about doing AI.

That's not even getting into exotic processing in memory electronics like multi terminal memtransistors that more closely resemble real neurons at the hardware level, rather than simulating them in software.
 

MS_AT

Senior member
Jul 15, 2024
597
1,246
96
But the article in the first post is not saying that nVidia GPU bad, but rather that people want ready to use solutions instead of rolling their own. FLY.IO hoped to make a business case of renting gpu instances to others that they could configure to run LLMs, but they believe now that it is too niche market compared to people who just prefer to use apis to existing services. And most of those existing services are running nVidia HW. Or did I misunderstand?
 
  • Like
Reactions: Tlh97 and marees
Jul 27, 2020
24,088
16,817
146
But the article in the first post is not saying that nVidia GPU bad
We could have shipped GPUs very quickly by doing what Nvidia recommended: standing up a standard K8s cluster to schedule GPU jobs on. Had we taken that path, and let our GPU users share a single Linux kernel, we’d have been on Nvidia’s driver happy-path.

Alternatively, we could have used a conventional hypervisor. Nvidia suggested VMware (heh). But they could have gotten things working had we used QEMU. We like QEMU fine, and could have talked ourselves into a security story for it, but the whole point of Fly Machines is that they take milliseconds to start. We could not have offered our desired Developer Experience on the Nvidia happy-path.

Instead, we burned months trying (and ultimately failing) to get Nvidia’s host drivers working to map virtualized GPUs into Intel Cloud Hypervisor. At one point, we hex-edited the closed-source drivers to trick them into thinking our hypervisor was QEMU.

I’m not sure any of this really mattered in the end. There’s a segment of the market we weren’t ever really able to explore because Nvidia’s driver support kept us from thin-slicing GPUs. We’d have been able to put together a really cheap offering for developers if we hadn’t run up against that, and developers love “cheap”, but I can’t prove that those customers are real.
Not the GPUs themselves but Nvidia didn't step in to help them do things the secure way and they ended up wasting a lot of time and effort.
 

marees

Senior member
Apr 28, 2024
999
1,338
96
But the article in the first post is not saying that nVidia GPU bad, but rather that people want ready to use solutions instead of rolling their own. FLY.IO hoped to make a business case of renting gpu instances to others that they could configure to run LLMs, but they believe now that it is too niche market compared to people who just prefer to use apis to existing services. And most of those existing services are running nVidia HW. Or did I misunderstand?
What you say is mostly right

But run refers to inference & develop refers to training

Nv speciality is training using CUDA/PTX

Inference can run on anything including epyc dual socket CPU with 768 gb ram & without any $$$$$$ GPUs
 
  • Like
Reactions: Win2012R2

coercitiv

Diamond Member
Jan 24, 2014
7,114
16,463
136
But the article in the first post is not saying that nVidia GPU bad, but rather that people want ready to use solutions instead of rolling their own. FLY.IO hoped to make a business case of renting gpu instances to others that they could configure to run LLMs, but they believe now that it is too niche market compared to people who just prefer to use apis to existing services. And most of those existing services are running nVidia HW. Or did I misunderstand?
I read the same in the article: they hoped developers would seek optimized environments (better cost/performance). Developers are seeking easy and rapid deployment instead, they're more than willing to trade convenience for cost. In a way this makes sense, we're still in the early days of AI tech, so real costs are hidden for the sake of market adoption while the offerings change fast. A developer may not even know what model/provider they want to use 6-9 months from now.