Discussion We were wrong about GPUs (Article for the AI developer crowd)

igor_kavinski · Feb 15, 2025

We Were Wrong About GPUs

Do my tears surprise you? Strong CEOs also cry.

fly.io

Wow. Nvidia isn't the answer for everyone it seems.

@Nothingness , you may like this.

soresu · Feb 15, 2025

igor_kavinski said:
We Were Wrong About GPUs

Do my tears surprise you? Strong CEOs also cry.

fly.io

Wow. Nvidia isn't the answer for everyone it seems.

@Nothingness , you may like this.

I mean, this is kinda known already.

The IBM NorthPole chip has already shown that there are much more efficient ways to go about doing AI.

That's not even getting into exotic processing in memory electronics like multi terminal memtransistors that more closely resemble real neurons at the hardware level, rather than simulating them in software.

MS_AT · Feb 15, 2025

But the article in the first post is not saying that nVidia GPU bad, but rather that people want ready to use solutions instead of rolling their own. FLY.IO hoped to make a business case of renting gpu instances to others that they could configure to run LLMs, but they believe now that it is too niche market compared to people who just prefer to use apis to existing services. And most of those existing services are running nVidia HW. Or did I misunderstand?

igor_kavinski · Feb 15, 2025

MS_AT said:
But the article in the first post is not saying that nVidia GPU bad

We could have shipped GPUs very quickly by doing what Nvidia recommended: standing up a standard K8s cluster to schedule GPU jobs on. Had we taken that path, and let our GPU users share a single Linux kernel, we’d have been on Nvidia’s driver happy-path.

Alternatively, we could have used a conventional hypervisor. Nvidia suggested VMware (heh). But they could have gotten things working had we used QEMU. We like QEMU fine, and could have talked ourselves into a security story for it, but the whole point of Fly Machines is that they take milliseconds to start. We could not have offered our desired Developer Experience on the Nvidia happy-path.

Instead, we burned months trying (and ultimately failing) to get Nvidia’s host drivers working to map virtualized GPUs into Intel Cloud Hypervisor. At one point, we hex-edited the closed-source drivers to trick them into thinking our hypervisor was QEMU.

I’m not sure any of this really mattered in the end. There’s a segment of the market we weren’t ever really able to explore because Nvidia’s driver support kept us from thin-slicing GPUs. We’d have been able to put together a really cheap offering for developers if we hadn’t run up against that, and developers love “cheap”, but I can’t prove that those customers are real.

Not the GPUs themselves but Nvidia didn't step in to help them do things the secure way and they ended up wasting a lot of time and effort.

marees · Feb 15, 2025

MS_AT said:
But the article in the first post is not saying that nVidia GPU bad, but rather that people want ready to use solutions instead of rolling their own. FLY.IO hoped to make a business case of renting gpu instances to others that they could configure to run LLMs, but they believe now that it is too niche market compared to people who just prefer to use apis to existing services. And most of those existing services are running nVidia HW. Or did I misunderstand?

What you say is mostly right

But run refers to inference & develop refers to training

Nv speciality is training using CUDA/PTX

Inference can run on anything including epyc dual socket CPU with 768 gb ram & without any $$$$$$ GPUs

coercitiv · Feb 16, 2025

MS_AT said:
But the article in the first post is not saying that nVidia GPU bad, but rather that people want ready to use solutions instead of rolling their own. FLY.IO hoped to make a business case of renting gpu instances to others that they could configure to run LLMs, but they believe now that it is too niche market compared to people who just prefer to use apis to existing services. And most of those existing services are running nVidia HW. Or did I misunderstand?

I read the same in the article: they hoped developers would seek optimized environments (better cost/performance). Developers are seeking easy and rapid deployment instead, they're more than willing to trade convenience for cost. In a way this makes sense, we're still in the early days of AI tech, so real costs are hidden for the sake of market adoption while the offerings change fast. A developer may not even know what model/provider they want to use 6-9 months from now.

Search

Discussion We were wrong about GPUs (Article for the AI developer crowd)

igor_kavinski

Lifer

We Were Wrong About GPUs

soresu

Diamond Member

We Were Wrong About GPUs

MS_AT

Senior member

igor_kavinski

Lifer

marees

Senior member

coercitiv

Diamond Member

TRENDING THREADS