Discussion Nvidia Blackwell in Q4-2024 ?

igor_kavinski · May 2, 2024

jpiniero said:
Nvidia criticizes AI PCs, says Microsoft's 45 TOPS requirement is only good enough for 'basic' AI tasks

Nvidia says its GPUs provide substantially better AI-performance than today's bleeding edge NPUs.

www.tomshardware.com

I bet this is just Jensen being furious that Microsoft got MI300X up and running for their AI needs.

coercitiv · Saturday at 6:36 PM

jpiniero said:
If you had any doubt what the focus of Blackwell is going to be...

Stealing lunch money from NPUs?

ToTTenTranz · Sunday at 7:24 AM

coercitiv said:
Stealing lunch money from NPUs?

No, like always their aim is to establish a perception of premium experience if you buy a nvidia graphics card. This time based on being able to run LLMs locally.

Though I'm not sure how they're going to convince people their anemic VRAM amounts are adequate to run LLMs.

Laptop SoCs can often be paired with 32GB LPDDR nowadays, but getting lots of VRAM on consumer GPUs isn't compatible with nvidia's usual planned obsolescence.

The mental gymnastics for convincing people that 8-12GB is good enough for LLMs is going to be interesting to watch.

Mahboi · Sunday at 7:59 AM

ToTTenTranz said:
No, like always their aim is to establish a perception of premium experience if you buy a nvidia graphics card. This time based on being able to run LLMs locally.

Though I'm not sure how they're going to convince people their anemic VRAM amounts are adequate to run LLMs.

Laptop SoCs can often be paired with 32GB LPDDR nowadays, but getting lots of VRAM on consumer GPUs isn't compatible with nvidia's usual planned obsolescence.

The mental gymnastics for convincing people that 8-12GB is good enough for LLMs is going to be interesting to watch.

I'm sure NV will come up with a "LLM hardware compression" that'll somehow compress 5% and they'll present that as revolutionary.

Also this came out https://wccftech.com/amd-instinct-m...memory-up-to-4x-speedup-versus-discrete-gpus/ as a "hi" to Jensen. Also explains why NV entirely dropped out of HPC with Blackwell's announced FP64 numbers. NV doesn't like being number 2.

igor_kavinski · Sunday at 8:41 AM

Mahboi said:
Also this came out https://wccftech.com/amd-instinct-m...memory-up-to-4x-speedup-versus-discrete-gpus/ as a "hi" to Jensen.

Now if AMD would only scale that down to consumer level with 32GB GDDR7 and it would become the mother of all APUs.

Heartbreaker · Sunday at 9:53 AM

ToTTenTranz said:
No, like always their aim is to establish a perception of premium experience if you buy a nvidia graphics card. This time based on being able to run LLMs locally.

Though I'm not sure how they're going to convince people their anemic VRAM amounts are adequate to run LLMs.

Laptop SoCs can often be paired with 32GB LPDDR nowadays, but getting lots of VRAM on consumer GPUs isn't compatible with nvidia's usual planned obsolescence.

The mental gymnastics for convincing people that 8-12GB is good enough for LLMs is going to be interesting to watch.

But Laptop SoC are likely too slow to very large LLMs. I don't see any push to run LLMs on SoCs...

Consumer applications of AI will likely involve models optimized for 8GB.

Hobbyist stuff can expand as much as you want, but if you want to run those massive models, you likely aren't going to choose an SoC.

igor_kavinski · Sunday at 11:08 AM

Heartbreaker said:
But Laptop SoC are likely too slow to very large LLMs. I don't see any push to run LLMs on SoCs...

Really? https://www.tomshardware.com/tech-i...wers-more-than-500-ai-models-the-company-says

Heartbreaker said:
Consumer applications of AI will likely involve models optimized for 8GB.

Just come out and say that even a 3050 8GB is better for AI than an SoC's NPU.

Heartbreaker said:
Hobbyist stuff can expand as much as you want, but if you want to run those massive models, you likely aren't going to choose an SoC.

Intel/AMD are working hard to prove your sire Jensen wrong.

igor_kavinski · Sunday at 11:10 AM

NVIDIA rumored to ONLY launch GeForce RTX 5090 this year, possible unveiling at Computex 2024

NVIDIA's new ultra-enthusiast GeForce RTX 5090 rumored to be the only RTX 50 series GPU released this year, rest of the Blackwell gaming GPUs come in 2025.

www.tweaktown.com

MLID so disclaimer "believe/dream at your own risk" applies.

Mopetar · Sunday at 12:16 PM

ToTTenTranz said:
The mental gymnastics for convincing people that 8-12GB is good enough for LLMs is going to be interesting to watch.

Pfft. 8 GB is more than fine for (L)ittle (L)anguage (M)odels, which shouldn't be confused with those other pesky LLMs people have been talking about.

ToTTenTranz · Sunday at 1:31 PM

Heartbreaker said:
But Laptop SoC are likely too slow to very large LLMs. I don't see any push to run LLMs on SoCs...

They're not. Microsoft / OpenAI know what they're doing when they're asking for 45TOPs for compliance.
8GB won't be enough even for the tiny 3B models, but a SoC with access to 32GB RAM can do 7B models quite well.

Token/s performance is easier to optimize than memory footprint.

dr1337 · Monday at 1:07 PM

Does a LLM even have that many tokens per second with 40TOPs? From what I understand A card like the 3060 already has well over 200 AI TOPs and is still quite slow for LLMs. I thought everything in mobile AI was about small and highly tuned models.

MoogleW · Tuesday at 2:58 AM

https://twitter.com/x/status/1787645742510993853

New rumor alleging that AD203 based rtx 5080 expected to launch before rtx 5090, contradicting the rtx 50 is hurried and only rtx 5090 will come in 2024 rumors.

Since only rtx5080 is mentioned, then assumption is that rtx 5070(ti) based on GB205 will launch later. Possibly same timframe as AD104 based rtx4070ti had, so maybe a January 2025 window.

If January 2025 for GB205, then it (GB205 chip named 5070 or 5070ti) may get a mention at the GTC architecture unveiling with first party benchmarks and other marketing

Mopetar · Tuesday at 9:58 AM

Makes sense since AMD doesn't have a big GPU to compete against the 5080, so NVidia has no reason not to hold back and put all of their big dies into the more profitable datacenter or professional products.

xpea · Tuesday at 10:20 AM

MoogleW said:
https://twitter.com/x/status/1787645742510993853

New rumor alleging that AD203 based rtx 5080 expected to launch before rtx 5090, contradicting the rtx 50 is hurried and only rtx 5090 will come in 2024 rumors.

Since only rtx5080 is mentioned, then assumption is that rtx 5070(ti) based on GB205 will launch later. Possibly same timframe as AD104 based rtx4070ti had, so maybe a January 2025 window.

If January 2025 for GB205, then it (GB205 chip named 5070 or 5070ti) may get a mention at the GTC architecture unveiling with first party benchmarks and other marketing

5080 and 5090 will be announced at same time. Availability will be separated by few weeks...

gdansk · Tuesday at 10:26 AM

xpea said:
5080 and 5090 will be announced at same time. Availability will be separated by few weeks...

As usual or you know something?

jpiniero · Tuesday at 11:51 AM

There is a good argument against doing that... given that the 5080 is unlikely to be that much better in games than the 4090 because it should be too bandwidth limited. It may however have way better TOPs (at least on paper) so I suppose... but it seems like a better strategy to just release the 5090 and let the 4090 supply thin out before releasing the 5080.

Search

Discussion Nvidia Blackwell in Q4-2024 ?

igor_kavinski

Lifer

Nvidia criticizes AI PCs, says Microsoft's 45 TOPS requirement is only good enough for 'basic' AI tasks

coercitiv

Diamond Member

ToTTenTranz

Member

Mahboi

Senior member

igor_kavinski

Lifer

Heartbreaker

Diamond Member

igor_kavinski

Lifer

igor_kavinski

Lifer

NVIDIA rumored to ONLY launch GeForce RTX 5090 this year, possible unveiling at Computex 2024

Mopetar

Diamond Member

ToTTenTranz

Member

dr1337

Senior member

MoogleW

Member

Mopetar

Diamond Member

xpea

Senior member

gdansk

Platinum Member

jpiniero

Lifer

TRENDING THREADS