Discussion Nvidia Blackwell in Q4-2024 ?

igor_kavinski · May 2, 2024

jpiniero said:
Nvidia criticizes AI PCs, says Microsoft's 45 TOPS requirement is only good enough for 'basic' AI tasks

Nvidia says its GPUs provide substantially better AI-performance than today's bleeding edge NPUs.

www.tomshardware.com

I bet this is just Jensen being furious that Microsoft got MI300X up and running for their AI needs.

coercitiv · May 4, 2024

jpiniero said:
If you had any doubt what the focus of Blackwell is going to be...

Stealing lunch money from NPUs?

ToTTenTranz · May 5, 2024

coercitiv said:
Stealing lunch money from NPUs?

No, like always their aim is to establish a perception of premium experience if you buy a nvidia graphics card. This time based on being able to run LLMs locally.

Though I'm not sure how they're going to convince people their anemic VRAM amounts are adequate to run LLMs.

Laptop SoCs can often be paired with 32GB LPDDR nowadays, but getting lots of VRAM on consumer GPUs isn't compatible with nvidia's usual planned obsolescence.

The mental gymnastics for convincing people that 8-12GB is good enough for LLMs is going to be interesting to watch.

Mahboi · May 5, 2024

ToTTenTranz said:
No, like always their aim is to establish a perception of premium experience if you buy a nvidia graphics card. This time based on being able to run LLMs locally.

Though I'm not sure how they're going to convince people their anemic VRAM amounts are adequate to run LLMs.

Laptop SoCs can often be paired with 32GB LPDDR nowadays, but getting lots of VRAM on consumer GPUs isn't compatible with nvidia's usual planned obsolescence.

The mental gymnastics for convincing people that 8-12GB is good enough for LLMs is going to be interesting to watch.

I'm sure NV will come up with a "LLM hardware compression" that'll somehow compress 5% and they'll present that as revolutionary.

Also this came out https://wccftech.com/amd-instinct-m...memory-up-to-4x-speedup-versus-discrete-gpus/ as a "hi" to Jensen. Also explains why NV entirely dropped out of HPC with Blackwell's announced FP64 numbers. NV doesn't like being number 2.

igor_kavinski · May 5, 2024

Mahboi said:
Also this came out https://wccftech.com/amd-instinct-m...memory-up-to-4x-speedup-versus-discrete-gpus/ as a "hi" to Jensen.

Now if AMD would only scale that down to consumer level with 32GB GDDR7 and it would become the mother of all APUs.

Heartbreaker · May 5, 2024

ToTTenTranz said:
No, like always their aim is to establish a perception of premium experience if you buy a nvidia graphics card. This time based on being able to run LLMs locally.

Though I'm not sure how they're going to convince people their anemic VRAM amounts are adequate to run LLMs.

Laptop SoCs can often be paired with 32GB LPDDR nowadays, but getting lots of VRAM on consumer GPUs isn't compatible with nvidia's usual planned obsolescence.

The mental gymnastics for convincing people that 8-12GB is good enough for LLMs is going to be interesting to watch.

But Laptop SoC are likely too slow to very large LLMs. I don't see any push to run LLMs on SoCs...

Consumer applications of AI will likely involve models optimized for 8GB.

Hobbyist stuff can expand as much as you want, but if you want to run those massive models, you likely aren't going to choose an SoC.

igor_kavinski · May 5, 2024

Heartbreaker said:
But Laptop SoC are likely too slow to very large LLMs. I don't see any push to run LLMs on SoCs...

Really? https://www.tomshardware.com/tech-i...wers-more-than-500-ai-models-the-company-says

Heartbreaker said:
Consumer applications of AI will likely involve models optimized for 8GB.

Just come out and say that even a 3050 8GB is better for AI than an SoC's NPU.

Heartbreaker said:
Hobbyist stuff can expand as much as you want, but if you want to run those massive models, you likely aren't going to choose an SoC.

Intel/AMD are working hard to prove your sire Jensen wrong.

igor_kavinski · May 5, 2024

NVIDIA rumored to ONLY launch GeForce RTX 5090 this year, possible unveiling at Computex 2024

NVIDIA's new ultra-enthusiast GeForce RTX 5090 rumored to be the only RTX 50 series GPU released this year, rest of the Blackwell gaming GPUs come in 2025.

www.tweaktown.com

MLID so disclaimer "believe/dream at your own risk" applies.

Mopetar · May 5, 2024

ToTTenTranz said:
The mental gymnastics for convincing people that 8-12GB is good enough for LLMs is going to be interesting to watch.

Pfft. 8 GB is more than fine for (L)ittle (L)anguage (M)odels, which shouldn't be confused with those other pesky LLMs people have been talking about.

ToTTenTranz · May 5, 2024

Heartbreaker said:
But Laptop SoC are likely too slow to very large LLMs. I don't see any push to run LLMs on SoCs...

They're not. Microsoft / OpenAI know what they're doing when they're asking for 45TOPs for compliance.
8GB won't be enough even for the tiny 3B models, but a SoC with access to 32GB RAM can do 7B models quite well.

Token/s performance is easier to optimize than memory footprint.

dr1337 · May 6, 2024

Does a LLM even have that many tokens per second with 40TOPs? From what I understand A card like the 3060 already has well over 200 AI TOPs and is still quite slow for LLMs. I thought everything in mobile AI was about small and highly tuned models.

MoogleW · May 7, 2024

https://twitter.com/x/status/1787645742510993853

New rumor alleging that AD203 based rtx 5080 expected to launch before rtx 5090, contradicting the rtx 50 is hurried and only rtx 5090 will come in 2024 rumors.

Since only rtx5080 is mentioned, then assumption is that rtx 5070(ti) based on GB205 will launch later. Possibly same timframe as AD104 based rtx4070ti had, so maybe a January 2025 window.

If January 2025 for GB205, then it (GB205 chip named 5070 or 5070ti) may get a mention at the GTC architecture unveiling with first party benchmarks and other marketing

Mopetar · May 7, 2024

Makes sense since AMD doesn't have a big GPU to compete against the 5080, so NVidia has no reason not to hold back and put all of their big dies into the more profitable datacenter or professional products.

xpea · May 7, 2024

MoogleW said:
https://twitter.com/x/status/1787645742510993853

New rumor alleging that AD203 based rtx 5080 expected to launch before rtx 5090, contradicting the rtx 50 is hurried and only rtx 5090 will come in 2024 rumors.

Since only rtx5080 is mentioned, then assumption is that rtx 5070(ti) based on GB205 will launch later. Possibly same timframe as AD104 based rtx4070ti had, so maybe a January 2025 window.

If January 2025 for GB205, then it (GB205 chip named 5070 or 5070ti) may get a mention at the GTC architecture unveiling with first party benchmarks and other marketing

5080 and 5090 will be announced at same time. Availability will be separated by few weeks...

gdansk · May 7, 2024

xpea said:
5080 and 5090 will be announced at same time. Availability will be separated by few weeks...

As usual or you know something?

jpiniero · May 7, 2024

There is a good argument against doing that... given that the 5080 is unlikely to be that much better in games than the 4090 because it should be too bandwidth limited. It may however have way better TOPs (at least on paper) so I suppose... but it seems like a better strategy to just release the 5090 and let the 4090 supply thin out before releasing the 5080.

SmokSmog · May 23, 2024

512bit with new PCB design

NVIDIA RTX 5090 Founder's Edition rumored to feature 16 GDDR7 memory modules in denser design - VideoCardz.com

NVIDIA RTX 5090 PCB rumors: dense memory layout, three PCBs for improved airflow Chiphell leaker claims NVIDIA’s next-gen GPU may use non-standard PCB design. The memory layout for the upcoming RTX 5090 graphics card will differ from that of the RTX 4090. The new flagship GPU, featuring the...

videocardz.com

Aapje · May 23, 2024

jpiniero said:
There is a good argument against doing that... given that the 5080 is unlikely to be that much better in games than the 4090 because it should be too bandwidth limited. It may however have way better TOPs (at least on paper) so I suppose... but it seems like a better strategy to just release the 5090 and let the 4090 supply thin out before releasing the 5080.

I think that it makes perfect sense to announce them at the same time. The 5090 will allow them to boast about the performance improvement, but the price is surely getting another increase with the 512 bit bus and 32 GB. So the 5080 is needed for Nvidia to defend against angry comments about the price. I expect the 5080 to stay at $999 exactly so they can argue that they only increased the 5090 price because they had to.

And I expect the 5080 to be at 4090D level so they can sell it in China. Announcing the 5080 at the same time as the 5090 prevents Nvidia from angering the Chinese, who will not be able to get the 5090 (legally). Then the message from Nvidia to China is: you can now get 4090D level performance for $999 instead of the $1800 that they have to pay for the actual 4090D.

igor_kavinski · May 23, 2024

512 bit bus and 32 GB <<< RDNA5 would need to be magical to counter that.

jpiniero · May 23, 2024

If anything, I'd expect the 5080's tensor performance to be way better than the 4090. So yes, under the current sanctions it won't be available in China.

Edit: I am expecting the die size to be very big, part from adding the additional SMs but also because of the inevitable additional tensor cores. It'll be closer to AD102 than AD103 and GB202 being at the recticle limit.

SteinFG · May 23, 2024

What the hell is jensen cooking

from wccftech

SteinFG · May 23, 2024

Kopite says the three PCBs are IO, PCIe connector, and the main GPU. PCIe connector is just that, a connector

What are they using to connect those PCBs lol, there's 0.5 Terabits/s of data on that PCIe bus. I guess it's a flex cable of some kind.

jpiniero · May 23, 2024

SteinFG said:
What the hell is jensen cooking

AI... what else?

MrTeal · May 23, 2024

SteinFG said:
Kopite says the three PCBs are IO, PCIe connector, and the main GPU. PCIe connector is just that, a connector View attachment 99529

What are they using to connect those PCBs lol, there's 0.5 Terabits/s of data on that PCIe bus. I guess it's a flex cable of some kind.

It wouldn't have to be anything different than a really short PCIe riser, really.

SteinFG · May 24, 2024

Throwing my guess for the 5090 cooler configuration. It's my only guess for how they can maximize the airflow even further

Discussion Nvidia Blackwell in Q4-2024 ?

Lifer

Diamond Member

Member

Senior member

Lifer

Diamond Member

Lifer

Lifer

Diamond Member

Member

Senior member

Member

Diamond Member

Senior member

Platinum Member

Lifer

Member

Golden Member

Lifer

Lifer

Senior member

Senior member

Lifer

Diamond Member

Senior member