It ain't gonna win anything with shader core count parity.I mean AMD could release a 48GB AT0 (or 64 if they feel like doing 4GB modules, or leave those for pro cards) full fat-SKU against a 6090/6090 Ti if they feel it
lol, whyIt’s been a while since we saw those leaked specs. I doubt AT0 is still alive.
Do we know that "CPX" is the next-gen RTX GeForce flagship die? I'm 50/50 on that personally.lol, why
AT0 has better prospects than (desktop) AT3/4, imo.
We got zero leaks for Rubin other than that "CPX" die shot that suggests GR102 (or whatever the codename is) stays at 192 SM. You think that means Rubin is dead?
I think it's wrong to assume the current status of any chips based on lack of leaks.
If anything, that's usually a good sign; cancellations often get reported more early and reliably than exact specs.
Technically we don't, no.Do we know that "CPX" is the next-gen RTX GeForce flagship die? I'm 50/50 on that personally.
Because it’s AMD who has a fame of not shipping halo cardslol, why
AT0 has better prospects than (desktop) AT3/4, imo.
We got zero leaks for Rubin other than that "CPX" die shot that suggests GR102 (or whatever the codename is) stays at 192 SM. You think that means Rubin is dead?
I think it's wrong to assume the current status of any chips based on lack of leaks.
If anything, that's usually a good sign; cancellations often get reported more early and reliably than exact specs.
oh it's not a halo part at all.Because it’s AMD who has a fame of not shipping halo cards
This is a great summary but perhaps AMD wants to go even further than this. It really depends on just how clean slate GFX13 is.Well, maybe it is simply not required to have more memory bandwidth:
- Revamped CUs and respective low level caches (bigger capacity)
- Out-of-order execution (increase hardware utilization of ALUs and cache)
- Maybe L0 cache sharing across multiple CUs (reduce wasted SRAM capacity, reduce LLC & DRAM bandwidth requirements)
- Universal compression (smaller memory footprint, reduce bandwidth requirements)
- DGF & DMM (smaller memory footprint, reduce bandwidth requirements)
- Neural techniques like NTC which aim to reduce data fetching from DRAM but rather use more compute from matrix engines (whose performance mostly rely on CU low level caches) to generate or extract data and information
- Work graphs and procedural algorithms with dynamic execution on CU level (reduces code footprints and reduces bandwidth pressure from higher level caches and DRAM)
All those things aim to maximize usage of low level CU resources, increase data locality and reduce load on higher level structures like LLC and DRAM.
It seems that there is much going on regarding rethinking GPU architecture as a whole.
You’r right. I still don’t understand why they refuse to ship that. People already paying 3k+ on 5090’s shows that they would pay even moreoh it's not a halo part at all.
A real halo part would be tiled with like 2x the SM count.
Apparently the idea was to bypass cpu latency/overhead by using CUDA like pointers. (No Rust here. O/S needs reliability but Graphics API needs performance)Bindless everything
What is the expectation on the memory situation?
When will it cool down?
Will lpddr5x & lpddr6 be affected?
When will RDNA 5 launch now
What will be the SKUs
Will it be something like this
- 50xt 12gb (AT4 24 CU) — $300
- 60 16gb (AT3 40? CU) — $400
- 60xt 16gb/24gb/32gb (AT3 48 CU) — $500/$550/$600
- 70 18gb (AT2 56/60 CU) — $650
- 70xt 18gb/24gb (AT2 72 CU) — $700/$800
- 80xt 24gb (AT0 128? CU) — $1200+
- 90xt 36gb (AT0 144? CU) — $1500+
MLID... C'mon man.
I'm still reading through Sebbi's blog post (it's a big boi), and it definitely sounds interesting. I'm not a graphics programmer, but I've done some CUDA in a past career and had to poke around in Unreal's render code to debug issues, and the mess of shader types, resource types etc in DX12 is pretty daunting. Getting it simplified and cleaned up definitely sounds like a big win for programmer productivity and debugging.Just read the excellent blogpost by Sebastian Aaltonen shared by @Gideon last week. Shocking how flawed the "modern APIs" are and new ones can't come soon enough. DX12's legacy bloat with Work Graphs bolted on top would hold back post-crossgen releases.
Compare that with a feature complete No Graphics API (DX13 and Vulkan 2.0) with accommodations (native design + extensions) for dataflow execution architecture, as described in my prev comment, that could greatly benefit the 10th era of gaming. Basically sounds like DX13 + WGs on steroids.
Especially true for developers that can't afford wiz SWEs as highlighted by @marees post. The API's design philosophy mean it's "...simpler to use than DirectX 11 and Metal 1.0, yet it offers better performance and flexibility than DirectX 12 and Vulkan." Oh and someone is working on an actual API implementation.
Some hypothetical changes and implications summarized below
- Grain of salt advised, no professional background
Sebbi's No graphics API:
- Unified memory
- 64-bit GPU pointers everywhere
- Shaders = C++ like kernels
- Bindless everything
- Raster/RT as libraries and intrinsics
- No descriptors
- No PSOs, permutations and pipeline caching
- No resource types
- No barriers
- No stateful driver
- No heap enumeration
- No memory type guessing
- No legacy shader languages
DX12 → DX13 + Dataflow extensions:
- Command buffers → dataflow graphs
- Ressource objects → pointers
- Descriptors → bindless
- PSOs → dynamic pipelines
- CPU orchestration → GPU autonomy
- Fixed pipelines → unified compute
- Legacy bloated APIs → sleek modern API
- Bloated driver -> thin driver
Fingers crossed RDNA5 and PS6 goes all the way architecturally and even if they don't a hypothetical DX13 still sounds much better than DX12 + WGs. Sounds like we're in for an inevitable programming paradigm shift of a similar magnitude to pure fixed function → programmable shaders in the early 2000s. Add ML and PT on top and the 2030s will be truly nextgen.
But games made from RDNA5/PS6 from the ground up are a very long way off. Probably 2031 at the earliest assuming PS6 is in 2027 and even then most developers will want to run on pre-DX13 hardware like your RTX 4090s etc, can see the dumb discourse about developers in 2033 or so "discriminating" against 4090 owners.Fingers crossed RDNA5 and PS6 goes all the way architecturally and even if they don't a hypothetical DX13 still sounds much better than DX12 + WGs. Sounds like we're in for an inevitable programming paradigm shift of a similar magnitude to pure fixed function → programmable shaders in the early 2000s. Add ML and PT on top and the 2030s will be truly nextgen.
Is MLID doing Swordfish style online job interview for a janitor position at Nvidia?
