Discussion CDNA 6 / Instinct MI500 series thread

marees · Jan 7, 2026

2027 & 2nm

soresu said:
Well that puts any question of CDNA nomenclature remaining pretty conclusively to bed....

View attachment 136337

Kinda weird to jizz all over enterprise products at the Consumer Electronics Show though.

We get it, we're not the priority - no need to rub it in.

109576_05_amd-confirms-its-next-gen-instinct-mi500-ai-accelerator-uses-cdna-6-tsmc-2nm-hbm4e-m...jpg

marees · Jan 7, 2026

1000x vs MI300x

Tarkin77 said:
is it 10x?

i really don't know how to interpret this:

AMD shared additional details at CES on the next-generation AMD Instinct MI500 GPUs, planned to launch in 2027. The MI500 Series is on track to deliver up to a 1,000x increase in AI performance compared to the AMD Instinct MI300X GPUs introduced in 20231

AMD and its Partners Share their Vision for “AI Everywhere, for Everyone” at CES 2026

News Highlights AMD provided an early look at its “Helios” rack-scale platform, the blueprint for yotta-scale AI infrastructure, built on AMD…...

ir.amd.com

"Based on engineering projections by AMD Performance Labs in December 2025, to estimate the peak theoretical precision performance of AMD Instinct™ MI500 Series GPU powered AI Rack vs. an AMD Instinct MI300X platform. Results subject to change when products are released in market."

is 1000x Helios vs single "platform" of 8 mi300x??? Is 10x vs. mi400 single gpu or "helios" rack vs mi500 "titan" rack (with 3.5x number of gpus)

????

marees · Jan 7, 2026

for reference

AMD CES 2026 live blog

AMD CES 2026 Keynote Live Coverage

By
Ryan Smith
-
January 5, 2026

AMD CES 2026 Keynote Live Coverage

We’re down to our third and final chipmaker keynote of the day. Closing out a busy day for press conferences is AMD, who this year gets the honor of holding CES’s official opening keynote. The subject of AMD’s keynote, like so many others this year, will be a broad focus on AI, with CEO Dr. […]

www.servethehome.com

marees · Jan 7, 2026

news coverage

AMD unwraps Instinct MI500 boasting 1,000X more performance versus MI300X — setting the stage for the era of YottaFLOPS data centers

News
By Anton Shilov published 16 hours ago
Next-generation CDNA 6 architecture on-track for 2027.

AMD's Instinct MI500X-series accelerators are set to be based on the CDNA 6 architecture (no UDNA yet?) with their compute chiplets made on one of TSMC's N2-series fabrication process (2nm-class). AMD says that its Instinct MI500X GPUs will offer up to 1,000 times higher AI performance compared to the Instinct MI300X accelerator from late 2023, but does not exactly define comparison metrics.

The demands of AI data centers compute capability are set to increase dramatically from around 100 ZettaFLOPS today to around 10+ YottaFLOPS* in the next five years (approximately by about 100 times), according to AMD.

*One YottaFLOPS equals to 1,000 ZettaFLOPS, or one million ExaFLOPS

https://www.tomshardware.com/tech-industry/artificial-intelligence/amd-unwraps-instinct-mi500-boasting-1-000x-more-performance-versus-mi300x-setting-the-stage-for-the-era-of-yottaflops-data-centers

soresu · Jan 7, 2026

marees said:
(no UDNA yet?)

AFAICT David Wang either didn't properly explain it to the interviewer, or he misinterpreted something said during an RTG meeting with the higher ups and AMD simply didn't refute it, because confusion about their roadmap goes in their favor against competitors as much as it keeps us REEEEEEing at them 😅

UDNA was never really anything more than a long term strategy to keep:

#1. Latest CDNA in line with fairly recent CU µArch improvements - no more than a generation behind RDNA if Kepler is correct.

So RDNA development is basically dogfooding µArch features for the more fiscally important CDNA, and bugs that would otherwise be swept under the rug for a RDNA release (like Next Gen Geometry) can be fixed before they are implemented in a future CDNA.

#2. Latest RDNA supported in ROCm, because a lot of the work has to be done anyway to enable it for the next CDNA.

In short it keeps CDNA relatively modern on the general compute side, and RDNA usable on the ROCm side.

ie the best of both worlds while still allowing for domain specific optimisation.

NTMBK · Jan 7, 2026

marees said:
for reference

AMD CES 2026 live blog

AMD CES 2026 Keynote Live Coverage
By
Ryan Smith
-
January 5, 2026

AMD CES 2026 Keynote Live Coverage

We’re down to our third and final chipmaker keynote of the day. Closing out a busy day for press conferences is AMD, who this year gets the honor of holding CES’s official opening keynote. The subject of AMD’s keynote, like so many others this year, will be a broad focus on AI, with CEO Dr. […]

www.servethehome.com

"Ryan Smith"? That name sounds familiar!

Win2012R2 · Jan 7, 2026

soresu said:
bugs that would otherwise be swept under the rug for a RDNA release (like Next Gen Geometry) can be fixed before they are implemented in a future CDNA.

Why would CDNA need Next Gen Geometry? A lot of complex (and thus bug prone) cool new stuff from patents seems to be useful only for RDNA with zero value for CDNA.

soresu · Jan 7, 2026

Win2012R2 said:
Why would CDNA need Next Gen Geometry? A lot of complex (and thus bug prone) cool new stuff from patents seems to be useful only for RDNA with zero value for CDNA.

I was just using that as an example of a major bug making it into a production µArch.

Win2012R2 · Jan 7, 2026

soresu said:
I was just using that as an example of a major bug making it into a production µArch.

Yeah I get this, but bugs more likely in complex new things like exactly you've mentioned, but that's non issue for CDNA.

If I had to guess what they want to use RDNA for is testing new better caching and scheduling (subtle bugs that make it into final silicon) - plus avoiding RDNA3 issue with lower clocks, but on the other hand if RDNA sticks to N-1 process then it's not really like for like comparison anyway.

adroc_thurston · Jan 7, 2026

soresu said:
So RDNA development is basically dogfooding µArch features for the more fiscally important CDNA, and bugs that would otherwise be swept under the rug for a RDNA release (like Next Gen Geometry) can be fixed before they are implemented in a future CDNA.

not really, gfx13 CU is derived from 1250 and not 1200/1201.
The roadmap kind of eats itself now with client iterating into DC into client into DC into yaddayaddayadda.

soresu · Jan 7, 2026

adroc_thurston said:
The roadmap kind of eats itself now with client iterating into DC into client into DC into yaddayaddayadda.

That's what I really meant, thanks.

adroc_thurston · Jan 7, 2026

soresu said:
That's what I really meant, thanks.

Intel is also kind of like that.
Xe2 shader core is a PVC derivative, not Alchemist.
Xe3p was primarily for FCS once.

soresu · Jan 7, 2026

adroc_thurston said:
Xe2 shader core is a PVC derivative, not Alchemist.
Xe3p was primarily for FCS once.

Not familiar with those abbreviations.

gdansk · Jan 7, 2026

soresu said:
Not familiar with those abbreviations.

Ponte Vecchio, Falcon Shores

madtronik · Jan 8, 2026

gdansk said:
Ponte Vecchio, Falcon Shores

It's normal that people don't get the abbreviations because both those products have ended in the vaporware section.

basix · Jan 8, 2026

I already explained it here: 1000x of an MI500 system vs. MI300X system is very reasonable:

Question - Zen 6 Speculation Thread

Page 338 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

I think it is rather simple: 8x GPU MI300X cluster vs. Full MI500 rack
MI300X delivers 1.3 PFLOPS of FP16 (matrix calculations). A MI300X platform is a 8x GPU cluster, so this results in 10.4 PFLOPS.

MI455X will bring FP4 support and at Helios rack level this results in 3 ExaFLOPS. Now double the amount of GPUs per rack for MI500, make the GPUs itself 1.75x faster (plausible if e.g. increasing GPU size by using 3x or 4x base Die tiles instead of 2x) and we land at 10.5 ExaFLOPS per rack.

10.5 ExaFLOPS / 10.4 PFLOPS = 1000x 😉

Claudiovict · Jan 8, 2026

RDNA 6 will derivate from CDNA 6 or they have nothing to do with each other? About the node, is it possible to be N2X?

adroc_thurston · Jan 8, 2026

Claudiovict said:
RDNA 6 will derivate from CDNA 6 or they have nothing to do with each other?

Well kind of.
A client GPU remains a client GPU.

marees · Jan 8, 2026

Claudiovict said:
RDNA 6 will derivate from CDNA 6 or they have nothing to do with each other? About the node, is it possible to be N2X?

CDNA changes every year
RDNA roadmap is once in 2 years

UDNA initiative should keep both in sync, especially from rOCM perspective

work groups, tensor cores, tile processing etc. are the next big things

soresu · Jan 10, 2026

GFX12.1 came up in Mesa drivers again.

For a dead thing it sure seems to have a lot of life in it.

vinifera · Jan 10, 2026

Gfx1250 (CDNA5) and presumably later (gfx13/RDNA5/CDNA6) is seemingly deprecating CDNA1-4 MFMA intrinsics in favor of more modern WMMA matrix intrinsics seen on Gfx11/RDNA3 and onwards. (Per ROCDL LLVM GitHub)

Screenshot_20260110_174827_Samsung Internet.jpg

adroc_thurston · Jan 10, 2026

vinifera said:
Gfx1250 (CDNA5) and presumably later (gfx13/RDNA5/CDNA6) is seemingly deprecating CDNA1-4 MFMA intrinsics in favor of more modern WMMA matrix intrinsics seen on Gfx11/RDNA3 and onwards. (Per ROCDL LLVM GitHub)
View attachment 136520

yuh one big happy family now

MrMPFR · Jan 11, 2026

Do we have any idea of how Rubin Ultra vs MI500 compares in terms of launch schedule and GPU performance?
I know NVIDIA has rebuild entire DC stack + SW moat on top but will they continue to win or does NVIDIA have to pull forward Feynman to H2 2027?

adroc_thurston · Jan 11, 2026

MrMPFR said:
Do we have any idea of how Rubin Ultra vs MI500 compares in terms of launch schedule and GPU performance?

They're both H2'27 and dunno.

MrMPFR said:
I know NVIDIA has rebuild entire DC stack + SW moat on top but will they continue to win or does NVIDIA have to pull forward Feynman to H2 2027?

They can't "pull in" anything since their hweng mines are already overworked to death.

marees · Feb 3, 2026

HLRS director reveals existence of previously unannounced AMD MI600 AI chip

Comments were made as part of a discussion regarding the center’s forthcoming supercomputers
July 08, 2025 By Charlotte Trueman

The director of the High-Performance Computing Center (HLRS) in Stuttgart, Germany, has revealed the existence of the AMD MI600 AI chip, something that has not been previously disclosed by the chipmaker.

During a discussion with journalists about the procurement processes for the successor to the center’s forthcoming system, Herder, Professor Dr. Michael Resch, said: “We are not so much interested in the MI400… We are already interested in MI500, 600.”

HLRS director reveals existence of previously unannounced AMD MI600 AI chip

Comments were made as part of a discussion regarding the center’s forthcoming supercomputers

www.datacenterdynamics.com

Discussion CDNA 6 / Instinct MI500 series thread

Platinum Member

Platinum Member

Platinum Member

AMD CES 2026 Keynote Live Coverage​

Platinum Member

AMD unwraps Instinct MI500 boasting 1,000X more performance versus MI300X — setting the stage for the era of YottaFLOPS data centers​

Diamond Member

Lifer

AMD CES 2026 Keynote Live Coverage​

Golden Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Member

Senior member

Member

Diamond Member

Platinum Member

Diamond Member

Junior Member

Diamond Member

Senior member

Diamond Member

Platinum Member

HLRS director reveals existence of previously unannounced AMD MI600 AI chip​

AMD CES 2026 Keynote Live Coverage

AMD unwraps Instinct MI500 boasting 1,000X more performance versus MI300X — setting the stage for the era of YottaFLOPS data centers

AMD CES 2026 Keynote Live Coverage

HLRS director reveals existence of previously unannounced AMD MI600 AI chip