Question Intel Celestial XE3 discussion - not dead yet

FlameTail · Dec 15, 2024

Timestamp 20:40

Tom Peterson is asked why Arc Battlemage doesn't have SIMT, and be hinted that a future Xe architecture (Xe3? Xe4?) might add it.

adroc_thurston · Dec 15, 2024

FlameTail said:
Tom Peterson is asked why Arc Battlemage doesn't have SIMT, and be hinted that a future Xe architecture (Xe3? Xe4?) might add it.

well it does have SIMT.
Better, wave size on Intel uarches is very variable. Should be {16, 32} now?

João Bortolace · Dec 15, 2024

FlameTail said:
What are some GPU architectural upgrades that might come to Xe3/Celestial?

1. Shader Execution Re-ordering
2. Mesh Nodes and Work Graphs
3. SIMT architecture

Maybe do like nvidea did on Ampere and add FP to INT pipe?

gaav87 · Dec 28, 2024

First mention of xe3 in kernel 24.12.2024
D= dgpu / dgfx ?
XE3D = celestial dgpu i assume.

Philste · Dec 31, 2024

So the whole confusion is because Celestial dGPUs will be using Xe⁴, right? That's why some say Celestial dGPU is canceled, they only got the information that Xe³ dGPUs are canceled, so the conclusion was Celestial=canceled.

In reality they probably keep the 2 Year Rhythm and Celestial dGPU are probably targeting late 2026/early 2027 with Xe⁴ and Xe³ is APU only starting H2 25.

mikk · Dec 31, 2024

Too early for Xe4. Nova Lake and Wildcat Lake are on Xe3 and according to Exist50 Razor Lake as well. Alchemist dGPU to Battlemage dGPU was a 2.5 years cadence (A380 launched on June 14th, 2022), mobile versions even earlier. Celestial to Druid sounds like a bigger redesign.

Philste · Jan 1, 2025

mikk said:
Too early for Xe4. Nova Lake and Wildcat Lake are on Xe3 and according to Exist50 Razor Lake as well.

But these are all iGPUs. Strix Halo with RDNA3.5 will launch on the same day as Desktop RDNA4, so what stops Intel from launching Xe⁴ Desktop Cards in early 2027 while iGPUs are using Xe³?

mikk · Jan 1, 2025

Lack of Development time and budget stops them. Two generations within 2.5 years is not realistic. If it's Xe4 it's called Druid and not Celestial. Intel favours iGPUs whereas AMD usually dGPUs.

mikk · Jan 12, 2025

Xe3 is going from 4 to 6 cores for one slice. PTL-H(P) has two 6 core slices for overall 12 cores. This is what Raichu told and we can see this on github as well.

gaav87 · Jan 14, 2025

XE3 with larger cache ? If each bit controls two banks instead of one if number of bits did not decrease this could translate to a potentially doubled L3 cache size if fully enabled. Or just diff organization.

marees · Jan 29, 2026

post the patent news in this thread @MrMPFR

DavidC1 · Jan 29, 2026

FlameTail said:
Tom Peterson is asked why Arc Battlemage doesn't have SIMT, and be hinted that a future Xe architecture (Xe3? Xe4?) might add it.

Since when did Intel not have SIMT? Up until Broadwell graphics, they were using SIMD/SIMT hybrid architecture and switched between on the fly. On Skylake they moved to SIMT and saved xtors on it. If they are not on SIMT anymore, it means they would have changed back. SIMT is what GPUs use and SIMD is what CPUs use.

MrMPFR · Jan 30, 2026

marees said:
post the patent news in this thread @MrMPFR

Think I might have found Xe3P's RT core design. This guy seems to be the HW RT architect.

Apparatus and method for manageable fragmented acceleration structures
- From <https://patents.google.com/patent/US20250190277A1>
Apparatus and method for implementing a bounding volume hierarchy with oriented bounds using quantized shared orientations
- From <https://patents.google.com/patent/US20250299411A1>
Apparatus and method for using multiple bounds for child nodes in a bounding volume hierarchy
- From <https://patents.google.com/patent/US20250299420A1>
Apparatus and method for block-friendly ray traversal
- From <https://patents.google.com/patent/US20250308128A1>
Apparatus and Method for Extended Cache Control for Workloads using Temporary or Scratch Memory Space
- From <https://patents.google.com/patent/US20250307155A1>
Apparatus and method for throttling ray tracing operations based on cache hit rate
- From <https://patents.google.com/patent/US20250308135A1>

Haven't had the chance to read the patents properly. @basix what do you think?

marees · Jan 30, 2026

MrMPFR said:
Think I might have found Xe3P's RT core design. This guy seems to be the HW RT architect.

Apparatus and method for manageable fragmented acceleration structures

From <https://patents.google.com/patent/US20250190277A1>

Apparatus and method for implementing a bounding volume hierarchy with oriented bounds using quantized shared orientations

From <https://patents.google.com/patent/US20250299411A1>

Apparatus and method for using multiple bounds for child nodes in a bounding volume hierarchy

From <https://patents.google.com/patent/US20250299420A1>

Apparatus and method for block-friendly ray traversal

From <https://patents.google.com/patent/US20250308128A1>

Apparatus and Method for Extended Cache Control for Workloads using Temporary or Scratch Memory Space

From <https://patents.google.com/patent/US20250307155A1>

Apparatus and method for throttling ray tracing operations based on cache hit rate

From <https://patents.google.com/patent/US20250308135A1>

Haven't had the chance to read the patents properly. @basix what do you think?

will go thru. are you sure these are for xe3p & not xe4p (or xe5p etc. )

there is a rumour from the dark satanic interwebs that xe3p will be purely igp based but xe4p could be discrete dGPUs

MrMPFR · Jan 30, 2026

marees said:
will go thru. are you sure these are for xe3p & not xe4p (or xe5p etc. )

there is a rumour from the dark satanic interwebs that xe3p will be purely igp based but xe4p could be discrete dGPUs

Based on filing dates (early 2024) Xe3P is most likely but it's possible it's Druid.

MrMPFR · Jan 30, 2026

MrMPFR said:
Think I might have found Xe3P's RT core design. This guy seems to be the HW RT architect.

Apparatus and method for manageable fragmented acceleration structures

From <https://patents.google.com/patent/US20250190277A1>

Apparatus and method for implementing a bounding volume hierarchy with oriented bounds using quantized shared orientations

From <https://patents.google.com/patent/US20250299411A1>

Apparatus and method for using multiple bounds for child nodes in a bounding volume hierarchy

From <https://patents.google.com/patent/US20250299420A1>

Apparatus and method for block-friendly ray traversal

From <https://patents.google.com/patent/US20250308128A1>

Apparatus and Method for Extended Cache Control for Workloads using Temporary or Scratch Memory Space

From <https://patents.google.com/patent/US20250307155A1>

Apparatus and method for throttling ray tracing operations based on cache hit rate

From <https://patents.google.com/patent/US20250308135A1>

Haven't had the chance to read the patents properly. @basix what do you think?

Patent contributions:

BVH consisting of fragmented independently managed chunks that are very flexible. Mechanism for tracking, updating and traversing acros fragments.
OBBs with shared, quantized orientation set. Reigning in bound box orientations making it compressable and reducing overhead.
Union-bounded child nodes (Children 0...N/box). Tight coverage of complex/diagonal/curved geometry without exploding node count.
HW accel BVH node block allocation (co-located parent and child nodes). Strict policy to reduce traversal misses and pointer chasing
Per-workload dirty-state cache control for workloads flooding into scratch pad. Avoids redundant writebacks and reloads.
Throttling ray dispatch operations based on cache hit rate for RT memory operations such as BVH node loads, ray state loads, stack accesses etc... Avoids cache thrashing.

While this is not RDNA5 level RT progress it's still significant. Should help either Xe3P (most likely) or later Intel GPUs run ray tracing a lot faster.

Question Intel Celestial XE3 discussion - not dead yet

FlameTail

Diamond Member

adroc_thurston

Diamond Member

João Bortolace

Member

gaav87

Senior member

Philste

Senior member

mikk

Diamond Member

Philste

Senior member

mikk

Diamond Member

mikk

Diamond Member

gaav87

Senior member

marees

Platinum Member

DavidC1

Platinum Member

MrMPFR

Senior member

marees

Platinum Member

MrMPFR

Senior member

MrMPFR

Senior member

TRENDING THREADS