Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

DisEnchantment · Sep 29, 2022

Speculate at will

biostud · Oct 5, 2024

Josh128 said:
This thread is so dead, not even some X3D rumors posted hours ago have made it here yet.

TLDR: 9800X3D R23 ST 2145, MT 23315
.......... 9950X3D: ............2245,...... 42375

Indicates ST boost clock speed of ~5.3GHz for 9800X3D, ~5.6GHz for 9950X3D. Very impressive for the 9950 if that score was achieved with an X3D die.

AMD Ryzen 9000X3D "Rumored" Performance Figures Reveal Faster Multi-Threaded & Slightly Slower Single-Threaded Numbers Versus Non-X3D CPUs

AMD's Ryzen 9000X3D CPUs are coming and it looks like they will boast faster multi-core performance than the non-X3D chips.

wccftech.com

Or more likely they are still heterogeneous....

techjunkie123 · Oct 5, 2024

biostud said:
Or more likely they are still heterogeneous....

Not on the 9800X3D...

poke01 · Oct 6, 2024

https://twitter.com/x/status/1842898843634892825

Good video, it’s the first die shot but unconfirmed

Det0x · Oct 6, 2024

*edit*
Already posted above

igor_kavinski · Oct 6, 2024

Something about ASUS Nitropath:

Lemme engage my AI assistant to understand what he's saying (sorry, short attention span prevents me from digesting info heavy videos!)...

[Music]

hi and welcome back to a new video last

week I was visiting Asus HQ in Taiwan

and I was discussing the Nitro paath

memory feature with them this was

already released and announced during

previous Gamescom in August I was

hosting a live stream at least for my

German YouTube channel I was discussing

it briefly there and at that point I

wasn't sure if that's just the typical

marketing blah blah or if it's actually

doing something Asus was advertising

that it allows to increase your memory

speed by about 400 megat transfers and

yeah so discussing this with the

engineers it was much more technically

interesting than I initially thought and

we will look at this in today's video

are you looking for a strong and

reliable hosting partner then hna is the

right place for you as leading hosting

provider with their own high-tech data

center hsna offers gdpr compliant

hosting at incredibly low prices heads

operates several hundred, servers at

multiple locations in Europe and the USA

and most recently also in Singapore hsna

products impress above all with their

outstanding price performance ratio the

secret behind this simple and functional

Solutions a focus on core features and a

constant optimization click on the link

in the description and discover more

about hetner this slide was used by Asus

to advertise this yearing Gamescom on

the left side you can see a conventional

memory slot everything you have

currently at home is using exactly this

technology and on the right side you can

see the nitr PA memory slot with those

like bent contact pins that then make

contact with the memory slot in the

middle last week we were using an let me

phrase it upcoming motherboard that only

has two dim slots and it's focused on

memory overclocking so I'm not getting

into trouble and this motherboard the

upcoming one for Intel didn't have the

nitr PA feature which they didn't make

sense to me because I thought this Nitro

paath feature like new memory dim slots

would enhance memory overclocking so why

would an upcoming memory focused

overclocking board not have it like this

one the x870 e hero for dim board for

AMD and AMD is typically not running

such a high memory speed has it but then

the Intel board didn't have it that was

quite odd then I asked the R&D team what

the reason for this is and then they set

up two comparison platforms so they

brought two x870 a gaming Wi-Fi

motherboards which are like mid-range

and four dim motherboards they had one

with con conventional memory slots and

one with the Nitro path memory slots the

first setup was with an AMD 8700g

because those apus can currently run

higher memory speeds than for example

ryzen 9000 and the on the conventional

motherboard it was maxing out at 88,200

me megat transfers so at that speed you

could still boot but in Windows in m

test you would get errors and then they

swapped the same CPU same memory dims

onto the Nitro paath motherboard and

8600 megat trans transfers would run

mest stable so that was again the moment

when I asked myself okay that's cool but

why is it again not on the upcoming two

dim motherboard that is focused on

memory overclocking and that's because I

had a wrong perception of what this

feature actually does when it comes to

the memory layout most of the

motherboards these days are working

exactly the same so the desktop CPUs

usually have two memory channels that is

channel a and channel B that's also why

you can find this label on here like Di

A1 A2 B1 and B2 and two of the channels

are always Daisy chained because it only

has two channels then the first two

slots are going to one channel and then

the B slots are going to the second

Channel and that's why when you occupy

A2 and B2 you always see this running in

dual Channel mode and now if you occupy

this with two memory dims like right

here you still have the two empty slots

and these are still electrically

connected to the CPU and there're still

signals arriving so there's like no

electrical switch in between that if you

don't pluck anything that they're like

that that's not the case when this nitr

PA feature was announced I just thought

okay it's like a different mechanical

mechanism of how the contact pins are

making contact with the dim itself and

because this is also like having a

little bit higher mounting pressure than

a conventional dim slot that this is the

real benefit but that's actually not the

case it was just about the empty slots

not about the occupied slots which is

what I had like completely wrong if you

want to measure the signal quality of

such a memory slot for example you can

do that with an ey chart if you look at

the binary signal it's in original form

like a rectangle shape but if it's very

high speed like here between let's say

3,000 and 500 MHz it becomes much more

like Senus sinus Senus sinus shaped and

that's because of a lot of interference

and things in between and with an eye

chart you can identify if the signal is

still clean enough or if it's already

overlapping and you will get some errors

this is one of the charts where you can

see it with the blue and green lines we

can also see state zero and state one

like zero at0 volt and then one at 0.4

volt and due to this very high clock you

can see that this is no longer the

typical rectangle theoretical form but

like a sinus shape because of the high

speed and in between there is this hole

and that's what they call the ey marked

in red and depending of the size of this

I that is determined in Picos seconds

that is the time and the height in Mill

volt so the voltage you can determine

how good the signal quality is so the

bigger the eye the better because you

won't have overlapping signals this

example we're looking at right now was

also provided by Asus and this is

running with 6,400 megat transfers which

equals 3,200 MHz we can see on the left

side config one which is testing the A2

memory slot and config 2 on the bottom

is testing the B2 memory slot on the

right the old connector is the

conventional memory slot and the new

connector on the right is the Nitro

paath memory slot but here in this

scenario with 64 megat transfers which

is 3,200 MHz there is little to no

difference so at this state at this

speed it's not really relevant that's

why we're now switching over to 5,000

MHz equaling 10,000 megat transfers and

here it looks quite a bit different the

traditional memory slot doesn't have

this eye anymore in the center so you

will have some memory interference some

overlapping signals which will lead to

errors or even that you're not even

possible or not even able to boot this

kind of speed then again on the right

side with a nitr paath slot you can see

a significant difference the I became a

lot smaller compared to the previous

example with 6,400 megat transfers but

it's still visible and that's I'm not

sure if this state is like usable with

the 10,000 megat transfers but you can

at least see see a clear physical

difference between the two slots now

going back to the beginning because I

originally thought that those different

memory slots will help because they have

like different shape and make better

electrical contact with the modules for

example but that is not the case it's

about the empty memory slots each of the

memory slots contains 288 pins it was

the same for ddr4 it's still the same

for ddr5 just with the fact that now

we're just approaching very high speeds

let's say with 10,000 megat transfers

approaching with ddr5 that is just very

very high frequency and will need very

good signaling quality and each of these

pins that are standing out from the

memory slots they also act like an

antenna especially the non-occupied

slots so every non-occupied slot is

interfering with the occupied slots

right next to it so the higher you go in

frequency the more relevant this problem

becomes that those empty slots and the

empty pins act like as an antenna and

have bad signal impact on the memory dim

right next to it now Asus did a lot of

testing about this and they had some

very cool images which they also gave to

me what they did was making pcbs with

memory slots sitting on top also with

memory dims inserted then they put

everything into epoxy hardened it then

they cut through it sanded polished

everything so you can get a very nice

crosssection to see how this looks in

reality the left image now shows was the

crosssection of conventional memory slot

where there is no memory module

installed yet the dark area is the

plastic of the socket and in the center

you can see the contact pin that also

has this rounded shape that then

eventually makes contact with the memory

dim itself that's what you can see on

the Right image so that is also again a

conventional slot but just with the

memory module installed that you can see

in the center has a quite thick PCB and

you can also see that it's a multi-layer

PCB and has this copper contact on left

and right which forms contact with the

pins of the memory module which are

forced out a little bit to left and

right because the module was inserted

now switching over to the nitr paath

memory slot left side again without

memory module and on the right side with

the module inserted now that's quite

interesting because this is a little bit

different to what they showed in the

original marketing slides could be that

this was some kind of technically like

more relevant and probably important

feature that they wanted to hide a

little bit so thanks for providing those

cross-sections anyway you can see that

it was bent down it is a lot shorter

than the conventional slot by about 40%

and only this like Bend section is then

making contact with the memory dim

itself so to recap this it's mainly a

feature that helps on the empty dim slot

and not on the occupied one and that's

also why you probably won't find this on

like an MSI UniFi X or like an Asus Apex

motherboard because it doesn't really

help if it's just a two dim motherboard

that's why I was just completely wrong

we recently covered something similar in

a video where I was talking about that

the guaranteed memory speed for example

advertised by Intel doesn't depend on

your actually memory configuration but

rather the motherboard configuration

they were listing for example I think

14th gen was 5600 megat transfers and

that is only for a two dim motherboard

like an apex but whenever you use

something like a hero or extreme which

has four four slots then like officially

it was something like 3600 or like 4,200

like bizarre low and I was wondering if

that's correct but it's indeed correct

because it's mainly the motherboard that

is causing trouble to how high you can

clock your memory rather than the actual

memory configuration that's why yeah if

you are targeting or if you're looking

for very high memory speed then like a

two dim motherboard will always be the

way to go the cool thing is also that

this won't stay an Asus exclusive

feature if I understood it correctly

they developed it together with the slot

manufacturer I think it was Lotus and

now it will be exclusive for Asus for

like a year and afterwards it's going to

be an open standard and everybody can

make use of that so I'm pretty sure we

will also find it on like MSI gigabyte

or like an ASRock motherboard and that

should definitely help memory clocking

in the future but yeah I just wanted to

cover this because I I

underestimated how yeah problematic just

the empty memory slot could be I hope

you enjoyed this video see you next time

bye-bye

[Music]

In this video, the host discusses their visit to ASUS HQ in Taiwan, specifically focusing on the company's "Nitro Path" memory feature, which was previously announced and released during Gamescom in August. Initially, the host expressed skepticism about whether this feature did anything beyond marketing hype. ASUS claimed that Nitro Path could increase memory speed by approximately 400 megatransfers.
The host found the discussions with ASUS engineers to be technically intriguing and aims to explain the technology in more detail in the video. They promote HETNER as a reliable hosting partner providing GDPR-compliant hosting and good price performance.
The video discusses two types of memory slots: conventional and Nitro Path. Conventional memory slots, commonly used, are shown alongside the Nitro Path design, which features bent contact pins. The host mentions using an upcoming desktop motherboard with two DIMM slots focused on memory overclocking, which did not include the Nitro Path feature—this was confusing because the host assumed it would enhance memory overclocking capabilities.
To understand the differences, the ASUS R&D team set up comparative platforms using two mid-range motherboards: one with conventional memory slots and the other with Nitro Path slots, both using an AMD 8700G APU. They recorded that the conventional board maxed out at 8,200 megatransfers, leading to errors, while the Nitro Path setup was stable at 8,600 megatransfers.
The host realizes that their initial assumption about Nitro Path was incorrect, believing it only affected occupied slots rather than empty slots. Most motherboards utilize two memory channels, which leads to dual-channel configurations when two slots are filled. The host initially thought Nitro Path improved the contact mechanism and signal quality but later learned it primarily helps manage the signal quality of empty slots that can interfere with occupied slots.
Signal quality is assessed using an "eye chart," which visually represents the signal integrity. A clean rectangular shape represents a good signal, while a sinusoidal shape indicates interference. The host emphasizes the significance of the "eye" size for signal quality, noting that larger eyes indicate better conditions without overlapping signals.
Examples were provided to illustrate the differences in signal quality between conventional and Nitro Path slots at various speeds. The host highlights that with higher speeds (like 10,000 megatransfers), the conventional slots demonstrate poor signal integrity, causing issues that may prevent proper operation, while Nitro Path slots maintain a better signal, even if it diminishes at these higher speeds.
Further, the host discusses the construction of the memory slots. They explain that conventional slots have a uniform structure, whereas Nitro Path slots show a different approach, allowing the shorter bent section to make contact with the memory module.
Ultimately, the host concludes that the Nitro Path feature might not be essential in two-DIMM motherboards designed for higher memory speeds; hence this feature might not be seen on certain models. However, they learned that the technology used in Nitro Path was developed in collaboration with a slot manufacturer and will eventually become an open standard, leading to broader adoption across different motherboard brands in the future.
In summary, the main takeaway is that the Nitro Path memory feature addresses signal quality issues stemming from unoccupied memory slots—a factor often overlooked but crucial for high-performance memory configurations. The host expresses an appreciation for the complexity and implications of this technology, inviting viewers to consider it more deeply.

Ranulf · Oct 6, 2024

So another gimmick that one should wait 1-2 years minimum before buying into?

igor_kavinski · Oct 6, 2024

Ranulf said:
So another gimmick that one should wait 1-2 years minimum before buying into?

I would call it progress rather than a gimmick since it improves signal integrity. Yes, initially it will only be available on ASUS 4 dimm slot mobos but hopefully in 2025 we can have it on other vendors' mobos too.

Seba · Oct 6, 2024

Non-AI summary (by me):

The pins from unoccupied memory slots (slots which are still receiving signals from the CPU) act like antennas, emitting an electromagnetic field. That electromagnetic field causes interference with the signals to the occupied memory slots.

ASUS changed the shape of the pins, claiming that the modified shape reduces the EM field generated from the pins from unoccupied memory slots.

This change allows for higher memory frequency (while the PC still remains stable).

Joe NYC · Oct 6, 2024

poke01 said:
https://twitter.com/x/status/1842898843634892825

Good video, it’s the first die shot but unconfirmed

Interesting video, which adds some facts from the die shots from just pure speculation in this thread.

My speculation, not seeing the TSV area from the first die shots was that perhaps AMD spread the TSVs over the some other areas of the CCD die, and that perhaps AMD was going to cover the whole bottom die with V-Cache.

High Yield video is speculating that perhaps the opposite could be taking place, that AMD will cut down the size of V-Cache die and make it 2 layers.

If his hypothesis is correct, this could be the reason why clock speeds regression of V-Cache chips is expected to be lower or eliminated - because cooling area over cores will be minimally obstructed.

It would seem like a complicated way to achieve just that, it seems that there could have been other ways to achieve the same.

Going as far as implementing 2 layers for no gain is overall size of the cache seems like a lot of work for small benefit. The only way I could see this as worthwhile would be if it was a solution that can have more layers beyond 2 (such as 4 or 8)

Doug S · Oct 6, 2024

Seba said:
Non-AI summary (by me):

The pins from unoccupied memory slots (slots which are still receiving signals from the CPU) act like antennas, emitting an electromagnetic field. That electromagnetic field causes interference with the signals to the occupied memory slots.

ASUS changed the shape of the pins, claiming that the modified shape reduces the EM field generated from the pins from unoccupied memory slots.

This change allows for higher memory frequency (while the PC still remains stable).

Or they could treat it like everyone else treats unterminated transmission lines, and create some sort of dummy terminator DIMM to put in those slots. Bet that would do an even better job than reshaping the pins, and they could sell those to every overclocker out there with not just Asus boards but every make of board!

If I had the connections in the industry and wanted to go back to having to work hard for a while, I'd develop, patent, and productize this myself lol

fastandfurious6 · Oct 6, 2024

wait when L3 height is higher than the rest of ccd how is it properly cooled 😱 how many stacks can they do??

lightmanek · Oct 6, 2024

fastandfurious6 said:
wait when L3 height is higher than the rest of ccd how is it properly cooled 😱 how many stacks can they do??

AMD's solution (TSMC's really) is up to 12Hi stack (old AMD EPYC server BIOS was showing options of up to 4 stacks on Zen 3 die).

My alternative theory is that instead of going with multiple stacks to make L3 die small enough, they productise it using TSMC 3nm process (which claims 1.2x scaling for SRAM vs N5).
Just a shot in the dark, but broken clock can be right from time to time.

RnR_au · Oct 6, 2024

Joe NYC said:
Going as far as implementing 2 layers for no gain is overall size of the cache seems like a lot of work for small benefit. The only way I could see this as worthwhile would be if it was a solution that can have more layers beyond 2 (such as 4 or 8)

From my understanding adding these layers is a slow process. Its capacity limited. So I don't think going to 2 or even more layers is being considered. And thats before thermal considerations. More layers means thicker silicon pads over the compute area which means harder to cool.

Kryohi · Oct 6, 2024

lightmanek said:
My alternative theory is that instead of going with multiple stacks to make L3 die small enough, they productise it using TSMC 3nm process (which claims 1.2x scaling for SRAM vs N5).

Isn't zen 4 vcache made on N7/N6? Even going N4 would increase density if that's the case. Though it kinda kills one of the purposes of vcache.

Joe NYC · Oct 6, 2024

RnR_au said:
From my understanding adding these layers is a slow process. Its capacity limited. So I don't think going to 2 or even more layers is being considered. And thats before thermal considerations. More layers means thicker silicon pads over the compute area which means harder to cool.

There is another packaging process called Wafer on Wafer, where the full wafer of the V-Cache die would get joined tother in one step, which would be ~3,500 of the half size V-Cache chips.

The advantage of one by one assembly is that it can be limited to Known Good Die. In case of 2 layers of SRAM chips, the yields are so high and cost is so low that you would not bother to test each individual die (for Known Good Die). The test would be just for assembled 2 layers.

To attach these 2 layers to the CCD could still be done on Known Good Die bases, one by one...

Joe NYC · Oct 6, 2024

lightmanek said:
AMD's solution (TSMC's really) is up to 12Hi stack (old AMD EPYC server BIOS was showing options of up to 4 stacks on Zen 3 die).

My alternative theory is that instead of going with multiple stacks to make L3 die small enough, they productise it using TSMC 3nm process (which claims 1.2x scaling for SRAM vs N5).
Just a shot in the dark, but broken clock can be right from time to time.

Going to more advanced nodes (from N7) would increase the cost per bit, while not really adding anything in terms of performance for SRAM.

Also, for practical purposes, TSMC N7 capacity is unlimited, while N5 and N3 is becoming limited.

lightmanek · Oct 6, 2024

Joe NYC said:
Going to more advanced nodes (from N7) would increase the cost per bit, while not really adding anything in terms of performance for SRAM.

Also, for practical purposes, TSMC N7 capacity is unlimited, while N5 and N3 is becoming limited.

I agree, but on the other hand, main driving force behind V-Cache is server Epyc line of products, so I don't think cost of maufacturing is critical.

Wonder what is more expensive in the long run - stacking 2 or 4 layers, where each stack brings potential defect or manufacturing smaller die on more expensive process.

Kepler_L2 · Oct 6, 2024

Joe NYC said:
Going to more advanced nodes (from N7) would increase the cost per bit, while not really adding anything in terms of performance for SRAM.

Also, for practical purposes, TSMC N7 capacity is unlimited, while N5 and N3 is becoming limited.

TSMC is not at capacity for any node.

Joe NYC · Oct 6, 2024

Kepler_L2 said:
TSMC is not at capacity for any node.

That has been the case since 2022, but I think now, in H2 2024, N5 and N3 are getting tight. This is just from various tidbits. We will get a more official update in a couple of weeks from TSMC in their investor call.

N7 capacity utilization dropped most significantly since late 2022, and that node has a lot of room left to recover.

Gideon · Oct 6, 2024

Zen 5 annotated die-shot, I/O die included:

https://twitter.com/x/status/1843054429773459889

Full res available here:

Annotations

Repository of CPU, GPU and other chip annotations and high resolution images.

nemez.net

511 · Oct 7, 2024

lightmanek said:
Just bought 9950X, soon will feel the urge to buy 9950X3D

Rich boi

Gideon · Oct 7, 2024

Gideon said:
Zen 5 annotated die-shot, I/O die included:

https://twitter.com/x/status/1843054429773459889

Full res available here:

Annotations

Repository of CPU, GPU and other chip annotations and high resolution images.

nemez.net

There was also this informative video:

Looks like the TSVs for 3D cache are changed a lot

Det0x · Oct 7, 2024

Gideon said:
There was also this informative video:

Looks like the TSVs for 3D cache are changed a lot

Wonder when they will figure out that the world have turned upside down 🧐

Gideon · Oct 7, 2024

Det0x said:
Wonder when they will figure out that the world have turned upside down 🧐

I thought that that might instead indicate that they put the cache chip below the main CCD, but seemed too far-fetched

SteinFG · Oct 7, 2024

I think it just means they over-engeneered first iterations. It was first of its kind product

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Golden Member

Lifer

Member

Diamond Member

Golden Member

Lifer

Platinum Member

Lifer

Golden Member

Diamond Member

Diamond Member

Senior member

Senior member

Platinum Member

Member

Diamond Member

Diamond Member

Senior member

Golden Member

Diamond Member

Platinum Member

Diamond Member

Platinum Member

Golden Member

Platinum Member

Senior member