Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 728 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

MS_AT

Senior member
Jul 15, 2024
246
567
96
Good information, y'all!

@MS_AT , thank you for filling me with hope for the 9950X with DDR5-8000 :) Can't wait to see the Phoronix benchmarks for that!

@Asterox , so DDR5-8000 kits will be more beneficial for Zen 5 desktop APUs!

From http://www.numberworld.org/blogs/2024_8_7_zen5_avx512_teardown/:



DDR5-20000 would have about 310 GB/s of bandwidth.

Guess which upcoming part has that much bandwidth on tap?

STRIX HALO!!!! :D

Now if AMD doesn't cut the AVX-512 width in that part, we are in for a REAL performance revolution!
Just keep in mind, that you need both CCDs active to see the improvement in Mem BW. Because each CCD has only one active GMI link to IOD. But the CCD itself has two GMI interfaces [at least this was the case with Zen4 and this was a reason why some lower core count SKUs had 2 links enabled, to ensure they could better tap into socket's bandwidth].

Likewise for Strix Halo, what will be important is how many GMI links will be active as this will decide how much of the overall bandwidth CPU can use. I think Adroc claimed that 2 links will be active, but until they release the product we won't know for sure.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,340
5,464
136
Not going to happen. Strix Halo might be able to do something like that with its 3nm IOD and new chiplet packaging, but its designed for LPDDR so I dont know if its even technically possible to release on a desktop package and work with DDR5.

Generally most designs can work with LPDDR and DDR.

But not on AM5. Strix Halo needs 256 bit memory, so it needs 4 independent memory slots. AM5 is 128 bit.

Maybe some variation of Next generation Threadripper, could support a socket version of Strix Halo. But even then it would have less bandwidth than the laptop version.
 

Det0x

Golden Member
Sep 11, 2014
1,299
4,234
136

Not very encouraging. Hopefully the EXPO 8000 MT/s kits will have lower timings than the ones for the kit used by Larabel.

I really want AMD to pull some miracle with an updated IOD that allows the X3D chips to use DDR5-8000 in 1:1 mode.
Easiest and largest performance gain for AM5 is maxing tREFI out at 65k for all memoryspeeds
Sadly it seems like very few reviews understands t‍his 🤷‍♀️
 

Det0x

Golden Member
Sep 11, 2014
1,299
4,234
136
Seems like either windows 11 or nvidia drivers needs to update the thread scheduling for Zen5 (?)
That's at least my take on this data..
During the course of our testing, we observed that Windows 11 was scheduling workloads on the 9700X in a manner that would try to saturate a single core first, by placing workloads on each of its logical threads. Additionally, the placement would put load on the CPPC2 "best" or "second-best" core (gold and silver in Ryzen Master)—which makes sense. However, if a highly demanding single threaded workload runs on one core, scheduling another demanding workload on the second thread of that core will result in lower overall performance. It would be better to place them on two separate cores, where they each have access to the full resources of that core. We hence set out to see if this is an SMT-specific problem.

1723305161222.png

1723305192004.png

1723305218494.png

1723305245069.png

1723305690412.png
Don't think i've observed this behavior on my 16 core Zen5
 
Last edited:
Jul 27, 2020
20,040
13,738
146
Easiest and largest performance gain for AM5 is maxing tREFI out at 65k for all memoryspeeds
Sadly it seems like very few reviews understands t‍his 🤷‍♀️
I think I read in some online posts that higher tREFI can result in hotter DIMMs. Is it possible to do 65536 tREFI without special cooling on DIMMs?
 

Hail The Brain Slug

Diamond Member
Oct 10, 2005
3,513
2,464
136
I think I read in some online posts that higher tREFI can result in hotter DIMMs. Is it possible to do 65536 tREFI without special cooling on DIMMs?
If anything, I'd think it would run the dimms cooler. All it does is increase the time between refreshes. Fewer refreshes = less power = cooler, maybe?
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,340
5,464
136
Seems like either windows 11 or nvidia drivers needs to update the thread scheduling for Zen5 (?)
That's at least my take on this data..

That graph shows improvement of only 1.5% with SMT disabled... Which isn't exactly massive.

Depending on the set of games, sometimes SMT improves the result, sometimes it makes it worse. I doubt this will lead to a rewrite of Windows scheduler.
 

Det0x

Golden Member
Sep 11, 2014
1,299
4,234
136
That graph shows improvement of only 1.5% with SMT disabled... Which isn't exactly massive.

Depending on the set of games, sometimes SMT improves the result, sometimes it makes it worse. I doubt this will lead to a rewrite of Windows scheduler.
Updated screenshots
Its 1.5% at 4k res, higher at lower res

Some more screens
1723306007475.png

1723306039783.png

1723306134327.png

1723306182768.png

1723306362685.png
 
Last edited:

Tuna-Fish

Golden Member
Mar 4, 2011
1,486
2,023
136
I think I read in some online posts that higher tREFI can result in hotter DIMMs. Is it possible to do 65536 tREFI without special cooling on DIMMs?

No, it reduces heat. But it increases sensitivity to heat, which means you have to worry more about cooling.

tREFI is the refresh interval, or how long a single row go between refreshes. The caps that hold charge in DRAM are leaky, and tREFI needs to be short enough that if a cap has charge, there is still sufficient charge in the cap to sense as 1 just before the refresh triggers. Because of fairly basic physical reasons, this time is extremely sensitive to temperature. The hotter your modules get, the shorter it needs to be.

Because a bank that is refreshing cannot be accessed, this has a fairly direct effect on average bandwidth and worst-case latency, but also because the memory manufacturers are scared of lots of ram being faulty and returned, the default is set quite low.

The correct way to adjust tREFI is to set it as high as you dare, very exhaustively test memory while also running some torture test on GPU to get your machine hot, and then if the machine passes, pull it down ~25% just to be sure it still passes when your case has more dust and/or the ambient temperature is higher. The common advice of just setting it to 64k scares me.
 
Last edited:

Heartbreaker

Diamond Member
Apr 3, 2006
4,340
5,464
136
Updated screenshots
Its 1.5% at 4k res, higher at lower res

And at lower res, you can see the same kind of thing for Zen 4 as well. This has been the reality of HT/SMT since it was first introduced decades ago. Sometimes it helps, sometimes it harms, but the overall consesnus is just turn it on and forget it.

A completely reworked core, is going to shift SMT behavior somewhat, but there isn't anything that significant going on here, except a whole lot of cope grasping at straws.
 

StefanR5R

Elite Member
Dec 10, 2016
5,926
8,863
136
Two areas in which Zen 5 notably improves over Zen 4 have been pointed out here:
  • vector arithmetic,
  • web browsers/ JITs and the likes.
Glancing over the TPU review, it appears there is another area:
  • databases.
MySQL TPC-C test:
9700X ......... 15,200 TPS (126 %)
7700X ......... 12,900 TPS (107 %)
7700 ........... 12,100 TPS (100 %)
7800X3D .... 11,700 TPS (97 %)​

MongoDB 6, time for 10M requests:
9700X ......... 67.5 s (141 %)
7700X ......... 90.0 s (106 %)
7700 ........... 95.4 s (100 %)
7800X3D .... 98.3 s (97 %)​
edit: added relative performance, Ryzen 7700 as baseline
 
Last edited:

yuri69

Senior member
Jul 16, 2013
541
975
136
Seems like either windows 11 or nvidia drivers needs to update the thread scheduling for Zen5 (?)

Don't think i've observed this behavior on my 16 core Zen5
Issues like this *might* be caused by a power-saver or bugged power plan/scheduling scheme. "Filling" all physical cores first should be the energy efficient strategy.
Two areas in which Zen 5 notably improves over Zen 4 have been pointed out here:
  • vector arithmetic,
  • web browsers/ JITs and the likes.
Glancing over the TPU review, it appears there is another area:
  • databases.
MySQL TPC-C test:
9700X ........ 15,200 TPS​
7700X ........ 12,900 TPS​
7700 .......... 12,100 TPS​
7800X3D .... 11,700 TPS​
MongoDB 6, time for 10M requests:
9700X ........ 67.5 s​
7700X ........ 90.0 s​
7700 .......... 95.4 s​
7800X3D .... 98.3 s​
Check Phoronix's DB section, the DB gains are nice.
 

Det0x

Golden Member
Sep 11, 2014
1,299
4,234
136
And at lower res, you can see the same kind of thing for Zen 4 as well. This has been the reality of HT/SMT since it was first introduced decades ago. Sometimes it helps, sometimes it harms, but the overall consesnus is just turn it on and forget it.

A completely reworked core, is going to shift SMT behavior somewhat, but there isn't anything that significant going on here, except a whole lot of cope grasping at straws.
Did you miss the following quote ?
During the course of our testing, we observed that Windows 11 was scheduling workloads on the 9700X in a manner that would try to saturate a single core first, by placing workloads on each of its logical threads. Additionally, the placement would put load on the CPPC2 "best" or "second-best" core (gold and silver in Ryzen Master)—which makes sense. However, if a highly demanding single threaded workload runs on one core, scheduling another demanding workload on the second thread of that core will result in lower overall performance. It would be better to place them on two separate cores, where they each have access to the full resources of that core.
This is not how SMT scheduling have worked in the past, nor how it should work
(for reference, check the numbers for 7700X how it behaves with SMT ON/OFF as a comparison)

Kinda seems more like its someone else that's "coping and grasping at straws" as you put it.. Why is that ?
 
Last edited:

StefanR5R

Elite Member
Dec 10, 2016
5,926
8,863
136
Seems like either windows 11 or nvidia drivers needs to update the thread scheduling for Zen5 (?)
That's at least my take on this data..
During the course of our testing, we observed that Windows 11 was scheduling workloads on the 9700X in a manner that would try to saturate a single core first, by placing workloads on each of its logical threads.
Whoa, I didn't realize that Windows is still that bad. I am forced to use it (Win 10) as application launcher at work, and ignore it as best as I can…
 
  • Like
Reactions: Rigg
Jul 27, 2020
20,040
13,738
146
Some more screens
I love how the situation totally changes with SMT off in Excel and Outlook, arguably two of the most used applications in offices all around the world:

1723304552324.png

7700X with SMT disabled is closer to 9700X SMT disabled in Outlook but loses any chance whatsoever in Excel, which really fills me with delight!

Now just gotta wait for Excel 2024 with AVX-512! :D
 

DaaQ

Golden Member
Dec 8, 2018
1,448
1,044
136
Did you miss the following quote ?

This is not how SMT scheduling have worked in the past, nor how it should work
(for reference, check the numbers for 7700X how it behaves with SMT ON/OFF as a comparison)

Kinda seems more like its someone else that's "coping and grasping at straws" as you put it.. Why is that ?
He didn't read the article.. hopefully he will. Probably won't.
 

CouncilorIrissa

Senior member
Jul 28, 2023
541
2,120
96
Last edited:
Jul 27, 2020
20,040
13,738
146
Clearly something was very lacking in Zen 4.

9700X SMT disabled performance uplift @1080p:

Baldur's Gate 3 +6.78%

Remnant II +6.67%

Spiderman Remastered +17.17% (!!!!) and RT +15.82% (!!!!)

9700X SMT disabled performance uplift @1440p:

Baldur's Gate 3 +6.69%

Spiderman Remastered +17.22% (!!!!) and RT +18.73% (!!!!)

9700X SMT disabled performance uplift @ 2160p:

Baldur's Gate 3 +7.23%

Spiderman Remastered +5.16% and RT +8.10% (!!!!)

AND

+11.61% CS 1080p minimum fps

+12.63% Remnant II 1080p minimum fps

+60.27% (!!!!) Spiderman Remastered 1080p minimum fps

+22.50% (!!!!) Last of Us 1080p minimum fps


+8.30% Baldur's Gate 3 2160p minimum fps

+31.91% (!!!!) Spiderman Remastered 2160p minimum fps


And people are disappointed.

SMH
 

CouncilorIrissa

Senior member
Jul 28, 2023
541
2,120
96
Clearly something was very lacking in Zen 4.

9700X SMT disabled performance uplift @1080p:

Baldur's Gate 3 +6.78%

Remnant II +6.67%

Spiderman Remastered +17.17% (!!!!) and RT +15.82% (!!!!)

9700X SMT disabled performance uplift @1440p:

Baldur's Gate 3 +6.69%

Spiderman Remastered +17.22% (!!!!) and RT +18.73% (!!!!)

9700X SMT disabled performance uplift @ 2160p:

Baldur's Gate 3 +7.23%

Spiderman Remastered +5.16% and RT +8.10% (!!!!)

AND

+11.61% CS 1080p minimum fps

+12.63% Remnant II 1080p minimum fps

+60.27% (!!!!) Spiderman Remastered 1080p minimum fps

+22.50% (!!!!) Last of Us 1080p minimum fps


+8.30% Baldur's Gate 3 2160p minimum fps

+31.91% (!!!!) Spiderman Remastered 2160p minimum fps


And people are disappointed.

SMH
If these figures are from TYC review, I wouldn't put too much stock into them yet. I'd wait for someone else to verify that.