- Mar 3, 2017
- 1,747
- 6,598
- 136
Just keep in mind, that you need both CCDs active to see the improvement in Mem BW. Because each CCD has only one active GMI link to IOD. But the CCD itself has two GMI interfaces [at least this was the case with Zen4 and this was a reason why some lower core count SKUs had 2 links enabled, to ensure they could better tap into socket's bandwidth].Good information, y'all!
@MS_AT , thank you for filling me with hope for the 9950X with DDR5-8000Can't wait to see the Phoronix benchmarks for that!
@Asterox , so DDR5-8000 kits will be more beneficial for Zen 5 desktop APUs!
From http://www.numberworld.org/blogs/2024_8_7_zen5_avx512_teardown/:
DDR5-20000 would have about 310 GB/s of bandwidth.
Guess which upcoming part has that much bandwidth on tap?
STRIX HALO!!!!
Now if AMD doesn't cut the AVX-512 width in that part, we are in for a REAL performance revolution!
Not going to happen. Strix Halo might be able to do something like that with its 3nm IOD and new chiplet packaging, but its designed for LPDDR so I dont know if its even technically possible to release on a desktop package and work with DDR5.
Easiest and largest performance gain for AM5 is maxing tREFI out at 65k for all memoryspeeds
Not very encouraging. Hopefully the EXPO 8000 MT/s kits will have lower timings than the ones for the kit used by Larabel.
I really want AMD to pull some miracle with an updated IOD that allows the X3D chips to use DDR5-8000 in 1:1 mode.
During the course of our testing, we observed that Windows 11 was scheduling workloads on the 9700X in a manner that would try to saturate a single core first, by placing workloads on each of its logical threads. Additionally, the placement would put load on the CPPC2 "best" or "second-best" core (gold and silver in Ryzen Master)—which makes sense. However, if a highly demanding single threaded workload runs on one core, scheduling another demanding workload on the second thread of that core will result in lower overall performance. It would be better to place them on two separate cores, where they each have access to the full resources of that core. We hence set out to see if this is an SMT-specific problem.





I think I read in some online posts that higher tREFI can result in hotter DIMMs. Is it possible to do 65536 tREFI without special cooling on DIMMs?Easiest and largest performance gain for AM5 is maxing tREFI out at 65k for all memoryspeeds
Sadly it seems like very few reviews understands this 🤷♀️
Runs fine even on naked DIMMsI think I read in some online posts that higher tREFI can result in hotter DIMMs. Is it possible to do 65536 tREFI without special cooling on DIMMs?
If anything, I'd think it would run the dimms cooler. All it does is increase the time between refreshes. Fewer refreshes = less power = cooler, maybe?I think I read in some online posts that higher tREFI can result in hotter DIMMs. Is it possible to do 65536 tREFI without special cooling on DIMMs?
Seems like either windows 11 or nvidia drivers needs to update the thread scheduling for Zen5 (?)
That's at least my take on this data..
Updated screenshotsThat graph shows improvement of only 1.5% with SMT disabled... Which isn't exactly massive.
Depending on the set of games, sometimes SMT improves the result, sometimes it makes it worse. I doubt this will lead to a rewrite of Windows scheduler.





I think I read in some online posts that higher tREFI can result in hotter DIMMs. Is it possible to do 65536 tREFI without special cooling on DIMMs?
Updated screenshots
Its 1.5% at 4k res, higher at lower res
Works on my machine!The common advice of just setting it to 64k scares me.
Issues like this *might* be caused by a power-saver or bugged power plan/scheduling scheme. "Filling" all physical cores first should be the energy efficient strategy.Seems like either windows 11 or nvidia drivers needs to update the thread scheduling for Zen5 (?)
Don't think i've observed this behavior on my 16 core Zen5
Check Phoronix's DB section, the DB gains are nice.Two areas in which Zen 5 notably improves over Zen 4 have been pointed out here:
Glancing over the TPU review, it appears there is another area:
- vector arithmetic,
- web browsers/ JITs and the likes.
MySQL TPC-C test:
- databases.
9700X ........ 15,200 TPS7700X ........ 12,900 TPS7700 .......... 12,100 TPS7800X3D .... 11,700 TPSMongoDB 6, time for 10M requests:
9700X ........ 67.5 s7700X ........ 90.0 s7700 .......... 95.4 s7800X3D .... 98.3 s
Video transcoding, source code compilation, ...How many? And which of these are you running once a month or more often? Thanks in advance.
Did you miss the following quote ?And at lower res, you can see the same kind of thing for Zen 4 as well. This has been the reality of HT/SMT since it was first introduced decades ago. Sometimes it helps, sometimes it harms, but the overall consesnus is just turn it on and forget it.
A completely reworked core, is going to shift SMT behavior somewhat, but there isn't anything that significant going on here, except a whole lot of cope grasping at straws.
This is not how SMT scheduling have worked in the past, nor how it should workDuring the course of our testing, we observed that Windows 11 was scheduling workloads on the 9700X in a manner that would try to saturate a single core first, by placing workloads on each of its logical threads. Additionally, the placement would put load on the CPPC2 "best" or "second-best" core (gold and silver in Ryzen Master)—which makes sense. However, if a highly demanding single threaded workload runs on one core, scheduling another demanding workload on the second thread of that core will result in lower overall performance. It would be better to place them on two separate cores, where they each have access to the full resources of that core.
Whoa, I didn't realize that Windows is still that bad. I am forced to use it (Win 10) as application launcher at work, and ignore it as best as I can…Seems like either windows 11 or nvidia drivers needs to update the thread scheduling for Zen5 (?)
That's at least my take on this data..
During the course of our testing, we observed that Windows 11 was scheduling workloads on the 9700X in a manner that would try to saturate a single core first, by placing workloads on each of its logical threads.
I love how the situation totally changes with SMT off in Excel and Outlook, arguably two of the most used applications in offices all around the world:Some more screens

He didn't read the article.. hopefully he will. Probably won't.Did you miss the following quote ?
This is not how SMT scheduling have worked in the past, nor how it should work
(for reference, check the numbers for 7700X how it behaves with SMT ON/OFF as a comparison)
Kinda seems more like its someone else that's "coping and grasping at straws" as you put it.. Why is that ?
That's a...non-insignificant gain in front-end bound workloads like browsing. I expected less.Updated screenshots
Its 1.5% at 4k res, higher at lower res
Some more screens
View attachment 105032
View attachment 105033
View attachment 105034
View attachment 105035
View attachment 105036
If these figures are from TYC review, I wouldn't put too much stock into them yet. I'd wait for someone else to verify that.Clearly something was very lacking in Zen 4.
9700X SMT disabled performance uplift @1080p:
Baldur's Gate 3 +6.78%
Remnant II +6.67%
Spiderman Remastered +17.17% (!!!!) and RT +15.82% (!!!!)
9700X SMT disabled performance uplift @1440p:
Baldur's Gate 3 +6.69%
Spiderman Remastered +17.22% (!!!!) and RT +18.73% (!!!!)
9700X SMT disabled performance uplift @ 2160p:
Baldur's Gate 3 +7.23%
Spiderman Remastered +5.16% and RT +8.10% (!!!!)
AND
+11.61% CS 1080p minimum fps
+12.63% Remnant II 1080p minimum fps
+60.27% (!!!!) Spiderman Remastered 1080p minimum fps
+22.50% (!!!!) Last of Us 1080p minimum fps
+8.30% Baldur's Gate 3 2160p minimum fps
+31.91% (!!!!) Spiderman Remastered 2160p minimum fps
And people are disappointed.
SMH
So you're buying today, right?And people are disappointed.
