Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 843 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Saylick

Diamond Member
Sep 10, 2012
3,711
8,391
136
You suppose that his code uses FMA? SSE2 in x64 mode has also 16 registers, 8 limitation is only in 32bit mode.

Intel actually offers great performance monitoring programs so developer doesn't need to guess where code is bottlenecked.
He mentioned GEMM and GEMM will use FMA if available so it could use it. Since SSE2 predates x64 its hard to say if his software was compiled for x86 or x64.

But yes without profiling we can mostly throw around random guesses;)
The software was re-compiled to run in x64 mode, but I just double checked my email correspondence with the software vendor from 2017 so I must issue a correction: it utilizes SSE3, not SSE2. I have not reached out to them in recent history to see if they updated the underlying analysis engine to take advantage of newer vector instruction sets (i.e. AVX2), but I have not seen anything in the update logs to suggest they have.

I'm a complete noob when it comes to profiling, but I suppose it can't hurt. I'm guessing this page is a good start? https://www.intel.com/content/www/u...pc-performance-characterization-analysis.html
 

Racan

Golden Member
Sep 22, 2012
1,176
2,164
136

Jan Olšan

Senior member
Jan 12, 2017
430
786
136
What the heck is “X3D Turbo Mode”?



If you go to Gigabyte’s website the page is broken…

My guess: 105W TDP mode for non-X3D CPUs. What's it doing to X3D, no idea, but best explanation is always OC or relaxed power limits (PBO).
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,983
3,669
136
that was not the number i was expecting ........ need more games to confirm....... but i did theory craft a reasonable position as to how Zen5 x3d could see more IPC uplift from the cache then zen4.


was my guess
1. clocks , can they get x3d clocks closer to base models relative to zen4?
2. can vcache allow Zen5 to stretch its legs further relative to zen4?

chipsandcheese.com

AMD’s Ryzen 9950X: Zen 5 on Desktop

AMD’s desktop Zen 5 products, codenamed Granite Ridge, are the latest in the company’s line of high performance consumer offerings. Here, we’ll be looking at AMD’s Ryzen 9 9…
chipsandcheese.com
chipsandcheese.com
you can see Zen5 is both more data memory bound and front end latency bound then zen4 . Also zen5 is less retirement bound then then Zen4 and Zen4x3d puts noticeably more pressure on retirement.

So zen5x3d could be a banger if both the above occur.
 
Last edited:

Gideon

Golden Member
Nov 27, 2007
1,875
4,520
136
Am I really seeing 5.6Ghz all-core clocks there? :eek:

That would make the 9950X3D a really exciting CPU to own for a software developer that also games

I know it's PBO and all ( thus prolly very good cooling coupled with heavy tuning) so the stock perf might be closer to 5.4, but it's still very impressive
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,983
3,669
136
BTW, am I blind, but i don't see what GPU he is using for the Dawntrail benchmark?

Considering the score and FSR I presume it's RX 7900XTX, but I find it strange he has blurred it out.
unless its like 5090 and it has stupid high raster performance , i might put on my GPU oc's and see how much it improves score will leaving CPU stock

nup this is super heavy cpu bound , my 7900XTX with my max stable witcher 3 OC ( near 400watts ) is only pullng like 300 and max GPU utilisation is 70%
 
Last edited:

Gideon

Golden Member
Nov 27, 2007
1,875
4,520
136
unless its like 5090 and it has stupid high raster performance , i might put on my GPU oc's and see how much it improves score will leaving CPU stock
Yeah, would be interesting to know. Here there are some runs in the mid 50000-ies at 1080p with a 4090:

And our very own @Det0x has a score of 53072 with his absurdly tuned 9950X and RTX 3090 at 1080p:

I wonder what he would score with FSR + 720p?

I mean it's a super promising result, but I wouldn't mind to see a comparison to a highly tuned RTX 4090 rig, with the same settings