Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

DisEnchantment · Sep 29, 2022

Speculate at will

Saylick · Aug 10, 2023

igor_kavinski said:
Long term, your company should really be looking at rewriting the software. That's insanely inefficient to depend on bruteforce computing power.

For the record, we didn't write the software. We own licenses from the software company that owns it, so we're at their beck and whim regarding the improvements to the software itself.

We've spent time in the past evaluating other software that accomplishes the same task but can be run more efficiently, e.g. more parallelizable and thus can take advantage of cloud computing, but nothing stood out as a clear winner. FWIW, the speed at which we can run the analysis is typically not the bottleneck for the project because the project schedule give us years to path find and refine the design, which is plenty of time for dozens of iterations. Being able to squeeze in another 20 to 30% more iterations generally doesn't get us a significantly more optimized design. Additionally, being a rather conservative engineering profession (civil), we tend to lean away from overly optimizing the design for the performance objective because the objective when you design for earthquakes is inherently "fuzzy" to begin with. No two earthquakes are the same, and we only design the building using only a suite of eleven to fourteen ground motions that are based off of past earthquakes. If the fault that is near your building ruptures in a way that is completely unexpected from one of the ones you used to analyze the building, your design better be pretty robust and insensitive to the spectral characteristics of the shaking. If it were too sensitive to *how* you shake the building, i.e. if it's shaken one way it's safe but shaken another way it collapses, then it's a bad design. For these reasons, intentionally not overly optimizing the design to just those 11 or 14 ground motions is the right approach in my opinion.

Hitman928 said:
What OS does it support? Buying bleeding edge hardware means you need us to date OS for support but many times these engineering CAD programs are slow to support OS updates. I was in a similar position as you a few years ago and we couldn’t go with the latest hardware because it required an OS version to support the hardware that the software wouldn’t support. Granted, the software probably could have been made to work on the updated OS, but when you are talking multi million dollar projects, people get real antsy at the words, “not officially supported “ showing up in the design resources plan.

Windows. I don't think Turin is so groundbreaking that it requires a new version of Windows.

Hitman928 · Aug 10, 2023

Saylick said:
We've spent time in the past evaluating other software that accomplishes the same task but can be run more efficiently, e.g. more parallelizable and thus can take advantage of cloud computing, but nothing stood out as a clear winner. FWIW, the speed at which we can run the analysis is typically not the bottleneck for the project because the project schedule give us years to path find and refine the design, which is plenty of time for dozens of iterations. Being able to squeeze in another 20 to 30% more iterations generally doesn't get us a significantly more optimized design. Additionally, being a rather conservative engineering profession (civil), we tend to lean away from overly optimizing the design for the performance objective because the objective when you design for earthquakes is inherently "fuzzy" to begin with. No two earthquakes are the same, and we only design the building using only a suite of eleven to fourteen ground motions that are based off of past earthquakes. If the fault that is near your building ruptures in a way that is completely unexpected from one of the ones you used to analyze the building, your design better be pretty robust and insensitive to the spectral characteristics of the shaking. If it were too sensitive to *how* you shake the building, i.e. if it's shaken one way it's safe but shaken another way it collapses, then it's a bad design. For these reasons, intentionally not overly optimizing the design to just those 11 or 14 ground motions is the right approach in my opinion.

Windows. I don't think Turin is so groundbreaking that it requires a new version of Windows.

Do you run inside of VMs? Or do you use Windows server and a bunch of individual sessions?

Saylick · Aug 10, 2023

Hitman928 said:
Do you run inside of VMs? Or do you use Windows server and a bunch of individual sessions?

We use VMs and Windows Server.

naukkis · Aug 10, 2023

Abwx said:
You post schematics without even knowing what it is about to make people think that you are in the know, and you are saying things that are false, i never said that there 3 kind of regulators in Zen, but a single kind, namely capacitors that are charged by high speed switching mosfets and without inductances, learn to read before keeping trolling.

That kind of regulation won't work well when input and output voltages are close to each other - needed capacitance would be way more that can practically implemented in silicon. Zen LDO is just simple linear regulator, there is different resistances connected between Vin and Vout which are controlled by switch transistors - logic just switches on correct amount of resistance between input and output to achieve wanted voltage for any current demand. And efficiency isn't terrible for small voltage drops even with such a simple linear regulation - 20% voltage drop can have about 80% efficiency.

Hitman928 · Aug 10, 2023

Saylick said:
We use VMs and Windows Server.

Should be fine then. It’s a bit more complicated on the Linux side with these types of situations. Hopefully you can hold out for Turin, but Genoa would be a beast itself and a gigantic upgrade over what you have currently.

Markfw · Aug 10, 2023

Saylick said:
We use VMs and Windows Server.

Any chance they could support buying one or 2P 9554's ? I mean, $8000-$10,000 to get 2 9554's on ebay + 384 gig of DDR5 4800 registered(new) + a 2P motherboard(new). For ANY company, they can afford that. Then you can see what it will really do, and if it works out, save the rest for Turin.

Hitman928 · Aug 10, 2023

Markfw said:
Any chance they could support buying one or 2P 9554's ? I mean, $8000-$10,000 to get 2 9554's on ebay + 384 gig of DDR5 4800 registered(new) + a 2P motherboard(new). For ANY company, they can afford that. Then you can see what it will really do, and if it works out, save the rest for Turin.

No decent sized business is going to touch Ebay for major hardware purchases.

Markfw · Aug 10, 2023

Hitman928 said:
No decent sized business is going to touch Ebay for major hardware purchases.

He said it was small. If they won't touch ebay, maybe retail ? It would be a few thousand more that way.

Saylick · Aug 10, 2023

Markfw said:
He said it was small. If they won't touch ebay, maybe retail ? It would be a few thousand more that way.

No, @Hitman928 is right. We would never touch eBay. When I said small, I meant a few hundred people.

Joe NYC · Aug 10, 2023

jpiniero said:
They did have a billion in Operating Income in Client (they don't break that down by Desktop/Notebook).

It seems like the Notebook market is sustaining the entire Intel organization.

Phoenix and Strix have a potential to upend that. After Rembrandt not being a commercial success.

Markfw · Aug 10, 2023

Saylick said:
No, @Hitman928 is right. We would never touch eBay. When I said small, I meant a few hundred people.

OK, well, even if you bought the whole server from Dell or something, one would not be that much. my 2 9554s are world record breaking (its out there, a 2P 9554 config). They are awesome and will convince any sane IT manager.

PJVol · Aug 10, 2023

naukkis said:
And efficiency isn't terrible for small voltage drops even with such a simple linear regulation - 20% voltage drop can have about 80% efficiency.

In practice it looks more like 25-50mv, so the Vdrop or IR losses doesn't seem to be of big concern.
But from my personal little experience, crashes in a certain scenarios looks like di/dt response issue.

Saylick · Aug 10, 2023

Markfw said:
OK, well, even if you bought the whole server from Dell or something, one would not be that much. my 2 9554s are world record breaking (its out there, a 2P 9554 config). They are awesome and will convince any sane IT manager.

I'll see what I can do. I am not sure of the urgency of the IT Department in upgrading, only that we are due. If they tell me it needs to happen, then I'll see if I can get them to only upgrade a portion of the racks rather than the whole lot. I just hope that AMD gives some kind of presentation at CES or at least within Q1 2024 so that I have something concrete to point to when I make my case to wait 6 months for Turin. I can't be pointing to an "obscure internet forum" as my source lol

Markfw · Aug 10, 2023

Saylick said:
I'll see what I can do. I am not sure of the urgency of the IT Department in upgrading, only that we are due. If they tell me it needs to happen, then I'll see if I can get them to only upgrade a portion of the racks rather than the whole lot. I just hope that AMD gives some kind of presentation at CES or at least within Q1 2024 so that I have something concrete to point to when I make my case to wait 6 months for Turin. I can't be pointing to an "obscure internet forum" as my source lol

Have them this https://www.phoronix.com/review/amd-epyc-9654-9554-benchmarks/15 or other reviews like it. Well known and respected.

This one is good also

yuri69 · Aug 10, 2023

Joe NYC said:
It looks encouraging that samples of these are out already.

Compared to Zen 4, desktop Zen 4 launch was ~Sep 2022, Phoenix announcement in Jan 2023, while shipping products only started appearing in volume in ~ June 2023.

It looks like AMD may compress this time with Zen 5 generation.

CES 2024 is in 4 months. They are going to somehow present Strix there since they can't afford breaking the sacred AMD ExecutionTM.

adroc_thurston · Aug 10, 2023

yuri69 said:
They are going to somehow present Strix there

No.
Granite Ridge tho.

igor_kavinski · Aug 10, 2023

Saylick said:
We use VMs and Windows Server.

Not trying to get you to divulge your trade secrets but some form of VM orchestration? Coz running dozens of VMs manually on a single server and switching between them sounds like hell for whoever is assigned that job.

igor_kavinski · Aug 10, 2023

Markfw said:
They are awesome and will convince any sane IT manager.

Don't think there are many of those

Saylick · Aug 10, 2023

igor_kavinski said:
Not trying to get you to divulge your trade secrets but some form of VM orchestration? Coz running dozens of VMs manually on a single server and switching between them sounds like hell for whoever is assigned that job.

I have no idea what VM orchestration even means, but I just know that we have an aggregate amount of CPU cores that are divided up into a handful of VMs, each running Windows. When people want to run these fancy analyses, we can remote desktop into the VM and run the analysis there. The engineers internally coordinate with each other to ensure that each VM isn't getting over-utilized because the analysis slows down a lot when you exceed 1 analysis run per core.

"Hey, are you using VM #1?"
"Yeah, I'm running 8 parallel analysis runs on it."
"Okay, no problem. I'll use VM #2 then. Thanks."

Like that. It's really low-tech, I know.

igor_kavinski · Aug 10, 2023

Markfw said:
Have them this https://www.phoronix.com/review/amd-epyc-9654-9554-benchmarks/15 or other reviews like it.

Won't work on these types of people. They will ask specifically to show how fast THEIR software will run. My company wasted almost two months worth of development costs on a UAT environment in Azure Cloud, running our financial applications, to see if there were any tangible benefits. I was the one doing the benchmarking and the gains weren't that great, considering the money being spent and they could have gotten some really cool Epyc hardware for the same cost of doing that testing but no. In the end, they accepted those gains as good enough and moved production to cloud. I'm not happy about it but at least, it's miles ahead of the Ivy Bridge-E Xeons we had.

igor_kavinski · Aug 10, 2023

Saylick said:
Like that. It's really low-tech, I know.

If the analysis doesn't need to run 24/7, you could just move to Azure/AWS Cloud, get results from the VMs and then shut them down to save on cost. Depending on how much time you save with the faster Epyc VMs, the running monthly cost could be pretty reasonable and your IT team wouldn't need to maintain physical infrastructure.

Saylick · Aug 10, 2023

igor_kavinski said:
If the analysis doesn't need to run 24/7, you could just move to Azure/AWS Cloud, get results from the VMs and then shut them down to save on cost. Depending on how much time you save with the faster Epyc VMs, the running monthly cost could be pretty reasonable and your IT team wouldn't need to maintain physical infrastructure.

Yeah, good points all around. We've considered AWS or Azure in the past, but the cost-benefit analysis didn't pan out. Unfortunately, turning on and off the cloud computing spigot isn't quite as fine grained as you think. If we went with cloud computing, the IT department turns the spigot on from Day X to Day Y, where X and Y are the start and end dates of the design phase. It would not be the engineer turning on a cloud instance and then turning it off when the analysis is complete. As a result, there's a lot of idle time where we'd be paying for the cloud server but not getting anything out of it. It wouldn't be like a kubernetes-esque setup that fires up instances with demand. When we evaluated cloud computing a few years ago, prices weren't that great if you normalized on a per-design iteration basis either. We concluded that it was better TCO to own and replace every 5 to 6 years.

igor_kavinski · Aug 10, 2023

Saylick said:
IT department

I guess they want to keep their jobs. Personally, I would hand a 7945HX laptop to every engineer and let them do their analysis wherever they want and then collaborate in some shared workspace.

Saylick · Aug 10, 2023

igor_kavinski said:
I guess they want to keep their jobs. Personally, I would hand a 7945HX laptop to every engineer and let them do their analysis wherever they want and then collaborate in some shared workspace.

Haha, that's not exactly viable because it can take a few days for the analysis to complete. Our laptops can handle most analyses, just not the nonlinear time history earthquake sims that we use the servers for.

Abwx · Aug 10, 2023

naukkis said:
That kind of regulation won't work well when input and output voltages are close to each other - needed capacitance would be way more that can practically implemented in silicon. Zen LDO is just simple linear regulator, there is different resistances connected between Vin and Vout which are controlled by switch transistors - logic just switches on correct amount of resistance between input and output to achieve wanted voltage for any current demand. And efficiency isn't terrible for small voltage drops even with such a simple linear regulation - 20% voltage drop can have about 80% efficiency.

You have in mind old linear regulators that use a 50-60Hz transformers as voltage source, such devices have the capacitor charged every 10-12ms or so, hence the need for a big capacitor, but think of a 50MHz switching speed, the capacitor can be reduced by a 1 000 000 factor at same output current, and at 500MHz switching speed, wich is a cakewalk for transistors used in CPUs, 10pF is enough where 100uF where necessary at a 50Hz frequency.

When it come to power saving say a core that run at 4GHz/1.2V.

At 2GHz and same voltage it would use 0.5x the power, if you insert a resistance in serial that reduce voltage by 1.5x the CPU willl consume 0.22x less while the resistance will consume 0.11x the power (0.5x the core power), the total is 0.33/0.5 = 0.66x the core power that would be used if voltage wasnt reduced by this resistance.

So even with a mediocre efficency there a significant power saving, that may be counter-intuitive at first but that works, and i must admit that at the start i thought the same as you when i heard of the concept.

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Golden Member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Diamond Member

Moderator Emeritus, Elite Member

Diamond Member

Moderator Emeritus, Elite Member

Diamond Member

Diamond Member

Moderator Emeritus, Elite Member

Senior member

Diamond Member

Moderator Emeritus, Elite Member

Senior member

Diamond Member

Lifer

Lifer

Diamond Member

Lifer

Lifer

Diamond Member

Lifer

Diamond Member

Lifer