Question Asus mobos burning x3D CpuS?

JoeRambo · Apr 30, 2023

IEC said:
Meanwhile I will watch with popcorn using my comfy manual 1.1vSOC.

Famous last words huh, if i understood that GN video correctly, it seems that ASUS motherboards were applying quite higher vSoC even if it was manually set below it.

Hilarious stuff really, this whole "we have two places to set parameters in vendor BIOS'es" is stuff of legends only morons from AMD could come up with.

TheELF · Apr 30, 2023

coercitiv said:
You poor little unsuspecting souls.

So according to what you linked intel set up a fund for the RMAs and released a new stepping to fix the issue, the issue being that signal could degrade after multiple years.
They didn't have a halo product blowing up within days and "fixing" it by limiting the bios with no info given on if they will be able to cover all the RMAs and with no info on them really fixing the issue on a new stepping.
Yeah, looks the same.

DrMrLordX · Apr 30, 2023

eek2121 said:
Yep, ASUS does this. Watch the video.

It is mind boggling TBH.

I did. Asus seems to be giving +.05v vSoC over that which is reported by software.

CakeMonster said:
Asus released the 1410 BIOS after this video was shot, that was the first BIOS for me that got 1.30v SOC right on Auto, before that DOCP/EXPO would still be at 1.35v.

Just keep in mind that even if the board says it's limiting vSoC to 1.30v, it could be going as high as 1.35v in actuality. Without external hardware testing, there's no way to know for sure.

JoeRambo said:
Hilarious stuff really, this whole "we have two places to set parameters in vendor BIOS'es" is stuff of legends only morons from AMD could come up with.

Keep in mind that appears to be a reference to two different ways of activating EXPO settings, e.g. the ASUS control set 1.35v+ with EXPO on while the AMD CBS menu set a lower vSoC. AMD has had these submenus on their platforms since AM4, and it's up to the motherboard vendors whether or not to expose them. Generally it's a good idea to expose them since some settings like VDDP and VDDG are hidden otherwise.

In any case none of that references users who manually set their vSoC (in lieu of letting the board do it in conjunction with using EXPO).

coercitiv · Apr 30, 2023

DrMrLordX said:
At least one of the reported users lost his CPU and he wasn't even using EXPO.

That's one of the reasons I said that it still feels like we haven't really gotten to the bottom of this.

aigomorla · Apr 30, 2023

Just to be fair i do not think any of the current team was the same team on Thunderbird / Duron.

:T

So they probably assumed material science has advanced, even tho were using thinner pcb on the cpu, and could tollerate the stress.

aigomorla · Apr 30, 2023

DrMrLordX said:
Just keep in mind that even if the board says it's limiting vSoC to 1.30v, it could be going as high as 1.35v in actuality. Without external hardware testing, there's no way to know for sure.

yup this is called Vdroop.

Voltage droop - Wikipedia

en.wikipedia.org

this is why i recommended 1.2875V after droop.

IEC · Apr 30, 2023

JoeRambo said:
Famous last words huh, if i understood that GN video correctly, it seems that ASUS motherboards were applying quite higher vSoC even if it was manually set below it.

View attachment 80145

Hilarious stuff really, this whole "we have two places to set parameters in vendor BIOS'es" is stuff of legends only morons from AMD could come up with.

🙄

Going back to the topic.

My comment early in the thread about short-circuits seems oddly prescient...

Two failure modes GN has identified:
1) Zero Ohm complete shorts in cases where CPU died but motherboard is fine (due to OCP/OTP kicking in)
2) Low resistance shorts... this is where motherboard vendors who somehow put in inadequate OCP protection in their $700 motherboards (*cough*, ASUS) continue delivering power overwhelming to an already-dead CPU leading to a catastrophic failure with desoldering of the IHS, audible cracking of the CPU die, and incineration of the motherboard socket.

My comments:
ASUS - OCP protection did not kick in even at 400W+. Fail. Catastrophic fail.
AMD - Make it right for the end users. And rein in your partners (especially ASUS).

Relevant Wendell-ism:

SK10H · Apr 30, 2023

eek2121 said:
Lower your SOC voltage. There is no reason for it to be above 1.3V.

A crappy IOD need more than just 1.3v to scale if you haven't play with one.
Whether it's worth pushing on such die is up to one's preference, but your statement is false.

AdamK47 · Apr 30, 2023

Those that have been running their memory at a magical 6400 or a mythical 6600 are going to have relent and settle for 6000/6200.

Then again, they may just go YOLO and keep VSOC at crazy levels for their 1% total system performance increase.

Markfw · Apr 30, 2023

Where is vsoc here ?? I see a VDDCR SOC at 1.25 Is that it ?

Markfw · Apr 30, 2023

And where is it here ?

DrMrLordX · Apr 30, 2023

aigomorla said:
yup this is called Vdroop.

Voltage droop - Wikipedia

en.wikipedia.org

View attachment 80149

this is why i recommended 1.2875V after droop.

Seems more like vdroop + LLC. It overshoots during a heavy load rather than immediately after one. Unless I'm getting the wrong impression of what was showing on the multimeter they used for testing in the video.

@Markfw

It should be VDDCR SoC.

IEC · Apr 30, 2023

Markfw said:
Where is vsoc here ?? I see a VDDCR SOC at 1.25 Is that it ?

It's VDDCR_SOC.

Which on at least the handful of ASRock boards tested by GN and Level1Techs sets it to 1.2-1.25V with EXPO, and closer to 1.0V with Auto settings. Unlike ASUS which defaults to overly high SOC voltage.

TheELF · Apr 30, 2023

IEC said:
🙄
View attachment 80146

Going back to the topic.

My comment early in the thread about short-circuits seems oddly prescient...

Two failure modes GN has identified:
1) Zero Ohm complete shorts in cases where CPU died but motherboard is fine (due to OCP/OTP kicking in)
2) Low resistance shorts... this is where motherboard vendors who somehow put in inadequate OCP protection in their $700 motherboards (*cough*, ASUS) continue delivering power overwhelming to an already-dead CPU leading to a catastrophic failure with desoldering of the IHS, audible cracking of the CPU die, and incineration of the motherboard socket.

My comments:
ASUS - OCP protection did not kick in even at 400W+. Fail. Catastrophic fail.
AMD - Make it right for the end users. And rein in your partners (especially ASUS).

Relevant Wendell-ism:
View attachment 80147

So two thirds of the issue are from AMD, weak io die and weak CCD is what is killing the CPUs.
If you also have a bad bios then you also get a burned mobo (the rapid disassembly means the blowing up damaging the mobo) but the bad bios is not the cause of the CPU failing it just makes it even worse.

H433x0n · Apr 30, 2023

TheELF said:
So two thirds of the issue are from AMD, weak io die and weak CCD is what is killing the CPUs.
If you also have a bad bios then you also get a burned mobo (the rapid disassembly means the blowing up damaging the mobo) but the bad bios is not the cause of the CPU failing it just makes it even worse.

That's how I interpreted it. It seems like Steve kind of overpromised with this video. He spent like 2/3 of the video talking about how horrible ASUS is but that doesn't really root cause the issue. He identified a ton of motherboard and BIOS bugs (one big one for Gigabyte too) however, that doesn't fully explain what's going on. It seems like we've got an explanation for how the socket got damaged on ASUS boards but that's all that's explained in this video.

What happens when the CPU blows up but the motherboard is intact? It seems like for that one we've got to wait for the failure analysis lab and we're still weeks away on that one. We've got a theory from Wendell about what's happening on the CPU side but it's just a theory (for now).

Markfw · Apr 30, 2023

IEC said:
It's VDDCR_SOC.

Which on at least the handful of ASRock boards tested by GN and Level1Techs sets it to 1.2-1.25V with EXPO, and closer to 1.0V with Auto settings. Unlike ASUS which defaults to overly high SOC voltage.

The one you quoted is an ASUS board, and I see 1.25. The second pic is ASRock and I see nothing that SOC anywhere in the voltage description. Its a Taichi.

IEC · Apr 30, 2023

TheELF said:
So two thirds of the issue are from AMD, weak io die and weak CCD is what is killing the CPUs.
If you also have a bad bios then you also get a burned mobo (the rapid disassembly means the blowing up damaging the mobo) but the bad bios is not the cause of the CPU failing it just makes it even worse.

1) If your BIOS sets the voltage for a rail to something wildly out of spec, whether intentional or not -
Is that a "weak" chip, a fault of the BIOS maker who set the wrong voltage, a fault of the chip maker who didn't properly delineate tolerance differences between X3D and non-X3D chips, or some combination of the above?

2) There's a certain common thread with damaged motherboards, and it involves OCP protections that don't actually work. I'm looking at you, ASUS.

H433x0n said:
That's how I interpreted it. It seems like Steve kind of overpromised with this video. He spent like 2/3 of the video talking about how horrible ASUS is but that doesn't really root cause the issue. He identified a ton of motherboard and BIOS bugs (one big one for Gigabyte too) however, that doesn't fully explain what's going on. It seems like we've got an explanation for how the socket got damaged on ASUS boards but that's all that's explained in this video.

What happens when the CPU blows up but the motherboard is intact? It seems like for that one we've got to wait for the failure analysis lab and we're still weeks away on that one. We've got a theory from Wendell about what's happening on the CPU side but it's just a theory (for now).

Motherboard being intact suggests one of two things:
1) Fried the CPU so good it's at 0 Ohm resistance in the short circuited area so even faulty ASUS setups won't dump power into it.
2) Motherboard tries to power the CPU on but due to actually working OCP/OTP it aborts properly.

IEC · Apr 30, 2023

Markfw said:
The one you quoted is an ASUS board, and I see 1.25. The second pic is ASRock and I see nothing that SOC anywhere in the voltage description. Its a Taichi.

You're seeing what is set in the BIOS, not the actual readout on the voltage.

Try hwinfo64 and look for the CPU SOC voltage and see if it matches. I suspect it overshoots the actual setting like it does on all my ASUS boards.

SK10H · Apr 30, 2023

AdamK47 said:
Those that have been running their memory at a magical 6400 or a mythical 6600 are going to have relent and settle for 6000/6200.

Then again, they may just go YOLO and keep VSOC at crazy levels for their 1% total system performance increase.

Gotta lower the fclk back to 2000 as well.
I used to pump 1.375v vsoc to run 6200 at just 2066 fclk on the bad one during testing, 1.35v isn't even stable. 🙄

Markfw · Apr 30, 2023

IEC said:
You're seeing what is set in the BIOS, not the actual readout on the voltage.

Try hwinfo64 and look for the CPU SOC voltage and see if it matches. I suspect it overshoots the actual setting like it does on all my ASUS boards.

Do I see 1.235 ? Thats ASUS BIOS 1410.

Markfw · Apr 30, 2023

HWINFO on mt Taichi also say 1.235. And I got that motherboard when the Zen 4 first came out, and the bios is original. ALL of my 7950x's are working just fine. I have 5 of them.

Aside from older ASUS BIOS, I don't see this as an issue , aside from some want to make it seem spectacular and a fail.

IEC · Apr 30, 2023

Markfw said:
Do I see 1.235 ? Thats ASUS BIOS 1410.

So, looks like for your board/memory combo on that BIOS it sets VDDCR_SOC to 1.25V and on load the vdroop causes it to go to 1.235V. Both of which should be safe. Assuming the software reading is accurate and ASUS isn't using some VRM trickery.

Markfw said:
HWINFO on mt Taichi also say 1.235. And I got that motherboard when the Zen 4 first came out, and the bios is original. ALL of my 7950x's are working just fine. I have 5 of them.

Aside from older ASUS BIOS, I don't see this as an issue , aside from some want to make it seem spectacular and a fail.

As for the issue... it definitely exists, and even if (as is likely) it's an edge case like +12VHPWR this should never happen, period. Proper safeguards should be in place to prevent it.

Markfw · Apr 30, 2023

IEC said:
So, looks like for your board/memory combo on that BIOS it sets VDDCR_SOC to 1.25V and on load the vdroop causes it to go to 1.235V. Both of which should be safe. Assuming the software reading is accurate and ASUS isn't using some VRM trickery.

As for the issue... it definitely exists, and even if (as is likely) it's an edge case like +12VHPWR this should never happen, period. Proper safeguards should be in place to prevent it.

Well, 5 boxes, 4 different motherboards, 2 vendors, and the one ASUS board is fine with bios that is MONTHS old. It still must be a small issue, way smaller than most are making it out to be.

aigomorla · Apr 30, 2023

H433x0n said:
What happens when the CPU blows up but the motherboard is intact? It seems like for that one we've got to wait for the failure analysis lab and we're still weeks away on that one. We've got a theory from Wendell about what's happening on the CPU side but it's just a theory (for now).

he answers this here:

Watch from where i linked.
The Die eventually explodes causing a crack.
On ASUS boards it kept ramping voltage until 400W+ was fed though the socket killing the ASUS board.
On the ASROCK, the cpu die shattered before any damage could be done to the board, and the board still works.

biostud · Apr 30, 2023

Markfw said:
Well, 5 boxes, 4 different motherboards, 2 vendors, and the one ASUS board is fine with bios that is MONTHS old. It still must be a small issue, way smaller than most are making it out to be.

Yes, if CPU's where burning up all over the place we would probably have known. But even with 5 boxes you are nowhere near to have a statistically significant amount, to say anything about the probability for it to happen, and if only ASUS boards have been running the vsoc out of spec, then only one of your boxes would be at a higher risk.
But I'm very glad to hear that all your ASRock boards also are doing well 🙂

Question Asus mobos burning x3D CpuS?

Golden Member

Diamond Member

Lifer

Diamond Member

CPU, Cases&Cooling Mod PC Gaming Mod Elite Member

CPU, Cases&Cooling Mod PC Gaming Mod Elite Member

Elite Member

Member

Lifer

Moderator Emeritus, Elite Member

Moderator Emeritus, Elite Member

Lifer

Elite Member

Diamond Member

Golden Member

Moderator Emeritus, Elite Member

Elite Member

Elite Member

Member

Moderator Emeritus, Elite Member

Moderator Emeritus, Elite Member

Elite Member

Moderator Emeritus, Elite Member

CPU, Cases&Cooling Mod PC Gaming Mod Elite Member

Lifer