Question 7xxx series AMD cards overheating as explained by Derbauer

Puffnstuff

Lifer
Mar 9, 2005
16,015
4,785
136
I've seen multiple youtubers talking about this issue and of course AMD states they are researching the issue but well known YouTuber Derbauer has a new video explaining what he's found out using multiple cards. I wonder how AMD will treat their customers over this one after promptly denying the problem's existence when first asked about it.

 

Mopetar

Diamond Member
Jan 31, 2011
7,797
5,899
136
Is this the same with Zen 4 where they just run at higher (like 95C) temperatures that freak people out, or is this something else.
 

insertcarehere

Senior member
Jan 17, 2013
639
607
136
Is this the same with Zen 4 where they just run at higher (like 95C) temperatures that freak people out, or is this something else.

Not really, if you watch the video the cards noticeably throttle down and max out the fans due to these high hotspot temps. This should not be normal for a GPU out of the box, especially on an open-air test bench.
 
  • Like
Reactions: Elfear and Mopetar

Heartbreaker

Diamond Member
Apr 3, 2006
4,222
5,224
136
Is this the same with Zen 4 where they just run at higher (like 95C) temperatures that freak people out, or is this something else.

Watch the Video. It's a flawed cooler. Won't be an issue with AIB designs, but a lot of cards with this reference cooler are going to have issues.
 
  • Like
Reactions: Mopetar

Hans Gruber

Platinum Member
Dec 23, 2006
2,092
1,065
136
How is changing the orientation from A - B - A and having A temperatures change a driver issue?
Just like with CPU's, firmware and drivers manage temperature on a GPU. It's possible cooling solution could be flawed or not mounted properly. Considering the number of people complaining about temps. I am guessing that there is a driver issue causing anomalies in temperatures and spikes in GPU clock speeds.
 

pj-

Senior member
May 5, 2015
481
249
116
Just like with CPU's, firmware and drivers manage temperature on a GPU. It's possible cooling solution could be flawed or not mounted properly. Considering the number of people complaining about temps. I am guessing that there is a driver issue causing anomalies in temperatures and spikes in GPU clock speeds.

Based on the video I think there's almost 0 chance it's a driver issue or mounting issue. One of the symptoms of the issue is throttling so the power consumption actually drops by up to 80w while the temps rise and fan speeds are significantly higher. He also experimented with the mounting to rule that out as a cause.

Physically rotating the GPU while it's running can trigger the problem within a minute. Very much seems like a design or construction problem with the vapor chamber
 

amenx

Diamond Member
Dec 17, 2004
3,851
2,019
136
AMD kind of acknowledged a problem a few days ago:

"We are aware that a limited number of users are experiencing unexpected thermal throttling on AMD Radeon RX 7900 XTX graphics cards (reference models made by AMD). Users experiencing unexpected thermal throttling of an AMD Radeon RX 7900 XTX should contact AMD Support (opens in new tab)," stated AMD in a statement to Tom's Hardware.

 
  • Like
Reactions: Elfear and Mopetar

coercitiv

Diamond Member
Jan 24, 2014
6,151
11,686
136
I was asking since I'm on my phone and can't watch the video at the moment.
Here's my TL;DR based on the snippets I managed to read in the past few days. ( or should I say "last year"? ;) )

It's a real issue, the 110C junction temp in itself is not the problem, but the fact that it is reached while edge temperatures are still low and power used by the card is way bellow stock (this indicates a cooling problem of sorts). AMD are still trying to understand whether it can be isolated to a few batches or it's more uniformly spread. They'll probably need a bit of time to figure this out.
 

IEC

Elite Member
Super Moderator
Jun 10, 2004
14,323
4,904
136
As usual der8auer does a great job analyzing potential causes of the problem. It seems like the vapor chamber is the problem. Either a manufacturing defect (how many %?) or a design flaw. Remains to be seen how widespread the problem is.

I laughed at his deadpan "high precision tool" joke when he cut the standoffs.
 

kondziowy

Senior member
Feb 19, 2016
212
188
116
This is insane. Remember no pre-orders :D Thank god AMD priced 7000 cards so high - they can just recall and fix all of them and still make a profit :D

btw: watch now people test Nvidia gpus and find the same issue :laughing: I called it first
 
Last edited:

lixlax

Member
Nov 6, 2014
183
150
116
So far it seems to be only MBA 7900XTX problem as it has a different cooler to the XT?
I think I haven't seen hotspot temp over 80c yet on my MBA XT. And considering its tiny size (by modern standards) the cooler performance is just excellent as it goes up to 1750rpm and is hardly audible over rest of the PC.

It would be funny (ofcourse not for the people who are having issues and AMD) if they accidentaly put the same amount of liquid to the XTX cooler that was supposed to go to the XT cooler and the amount is on the borderline enough to some cards not to have the problem. Or as he pointed out in the video it may condense in the weirdly shaped end of the chamber.
 
  • Like
Reactions: Kaluan

Ranulf

Platinum Member
Jul 18, 2001
2,331
1,139
136
btw: watch now people test Nvidia gpus and find the same issue :laughing: I called it first

It would just be another data point for sloppy quality control in the covid era with increased prices for everything. Green marketing team would just spin it to win it as a new space heater.
 

HutchinsonJC

Senior member
Apr 15, 2007
465
201
126
The video basically shows that it's the orientation of the GPU causing the problem for the vapor chamber.

In a horizontal position the liquid within the vapor chamber is likely getting hot, and then not cooling properly, and even if you flip it to a vertical position mid-run, it's already too hot to ever get back to a normal operation and doesn't cool off. He demonstrated this in the video.

But if you start and stay in a vertical position, the GPU seems to run as expected.

He tested this with 4 separate GPUs and showed the affect of this on all 4 GPUs, including aftermarket. So this does not really seem like a limited problem, but rather a pretty wide spread problem, which is basically why he named the video the way he did.
 

amenx

Diamond Member
Dec 17, 2004
3,851
2,019
136
AMD partner suspects a faulty batch of reference Radeon 7900 GPUs is affected by thermal issues

Today Igor weighs in by sharing a note from a board partner. It is reported that the undisclosed OEM suspects at least one faulty batch of Radeon 7900 series might have left the factory. The possible issue being described is insufficient coolant added to the vapor chamber.


If so, would be a relief that its narrowed down to a batch QC issue than faulty design.
 
  • Like
Reactions: Leeea

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
Der8auer's video on the matter.



Really great video showing the internal structure of the cooler. You can see the main copper mesh, and the copper wick around where the GPU would be.

And as he notes, there is not enough water in it. So its sounding like a manufacturing issue. Which the OEM that was spoken in hypothesized as well. So then it comes down to how many cards ended up being shipped with this issue. And, its a bummer for AMD as this looks bad on them, even though they are not directly responsible.

All of their test samples could have been perfect, and then once manufacturing, all it takes is somebody setting up the machine to configure it to input less coolant than there should be, and then there are issue. Especially if there is actually more than one company making these. AMD could have spot checked cards, but they would only see it if they happen to check a card from that batch.
 
Last edited:

KompuKare

Golden Member
Jul 28, 2009
1,012
923
136
Really great video showing the internal structure of the cooler. You can see the main copper mesh, and the copper which around where the GPU would be.

And as he notes, there is not enough water in it. So its sounding like a manufacturing issue. Which the OEM that was spoken in hypothesized as well. So then it comes down to how many cards ended up being shipped with this issue. And, its a bummer for AMD as this looks bad on them, even though they are not directly responsible.

All of their test samples could have been perfect, and then once manufacturing, all it takes is somebody setting up the machine to configure it to input less coolant than there should be, and then there are issue. Especially if there is actually more than one company making these. AMD could have spot checked cards, but they would only see it if they happen to check a card from that batch.
On the other hand, that is why QA sample rates, AQL and similar things are defined: if you can't test everything then with a proper sample rate you should catch any issues with pretty good tolerance as long as your sample rate is correct.
 

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
On the other hand, that is why QA sample rates, AQL and similar things are defined: if you can't test everything then with a proper sample rate you should catch any issues with pretty good tolerance as long as your sample rate is correct.

For sure. But how many GPU QA processes include rotating the GPU? My guess is, not many. As this issue seems to impact reference GPUs from all board partners? So I wonder if they assemble these themselves, or if its literally just their label on a card manufactured by just one of them.
 

RnR_au

Golden Member
Jun 6, 2021
1,675
4,079
106
Sounds like they have identified the batches affected. According to the latest update 01/03/2023 9:49 p.m;

From what I've heard 5 batches are affected and AMD is starting a partial recall of the MBA designs.
Source