Question DEGRADING Raptor lake CPUs

Page 16 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Kocicak

Golden Member
Jan 17, 2019
1,065
1,120
136
I noticed some reports about degrading i9 13900K and KF processors.

I experienced this problem myself, when I ran it at 6 GHz, light load (3 threads of Cinebench), at acceptable temperature and non extreme voltage. After only few minutes it crashed, and then it could not run even at stock setting without bumping the voltage a bit.

I was thinking about the cause for this and I believe the problem is, that people do not appreciate, how high these frequencies are and that the real comfortable frequency limit of these CPUs is probably at something like 5500 or 5600 MHz. These CPUs are made on a same process (possibly improved somehow) on which Alder lake CPUs were made. See the frequencies 12900KS runs at. The frequency improvement of the new process tweak may not be so high as some people presume.

Those 13900K CPUs are probably highly binned to be able to find those which contain some cores which can reliably run at 5800 MHz. Some of the 13900K probably have little/no OC reserve left and pushing them will cause them to degrade/break.

The conclusion for me is that the best you can do to your 13900K or 13900KF is to disable the 5800 MHz peak, which will allow you to offset the voltage lower, and then set all core maximal frequency to some comfortable level, I guess the maximum level could be 5600 MHz. With lowered voltage this frequency should be gentler to the processor than running it at original 5500 MHz at higher voltage. You can also run it at lower frequencies, allowing for even higher voltage drop, but then the CPU is slowly loosing its sense (unless you want some high efficiency CPU intended for heavy multithread loads).

Running it with some power consumption limit dependent on your cooling solution to keep the CPU at sensible temperature will help too for sure.
 
Last edited:

controlflow

Member
Feb 17, 2015
148
241
116
Interesting prespective from Puget Systems on the RPL instability issues


Seems 13th and 14th gen are actually quite different based on this data. Looks like 11th gen was also problematic.
 
  • Like
Reactions: Ranulf

alcoholbob

Diamond Member
May 24, 2005
6,301
344
126
Interesting prespective from Puget Systems on the RPL instability issues


Seems 13th and 14th gen are actually quite different based on this data. Looks like 11th gen was also problematic.

It makes sense that 11th gen never got any traction online because it was universally reviled by techtubers, online reviews and the DIY community, and so individuals who had problems were mostly using prebuilts and thus never would have been able to pinpoint the problem to the CPU. That said, the nuclear power consumption of 11th gen was pretty similar to 14th gen so it seems to be electro-migration is likely the problem with both gens.
 

lakedude

Platinum Member
Mar 14, 2009
2,679
479
126
Just watched the Gamer's Nexus video. Very disapointing, not much new info there. It is just a review of what we already know from old posts taken from several forums including this one.

GN skips right past what I consider to be the most important chart (see picture) because it is the only one adjusted for the number of units sold.

Puget says: "Even though failure rates (as a percentage) are the most consequential, I think showing the absolute number of failures illustrates our experience best."

I disagree that the absolute number of failures is a good metric.

Upon reflection EDITed to remove any conclusions about a possible motive for the omission.

Screenshot_20240803-000251_Firefox.jpg
 
Last edited:
  • Haha
Reactions: Rigg and krawcmac

Kocicak

Golden Member
Jan 17, 2019
1,065
1,120
136
Let us hope there will be somebody around after 4 years to fulfill the extended warranty.
I meant this as a joke, after seeing the video and getting a better picture of what has been going on and how Intel has been responding to this I am not sure anymore...

Anyway, looking forward to the next video/s dealing with the real physical processes ruining the chips, I wonder if we will finally learn the safe frequencies for these CPUs.

(running mine still with 4,5/3,6 GHz limits)
 

coercitiv

Diamond Member
Jan 24, 2014
6,597
13,923
136
Also I now claim GN has a bias against Intel. The Puget site has one slide that shows failure rates by percent and that chart includes some information that is not favorable to AMD. I know cause I checked it out myself. GN skips right past what I consider to be the most important chart (see picture) because it is the only one adjusted for the number of units sold.
On the contrary, I think GN is heavily shilling for Intel. They managed to distract you from observing the ridiculously high percentage bar for the 11th gen. It's the highest bar in the chart, harder to spot when looking for at the smaller ones. /s

Read the Puget article, they explain how their custom systems are configured with BIOS settings aimed at maximizing reliability. Combine this with a more aggressive screening process aimed to catch instability in the shop, and two things should happen:
  • early failure rates will likely be higher for Puget than the rest of the industry (pushed up by more thorough shop testing)
  • failure rates in the field should be considerably lower than the rest of the industry (pushed down by the combo of early shop testing and safer BIOS settings)
Last but not least, both the article and the video are discussing the topic of Intel 13th and 14th gen failures, which is a proven fact today. Puget included the AMD data for context, to show why they're not seriously considering dropping Intel as their approach has minimized the impact of the Intel issues so far. The Intel issue exists whether or not Puget data reflects it. Anyone further pushing AMD into the discussion after Intel themselves have admitted their CPUs are defective should familiarize themselves with whataboutism and how toxic it is in debates.

We had AMD failure events in the past. One of them was very recent, the SoC voltage issue burning up 7000 series chips. When discussing that issue we did not bring up Intel failure rates to distract folks from acknowleding the extent and seriousness of the failure on the AMD side.
 

Timorous

Golden Member
Oct 27, 2008
1,748
3,239
136
Just watched the Gamer's Nexus video. Very disapointing, not much new info there. It is just a review of what we already know from old posts taken from several forums including this one.

Also I now claim GN has a bias against Intel. The Puget site has one slide that shows failure rates by percent and that chart includes some information that is not favorable to AMD. I know cause I checked it out myself. GN skips right past what I consider to be the most important chart (see picture) because it is the only one adjusted for the number of units sold.

Puget says: "Even though failure rates (as a percentage) are the most consequential, I think showing the absolute number of failures illustrates our experience best."

I disagree that the absolute number of failures is a good metric. If you only sold one and it is bad your failure rate is 100% but it only shows up as one failure on the absolute charts.

View attachment 104384

Puget do not define what they mean by failure, something Wendell did. They give no absolute numbers of units sold for each generation of CPU so we don't know the sample size which means we have no real clue if their data (and as such their safe settings) can be extrapolated with reasonable accuracy.

Looking at their charts they say they have 34 failed 14th gen parts with a failure rate of about 2.5% which would mean a total population of around 1,360 14th gen machines.

What we don't know is if that is just machines that failed and happen to contain a certain CPU or if that is machines that failed and the cause was tracked back to the CPU.

Compared to the OEM who spoke to GN with a population of 8 million chips 1,360 is a tiny sample.
 

Turbonium

Platinum Member
Mar 15, 2003
2,143
80
91
Wtf.

While I'm not as technically versed as a lot of users posting in this thread, the idea that 11th gen may have issues as well is really frustrating, considering I just got an 11th gen chip to replace the RTL chip I have. I'd fully go AMD at this point if it proves factual.
 
  • Wow
Reactions: igor_kavinski

GTracing

Member
Aug 6, 2021
78
192
76
Wtf.

While I'm not as technically versed as a lot of users posting in this thread, the idea that 11th gen may have issues as well is really frustrating, considering I just got an 11th gen chip to replace the RTL chip I have. I'd fully go AMD at this point if it proves factual.
According to pudget, their AMD chips also have a higher failure rate than 13th gen and 14th gen Intel.

1722687780422.png
Maybe not all workloads cause degradation as quickly? The Minecraft servers (I think Wendell mentioned them) we're almost all failing within 6 months.
 

poke01

Golden Member
Mar 8, 2022
1,991
2,527
106
Wtf.

While I'm not as technically versed as a lot of users posting in this thread, the idea that 11th gen may have issues as well is really frustrating, considering I just got an 11th gen chip to replace the RTL chip I have. I'd fully go AMD at this point if it proves factual.
You know stay away from Intel, even Skylake had bugs. 11th rocket lake was bad too.

Get an AMD Zen 4, not even Zen 5 as it’s not tested yet in the real world yet. Zen 4 is stable.
 

coercitiv

Diamond Member
Jan 24, 2014
6,597
13,923
136
The last paragraph in the Puget blog article:
We are extending our warranty to 3 years for all customers affected by this issue, regardless of warranty purchased. With a Puget Systems PC, you should be able to count on it working for you. If we no longer have supply of 13th or 14th Gen processors, we’ll upgrade you to a more current generation.
 

DAPUNISHER

Super Moderator CPU Forum Mod and Elite Member
Super Moderator
Aug 22, 2001
29,464
24,163
146
Counter WTF from me: Why did you go with 11th gen (slower HOTTER 14nm Rocket Lake crap) instead of Alder Lake (12th gen) ?????
It was back ported alright; in the sense someone pulled it out of their back port.

Golf clap for Puget. They inserted themselves in such a way that it immediately went viral. It was 100% a humble brag, and the extended warranty is a serious flex that dares others to match their commitment to clients.

Intel continues to admit more and more as the days go by.

willy-wonka-suspense.gif


But that some are attempting to retcon any of this with that bigger bar badder chart was not on my bingo card.
 

DrMrLordX

Lifer
Apr 27, 2000
21,998
11,555
136
Read the Puget article, they explain how their custom systems are configured with BIOS settings aimed at maximizing reliability. Combine this with a more aggressive screening process aimed to catch instability in the shop, and two things should happen:
  • early failure rates will likely be higher for Puget than the rest of the industry (pushed up by more thorough shop testing)
  • failure rates in the field should be considerably lower than the rest of the industry (pushed down by the combo of early shop testing and safer BIOS settings)

Puget shouldn't need to go to any considerable length to prescreen Raptor Lake CPUs, nor should there need to be special UEFI settings aimed at stability before they sell to the customer. The product should work as it ships fresh out-of-the-box. But hey maybe I'm just being naive?

Also do any of these prescreened, specially-configured Raptor Lakes perform differently than the CPUs you see in post-launch benchmarks?
 
Mar 8, 2024
65
196
66
Here is new news on this matter, Tom's HW picked it up but link is from the Reddit user.

Multiple reports of this on r/hardware as of 3PM Eastern, seems like Intel is in full collapse mode. RMAs being constantly denied, outright accusations of fraud from intel towards legitimate end users, and the sort of shenanigans you'd expect from a fly-by-night eBay retailer than the actual-factual inventor of the microprocessor.

Like, imagine if IBM pulled this stuff beyond the failure of MicroChannel. Really makes the whole DeathStar HDD debacle look like peanuts in comparison...
 

DaaQ

Golden Member
Dec 8, 2018
1,438
1,039
136
Modern customer service: get no response unless you escalate to social media.

AKA scummy companies won't do right unless they are embarrassed?
In the Tom's article, 1 chip was said top be a tray chip not retail, and other was said to remarked. So Amazon is buying tray chips and boxing the chips themselves, and Microcenter is re-marking chips which would also imply they have empty retail boxes to place the chips in as well.

Simply ridiculous. EDIT: To clarify Intel's actions are ridiculous at this point.

Here is the quote from that article.

Before all this drama, jerubedo contacted Intel’s customer service regarding the processors, and the company agreed that they were indeed faulty. When Intel asked for photographs and documentation, the company responded to jerubedo saying that one of the chips (Serial Number 02096 from Micro Center) was re-marked. At the same time, the other was a tray processor (Serial number 03252 from Amazon) and was not covered by a retail warranty.

When the Reddit poster sent clearer photos of the Amazon chip, Intel changed its stance, saying it was indeed a boxed processor. However, the company also noted that it wasn’t confident that the chip would pass muster with its fraud validation, suggesting that jerubedo return the chip to the merchant (Amazon) that sold them the chip instead.
 

KompuKare

Golden Member
Jul 28, 2009
1,163
1,426
136
In the Tom's article, 1 chip was said top be a tray chip not retail, and other was said to remarked. So Amazon is buying tray chips and boxing the chips themselves, and Microcenter is re-marking chips which would also imply they have empty retail boxes to place the chips in as well.

Simply ridiculous. EDIT: To clarify Intel's actions are ridiculous at this point.

Here is the quote from that article.
The tray one seemed to have them misreading the serial number as the number they quoted back was 3262 not 3252... And then doubled down on the number they misread.

If their agents or the agents' managers are not on commission to reject RMAs... Well I don't know what they are doing.