Intel processors crashing Unreal engine games (and others)

Page 34 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DAPUNISHER

Super Moderator CPU Forum Mod and Elite Member
Super Moderator
Aug 22, 2001
31,494
30,980
146
I think Intel has 2 options:
- keep making, selling and RMAing Raptor Lake and sending out more defective Raptor Lakes
- tell customers to get AMD CPU instead

The capacity Intel has at its disposal to make MTL and ARL is extremely limited. Something in range of 10-25% of client chips Intel can make are MTL, ARL or LNL, the rest of the capacity is RPL.
So effectively zero good options.

As Wendell has continued to hammer home, where is the Intel "We are going to make you whole." statement? That's what Microsoft did BITD for RROD issues. That's what AMD did recently as Wendell cited. Aussie Steve informed us how the board makers felt thrown under the bus and volunteered info and expressed frustration. Something he said is most unusual. Wendell more than hinted the sentiment is shared by sources in both DC and OEM. Even game makers are losing money and getting spicy over this.

On the RPL capacity: Do you think it possible the refresh was partly an effort to fix what was wrong with 13th gen? Everyone considered it a pointless refresh. And they stop making 13th gen altogether. Makes me wonder if our intrepid reporters are going to find some meat on those bones or not. This post linking to the production ending from 3 months ago, including her S.O. having a brand new 13900K be defective, got me musing about it - https://www.theverge.com/2024/4/11/24127596/intels-discontinuing-some-of-its-13th-gen-desktop-cpus
 

SteinFG

Senior member
Dec 29, 2021
717
853
106
Do you think it possible the refresh was partly an effort to fix what was wrong with 13th gen?
Considering that temporary fixes included reducing clock speed and disabling e-cores, and that 14th gen only increased clock speed and increased the number of e-cores on 14700K, I doubt that theory. Even silicon revision didn't change through the year, right?
 
  • Like
Reactions: coercitiv

DAPUNISHER

Super Moderator CPU Forum Mod and Elite Member
Super Moderator
Aug 22, 2001
31,494
30,980
146
Considering that temporary fixes included reducing clock speed and disabling e-cores, and that 14th gen only increased clock speed and increased the number of e-cores on 14700K, I doubt that theory. Even silicon revision didn't change through the year, right?
You should not dignify what I wrote as a theory. Hypothesis is even a stretch. It is more of a - does it strike anyone else as odd? Your response is level headed, stating things are exactly as they appear. Occam's razor and all that. But here's the rub: Who knew what the temporary fix was when Intel was designing the refresh? Which other than some extra e cores for the i7 brought nothing worth mentioning, leading reviewers to scratch their heads searching for a reason why the refresh was even bothered with. You'll have to pardon me, I do have tendency for the Scooby Doo mystery shtick
 

Jan Olšan

Senior member
Jan 12, 2017
530
1,050
136
On the RPL capacity: Do you think it possible the refresh was partly an effort to fix what was wrong with 13th gen? Everyone considered it a pointless refresh. And they stop making 13th gen altogether. Makes me wonder if our intrepid reporters are going to find some meat on those bones or not. This post linking to the production ending from 3 months ago, including her S.O. having a brand new 13900K be defective, got me musing about it - https://www.theverge.com/2024/4/11/24127596/intels-discontinuing-some-of-its-13th-gen-desktop-cpus

I also wondered about the discontinuation of 13th gen in that light, but the refresh itself, unlikely IMO. Making a new generation every year even if it has to be a refresh like that has been their standard mode of operation (slight exception with Rocket Lake, but that simply was delayed for whatever reason, else it would have been out a year after Comet).
 

DAPUNISHER

Super Moderator CPU Forum Mod and Elite Member
Super Moderator
Aug 22, 2001
31,494
30,980
146
I also wondered about the discontinuation of 13th gen in that light, but the refresh itself, unlikely IMO. Making a new generation every year even if it has to be a refresh like that has been their standard mode of operation (slight exception with Rocket Lake, but that simply was delayed for whatever reason, else it would have been out a year after Comet).
200w.gif


Just a little humor; couldn't resist such a perfect setup. 🫶 I really do appreciate the response :beercheers: and the Tick Tock explanation is the most Occam's razor. I do however have one last Scooby Doo clue for the gang to examine. Meteor Lake was supposed to be enthusiast desktop. Kind of monkey wrenched things by not working out don't you think?
 

Kocicak

Golden Member
Jan 17, 2019
1,177
1,232
136
Interesting! Wendel's video at least confirms the thought that Alder Lake didn't have this problem plaguing Raptor Lake.
Alder lake ran at lower clock speeds. Other Raptor lake CPU which are stable also run at lower clock speeds.

I have been saying this all the time, the silicon is simply not able to handle the high frequencies. The process is flaky and the silicon simply degrades too quickly with a combination of heat and electric current density necessary to run the breakneck speeds.

Intel CPUs are very nice and stable, once you run them at sane frequencies.

My 14900K is currently limited to 5.2 GHz and it runs just fine.

Intel must be held responsible for the stupid settings they ship their product with.
 

GunsMadeAmericaFree

Golden Member
Jan 23, 2007
1,384
379
136
I retract my prediction about what this will cost Intel. Clearly I underestimated the extent of the problem. This is going to get expensive, what with needing to now factor in DCS providers steering customers to other vendors. Lost sales and RMAs in retail = meh, but it happening in DC? That hurts bad.

EDIT: I hope Wendell is right in his speculation about frustrated clients leaking data. Would love to know how many prebuilts are failing too.
Not sure if the bad news and resulting temp downward stock price might make Intel stock a buy, or not, short term. Either way, though, Intel's reputation takes a hit, and AMD likely gains another few percentage points in the server market.
 

coercitiv

Diamond Member
Jan 24, 2014
7,193
16,847
136
I have been saying this all the time, the silicon is simply not able to handle the high frequencies. [...]

My 14900K is currently limited to 5.2 GHz and it runs just fine.
Wendel's video talks about 14900K game servers that fail stability tests @ 5.3Ghz with JEDEC spec DDR5. For servers with dual DIMMs per channel they went as low as DDR5 4200 and even disabled E cores (/w single DIMM per channel they used DDR 5200). The combination of lower clocks and slower memory helped the most, but did not completely solve the issue. Updating BIOS also seemed to help, but not enough to make the machines pass all tests.
 

gdansk

Diamond Member
Feb 8, 2011
4,157
6,928
136
How many DPC for DDR5-4200? I missed that. Most W680 boards are only rated for DDR5-3600 with 2DPC...
 

Hitman928

Diamond Member
Apr 15, 2012
6,633
12,202
136
Wendel's video talks about 14900K game servers that fail stability tests @ 5.3Ghz with JEDEC spec DDR5. For servers with dual DIMMs per channel they went as low as DDR5 4200 and even disabled E cores (/w single DIMM per channel they used DDR 5200). The combination of lower clocks and slower memory helped the most, but did not completely solve the issue. Updating BIOS also seemed to help, but not enough to make the machines pass all tests.

If it is a degradation issue from trying to run too high of frequencies, it could be that those CPUs were already too badly damaged that even down clocking will no longer work.
 
  • Like
Reactions: DAPUNISHER

coercitiv

Diamond Member
Jan 24, 2014
7,193
16,847
136
How many DPC for DDR5-4200? I missed that. Most W680 boards are only rated for DDR5-3600 with 2DPC...
They said 4200 for 2 DPC and 5200 for 1 DPC. No mention of SS vs. DS, but the 2 DPC systems had 32GB modules, the 1 DPC systems had 48GB modules.
 
  • Like
Reactions: lightmanek

Kocicak

Golden Member
Jan 17, 2019
1,177
1,232
136
I am convinced that most of the unstable chips are not broken, just ran out of their voltage safety margin. If they increased the voltage and ran them slow (like 4200-4600 MHz), most of them could be still usable for some time.

BTW if they used 14900K out of the box for professional intensive loads in servers, no wonder that half of them are failing.

Consumer safe frequency is something around 5 GHz and heavy load frequency, where stability is required, 1 GHz lower. Or even more than that. I can guarantee you that if they ran them between 3000 - 4000 MHz, almost all of them would be still running OK.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,042
15,988
136
I am convinced that most of the unstable chips are not broken, just ran out of their voltage safety margin. If they increased the voltage and ran them slow (like 4200-4600 MHz), most of them could be still usable for some time.

BTW if they used 14900K out of the box for professional intensive loads in servers, no wonder that half of them are failing.

Consumer safe frequency is something around 5 GHz and heavy load frequency, where stability is required, 1 GHz lower. Or even more than that. I can guarantee you that if they ran them between 3000 - 4000 MHz, almost all of them would be still running OK.
By fleet of 7950x run 100% load 24/7/365 at 4.7-4.9 ghz. Why on earth would anybody buy a 14900k to run it at 4 ghz ? tHEY ARE ALL BROKEN IMO

And I run my 7950x's almost all of them on $50 air coolers.
 
Last edited:

GTracing

Senior member
Aug 6, 2021
478
1,112
106
I am convinced that most of the unstable chips are not broken, just ran out of their voltage safety margin. If they increased the voltage and ran them slow (like 4200-4600 MHz), most of them could be still usable for some time.

BTW if they used 14900K out of the box for professional intensive loads in servers, no wonder that half of them are failing.

Consumer safe frequency is something around 5 GHz and heavy load frequency, where stability is required, 1 GHz lower. Or even more than that. I can guarantee you that if they ran them between 3000 - 4000 MHz, almost all of them would be still running OK.
Wendell says his sources set the power limit to 125W.

Even if they ran it stock, it's pretty wild to claim that we shouldn't expect a CPU to be able to operate at base settings for six months of continual use.
 

Thunder 57

Diamond Member
Aug 19, 2007
3,690
6,227
136
Alder lake ran at lower clock speeds. Other Raptor lake CPU which are stable also run at lower clock speeds.

I have been saying this all the time, the silicon is simply not able to handle the high frequencies. The process is flaky and the silicon simply degrades too quickly with a combination of heat and electric current density necessary to run the breakneck speeds.

Intel CPUs are very nice and stable, once you run them at sane frequencies.

My 14900K is currently limited to 5.2 GHz and it runs just fine.

Intel must be held responsible for the stupid settings they ship their product with.

Welcome back! You would know about taking CPU's to the limit too, wouldn't you? ;)

Also, another day, more bad news for Intel.
 

Futuremotion

Junior Member
Jun 23, 2024
23
23
41
github.com
Since its for pro use, I would return it now and just move to an AMD 7950X system. Better to have a working work PC.

I've never gone AMD in my life, I've for better or worse been a shameless Intel fanboy since I first touched a PC over 20 years ago. But I am very seriously now thinking about returning the 14900KF and GB motherboard for a 7950X3D. It looks really solid and I've seen no reports of instability. Another option I suppose is to hold out for the AMD 9000 series.

I'm just so incredibly frustrated because I literally just bought all these parts yesterday and had no idea that the instability issues and possible chip defects reported in this thread were as horrifying as they seem to be.

They were resorting to downclocking to a max of 5.3GHz and DDR5-4200 (not a typo) to try to obtain stability and sometimes even that doesn't work.

The above quote from this thread is absolutely ridiculous.

The whole reason behind my choice to go Intel was to take advantage of the crazy P-core and turbo boost frequencies. Audio production workloads are not generally heavily multithreaded, and single core performance is the biggest contributor to DAW / Production performance. High CPU frequency rates enable you to run more VST/VSTi plugins simultaneously - specifically some of the more demanding plugins like u-He Diva as an example.

I guess I have some detailed research and contemplation to do.

One positive comment from Reddit:
I just built a workstation for 3D and compositing. Went with the 14900K and 4090. DDR5 192 GB ram stable at 5200 mhz (4 slots). I don’t regret a thing, and if I need to upgrade by 2026 then I will part it out or find the most efficient way (even if I need to swap the whole mobo and cpu) - just as I have done every 2-3 years.

On a positive note, I did run across what seems to be a (partial?) solution (I can't tell exactly because I've yet to actually assemble my new PC). Take a look at this thread on Reddit by u/Acadia1337:


Here's the meat of it:

My quest for stability finally led me to a revelation. The Holy Grail: "13th Generation Intel® Core™ and Intel® Core™ 14th Generation Processors Datasheet, Volume 1 of 2". 219 pages of technical glory.

https://www.intel.com/content/www/u...ation-processors-datasheet-volume-1-of-2.html

Page 98, Table 17, Row 3: Reveals the stock turbo power limits for the 13900K and 14900K CPUs are 253W, not the 4,000+ my motherboard defaulted to. Page 184, Table 77, Row 6: Lists the maximum current limit at 307A, far below my motherboard's default of 500+A.

I decided to implement this right away. I reset my BIOS to default settings, turned off multicore enhancement, enabled xmp, and input the settings from the datasheet. Ta-Da! All of my issues were solved by a simple 2 minute process. All my games worked, there are no random lags, and nothing ever crashes. I can run any stability test as long as I want and it all works fine. Problem solved.

Turns out, all I needed to do was spend 2 minutes setting up the stock settings in my BIOS.

I'd really like to hear some feedback from anyone here that owns a i9-14900K/KF - specifically if you tried lowering the turbo power limits and maximum current limit in your BIOS? According to the OP, a lot of motherboards are shipping with wildly over-aggressive power profiles.

Apparently making the following BIOS modifications has completely resolved instability issues, game crashes, and random BSODs for a ton of users:

Set Package Power Limit 1 = 253
Set Package Power Limit 2 = 253
Set Core Current Limit = 307Amps

Also, some users are reporting full P95 stability at Core Current Limit set to 400Amps as well (likely because they have good silicon).

I really want to believe this is a misconfiguration of stock BIOS power settings than a more insidious hardware issue, but I'm not as well informed as many of you here on this forum, so I don't know for sure.
 

Kocicak

Golden Member
Jan 17, 2019
1,177
1,232
136
The whole reason behind my choice to go Intel was to take advantage of the crazy P-core and turbo boost frequencies.
These frequencies are not a result of a true CPU capability, but a result of desparate Intel wanting to beat/match AMD in review charts. They are really CRAZY. Do not use them.

The CPUs are fine. They break if you let them run at crazy frequencies.
 
  • Like
Reactions: Tlh97 and Mopetar

KompuKare

Golden Member
Jul 28, 2009
1,223
1,578
136
I though Wendel pretty much ruled out BIOS settings being to blame with those W680 server motherboard at really tame settings degrading with time at an average 50% rate.
There is no way Supermicro would set any aggressive BIOS defaults, and the server division of Asus behave normal like their bling gamerz board division.

I would not hold out that a simple "fix" will fix this long term. Might delay the degrading a few months though.
 
Last edited:

Ranulf

Platinum Member
Jul 18, 2001
2,801
2,397
136
Warframe devs post some info on their forums:


"Luckily we found a staff member who would encounter these crashes on his home computer. Curiously, his computer at the office was fine: he was playing with the same loadout, the same customizations, with the same people, but he would only crash at home.


He wasn’t over-clocking anything and it was a new machine so there was no reason to expect problems. We tried all of the usual fixes: he got the latest Windows Updates, he updated all his drivers, he disabled all third-party overlays being injected, he tested his RAM, and by all accounts everything was fine.


We ran aggressive stress-tests on similar machines: we used scripts to repeatedly open and close various user-interface components that were mentioned in crash reports, we ran endless simulated battles between squads of NPCs, and we even we made a test that would load up random levels, teleport around quickly to a whole bunch of vantage points to exercise the graphics driver, and then move on.


Everything was fine for us and yet he kept crashing doing the most basic things like launching the game and flying to a mission.


Because the crash wasn’t in our code it was hard to guess what we could be doing wrong but as we looked over the reports we noticed that these crashes tended to occur when the graphics driver was working very hard on all CPU-cores. The penny dropped when we realized that this was a particularly power-hungry state for the processor to be in and we were reminded of a recent report from Intel that suggested that a BIOS update might help."
 

Futuremotion

Junior Member
Jun 23, 2024
23
23
41
github.com
These frequencies are not a result of a true CPU capability, but a result of desparate Intel wanting to beat/match AMD in review charts. They are really CRAZY. Do not use them.

The CPUs are fine. They break if you let them run at crazy frequencies.

So are you saying my new i9-14900KF is perfectly fine provided I cap my frequencies and turbo multipliers to a more reasonable configuration? The post above by @KompuKare pretty flatly refutes this, and raises a much more worrisome problem about chip degradation over a period of "a few months".

I love Intel. Always have. But something is seriously wrong here with this generation of chips. I am at the point where I'm about 95% certain I'm going to return my motherboard and i9-14900KF for a Ryzen 9 7950X3D. The 8 additional Performance Cores should actually pay bigger dividends when it comes to audio production in my DAW, now that I think about it.

It's just a shame. I hope Intel gets to the bottom of these issues - if only to elicit further healthy competition, so we all win.