Intel processors crashing Unreal engine games (and others)

Saylick · Jul 20, 2024

RnR_au said:
Tinfoil time... I wonder if Intel is keeping quiet until after the AMD 9000 series have been reviewed. Reviewers have to run the Intel cpu's at stock even if the cpu's have a known issue at that performance level. At best the reviewers can put big fat question marks next to the Intel results.

And then before Arrow Lake launches, *boom*. New firmware issued for immediate use. Cuts RPL performance by 10% to give ARL a larger lead. But you ask, “Won’t that make Zen 5 look better against RPL, too?” No worries. By that point, RPL will be last gen, baby. Patty G wants you to forget about all that. It’s in the rear view mirror now.

maddie · Jul 20, 2024

Is Intel quiet because the entire production is affected? What if it's not contamination but an integral part of the upgraded process used, post Alder Lake?

poke01 · Jul 20, 2024

NTMBK said:
They should be benchmarking against the 12900K, because that's the last generation that actually functions correctly.

Alderlake is a bloated arch( ~7mm2 for a P core...), it just doesn't degrade. The last proper Intel arch was Haswell. Skylake was buggy too.
I'm gonna say something, Intel hasn't been making good CPU architectures for a long time. Look back and see.

Will Arrow lake be good? lets see

coercitiv · Jul 20, 2024

RnR_au said:
Tinfoil time... I wonder if Intel is keeping quiet until after the AMD 9000 series have been reviewed. Reviewers have to run the Intel cpu's at stock even if the cpu's have a known issue at that performance level. At best the reviewers can put big fat question marks next to the Intel results.

Based on what GN said at the beginning of the video, I think GN will not include 13th and 14th gen CPUs in their review if Intel fails to come up with a statement , and will recommend NOT buying Intel during the Zen 5 review.
At the end of the video they mention they would include data with the older official guidelines but would also recommend NOT buying Intel until the issue is resolved.

I expect a considerable number of outlets will talk about stability issues with Intel products during their Zen 5 review. If Intel does intend to weather out the storm by staying silent until after review day, I think they're taking a huge risk, they may actually be heading into the perfect mass media storm. That being said, they may actually think they can do it.

DAPUNISHER · Jul 20, 2024

That snakeoil and used car salesman slide deck going hard at AMD as it turns out, was projection. Truly a case of when you are pointing the finger, 3 more are pointing back at you.

NTMBK said:
They should be benchmarking against the 12900K, because that's the last generation that actually functions correctly.

I like that idea, but they should include raptor too. I'd approach it showing Alder Lake's performance, value, and reliability. Raptor as the worst kind of silicon lottery. With a 1 in 2 to 1 in 10 chance that it'll crap the bed at some point in the future, running the settings used in the review. May the odds ever be in your favor!

And judging by the comments I am seeing around the web by some raptor owners, Raptor should come with a bag of pretzels with extra salt. It's the perfect metaphor for their brand of logic and disposition currently. Or one big extra salty pretzel stick so they can pretend it's Intel.

vanplayer · Jul 20, 2024

Why 12P0E Barrettlake and now 8P0E 14901KE?

https://twitter.com/x/status/1814676788850634907

This would be the cluster F, if the root cause is the E core.

And:

Intel states mobile 13/14th Gen Core series are not affected by instability issues - VideoCardz.com

Intel claims 13/14th Gen Core mobile CPU are not exposed to instability issues An in-depth internal investigation has shown. Intel has issued a statement to Digital Trends, who followed a story on recent instability issues. Recently, one of the game developers, Matthew Cassells, founder of...

videocardz.com

It looks like Intel knows what the issue is otherwise it wouldn't be so sure mobile RPL are not affected.

mikk · Jul 20, 2024

What about the 8+16 HX B0 mobile processor? Not affected?

poke01 · Jul 20, 2024

vanplayer said:
Why 12P0E Barrettlake and now 8P0E 14901KE?

https://twitter.com/x/status/1814676788850634907

This would be the cluster F, if the root cause is the E core.

And:

Intel states mobile 13/14th Gen Core series are not affected by instability issues - VideoCardz.com

Intel claims 13/14th Gen Core mobile CPU are not exposed to instability issues An in-depth internal investigation has shown. Intel has issued a statement to Digital Trends, who followed a story on recent instability issues. Recently, one of the game developers, Matthew Cassells, founder of...

videocardz.com

It looks like Intel knows what the issue is otherwise it wouldn't be so sure mobile RPL are not affected.

BL has e cores too. Also laptop is affected. Intel probably doesn’t want to deal with laptop rma

igor_kavinski · Jul 20, 2024

vanplayer said:
Why 12P0E Barrettlake and now 8P0E 14901KE?

https://twitter.com/x/status/1814676788850634907

This would be the cluster F, if the root cause is the E core.

https://download.asrock.com/IPC/List/CPU/iEP-9030E.pdf

These CPUs are meant for embedded applications and probably for running realtime operating systems which can't afford the latency of thread migration to E-cores and the slower processing of E-cores.

coercitiv · Jul 20, 2024

vanplayer said:
It looks like Intel knows what the issue is otherwise it wouldn't be so sure mobile RPL are not affected.

So Intel took a break from their radio silence to tell us the mobile chips are not affected by the issue they're not willing to talk about.

DAPUNISHER · Jul 20, 2024

vanplayer said:
This would be the cluster F, if the root cause is the E core.

In Wendell's testing turning off the e-cores only helped in something like 2.5% of the systems.

KompuKare · Jul 20, 2024

igor_kavinski said:
https://download.asrock.com/IPC/List/CPU/iEP-9030E.pdf

These CPUs are meant for embedded applications and probably for running realtime operating systems which can't afford the latency of thread migration to E-cores and the slower processing of E-cores.

I'm not fan of E-cores (and have internal benchmarks showing MSSQL queries running far far slower when my work laptop was "upgraded" from 2C/4T Haswell to Alderlake 10C/12T), but is there any benchmarks out their measuring thread and context switching overhead?

coercitiv said:
So Intel took a break from their radio silence to tell us the mobile chips are not affected by the issue they're not willing to talk about.

Yes, but they still did not tell us whether are talking about actual mobile chips like the 1365U) and which are wolves in sheep clothing (aka desktop chips) like the 14900HX?

So it still tells us little.

GTracing · Jul 20, 2024

Alderon games updated their reddit comment.

Yes we have several laptops that have failed with the same crashes. It's just slightly more rare then the desktop CPU faults.

Update (7/20/2024): The laptops crash in the exact same way as the desktop parts including workloads under Unreal Engine, decompression, ycruncher or similar. Laptop chips we have seen failing include but not limited to 13900HX etc.

Intel seems to be down playing the issues here most likely due to the expensive costs related to BGA rework and possible harm to OEMs and Partners.

We have seen these crashes on Razer, MSI, Asus Laptops and similar used by developers in our studio to work on the game.

The crash reporting data for my game shows a huge amount of laptops that could be having issues.

https://www.reddit.com/r/hardware/comments/1e13ipy/comment/lcyythb

DAPUNISHER · Jul 20, 2024

I hope Matt is ready for the next round of DDoS attacks. Because if the wailing and gnashing of teeth I am seeing on other forums is any indication, this will not be taken well. To put it mildly.

This is turning into a perfect crap storm. Evidently some are attaching a negative halo effect to Intel over the CrowdStrike outage. Steve (of course his name is Steve) does not paint a pretty picture -

https://www.tipranks.com/news/reality-catches-up-with-intel-nasdaqintc-shares-plunge

igor_kavinski · Jul 20, 2024

DAPUNISHER said:
https://www.tipranks.com/news/reality-catches-up-with-intel-nasdaqintc-shares-plunge

It is VERY possible that the garbage update sent over to all the Crowdstrike clients worldwide from their HQ was corrupted due to being generated from some expensive Intel based development workstation.

igor_kavinski · Jul 20, 2024

That dude's brain has melted circuitry

Jan Olšan · Jul 20, 2024

The Arrow Lake leaker on Twitter is convinced the oxidation issue mentioned by GN is nonsense. Problem is elsewhere according to them.

https://twitter.com/x/status/1814610978757124209

igor_kavinski · Jul 20, 2024

KompuKare said:
but is there any benchmarks out their measuring thread and context switching overhead?

I/O workloads inside VMs would expose the higher context switching latency.

I've always wanted to do the following test (because I know it will have very interesting results) but never get the time to do it because VMs can be a bit slow and need a lot of free storage space:

Install HammerDB on your PC. Enable two P-cores and two E-cores only. Run the benchmark with whatever settings you like. Note the score (in Transactions per second or orders per minute etc.)

Now enable one more P-core so the OS has a core available for its tasks and so the VM hypervisor will let you assign the remaining two P-cores and two E-cores to the VM. Install HammerDB in the guest OS (should ideally be the same as the host OS). Now run the same benchmark with the same settings. Ideally, the difference should be about 90% of the native speed. But you may find out that the performance hit is way too big. That is the context switching overhead because every time that benchmark has to do I/O, the hypervisor goes from user mode to kernel mode for the system calls for DB writes and switching from user mode to kernel mode is the actual context switch overhead.

Repeat the same experiment with 4 P-cores with no E-cores on host OS and 5 P-cores/0 E-cores inside VM and that will give you the impact of the slowness resulting from the involvement of the E-cores.

poke01 · Jul 20, 2024

GTracing said:
Alderon games updated their reddit comment.

https://www.reddit.com/r/hardware/comments/1e13ipy/comment/lcyythb

Yep told ya. This is worse than Nvidia bumpgate. If Apple still used Intel they would have been very angry with Intel right now. Laptop RMAs are a pain and expensive.

But Apple saw Intels problems as far back as skylake and ditched them like they did Nvidia.

mikk · Jul 20, 2024

GTracing said:
Alderon games updated their reddit comment.

https://www.reddit.com/r/hardware/comments/1e13ipy/comment/lcyythb

So if non B0 mobile CPUs are affected we should expect that also the non B0 desktop Core i5 models are affected right? Is there something in other forums or reddit whatever? We should wait for other sources, call me sceptical. HX B0 would make sense of course.

jpiniero · Jul 20, 2024

vanplayer said:
Why 12P0E Barrettlake and now 8P0E 14901KE?

Enterprisey Virtualization products have a real hard time with different types of cores apparently.

H433x0n · Jul 20, 2024

DAPUNISHER said:
I hope Matt is ready for the next round of DDoS attacks. Because if the wailing and gnashing of teeth I am seeing on other forums is any indication, this will not be taken well. To put it mildly.

I sincerely doubt that statement from alderon games. There may be a few unhinged people but there’s not some big grass roots uprising against this news.

maddogmcgee · Jul 20, 2024

Lucky they switched to AMD servers because I can imagine future replacement cpus from Intel are going to be a long time coming.

controlflow · Jul 20, 2024

DAPUNISHER said:
This is turning into a perfect crap storm. Evidently some are attaching a negative halo effect to Intel over the CrowdStrike outage. Steve (of course his name is Steve) does not paint a pretty picture -

https://www.tipranks.com/news/reality-catches-up-with-intel-nasdaqintc-shares-plunge

The quality of journalism on investment sites these days is poor but this author is a particularly impressive halfwit.

The massive CRWD disaster is entirely a reflection of substandard engineering practices and cost cutting pushed by greedy management. CRWD recently laid off significant numbers of QA staff to boost profitability to impress wall street and pump the stock. The whole thing was so amateur. Push a kernel mode module/sys file filled with null characters, do this on a Friday and do it all at once globally instead of doing a staged release and do it without even testing the file. It crashed 100% of the systems that it was deployed to. CRWD caused this disaster simply because they literally test in prod, on their paying customers and have no QA process.

Whether CRWD developer workstations were running on crashing Raptor Lakes or not, should be very very far down on the list of what probably went wrong. If you had even the most rudimentary automated testing and validation, you would have caught this. It is quite the stretch to connect the Intel stability issues and the CRWD Falcon disaster.

blckgrffn · Jul 20, 2024

controlflow said:
The quality of journalism on investment sites these days is poor but this author is a particularly impressive halfwit.

The massive CRWD disaster is entirely a reflection of substandard engineering practices and cost cutting pushed by greedy management. CRWD recently laid off significant numbers of QA staff to boost profitability to impress wall street and pump the stock. The whole thing was so amateur. Push a kernel mode module/sys file filled with null characters, do this on a Friday and do it all at once globally instead of doing a staged release and do it without even testing the file. It crashed 100% of the systems that it was deployed to. CRWD caused this disaster simply because they literally test in prod, on their paying customers and have no QA process.

Whether CRWD developer workstations were running on crashing Raptor Lakes or not, should be very very far down on the list of what probably went wrong. If you had even the most rudimentary automated testing and validation, you would have caught this. It is quite the stretch to connect the Intel stability issues and the CRWD Falcon disaster.

Now, that is all true.

The kickback may be against tech stocks and particularly those experiencing public issues right now. So the backlash correlation might not have anything to do with causality but via their tech stock relationship.

Intel processors crashing Unreal engine games (and others)

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Super Moderator CPU Forum Mod and Elite Member

Member

Diamond Member

Diamond Member

Lifer

Diamond Member

Super Moderator CPU Forum Mod and Elite Member

Golden Member

Senior member

Super Moderator CPU Forum Mod and Elite Member

Lifer

Lifer

Senior member

Lifer

Diamond Member

Diamond Member

Lifer

Golden Member

Senior member

Member

Diamond Member