• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Trying to profile a notebook failure

Etherized

Junior Member
I'm troubleshooting a family member's MSI EX625 notebook (product page) that freezes, which is one of two problems. The second problem is profiling the conditions for failure so that I can get it fixed. From the peculiar conditions of failure, it doesn't point to simply overheating and correlates to GPU frequencies (not exceeding defaults). Maybe you can offer some insight to what is happening here.

The Problem:
The system will crash to a black screen if a videogame is running and the Windows Power Option for ATI Graphics is set to "Maximize Performance". This sets core/memory GPU clocks to the default 675/800 MHz. Upon failure, the screen will go black, with audio continuing/looping for about five seconds, and then the audio will sustain a stutter, as if the last audio sample is repeatedly played. The system is unresponsive to any key presses/combinations except for the power button. Pressing it sometimes shuts power immediately. Sometimes on restart, Windows shows that it has recovered from an unexpected shutdown from event BlueScreen. As a note, I never see a BSOD. For the Mobility Radeon 4670, Catalyst 12.6 is installed with drivers 8.97.100.3000 direct from AMD. Drivers supplied by the manufacturer also fail in a similar manner. Latest MSI-supplied VBIOS has been applied [011.021.000.006.033017 (BR33017.001)].

Setting the ATI Graphics setting to "Maximize Battery Life", or using MSI Afterburner to downclock the GPU to 335/800 MHz (lowest clocks allowed by Afterburner) avoids the issue, but this is a non-solution given the limiting performance of an otherwise capable video card. Memtest86 shows no errors despite multiple passes, hours of runtime. OS has been reinstalled multiple times.

Trigger a crash:
1. Configure the entry in Power Options > Edit Plan Settings > Advanced settings > ATI Graphics to "Maximize Performance", setting maximum core/memory GPU frequencies, voltages to 675/800MHz at 1.0-1.2V as reported by GPU-Z
2. Run GPU-intensive programs

To give a better idea of the situation, here are some max temperature readings from GPU-Z, HWMonitor
Crash:
Title--------------------GPU/DISPIO/MEMIO°C--Core0/Core1°C--Condition/Runtime
Crysis Benchmark Tool--82/80/83-------------79/91-----------Within 5min
^CPU downclocked-----73--------------------59/67----------Within 5min
GTA IV-----------------71/67/71.5------------73/88----------Within 5min
^CPU downclocked-----68/65/69.5----------59/66-----------^10min+Changing scenery
^GPU 675/400Mhz------79/75.9/79-----------79/87----------^10min+Changing scenery

No crash:
Crysis Benchmark Tool
^GPU 335/800Mhz------72/70.5/71.5----------77/88----------30min
^GPU 675/400MHz------82/78.5/81.5----------79/90----------60min
GTA IV
^GPU 335/800Mhz------71/70/72--------------79/91----------60min
Unigine Heaven---------84/81.5/84-------------79/90---------3hr
FurMark----------------92/91/97.5-------------75/81---------20min
3DMark Vantage--------79/77/79.5-------------76/87---------20min

Essentially any game that this system has run for any extended period of time crashes at some point, although there exists some games that I have not extensively run (more than an hour) and have not exhibited crashing within the limited time frame such as Assassin's Creed. Synthetic benchmarks don't seem to trigger problems.

In a game, moving the camera/scenery seems to correlate with crashes interestingly enough. For example, in GTA IV or Mass Effect, having fixed scenery doesn't seem to trigger crashes within 30 minutes, but moving the camera perspective is highly susceptible to causing failure. Even more so is switching the scenery entirely, such as in GTA IV upon player death or opening the game menu. In the case of the former the scene would fade into view as the camera pans across a hospital building exterior and after a fraction of a second the crash will occur. In the latter, the crash will happen when the menu is invoked, or a fraction of a second after the game is resumed. Similar to the user in the following link: http://forum-en.msi.com/index.php?topic=128031.msg964188#msg964188

The system crashes at varying time intervals depending on the game. Crysis/Benchmark Tool, GTA IV, and Skyrim crash around 5 minutes. With the CPU/GPU downclocked, GTA IV takes longer to crash. With CPU downclocked the GPU has a low load in typical gameplay (<42% max, <10% typical). Entering a mission cutscene increases the GPU load (consistent ~50%) and a crash is more likely to happen. Mass Effect crashes within 15 minutes while other games can run from 45 minutes to several hours before crashing. These games include Left 4 Dead 2, Civilization V, Borderlands, Rainbow Six Vegas.

This notebook has undergone a few RMAs, OS reinstalls, and essentially a full part replacement. I asked at an MSI user forum, and the advice was simply to keep sending it back until it's solved. For a new RMA, tech support is requesting a failure on the benchmarks, and I don't know of an industry benchmark that exposes the problem. I've sent an RMA before with most of the detail posted here, including Crysis Benchmark Tool, but it seems to have been ignored.

So, what am I seeing here? If you have any suggestions or ideas, I would appreciate the post. Thanks, for at the very least your time in reading this.
 
Maybe I should try a different approach. Given that the crashes seem correlated to the GPU core clock speed, I've been adjusting this setting and checking stability. GTA IV seems unstable at clocks exceeding 618MHz, with the memory clock constant at the default 800Mhz. However, Crysis seems more tolerant. So two questions:

What is it about the GPU clock that would make GTA IV more sensitive than Crysis, and in turn, more sensitive than other games, ie. what sort of GPU-related utilization do they invoke more than others ? From the conditions for failure (scenery change) I would have initially thought it would have been related to the memory if anything, but the game seems stable while adjusting only the core clock.

Is it understood that some applications are more sensitive to higher clock rates than others, such as in cases of overclocking?
 
Back
Top