Funky problem with X2

Miles Teg

Junior Member
Dec 16, 2005
5
0
0
So, My machine has developed a strange problem. The specs are:

Abit AN8 Ultra (BIOS 19) nForce4
X2 4200+
2GB OCZ DDR400
eVGA GT7800
Antec 430W TPII
250GB Hitachi T7K250
< 4 months old
No OC

Anyway, the symptoms are thus:

Even on a clean install of XP (XP + 6.66 nForce Drivers + 78.01 vid drivers) on a bare bones machine (err. as bare as a modern machine can get), the machine will freeze (no BSOD, no error message, just a freeze). Sometimes the video will flash on and off before the machine crashes.

Now, the odd part is this: If I keep one core pegged with Prime95, Seti@Home or something similar, the system is stable for a couple of days before any problems present themselves. Eventually though, the machine will freeze and require a hard boot.

I've tried the following:
* Memtest86+ 1.55 for a day and no memory errors present themselves
* Prime95 runs on both cores - no errors
* I've tried the XP driver verifier, and it causes further problems by throwing a BSOD during boot (STOP 0x000000D6). Disabling the verifier in safe mode solves that BSOD.

Does this perchance related to the XPsp2 Dual Core performance hotfix? What I mean is, is this the type of symptoms one would expect to be solved by that hotfix?
 

montag451

Diamond Member
Dec 17, 2004
4,587
0
0
Up the RAM voltage in BIOS to 2.7v - 2.8v, then see.
Take a stick of RAM out and try it.
Lower the RAM timings and try it.
I hate to say it, but don't suppose you have another psu lying around, do you?
 

kitkat22

Golden Member
Feb 10, 2005
1,464
1,333
136
It seems PSU's are the answer to a lot of things at the moment. Surprisingly it works in many cases. Goes to show you need a good solid PSU up front or you will have problems.
 

montag451

Diamond Member
Dec 17, 2004
4,587
0
0
Although there is an Antec psu in there, I would try another one after the RAM testing - if no bad results received.
With a solid one, we are still prone to problems, but just less so.
 

Miles Teg

Junior Member
Dec 16, 2005
5
0
0
montag451:

"Up the RAM voltage in BIOS to 2.7v - 2.8v, then see."

I'm already running the ram at 2.8V. When I first got the machine I had to set it to 2.8V to get the machine stable (and when I did that it was very stable until my recent problems).

"Lower the RAM timings and try it."

I'm running the ram at fairly low settings (Cas2.5 instead of the 2 that they are rated at, but still 1T). You think it might be ram, even though memtest doesn't show any errors?

"I hate to say it, but don't suppose you have another psu lying around, do you? "

Actually, I do in my old machine. It's an Antec 350W SP (<5 months old). I will give that a try tonight and post here with my results, good or bad.

Thanks!
Miles Teg
 

Slikkster

Diamond Member
Apr 29, 2000
3,141
0
0
Originally posted by: cscpianoman
It seems PSU's are the answer to a lot of things at the moment. Surprisingly it works in many cases. Goes to show you need a good solid PSU up front or you will have problems.

It's funny that you mention that, because I'm seeing the PSU swap as kind of a standard response nowadays on all kinds of problems. It's akin to the old "reboot" solution to any Microsoft problem.

I'm a firm believer in "whatever works". But I also don't believe in buying things unnecessarily only to find out they don't solve the problem. Sure, you'll always have a spare PSU around, but why?

Anyway, the PSU thing seems to becoming a mantra, and I don't see that as a positive.

On the other hand, if someone already has a spare PSU to try, far be it for me to talk them out of it. That's a no-brainer.

I dunno...I just cringe now every time I see a problem crop up, and the stock answer is "sounds like a PSU problem". It's probably just me, lol.
 

montag451

Diamond Member
Dec 17, 2004
4,587
0
0
Slikkster,
you are right, and I cringe myself when I type it.

Problem is that it seems psu or ram seem to be cover half the problems (????).
Recently, I haven't seen many blown cpu's, or cmos refreshes that fix a problem.

Ah well.
see what happens.
 

kitkat22

Golden Member
Feb 10, 2005
1,464
1,333
136
Ditto on the ram and psu. When it all boils down to it the mobo and cpu are rarely the culprit. I'll have to add the HD takes it's share as well. After those it's usually something not installed right, a setting, or drivers. Maybe it's just me, but computers are getting rather complicated now.
 

Slikkster

Diamond Member
Apr 29, 2000
3,141
0
0
Originally posted by: montag451
Slikkster,
you are right, and I cringe myself when I type it.

Problem is that it seems psu or ram seem to be cover half the problems (????).
Recently, I haven't seen many blown cpu's, or cmos refreshes that fix a problem.

Ah well.
see what happens.

Just so you know, I'm not talking about you, montag. You obviously have a good grasp on troubleshooting. I just see a lot of other people chime in with the best of intentions, I'm sure, spouting the PSU cheer. So please don't give your advice a second thought...go for what you think. We're all just guessing here, sight unseen, quite often. And again, whatever works is the best advice.

I'm beginning to think there should be some industry certification a la gold standard for PSU's that might help solve some of these problems, as it does seem like there are tons of no name brands out there nowadays.



 

montag451

Diamond Member
Dec 17, 2004
4,587
0
0
Yeah - again, I gotta agree about a standard grade system for psu's.
I still cringe everytime someone says they got a 500w unnamed psu.
And goosebumps of shame come on me when I say 'got another psu lying around?', but unfortunately you only realise the error of a cheap psu by direct experience.
 

Miles Teg

Junior Member
Dec 16, 2005
5
0
0
Ok, so I tried a new Power Supply, and I still have the same problem. I have, however, made a possibly significant discovery:

I finally got a few memory dumps, and I ran Windgd on them. In all of them the "probably caused by" is: ntkrpamp.exe

What is this file (something to do with power management perhaps?), and is there anyone who might be able to help me out here?

Here's the full dump, if anyone has any insight that would be great!

Microsoft (R) Windows Debugger Version 6.5.0003.7
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [D:\Minidump\Mini122805-03.dmp]
Mini Kernel Dump File: Only registers and stack trace are available

Symbol search path is: SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows XP Kernel Version 2600 (Service Pack 2) MP (2 procs) Free x86 compatible
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 2600.xpsp_sp2_gdr.050301-1519
Kernel base = 0x804d7000 PsLoadedModuleList = 0x8055c700
Debug session time: Wed Dec 28 16:07:22.687 2005 (GMT-7)
System Uptime: 0 days 5:25:50.390
Loading Kernel Symbols
..............................................................................................................
Loading unloaded module list
........
Loading User Symbols
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 9C, {4, 8054d5f0, b2000000, 70f0f}

Probably caused by : ntkrpamp.exe ( nt!KdPitchDebugger+ef4 )

Followup: MachineOwner
---------

0: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

MACHINE_CHECK_EXCEPTION (9c)
A fatal Machine Check Exception has occurred.
KeBugCheckEx parameters;
x86 Processors
If the processor has ONLY MCE feature available (For example Intel
Pentium), the parameters are:
1 - Low 32 bits of P5_MC_TYPE MSR
2 - Address of MCA_EXCEPTION structure
3 - High 32 bits of P5_MC_ADDR MSR
4 - Low 32 bits of P5_MC_ADDR MSR
If the processor also has MCA feature available (For example Intel
Pentium Pro), the parameters are:
1 - Bank number
2 - Address of MCA_EXCEPTION structure
3 - High 32 bits of MCi_STATUS MSR for the MCA bank that had the error
4 - Low 32 bits of MCi_STATUS MSR for the MCA bank that had the error
IA64 Processors
1 - Bugcheck Type
1 - MCA_ASSERT
2 - MCA_GET_STATEINFO
SAL returned an error for SAL_GET_STATEINFO while processing MCA.
3 - MCA_CLEAR_STATEINFO
SAL returned an error for SAL_CLEAR_STATEINFO while processing MCA.
4 - MCA_FATAL
FW reported a fatal MCA.
5 - MCA_NONFATAL
SAL reported a recoverable MCA and we don't support currently
support recovery or SAL generated an MCA and then couldn't
produce an error record.
0xB - INIT_ASSERT
0xC - INIT_GET_STATEINFO
SAL returned an error for SAL_GET_STATEINFO while processing INIT event.
0xD - INIT_CLEAR_STATEINFO
SAL returned an error for SAL_CLEAR_STATEINFO while processing INIT event.
0xE - INIT_FATAL
Not used.
2 - Address of log
3 - Size of log
4 - Error code in the case of x_GET_STATEINFO or x_CLEAR_STATEINFO
AMD64 Processors
1 - Bank number
2 - Address of MCA_EXCEPTION structure
3 - High 32 bits of MCi_STATUS MSR for the MCA bank that had the error
4 - Low 32 bits of MCi_STATUS MSR for the MCA bank that had the error
Arguments:
Arg1: 00000004
Arg2: 8054d5f0
Arg3: b2000000
Arg4: 00070f0f

Debugging Details:
------------------


BUGCHECK_STR: 0x9C_IA32_AuthenticAMD

CUSTOMER_CRASH_COUNT: 3

DEFAULT_BUCKET_ID: DRIVER_FAULT

LAST_CONTROL_TRANSFER: from 806e7bff to 804f9c37

SYMBOL_ON_RAW_STACK: 1

STACK_TEXT:
8054d5c8 806e7bff 0000009c 00000004 8054d5f0 nt!KeBugCheckEx+0x1b
8054d6f4 806e2c52 80042000 00000000 00000000 hal!HalpMcaExceptionHandler+0xdd
8054d6f4 00000000 80042000 00000000 00000000 hal!HalpMcaExceptionHandlerWrapper+0x4a


STACK_COMMAND: dds @$csp ; kb

FOLLOWUP_IP:
nt!KdPitchDebugger+ef4
8054d5f0 0100 add [eax],eax

FOLLOWUP_NAME: MachineOwner

SYMBOL_NAME: nt!KdPitchDebugger+ef4

MODULE_NAME: nt

IMAGE_NAME: ntkrpamp.exe

DEBUG_FLR_IMAGE_TIMESTAMP: 42250a1e

FAILURE_BUCKET_ID: 0x9C_IA32_AuthenticAMD_nt!KdPitchDebugger+ef4

BUCKET_ID: 0x9C_IA32_AuthenticAMD_nt!KdPitchDebugger+ef4

Followup: MachineOwner
---------

Thanks,
Miles Teg