Random BSOD's

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Sianath

Senior member
Sep 1, 2001
437
0
0
If you are rebooting without even the bugcheck screen (regardless of getting a dump or not) then it's hardware. To be sure, turn off the option to automatically reboot. Then if it bugchecks you'll see the bluescreen (even if you don't write a dump).

If you STILL don't see it, then go through the hardware motions. Check RAM, make sure BIOS/CPU is updated, etc.

:(
 

scoughlin1

Member
Jun 15, 2004
58
0
0
The option for automatically rebooting has been off for quite some time. I was going to update my bios but I heard bad stories about the newer ones for my board. It appears that I got a dump from a crash while I was gone a bit ago, its about 69mb.

Edit: I also got a minidump.
 

Sianath

Senior member
Sep 1, 2001
437
0
0
Smooth. Send me the minidump please. I didn't get a chance (meetings all day) to set up an FTP site for the larger dumps, but if you have an FTP site you want to put it on, I'll grab it and look at it.
 

scoughlin1

Member
Jun 15, 2004
58
0
0
Sent. I took a look at the larger dump file in debugging tools and it was a different file listed as the probable cause of the error, ntkrnlmp.exe.
 

mechBgon

Super Moderator<br>Elite Member
Oct 31, 1999
30,699
1
0
Set your memory voltage to 2.8 volts and your memory timings to 3-3-3-8. The chipset used on your motherboard is known for certain traits, one of them being that it often needs some extra memory voltage and relaxed timings.
 

scoughlin1

Member
Jun 15, 2004
58
0
0
Originally posted by: mechBgon
Set your memory voltage to 2.8 volts and your memory timings to 3-3-3-8. The chipset used on your motherboard is known for certain traits, one of them being that it often needs some extra memory voltage and relaxed timings.

Done. The default timings were 3-4-4-7 so I left everything the same except for the last which I switched to 8.
 

oldman420

Platinum Member
May 22, 2004
2,179
0
0
its the raptor get one of those cable clamps from wd the data cables on that drive do come loose from system vibration and that crashes xp.
sata drives kind of need an enclosure with a hard wired connector to be 100% reliable. alls i had to do was barely touch the cables on my raptors and they would momentarily lose conection. sata connectors are flimsy and prone to probs i would get a wd drive conector or superglue the connector to the drive.
heres a link to the conector
http://westerndigital.com/en/products/wdsc50rcw.asp
 

scoughlin1

Member
Jun 15, 2004
58
0
0
I ordered one of those cables a few hours ago. I just had a crash about 10 minutes ago and it put out another minidump and a bigger one. The memory.dmp file seems to have caught a driver that's been messing up my system. Here's a bit of what debugging tools put out:

DRIVER_VERIFIER_DETECTED_VIOLATION (c4)
A device driver attempting to corrupt the system has been caught. This is
because the driver was specified in the registry as being suspect (by the
administrator) and the kernel has enabled substantial checking of this driver.
If the driver attempts to corrupt the system, bugchecks 0xC4, 0xC1 and 0xA will
be among the most commonly seen crashes.

IMAGE_NAME: wpsdrvnt.sys

I did a search and found this driver in my Sygate firewall program files folder.
 

Sianath

Senior member
Sep 1, 2001
437
0
0
Check it out, yo!

0: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

UNEXPECTED_KERNEL_MODE_TRAP_M (1000007f)
This means a trap occurred in kernel mode, and it's a trap of a kind
that the kernel isn't allowed to have/catch (bound trap) or that
is always instant death (double fault). The first number in the
bugcheck parens is the number of the trap (8 = double fault, etc)
Consult an Intel x86 family manual to learn more about what these
traps are. Here is a *portion* of those codes:
If kv shows a taskGate
use .tss on the part before the colon, then kv.
Else if kv shows a trapframe
use .trap on that value
Else
.trap on the appropriate frame will show where the trap was taken
(on x86, this will be the ebp that goes with the procedure KiTrap)
Endif
kb will then show the corrected stack.
Arguments:
Arg1: 00000008, EXCEPTION_DOUBLE_FAULT
Arg2: 80042000
Arg3: 00000000
Arg4: 00000000

Double fault is almost always hardware... but we'll look further.

0: kd> r
eax=00000000 ebx=00000000 ecx=00000000 edx=00000000 esi=85d865f0 edi=00000000
eip=804f577d esp=f76f6a9c ebp=f7676ae4 iopl=0 nv up ei pl zr na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246
nt!KeSetEvent+0x77:
804f577d 5f pop edi

I wouldn't expect you to know this, but if you look at ESP(stack pointer) and EBP(base stack pointer), you'll notice that ESP is a larger number than EBP. This should NEVER happen, because the stack starts high in memory and works it's way down toward lower numbers. ESP should always be smaller than EBP.

0: kd> .formats esp^ebp
Evaluate expression:
Hex: 00080078
Decimal: 524408
Octal: 00002000170
Binary: 00000000 00001000 00000000 01111000
Chars: ...x
Time: Tue Jan 06 17:40:08 1970
Float: low 7.34852e-040 high 0
Double: 2.59092e-318

If you look at the binary difference between ESP and EBP, you'll notice on the high end a single 1. If that bit was flipped to a 0 in ESP, you would have a valid value for ESP. The hex value with that bit flipped to a 0 is F7676A9C, which is smaller than EBP and in the range for the stack.

0: kd> !cpuid
CP F/M/S Manufacturer MHz
0 15,2,9 GenuineIntel 2405
Unable to get information for processor 1

Check it out. Hope this helps!

842465 Stop 7F, 0x00000008 (double-fault) error occurs because of a single-bit
http://support.microsoft.com/?id=842465
 

Sianath

Senior member
Sep 1, 2001
437
0
0
If verifier caught a driver, update the driver. If it's not the cause of the majority of crashes you have (the ones you sent me the dumps for appear to be hardware, not software) it'll bite you in the future for sure. Verifier is a GREAT tool.
 

scoughlin1

Member
Jun 15, 2004
58
0
0
The computer hasn't crashed since I replaced Sygate with Kerio. If it happens again I'll look into getting a new cpu.
 

foamfoot33

Junior Member
Jun 21, 2004
8
0
0
scough, just curious if you're still having probs. i have a VERY similar set up to you and am having identical probs for the past 2 weeks. can't seem to figure it out. my post is "reboot probs". was wondering what else you tried.
 

scoughlin1

Member
Jun 15, 2004
58
0
0
No problems in about the last 48 hours. Read Sianath's post on 6/17 about enabling the driver verifier. That helped me identify a driver that was causing conflicts.
 

Sianath

Senior member
Sep 1, 2001
437
0
0
The link I posted has you update the CPU microcode. I wouldn't replace the CPU unless that fails to resolve the issue.

:)
 

scoughlin1

Member
Jun 15, 2004
58
0
0
I went for over 6 days without a crash but in the last 24 hours or so it's like the gates of hell have been opened. About a dozen crashes just since I've started using the computer today. I woke up to the DRIVER_IRQL_NOT_LESS_OR_EQUAL. I rebooted and quickly got another BSOD. Soon after I went back to the driver verifier and set it to check ALL drivers. When the system booted after the required restart, I got a blue screen before I could even get to the desktop. It said this:
IO SYSTEM VERIFICATION ERROR in fwdrv.sys (WDM DRIVER ERROR 224)
fwdrv.sys+2480 at EEE79480
After I rebooted again it only took about 3 minutes to get this error:
IO SYSTEM VERIFICATION ERROR (WDM DRIVER ERROR 203)
[+0 at 00000000]
and this is the error I got over and over until I turned the driver verifier off for the time being.
Any ideas?
 

scoughlin1

Member
Jun 15, 2004
58
0
0
I looked at the description for the latest bios and it said it updates the microcode. I tried to boot from the floppy and it gave me an I/O error, I think my drive is just broken or something. I have a 3200+ and motherboard coming on thursday so I kind of doubt I'm gonna bother trying to fix this current setup.