Random BSOD's at regular intervals....

sgrinavi

Diamond Member
Jul 31, 2007
4,537
0
76
With regard to the system in my sig...

I have started to have some strange, random, BSODs : here are the last 4:

Stop: 0x00000024 (0x000000000029033d,0xfffffadf9117a4f0,0xfffffadf9039f854)
NTFS.SYS address fffffadf9039f854 base at fffffadf9039000,date stamp 45d699ef


Stop: 0x00000050 (0xfffffa7f863dfd79,0x0000000000000000, 0xfffff97fff0d5d4e,0x0000000000000007)
Win32k.sys ? address fffff97fff0d5d4e base at fffff97fff000000, date stamp 45e6f310

Stop: 0x00000024 (0x000000000019033d, 0xfffffadf910b72a0, 0xfffffadf910b6cb0,0xfffffadf9031ef7f)
Snapman.sys address FFFFFadF9031EF7F, vase at FFFFFADF903030000, date datestamp 442bc8e1

Stop 0x0000000a (0xffffdadf9cbcef60, 0x0000000000000002,0x0000000000000000,0xffffff80001039d9c)



Nothing consistant, that I can see and none of my searches have yeilded anything that helps. I can not track the problem starting at a precise event, but made the following changes on or about the time it started:

1. Swapped out the 2 gb of geil ram for 2 gb of mushkin matching what I have. I tried pulling them out, but there was no change in symptoms

2. Swapped out one of my 320 gb drives with a newer one. I don't think this has anything to do with it, but I seem to be getting a lot of rebuilds on my RAID 5

3. Pulled system down to rearrange cables. I don't think this had anything to do with it.. but... who knows, maybe I fat fingered something.

As I said, I can not trace it to a specific event.

Possible causes that I have found in my research are:

1. Bad PSU
2. Bad Video Drivers, does not seem likely as I have not changed them.
3. Faulty RAM.

Personally, I think I may have a bad PCIE slot, but no one that I polled (including GIGABYTE tech support) has ever seen this...

Any ideas?
 

dclive

Elite Member
Oct 23, 2003
5,626
2
81
Check out the BSOD website and debug your dumps (see my .sig for full info) ... from there post the output you get from an !analyze -v from the debugger.

Do the same for each of your dumps.
 

sgrinavi

Diamond Member
Jul 31, 2007
4,537
0
76
Hi and Thanks,

What's the deal with the symbols? I don't understand.... I would like to get that working.

Anyhow. On some of the reboots the video did not come back up... on a whim I removed the video driver and the system has been running all day without a single BSOD. From what I have seen it is not uncommon.

BTW, I am running xp64 pro
 

dclive

Elite Member
Oct 23, 2003
5,626
2
81
You can look at the website (see my .sig) for details on how to correctly install the debugger and configure it so your symbols will work correctly. Did you find that part?
 

dclive

Elite Member
Oct 23, 2003
5,626
2
81
Originally posted by: sgrinavi
On some of the reboots the video did not come back up... on a whim I removed the video driver and the system has been running all day without a single BSOD.

You may have removed it, but it would immediately reinstall on the next bootup, using drivers found in c:\windows\inf's INF files. Did it do a hardware PnP after you rebooted? Did the date and version of the drivers change after you did this?
 

sgrinavi

Diamond Member
Jul 31, 2007
4,537
0
76
Hi and thanks again,

I appreciate the instructions -- I got it to work, but really still do not understand the whole l deal... Yes, it asked to reinstall the video drivers, but I did not let it. The system stayed up all day and rebuilt one of the raid drives. I installed the oldest, whql, x64 driver with 8800 support that I could find, 97.44. I found a couple user reviews that said it was solid. In anycase, the drivers installed and seem to work fine... but.. I tried to run World In Conflict and BAM, the BSODs started again.

Here's a debug from last night, I think this is the one that happened at that point



Microsoft (R) Windows Debugger Version 6.7.0005.1
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [C:\WINDOWS\Minidump\Mini100707-04.dmp]
Mini Kernel Dump File: Only registers and stack trace are available

Symbol search path is: SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
*** WARNING: Unable to verify checksum for ntkrnlmp.exe
Windows Server 2003 Kernel Version 3790 (Service Pack 2) MP (4 procs) Free x64
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 3790.srv03_sp2_gdr.070321-2337
Kernel base = 0xfffff800`01000000 PsLoadedModuleList = 0xfffff800`011d4140
Debug session time: Sun Oct 7 17:51:20.781 2007 (GMT-4)
System Uptime: 0 days 3:38:33.875
*** WARNING: Unable to verify checksum for ntkrnlmp.exe
Loading Kernel Symbols
..........................................................................................................
Loading User Symbols
Loading unloaded module list
..
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck A, {fffff800014141d1, 2, 0, fffff8000104f236}

Probably caused by : tcpip.sys ( tcpip!ProcessSynTcbs+f0 )

Followup: MachineOwner
---------

3: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: fffff800014141d1, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000000, bitfield :
bit 0 : value 0 = read operation, 1 = write operation
bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
Arg4: fffff8000104f236, address which referenced memory

Debugging Details:
------------------


READ_ADDRESS: fffff800014141d1

CURRENT_IRQL: 2

FAULTING_IP:
nt!RtlVirtualUnwind+132
fffff800`0104f236 440fb603 movzx r8d,byte ptr [rbx]

CUSTOMER_CRASH_COUNT: 4

DEFAULT_BUCKET_ID: COMMON_SYSTEM_FAULT

BUGCHECK_STR: 0xA

PROCESS_NAME: Idle

TRAP_FRAME: fffffadf910393b0 -- (.trap 0xfffffadf910393b0)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=fffffadf9aa1a040 rbx=0000000000000000 rcx=ff00fadf9aa14950
rdx=0000000000010000 rsi=0000000000000000 rdi=0000000000000000
rip=fffff80001024f90 rsp=fffffadf91039548 rbp=ff00fadf9aa14940
r8=0000000000004001 r9=0000000000000005 r10=0000000000000003
r11=fffffadf87c9f1b0 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei ng nz na po cy
nt!KeAcquireSpinLockAtDpcLevel:
fffff800`01024f90 f0480fba2900 lock bts qword ptr [rcx],0 ds:ff00fadf`9aa14950=????????????????
Resetting default scope

EXCEPTION_RECORD: fffffadf91039320 -- (.exr 0xfffffadf91039320)
ExceptionAddress: fffff80001024f90 (nt!KeAcquireSpinLockAtDpcLevel)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 0000000000000000
Parameter[1]: ffffffffffffffff
Attempt to read from address ffffffffffffffff

LAST_CONTROL_TRANSFER: from fffff8000102e5b4 to fffff8000102e890

STACK_TEXT:
fffffadf`910381d8 fffff800`0102e5b4 : 00000000`0000000a fffff800`014141d1 00000000`00000002 00000000`00000000 : nt!KeBugCheckEx
fffffadf`910381e0 fffff800`0102d547 : 00000000`00000202 fffff800`01025817 fffffadf`9c244000 fffffadf`90334346 : nt!KiBugCheckDispatch+0x74
fffffadf`91038360 fffff800`0104f236 : 00000000`00000000 fffffadf`9c0d4000 fffffadf`00000056 fffffadf`91038638 : nt!KiPageFault+0x207
fffffadf`910384f0 fffff800`01054a97 : fffffadf`9c1a0660 fffffadf`91039db0 00000000`00000000 00000000`91038d30 : nt!RtlVirtualUnwind+0x132
fffffadf`91038570 fffff800`0100b901 : fffffadf`91039320 fffffadf`91038d30 fffffadf`91039320 fffffadf`91039430 : nt!RtlDispatchException+0x10b
fffffadf`91038c30 fffff800`0102e6af : fffffadf`91039320 fffffadf`9c0016b0 fffffadf`910393b0 00000000`00000000 : nt!KiDispatchException+0xd9
fffffadf`91039230 fffff800`0102d30d : 00000000`0001896a fffff800`0102f557 00000000`00000001 fffffadf`9c10d010 : nt!KiExceptionExit
fffffadf`910393b0 fffff800`01024f90 : fffffadf`87c05ec9 00000000`00000003 00000000`00000000 00000000`00000000 : nt!KiGeneralProtectionFault+0xcd
fffffadf`91039548 fffffadf`87c05ec9 : 00000000`00000003 00000000`00000000 00000000`00000000 00000000`00000003 : nt!KeAcquireSpinLockAtDpcLevel
fffffadf`91039550 fffffadf`87c17916 : 00000000`000000a8 00000000`00000000 00000000`0001ffa7 00000000`00000003 : tcpip!ProcessSynTcbs+0xf0
fffffadf`910395e0 fffff800`010285a1 : 00000000`00000010 00000000`00000217 00000000`00000000 00000000`00000000 : tcpip!TCBTimeout+0x1ee2
fffffadf`91039d20 fffff800`01067c10 : fffffadf`90c8b180 fffffadf`90c8b180 00000000`00000000 fffffadf`90c93680 : nt!KiRetireDpcList+0x150
fffffadf`91039db0 fffff800`014141d1 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiIdleLoop+0x50
fffffadf`91039de0 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemStartup+0x1bf


STACK_COMMAND: kb

FOLLOWUP_IP:
tcpip!ProcessSynTcbs+f0
fffffadf`87c05ec9 0fba63381f bt dword ptr [rbx+38h],1Fh

SYMBOL_STACK_INDEX: 9

SYMBOL_NAME: tcpip!ProcessSynTcbs+f0

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: tcpip

IMAGE_NAME: tcpip.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 45d699a3

FAILURE_BUCKET_ID: X64_0xA_tcpip!ProcessSynTcbs+f0

BUCKET_ID: X64_0xA_tcpip!ProcessSynTcbs+f0

Followup: MachineOwner
---------






Here's one AFTER that... it implicates iastor.sys...



3: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

SYSTEM_THREAD_EXCEPTION_NOT_HANDLED_M (1000007e)
This is a very common bugcheck. Usually the exception address pinpoints
the driver/function that caused the problem. Always note this address
as well as the link date of the driver/image that contains this address.
Some common problems are exception code 0x80000003. This means a hard
coded breakpoint or assertion was hit, but this system was booted
/NODEBUG. This is not supposed to happen as developers should never have
hardcoded breakpoints in retail code, but ...
If this happens, make sure a debugger gets connected, and the
system is booted /DEBUG. This will let us see why this breakpoint is
happening.
Arguments:
Arg1: ffffffffc0000005, The exception code that was not handled
Arg2: fffffadf9058e138, The address that the exception occurred at
Arg3: fffffadf91143a80, Exception Record Address
Arg4: fffffadf91143490, Context Record Address

Debugging Details:
------------------


EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".

FAULTING_IP:
iaStor+29138
fffffadf`9058e138 498b4008 mov rax,qword ptr [r8+8]

EXCEPTION_RECORD: fffffadf91143a80 -- (.exr 0xfffffadf91143a80)
ExceptionAddress: fffffadf9058e138 (iaStor+0x0000000000029138)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 0000000000000000
Parameter[1]: ffffffffffffffff
Attempt to read from address ffffffffffffffff

CONTEXT: fffffadf91143490 -- (.cxr 0xfffffadf91143490)
rax=fffffadf9b873340 rbx=fffffadf9b5d0000 rcx=fffffadf9ba93070
rdx=fffffadf9ba930f0 rsi=fffffadf9b60cb50 rdi=fffffadf9b86fa60
rip=fffffadf9058e138 rsp=fffffadf91143ca0 rbp=0000000000000080
r8=ffff00000000c800 r9=fffffadf90c8b180 r10=0000000000000000
r11=0000000000000000 r12=fffffadf9bd23c30 r13=0000000000000000
r14=fffffadf9ca8f040 r15=fffffadf90c8d600
iopl=0 nv up ei ng nz na po nc
cs=0010 ss=0018 ds=002b es=002b fs=0053 gs=002b efl=00010286
iaStor+0x29138:
fffffadf`9058e138 498b4008 mov rax,qword ptr [r8+8] ds:002b:ffff0000`0000c808=????????????????
Resetting default scope

CUSTOMER_CRASH_COUNT: 5

DEFAULT_BUCKET_ID: COMMON_SYSTEM_FAULT

PROCESS_NAME: System

CURRENT_IRQL: 0

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".

READ_ADDRESS: ffffffffffffffff

BUGCHECK_STR: 0x7E

LAST_CONTROL_TRANSFER: from 0000000000000000 to fffffadf9058e138

STACK_TEXT:
fffffadf`91143ca0 00000000`00000000 : 00000000`00000000 fffffadf`9bd23c20 fffffadf`905a5278 fffffadf`9b86fa60 : iaStor+0x29138
fffffadf`91143ca8 00000000`00000000 : fffffadf`9bd23c20 fffffadf`905a5278 fffffadf`9b86fa60 fffffadf`9058dc1b : 0x0
fffffadf`91143cb0 fffffadf`9bd23c20 : fffffadf`905a5278 fffffadf`9b86fa60 fffffadf`9058dc1b fffffadf`9ba93070 : 0x0
fffffadf`91143cb8 fffffadf`905a5278 : fffffadf`9b86fa60 fffffadf`9058dc1b fffffadf`9ba93070 00000000`00000000 : 0xfffffadf`9bd23c20
fffffadf`91143cc0 fffffadf`9b86fa60 : fffffadf`9058dc1b fffffadf`9ba93070 00000000`00000000 fffffadf`9bd23c30 : iaStor+0x40278
fffffadf`91143cc8 fffffadf`9058dc1b : fffffadf`9ba93070 00000000`00000000 fffffadf`9bd23c30 fffffadf`9b0a58f0 : 0xfffffadf`9b86fa60
fffffadf`91143cd0 fffffadf`9ba93070 : 00000000`00000000 fffffadf`9bd23c30 fffffadf`9b0a58f0 fffffadf`9c4e0bf0 : iaStor+0x28c1b
fffffadf`91143cd8 00000000`00000000 : fffffadf`9bd23c30 fffffadf`9b0a58f0 fffffadf`9c4e0bf0 00000000`00000080 : 0xfffffadf`9ba93070
fffffadf`91143ce0 fffffadf`9bd23c30 : fffffadf`9b0a58f0 fffffadf`9c4e0bf0 00000000`00000080 fffffadf`9b0a58f0 : 0x0
fffffadf`91143ce8 fffffadf`9b0a58f0 : fffffadf`9c4e0bf0 00000000`00000080 fffffadf`9b0a58f0 fffffadf`9059346a : 0xfffffadf`9bd23c30
fffffadf`91143cf0 fffffadf`9c4e0bf0 : 00000000`00000080 fffffadf`9b0a58f0 fffffadf`9059346a fffffadf`9bd23c20 : 0xfffffadf`9b0a58f0
fffffadf`91143cf8 00000000`00000080 : fffffadf`9b0a58f0 fffffadf`9059346a fffffadf`9bd23c20 fffffadf`9c4e0bf0 : 0xfffffadf`9c4e0bf0
fffffadf`91143d00 fffffadf`9b0a58f0 : fffffadf`9059346a fffffadf`9bd23c20 fffffadf`9c4e0bf0 fffffadf`00000000 : 0x80
fffffadf`91143d08 fffffadf`9059346a : fffffadf`9bd23c20 fffffadf`9c4e0bf0 fffffadf`00000000 00000000`00000000 : 0xfffffadf`9b0a58f0
fffffadf`91143d10 fffffadf`9bd23c20 : fffffadf`9c4e0bf0 fffffadf`00000000 00000000`00000000 00000000`00000000 : iaStor+0x2e46a
fffffadf`91143d18 fffffadf`9c4e0bf0 : fffffadf`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0xfffffadf`9bd23c20
fffffadf`91143d20 fffffadf`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0xfffffadf`9c4e0bf0
fffffadf`91143d28 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0xfffffadf`00000000
fffffadf`91143d30 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 fffffadf`9bd23c30 : 0x0
fffffadf`91143d38 00000000`00000000 : 00000000`00000000 00000000`00000000 fffffadf`9bd23c30 fffffadf`9bd23c48 : 0x0
fffffadf`91143d40 00000000`00000000 : 00000000`00000000 fffffadf`9bd23c30 fffffadf`9bd23c48 00000000`00000000 : 0x0
fffffadf`91143d48 00000000`00000000 : fffffadf`9bd23c30 fffffadf`9bd23c48 00000000`00000000 fffff800`0124a972 : 0x0
fffffadf`91143d50 fffffadf`9bd23c30 : fffffadf`9bd23c48 00000000`00000000 fffff800`0124a972 fffffadf`9c4e0bf0 : 0x0
fffffadf`91143d58 fffffadf`9bd23c48 : 00000000`00000000 fffff800`0124a972 fffffadf`9c4e0bf0 00000000`00000080 : 0xfffffadf`9bd23c30
fffffadf`91143d60 00000000`00000000 : fffff800`0124a972 fffffadf`9c4e0bf0 00000000`00000080 fffffadf`9c4e0bf0 : 0xfffffadf`9bd23c48
fffffadf`91143d68 fffff800`0124a972 : fffffadf`9c4e0bf0 00000000`00000080 fffffadf`9c4e0bf0 fffffadf`90c93680 : 0x0
fffffadf`91143d70 fffff800`01020226 : fffffadf`90c8b180 fffffadf`9c4e0bf0 fffffadf`90c93680 00000000`00000000 : nt!PspSystemThreadStartup+0x3e
fffffadf`91143dd0 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxStartSystemThread+0x16


FOLLOWUP_IP:
iaStor+29138
fffffadf`9058e138 498b4008 mov rax,qword ptr [r8+8]

SYMBOL_STACK_INDEX: 0

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: iaStor

IMAGE_NAME: iaStor.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 4696b5c1

SYMBOL_NAME: iaStor+29138

STACK_COMMAND: .cxr 0xfffffadf91143490 ; kb

FAILURE_BUCKET_ID: X64_0x7E_iaStor+29138

BUCKET_ID: X64_0x7E_iaStor+29138

Followup: MachineOwner
---------

 

oynaz

Platinum Member
May 14, 2003
2,449
2
81
Seems like your problem is your video card - either a driver problem or defective hardware. You might want to contact nVideas tech support.
 

sgrinavi

Diamond Member
Jul 31, 2007
4,537
0
76
Originally posted by: oynaz
Seems like your problem is your video card - either a driver problem or defective hardware. You might want to contact nVideas tech support.

Hi and Thanks for your response...

I'm not sure anymore, the frequency of the BSODs has decreased, but they are still present. I am starting to lean towards the RAID drivers. I am going to disconnect the raid and install a single drive with win xp 64 and see what happens.

 

sgrinavi

Diamond Member
Jul 31, 2007
4,537
0
76
First the Good news; I abandoned the RAID setup and installed xp64 on a clean drive. The system worked flawlessly, would not crash.

The bad news. When I installed my chipset drivers I lost my USB ports; they all come up as unavailable in device manager.

There is not an apparant IRQ conflict, but then I don't know what to look for..

Any ideas?
 

sgrinavi

Diamond Member
Jul 31, 2007
4,537
0
76
pfffftttt.....

Can not even install SP2 on it now.

1. Installed windows
2. Added net card drivers
3. Added wireless drivers
4. Added chipset drivers (killed USB ports)
5. Tried to install on board sound drivers, failed looking for updated HD drivers
6. Tried to install SP2, BSOD started AGAIN!
7. Rolled back system to step 2.
8. Tried to install SP2, BSODs almost instantly.

What next?
 

dclive

Elite Member
Oct 23, 2003
5,626
2
81
Easy - debug the dumpfile to see if it's software (in which case the dumps all point to the same issue, usually) or hardware (in which case, usually but not always, there's lots of random dumps with random errors).

Check my .sig for details on how to.

Post the !analyze -v output when you get it set up (of a few dumpfiles not just one).
 

sgrinavi

Diamond Member
Jul 31, 2007
4,537
0
76
Hi and thanks,

Its the random dumps with random errors option.... All over the map. I updated the bios to the newest beta and the system has been acting better (only a few bsod's with SP1 and zero with SP2). Running 100% stock with a single hard drive and 2gb of RAM. 11,400 3dmark06 is not too shabby...


An interesting side note; I have had to rename my edb.log file a couple times during sp2 and 3dmark06 install. I don't know if there is any connection.
 

sgrinavi

Diamond Member
Jul 31, 2007
4,537
0
76
SOLVED... well, I think...

Lowered RAM voltages and things are working well.

As it turns out Gigabytes 1.8 volt stock settings are not correct, not in my case and many others. My RAM wants to be run at 2.2 to 2.35, I had been running it at stock + .5 (1.8 stock + .5 = 2.3) sounds good, right?

It was probably more like 2.5 as the stock settings vary, but 2.0 seems to be about right.

SO, I just wasted a perfectly good raid setup and about 30 hours of my time. Hopefully I didn't shorten the RAM life.. the whole system is more responsive, better 3dmarks and a few degrees cooler to boot.. HMMM..

THANKS GIGABYTE