This is a long and boring problem, just confirm what i suspect please.

Maximilian

Lifer
Feb 8, 2004
12,604
15
81
Ok, so ive been having problems with bluescreens, ive switched out components etc and here are my results, this is the rig either labelled "main rig" or "gaming rig" in my sig.

Stuff ive done already:
1. scanned hard drive for bad sectors
2. fully reinstalled windows
3. removed 500mhz overclock on cpu

Problem - random bluescreens 1 - 4 every day, somtimes none. AND screen stuck in 256 color mode and bluescreens somtimes had purple blocks on emm
Solution - replace x1900xt with 7900 GTO
Outcome - full color is back

Problem - random bluescreens 1 - 4 every day (STILL)
Solution - run memtest and find bad memory, remove 2 x1gb sticks and leave the 2x512 sticks
Outcome - memtest passes *without* the 50 odd errors i had before

Problem - still get a bluescreen now and then, although much less frequent
Solution - switched out seasonic S12 500w for antec truepower 430w
Outcome - bluescreens stopped

Problem - I got a bluescreen again yesterday, this is after 1-2 days of no bluescreens. Also i got the "windows has recovered from a serious error" on starup, which ive seen a few times before, this is on a pretty clean new insteall of windows, fully updated.
Solution - RMA mobo and get replacement
Outcome - Hopefully all will be fine....

Its the motherboard isnt it? Its gotta be that, as why the hell would my ram (newish ram!) break for no reason, or my x1900 go all weird. Im hunting for my epox rma thing to get a new mobo.

Just confirm what i think please.
 

imported_nocturne

Senior member
Jun 21, 2005
567
0
0
Well... there's only one way to find out.

It'd be suprising if you just happened to get a bunch of bad hardware, and a motherboard can cause all those problems too.

If you don't have other systems to try out the hardware in, you're going to have to wait for the new motherboard to test them out. Also, pay attention to the pattern of the BSOD errors. The stop codes can really help point you in the right direction sometimes. There's even a guy around here (dclive) that will help you process the dumps if you need more help.

Also, have you tried any new BIOS versions? The fact that it's an SLI board might mean the release BIOS was more tested to be compatible with NV than ATI.

In any case, good luck.
 

Maximilian

Lifer
Feb 8, 2004
12,604
15
81
Originally posted by: nocturne
Well... there's only one way to find out.

It'd be suprising if you just happened to get a bunch of bad hardware, and a motherboard can cause all those problems too.

If you don't have other systems to try out the hardware in, you're going to have to wait for the new motherboard to test them out. Also, pay attention to the pattern of the BSOD errors. The stop codes can really help point you in the right direction sometimes. There's even a guy around here (dclive) that will help you process the dumps if you need more help.

Also, have you tried any new BIOS versions? The fact that it's an SLI board might mean the release BIOS was more tested to be compatible with NV than ATI.

In any case, good luck.

Thanks for the reply. I spoke to another guy who i think was from MS on these forums, he also said dclive could help, but he said that since there are more than 3 different types of bluescreens that its almost definately a hardware error and dumps wouldnt be too useful. Ive had STOP AT xxxxx and BAD_POOL_CALLER and DRIVER IRQL NOT LESS OR EQUAL etc.

I updated the bios the day i got the board, kept it that way since. It was fine for the most part up until the past month or so.
 

dclive

Elite Member
Oct 23, 2003
5,626
2
81
You might read the web page (see my .sig) and you can then take a look at your own dumps.

The advice...Smilin, I think - gave is right - if you have multiple errors and BSODs with multiple types of stop codes, typically you have a hardware issue.

In that case, you can take it to your hardware vendor, and they can address the issue.
 

Maximilian

Lifer
Feb 8, 2004
12,604
15
81
Originally posted by: dclive
You might read the web page (see my .sig) and you can then take a look at your own dumps.

The advice...Smilin, I think - gave is right - if you have multiple errors and BSODs with multiple types of stop codes, typically you have a hardware issue.

In that case, you can take it to your hardware vendor, and they can address the issue.

Awsome site dude, im currently downloading the debugger tool, ive only got one dump file, since its only had one bluescreen since the format/removal of supposedly bad psu/ram/gcard.

Wish i had the rest of the dumps, there was about 10-15 of them before the format as it used to bluescreen 2-3x a day. Now its only been once in the past 3-4 days. Still thats once too much for an XP system. Ill post back with the results.

Hopefully itll bluescreen one more time or somthing, so i get more to look at.
 

Maximilian

Lifer
Feb 8, 2004
12,604
15
81
Hey, this is the basic debug i got:

*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 24, {1902fe, b1c2e498, b1c2e194, 8053998a}

Unable to load image SiWinAcc.sys, Win32 error 2
*** WARNING: Unable to verify timestamp for SiWinAcc.sys
*** ERROR: Module load completed but symbols could not be loaded for SiWinAcc.sys
Probably caused by : hardware ( Ntfs!NtfsLookupEntry+85 )

Followup: MachineOwner
---------

I dont think i configured the symbols thing correctly. Anyways, this is the detailed !analyse -v one:


1: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

NTFS_FILE_SYSTEM (24)
If you see NtfsExceptionFilter on the stack then the 2nd and 3rd
parameters are the exception record and context record. Do a .cxr
on the 3rd parameter and then kb to obtain a more informative stack
trace.
Arguments:
Arg1: 001902fe
Arg2: b1c2e498
Arg3: b1c2e194
Arg4: 8053998a

Debugging Details:
------------------


EXCEPTION_RECORD: b1c2e498 -- (.exr ffffffffb1c2e498)
ExceptionAddress: 8053998a (nt!memmove+0x0000010a)
ExceptionCode: c000001d (Illegal instruction)
ExceptionFlags: 00000000
NumberParameters: 0

CONTEXT: b1c2e194 -- (.cxr ffffffffb1c2e194)
eax=00000000 ebx=e1757518 ecx=00000000 edx=00000002 esi=e18cd0e6 edi=e12010f4
eip=8053998a esp=b1c2e560 ebp=b1c2e568 iopl=0 nv up ei pl nz na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010206
nt!memmove+0x10a:
8053998a 8f ???
Resetting default scope

CUSTOMER_CRASH_COUNT: 1

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0x24

PROCESS_NAME: csrss.exe

ERROR_CODE: (NTSTATUS) 0xc000001d - {EXCEPTION} Illegal Instruction An attempt was made to execute an illegal instruction.

LAST_CONTROL_TRANSFER: from ba5f9dd0 to 8053998a

MISALIGNED_IP:
nt!memmove+10a
8053998a 8f ???

STACK_TEXT:
b1c2e568 ba5f9dd0 e120104a e18cd03c 000000ac nt!memmove+0x10a
b1c2e588 ba5f9b28 b1c2e7d8 e1757518 00000101 Ntfs!NtfsLookupEntry+0x85
b1c2e7b4 ba5f5f72 b1c2e7d8 87808008 b1c2e908 Ntfs!NtfsCommonCreate+0x10c3
b1c2e960 ba67cf70 87808008 b1c2ebfc 89db3020 Ntfs!NtfsNetworkOpenCreate+0x8a
b1c2e980 bacbd7ef 87808008 b1c2ebfc 89d459c0 sr!SrFastIoQueryOpen+0x40
WARNING: Stack unwind information not available. Following frames may be wrong.
b1c2e9bc ba696927 000000f2 00000000 b1c2e9f4 SiWinAcc+0x17ef
b1c2ea14 80581dbe 87808008 b1c2ebfc 897b2a38 fltMgr!FltpFastIoQueryOpen+0xa1
b1c2eb00 805bdd08 89dc6c98 00000000 882a1648 nt!IopParseDevice+0x95c
b1c2eb78 805ba390 00000000 b1c2ebb8 00000040 nt!ObpLookupObjectName+0x53c
b1c2ebcc 80575f33 00000000 00000000 00000001 nt!ObOpenObjectByName+0xea
b1c2ed54 8054060c 0119eb24 0119eaec 0119eb50 nt!NtQueryFullAttributesFile+0x121
b1c2ed54 7c90eb94 0119eb24 0119eaec 0119eb50 nt!KiFastCallEntry+0xfc
0119eb50 00000000 00000000 00000000 00000000 0x7c90eb94


FOLLOWUP_IP:
Ntfs!NtfsLookupEntry+85
ba5f9dd0 83c40c add esp,0Ch

SYMBOL_STACK_INDEX: 1

SYMBOL_NAME: Ntfs!NtfsLookupEntry+85

FOLLOWUP_NAME: MachineOwner

IMAGE_NAME: hardware

DEBUG_FLR_IMAGE_TIMESTAMP: 0

STACK_COMMAND: .cxr 0xffffffffb1c2e194 ; kb

MODULE_NAME: hardware

FAILURE_BUCKET_ID: IP_MISALIGNED

BUCKET_ID: IP_MISALIGNED

Followup: MachineOwner
---------
 

dclive

Elite Member
Oct 23, 2003
5,626
2
81
Looks hardware-ish to me; what's siwinacc.sys? Run MPSReports and then open the drivers.txt file inside the .cab file, and find that siwinacc.sys entry, and then paste the entire section.

Anyway, yes, I'd agree with Smilin's point -- hardware. Take it back to your hardware vendor and tell them to give you a working system.
 

dclive

Elite Member
Oct 23, 2003
5,626
2
81
BTW, unless you get lots of symbols aren't configured properly errors, your symbols are configured fine.
 

Maximilian

Lifer
Feb 8, 2004
12,604
15
81
Well it was all built by me so i cant really take it anywhere unfortunately :(

This is what drivers.txt said about siwinacc.sys

Module[166] [C:\WINDOWS\SYSTEM32\DRIVERS\SIWINACC.SYS]
Company Name: Silicon Image, Inc.
File Description: Windows Accelerator Driver
Product Version: (1.0:0.11)
File Version: (1.0:0.11)
File Size (bytes): 17328
File Date: Thu Jul 13 19:42:42 2006
Module TimeDateStamp = 0x41868cbb - Mon Nov 01 19:21:31 2004
Module Checksum = 0x0000b02e
Module SizeOfImage = 0x00002880
Module Pointer to PDB = [C:\release\sifilter\i386\SIWinAcc.pdb]
Module PDB Guid = {048D254D-67F8-449A-A2F6-769A688FF8E2}
Module PDB Age = 0x1

I think its the driver for two of the six sata ports, says here that two are from a silicon image thing and the other four are from the nforce4 chipset.

So its a good bet the motherboard is to blame? Since its unlikely that the ram/gcard/psu all went bad at the same time huh? Somthing caused that bluescreen 3 days ago and the only things that havent been replaced/switched out are the mobo/hdd's/cpu.

Oh also, for the record this systems been running fine for 6+ months, theres been no major changes to cause such errors, its just happened.
 

dclive

Elite Member
Oct 23, 2003
5,626
2
81
Sounds like the best place to start. Unfortunately with hardware there's rarely a smoking gun, but, as you've seen, you'll just get random errors with random causes.
 

Maximilian

Lifer
Feb 8, 2004
12,604
15
81
Ok, well ill repost the results with my new mobo, and most likely create another troubleshooting threat when it dosent boot/work for whatever reason :p Thanks for helping!

So, for the record:

Suspect mobo: Epox 9NPA+ SLI

solution - new abit mobo arriving soon, replacemenr epox board will be sold, screw epox.

Suspect PSU: Seasonic S12 500w

solution - going to install in another rig and leave it for some intensive testing, ill even get the parents to use it! I think its ok though.

Result: Coming soon!
 

dclive

Elite Member
Oct 23, 2003
5,626
2
81
If you stop using the drives attached to that controller (the SIS controller) and disable it in the BIOS, does anything change?
 

Maximilian

Lifer
Feb 8, 2004
12,604
15
81
Originally posted by: dclive
If you stop using the drives attached to that controller (the SIS controller) and disable it in the BIOS, does anything change?

Theres nothing attached to it, i use IDE drives still. Ill disable it now and find out, although it would take a while to find out if the stability problem is solved as the bluescreens arent even daily anymore.
 

Maximilian

Lifer
Feb 8, 2004
12,604
15
81
Well, with my abit a8n 32x its a lot quieter for some reason. Also the memorys definately kinda messed. So far ive diagnosed 1x512mb stick with being bad, as memtest gives it 200+ errors in a few minutes. One of the 1GB sticks is messed up too, not bad badly though, takes a while to get 40 errors, around the 50-60% pass with the pattern of ff0fffff or somthing like that.

Memory used to be 2x 1gb (few months old) crucial value and 1x512mb crucial value (2 years old) 1x512mb samsung (2 years old, came with store bought comp)

Samsung - gets memtest errors, a LOT of them
Crucial 512 - seems fine
Crucial 2x1gb - it can pass memtest once, didnt have enough time to test throughly, but one of these sticks is broke im pretty sure. Had three bluescreens on a fresh windows install (fresh from the new mobo) with a 1xgb stick and that 512 crucial which im pretty sure is fine, so ive switched to the other 1gb stick and im typing now, no bluescreens yet. Ill have to run memtest for quite a while later on, to be sure.

In fact one of the 1gb sticks passed memtest 3 times, i was away watching forest gump and let it run. The other passed once before i assumed it was ok. Im gonna RMA both sticks anyways and run with just 1x512 until they get me my new 2x1gb.

So... Hows memtest? Its the dos thing with the blue screen im using (theres a windows based thing with the same name, im using the dos one) Is memtest pretty accurate? How many runs should i give it before i declare ram ok? Also, theres a column saying either "bank" or "channel" in it, does that refer to the memory slot, i.e. slots 1-4?
 

Maximilian

Lifer
Feb 8, 2004
12,604
15
81
Ive analyzed both dumps (only two there) and they were both caused by "ntkrpamp.exe" so i googled that and found this. Which makes sense, since memtest gave errors for one of those sticks it mustve been the bad one i had installed. Either way i aint takin a chance, ill have both of them replaced. :D