any ideas on the cause of this?

imported_coho

Junior Member
Dec 8, 2007
12
0
61
the system:

win xp pro sp2 32bit
Athlon 64 3200+ clawhammer
DFI Lanparty UT Nforce 3 with Realtek ac97
1gb OCZ Gold PC3200
Gainward Geforce 6800 GT 256mb
benq dw1640
Antec slk1650b (with 350w smartpower 2.0)

I put this system together ~3 years ago for my lil brother and it's ran flawless until recently. I was away out of town on work and when i came back he told me the system started randomly resetting while doing a variety of tasks. His inclination was to reformat, and when he attempted to do this, the system would hang and/or produce memory read error messages during the installation of xp.

So i took a quick look at it and yup, i didn't even get to begin installation and I got signs that things weren't right:

http://img451.imageshack.us/my.php?image=img1864io6.jpg

The unit wasn't terribly dusty inside but for thoroughness' sake I cleaned out the entire case with compressed air while paying special attention to the cpu hsf, the video card hsf, the psu fan, and the case fans. While inside i verified everything was seated correctly (the unit also wasn't moved recently) and none of the electrolytic caps have gone. Although accuracy can be questionable, a quick glance at the bios voltages and the battery voltage show nothing abnormal. The entire system is plugged into a surge arrest (although I don't know if it fails open or not)...but power spikes aren't common around here anyways.

Although i'm no expert....and i haven't tested the memory, but it'd seem weird that ocz memory would just "up and go" like that but???

Any suggestions on tips and/or an algorithm for testing? i could swap out the memory, and vid card but that's about it for extra parts to play with. I could spend a lot of time playing around, but i'm tired and lazy and want this to be as "quick n' easy" as possible.

Cheers,

Coho
 

robisbell

Banned
Oct 27, 2007
3,621
0
0
can you manually test the PSU with a multimeter while under a load and not, and post the results?
 

imported_coho

Junior Member
Dec 8, 2007
12
0
61
alright, here's an update:

-checked the 12, 5, & 3.3v rails....all checked out fine

-rechecked all of the caps, looking for leakage at the bottom-no dice

-the system went through 2 consecutive rounds of memtest 1.70 with 0 errors

-I noticed that the weird looking characters (and the accompanying machine lockup) in the picture in my first post only appeared if you did NOT push a button to begin booting from the cd....whereas if you pushed a button within the appropriate time interval, the machine would proceed to load & install windows

-As a result of this observed pattern I wondered if the dvd drive had anything to do with it (despite the fact that my brother said the system resets occurred even when the dvd drive wasn't being accessed), so I swapped drives, and received the same result as in my first post.

-If I choose to install windows, the install process proceeds all the way until right after the very end of the file copy process with the yellow progress bar. Right after the file copy process is complete and the menu fades to black to continue on the to next step, I get this BSOD:

http://allyoucanupload.webshot.../v/2003748557521236280

-Next I wondered if the Win XP disc was to blame. So I swapped discs and rather than receiving the prompt with the weird characters after not pushing a button to begin booting from the disc like in my first post, the system proceeds normally to look for a boot image from another source...the way it should. Then I tried both discs in another working machine...both discs worked as they should, and the weird characters didn't occur?

-Being the curious lad i am, i wondered what would happen if i continued to install Win XP using the 2nd disc? Again the install continued normally until right after the file copy process with the yellow bar, but this time I receive a totally different error message:

http://allyoucanupload.webshot.../v/2003792690607092268

Hmmm....i'm confused....maybe a southbridge problem? I don't know.
 

Quiksilver

Diamond Member
Jul 3, 2005
4,725
0
71
Have you flashed to the latest BIOS?
Have you tried resetting the CMOS?
If you have a spare CMOS battery laying around I would try replacing it to be sure its not dying.

For Rob:
YAY! you got it right and not telling him to use inaccurate software.
 

robisbell

Banned
Oct 27, 2007
3,621
0
0
"-checked the 12, 5, & 3.3v rails....all checked out fine" what were the readings with and without a load, to be exact?
the errors are indicating a memory error. "This Stop message usually occurs after the installation of faulty hardware or in the event of failure of installed hardware (usually related to defective RAM, either main memory, L2 RAM cache, or video RAM)."
I'd suggest you goto http://ultimatebootcd.com and burn the iso to a cd and run the memtest86+ for 6 hours at minimum, then run the HDD diagnostic test for 4 hours minimum.
 

imported_coho

Junior Member
Dec 8, 2007
12
0
61
Originally posted by: QuiksilverX1
Have you flashed to the latest BIOS?
Have you tried resetting the CMOS?
If you have a spare CMOS battery laying around I would try replacing it to be sure its not dying.

For Rob:
YAY! you got it right and not telling him to use inaccurate software.

-yup, the mobo has had the latest bios for years since they've stopped updating it. Same goes for dvd writer firmware and drivers. I guess computers are weird creatures but it would also seem weird to all of a sudden have problems after the system was previously rock solid for the longest time...and nothing was changed (some point source causation), or moved around (to cause cables/cards to be disconnected)???

-i'll try the battery bit....would be a nice easy fix. In looking at the settings, everything seems fine and the way i left it. In speculation on things being changed, my brother would only modify things at an operating system level, not a bios level.
 

imported_coho

Junior Member
Dec 8, 2007
12
0
61
Originally posted by: robisbell
"-checked the 12, 5, & 3.3v rails....all checked out fine" what were the readings with and without a load, to be exact?
the errors are indicating a memory error. "This Stop message usually occurs after the installation of faulty hardware or in the event of failure of installed hardware (usually related to defective RAM, either main memory, L2 RAM cache, or video RAM)."
I'd suggest you goto http://ultimatebootcd.com and burn the iso to a cd and run the memtest86+ for 6 hours at minimum, then run the HDD diagnostic test for 4 hours minimum.

-well, I guess one would have to have a clear identification of "load", but in observing the voltages during the setup process (as that's all i can observe at this point), I got the following: 12v= constant 11.95/11.96, 5v = 5.0x, 3.3v = 3.3xx

-i'll have to look into running the memtest & hd test
 

robisbell

Banned
Oct 27, 2007
3,621
0
0
thank you for posting the results. I'd watch the 12v main carefully. I'll be awaiting the results of the tests.
 

imported_coho

Junior Member
Dec 8, 2007
12
0
61
well, i left memtest running overnight...

I got:

27 passes....26 errors...with both sticks in...which would imply that it had at least 1 error within the first 2 or 3 passes?!

wtf???!!! I ran it yesterday with both sticks in and then each of the two sticks individually...all for 3 passes and all 3 permutations gave 0 errors!? why the errors so soon now?

I replaced the 2032 3v battery....and am running the tests again. I was a little busy today so i've yet to try the hd tests yet
 

robisbell

Banned
Oct 27, 2007
3,621
0
0
memtest86+ is better than the old memtest you used.

if it throws the errors again, we need to test each stick individually.
 

imported_coho

Junior Member
Dec 8, 2007
12
0
61
Originally posted by: robisbell
memtest86+ is better than the old memtest you used.

if it throws the errors again, we need to test each stick individually.



well, i tried each of the two sticks individually again...and this time one stick gave errors, and judging by the fact that i'm typing this message to you on the machine in question...that the other stick didn't.

I'm still a little bothered by the fact that i did the exact same thing yesterday and didn't get errors?!?!

To add even more flame to the puzzling fire...I mentioned that my brother had formatted so I reinstalled & configured windows, drivers, and pertinent software. during the install of the 6800's drivers I kept getting corrupt display & hangup upon driver installation. I thought "oh brother, not again". Luckily (unfortunately) i had experienced the same problem before with my 7800gs...the 4 pin molex on the video card was plugged in but for some reason it wasn't making a thorough connection and everytime the driver would proceed to initialize and draw more current...it'd crap out.

Oh well...frustrating it was, but now it's in my brother's court...he has to find the receipt so we can send the stick off to be replaced.

Robisbell...I can't say "thank you" enough for taking the time to help out....it was much appreciated. You certainly helped out a lot more than anyone else on the numerous online linux communities has about linux questions. Merry Christmas to you & your family.
 

robisbell

Banned
Oct 27, 2007
3,621
0
0
hey, no problem, I just try to help and get the system back to 100% and where it should run stable. I am always glad to help and I hope you and yours have a great holiday season.