system instability

Red Squirrel

No Lifer
May 24, 2003
70,540
13,791
126
www.anyf.ca
I have VERY bad luck with BSODs. My old PC BSODed so bad that it would corrupt the C partition about 10 times, then it would attack the D partition... this happened to TWO machines. Anyway, when it hit my machine at home I had to rebuild a new PC since I had no freaken clue wtf was causing this, it was not hardware (all diagnostic tests passed), and it was not software(I kept restoring a fresh image every time, an image that had proved to be stable for the longest time).

Now I'm getting more bad luck, my new machine randomly BSODs, OR freezes completely, cursor and all. The BSODs are too fast, but say something about DRIVER_KRNDL_10101001 then a bunch of hex values, it's not your typical NT BSOD with tons of writing, it's like 2 lines. I can tell when it's going to happen since sometimes the system will beep, freeze, the screen will turn off, then come back on to the BSOD, then it reboots.

what's going on? I'm a comp sci student so obviously know what I'm doing, but this has me stuned, win2k was not this bad at me before....


System specs:

AMD 64 3200+
Asus A8V
2X 512MB of DDR ram (forget speed, I think it's 2700) dual channel
120GB seagate sata HDD
Nvidia gforce 4 4200TI

Software running while it happends:
-Sygate personal firewall
- MSN
- AVG
- Asus Probe

OS is windows 2000 SP4

I use other programs, but it does not always crash with those programs, the ones above are the only always on programs.

Also I have VMware and as far as I know there's some services related to VMware that run such as the network stuff. But VMware itself is not always on when it happends.

I checked the event log but only errors I find is something about DCOM which are far from the time of the crashes:

Code:
Event Type:	Error
Event Source:	DCOM
Event Category:	None
Event ID:	10010
Date:		10/7/2005
Time:		10:14:45 PM
User:		DESTROYER\ryan
Computer:	DESTROYER
Description:
The server {73AA8F59-DBC4-11D0-AF5C-00A02448799A} did not register with DCOM within the required timeout.

Anyone have a clue what's going on? Is there a way to "capture" BSODs so I can post more info next time it happends? It's exponentially more often. It used to be once or twice a month then per week, and now, today I got 2... so the next step is total corruption of my HDD like all other times.

I've done plenty of memetests, HDD tests etc... so I know it's not that. My CPU/Mobo temps are also super low. Around 30 for cpu and 20 for mobo.

Been doing some research with the little info I have and found another BSOD I got a couple of times: IRQL_NOT_LESS_OR_EQUAL I got that a couple of times too, but have not seen it in a while, now it's the DRIVER_NTKRNL one and a system file (no time to write it down)


Edit: Just got another... this time it said something about page_fault but again, too fast to read it properly. This is getting F-ing rediculous.
 

Red Squirrel

No Lifer
May 24, 2003
70,540
13,791
126
www.anyf.ca
Ok this is gettign really bad. I can't play any games, because games either don't start, or crash midway. Twice today now, in Halo I got the "gathering exeption data" thing apear and I loose my spot in the game. WTF is wrong with this POS computer?


Another thing I thought of, could it be that the probe for the cpu temp is faulty, and that my cpu is actually overheating? Since it tells me my temp is rather low even when all my fans are turned off, so could it be missreporting? Is it normal for a AMD 64 3200+ to only be at 32 degrees with stock hsf and all case fans turned off? My AMD Athlon 2700+ would go near 60-70 in those conditions.
 

Red Squirrel

No Lifer
May 24, 2003
70,540
13,791
126
www.anyf.ca
Did a clean install of win2k... no luck. I have not had any BSODs and stuff in a couple of weeks, but I've been trying games hoping that the machine could at least get me somewhere at a lan party... it won't play ANY games. Halo crashes randomly with the "gathering exeption data" dialog, LOTR battle won't even start, let alone crash, Lego star wars wont start either. heck, I've even tried RollerCoaster tycoon to no avail... Can't be my video card, it's an nvdia gforce 4200 TI...
 

daniel49

Diamond Member
Jan 8, 2005
4,814
0
71
Originally posted by: RedSquirrel
Ok this is gettign really bad. I can't play any games, because games either don't start, or crash midway. Twice today now, in Halo I got the "gathering exeption data" thing apear and I loose my spot in the game. WTF is wrong with this POS computer?


Another thing I thought of, could it be that the probe for the cpu temp is faulty, and that my cpu is actually overheating? Since it tells me my temp is rather low even when all my fans are turned off, so could it be missreporting? Is it normal for a AMD 64 3200+ to only be at 32 degrees with stock hsf and all case fans turned off? My AMD Athlon 2700+ would go near 60-70 in those conditions.
*******************************************************************
I have an amd 3200 cpu this time of year I am often idleing between 30-34 cel
my xp 2400+ on the other hand same conditions tends to idle about 10 degrees higher
tried fans and bigger case didn't help they just run hotter.

might try slowing your ram down a little for an experiment just to see if it helps?
 

Brazen

Diamond Member
Jul 14, 2000
4,259
0
0
sounds like overheating... or bad psu. You're not overclocking anything are you?
 

drag

Elite Member
Jul 4, 2002
8,708
0
0
Your not running long cables IDE are you?
They aren't the 'rounded' type are they?

Those can cause data corruptions and BSOD easily enough. The whole selling super long parrallel ide cables and fancy rounded ide cables is a 100% scam and pisses me off.

To rule out memory problems download and burn a copy of a memtest86 bootup cdrom. (or a floppy) boot that up and let it run for a 4-8 hours (I usually net it run while I am at work or asleep.). If you get a single error then you have bad memory module issues that need to be delt with.

Maybe your goofing up by putting way to much heatsink compound on your cpu. Only a tiny drop is needed. If it smears everywere then it can cause shorts and overheating issues.

Something definately weird is going on and thats it is consistant between many different computers is confounding.
 

Red Squirrel

No Lifer
May 24, 2003
70,540
13,791
126
www.anyf.ca
I do have round IDE cables, not home made ones, but pre made ones... but I've done memtest overnight and during school the next day, about 12 hours or so, clean installed and I doubt I'm having overheating issues unless asus probe is wrong. Even with not all my fans turned on and F@H full tilt it only runs at about 40C (lower when all fans are turned on)

I've already changed my PSU, thinking it was what was causing me issues, but it's hard to find the good high end ones around here. This is a "Best Power" 500W which sounds like a non name brand. The voltages look fine in the asus probe. 11.96, 5.06, 3.24 (3.3), 1.34(vcore). The 3.3 could maybe be an issue though... If the ram slots themselves are bad, would memtest pick that up? if I tried to run in non dual channel could it do a difference?
 

RBBRMADE

Senior member
Oct 28, 2003
491
0
0
Just a thought -- I had an AMD system a few years back that gave me similar issues until I got a good UPS that smoothed the incoming power.

I like the idea of swapping IDE cables. Even if you just swapped your HDD and optical around. See what happens.

If your RAM slots are bad I agree Memtest86 would pick that up.

Have you looked at the more detailed errors in event viewer?

You can turn off the auto system restart on bsod...
"To disable it, go to your control panel, double click on System, select the Advanced tab, select Startup And Recovery, and under System Failure, uncheck "Automatically Reboot".

This shoud allow you to read the details on the BSODs.

Ron
 

Red Squirrel

No Lifer
May 24, 2003
70,540
13,791
126
www.anyf.ca
Yep I have an APC UPS. It's a lower end one (only 350VA) but it should at least filter stuff well enough, at least I would think.

I'll try swaping cables and disabling the system restart. Hopefully BSODs are over with now as I did not get one for a while even before my CI but at least if I do I can read it.
 

Red Squirrel

No Lifer
May 24, 2003
70,540
13,791
126
www.anyf.ca
Yep I have an APC UPS. It's a lower end one (only 350VA) but it should at least filter stuff well enough, at least I would think.

I'll try swaping cables and disabling the system restart. Hopefully BSODs are over with now as I did not get one for a while even before my CI but at least if I do I can read it.
 

drag

Elite Member
Jul 4, 2002
8,708
0
0
Originally posted by: RedSquirrel
I do have round IDE cables, not home made ones, but pre made ones... but I've done memtest overnight and during school the next day, about 12 hours or so, clean installed and I doubt I'm having overheating issues unless asus probe is wrong. Even with not all my fans turned on and F@H full tilt it only runs at about 40C (lower when all fans are turned on)

I've already changed my PSU, thinking it was what was causing me issues, but it's hard to find the good high end ones around here. This is a "Best Power" 500W which sounds like a non name brand. The voltages look fine in the asus probe. 11.96, 5.06, 3.24 (3.3), 1.34(vcore). The 3.3 could maybe be an issue though... If the ram slots themselves are bad, would memtest pick that up? if I tried to run in non dual channel could it do a difference?


Memtest should pick it up. All it does is do different patterns of data and swaps it into memory while keeping track of addresses. If the data comes back incorrect then it knows that little part of the memory is bad. Something like that.

If there is a problem with the memory slots, or the memory controller what generally happens is that you get a whole bunch of errors that are seemingly random. Lots of errors. Thats usually a issue with the motherboard, dirty/broken slots, or the the memory controller and stuff like that.

With errors with the physical ram being screwed up you tend to get very specific errors with some tests over a specific range of memory addresses. Also intermittent errors are almost always corrupted ram. Like if you get a few rounds of testing that comes up clean, but one or two that fail on one test, then that's almost always corrupted ram. People sometimes make the mistake that if the test pasts sometimes, then other times it fails that means the ram is so-so, which it isn't. Decent ram should run for weeks with no errors.. I read once that a ram manufacturer calculated out that on average, due to a cosmic ray or some silliness, that a computer will get one or two memory errors a year. (which is why you'd would want to buy ECC ram for servers)
 

Red Squirrel

No Lifer
May 24, 2003
70,540
13,791
126
www.anyf.ca
Hmm I said I had round cables but forgot that I have a sata drive now so the round cables are only for the optical drives. the sata drive has an ordinary sata connector.
 

xtknight

Elite Member
Oct 15, 2004
12,974
0
71
You can disable automatic rebooting so you can read it. It's in system properties, advanced, startup and recovery.

My dad just got a plethora of totally random BSODs too. Sometimes (or lots of the time) Windows just goes to crap. SP2 was supposed to fix it, but of course didn't. Even the most simple software can cause issues with Windows. Make sure you just restore that drive image and install NOTHING else. A BSOD my dad was getting (BAD_POOL_HEADER and a couple others) was linked to Nero in several instances.
 

mechBgon

Super Moderator<br>Elite Member
Oct 31, 1999
30,699
1
0
The next things I'd suggest are:

1) get a name-brand PSU with ample wattage. Quality first. Seasonic, Antec, Fortron, PC Power & Cooling and Enermax are on my GRAS* list.

2) get the brand & model of your RAM and see what voltage it's rated for. Especially if it's fancy stuff (heatspreaders being a general sign of "fancy"), raise the memory voltage to 2.8 volts and see what happens. Memtest is not a bad way to start testing the RAM, but it doesn't run while your video card, HDD and optical drive are also yanking the PSU around during a map change in BF2 or whatever, so let real-world usage be the final acid test.

3) if you are using an older BIOS on the A8V, consider upgrading the BIOS to the latest version. A8V downloads Is yours an A8V proper, or is it one of the other variants? because you don't want to goof that one up :eek:





*generally recognized as safe

 

Red Squirrel

No Lifer
May 24, 2003
70,540
13,791
126
www.anyf.ca
Hmm I'll see if I can get my hands on a PSU of that brand then. The ram is just low end stuff, well Kinston, not sure how low end that is, but I sure did not pay 500 bucks per stick like some people do. I've thought of upgrading the bios, but it is a risky process. If I get the wrong one, etc.
 

Red Squirrel

No Lifer
May 24, 2003
70,540
13,791
126
www.anyf.ca
Ordered new ram, I'll start with that, maybe memtest is not detecting it, and if the ram is not bad it's not a loss, more ram never hurts. :D (the extra will probably go in my server)