Not sure if anyone can help with this, as it's tough to investigate after the fact... but the short version is that a PC that I built 4 months ago had a major hardware failure in at least 2 components that should be independent. I'd really like to know why, and whether I can prevent a recurrence.
I diagnosed and rebuilt the PC by component swapping after this failure, and narrowed down the failing components to:
- CPU or motherboard (Core i3 530 and Gigabyte GA-H55M-UD2H)
- Hard drive (Samsung F3 1TB, HD103SJ)
The PC was on an APC UPS (Back-UPS) - don't have any record of power events as the UPS was switched off for quite a while. I could find out from other UPSs in the house if there was anything.
After the failure (seemed to be after a 0x3B BSOD but unclear how that's involved), the system would not boot - PSU powered up fans and the 4 LEDs lit up on motherboard, and about half of the time I would get a single beep on PC speaker, and half of the time no beep. Then it would reboot after spinning fans for a few seconds. Looked a bit like a RAM problem, but removing CMOS battery for a few minutes didn't help.
POST code shown on a Startech PCI card (added after failure) was 00. I tested with different CPU fan (stock), removed all SATA/PATA connections, removed 1 RAM stick, etc. Also tried a PCI graphics card, which the BIOS defaults to.
I ordered new instances of the same model PSU, RAM, CPU and motherboard. I built with new CPU+motherboard, testing with the new PSU and old PSU, and the old RAM, and it booted OK (though only 1 RAM stick). I didn't want to blow the new CPU or motherboard so I haven't tested them against the old ones.
The hard drive seemed to work for the C and D partitions, and I booted Win7 using this drive with the rebuilt PC, then realised that some parts of the disk were generating large numbers of unrecoverable read errors. The errors only under Ubuntu, and under Win7 there were no errors in event log about the hard disk, only the BSOD. However, while in Win7 I did have another BSOD (0x7A)
I've now replaced that drive with a new one of same model.
I have a complete backup of the C and D drives of the PC as of 1 hour before the BSOD, so I can see most of the event log up to that point. (Image backups using the excellent ShadowProtect.) I've also got the original BSOD (0x3B) details from the event log.
A power event seems the most likely cause, but the APC UPS includes AVR that should stop that (even though the battery is now 3 years old and the PC is powered up 24/7).
If it was 2 independent failures, I would have thought the hard drive would have shown progressively more read errors, as it did immediately on Linux afterwards. If it was the CPU/motherboard failing first, how could that create bad blocks on the hard disk?
Any ideas at all on what might have happened? Your help would really be appreciated on this one, as I'm completely at sea here...
Full components (original and rebuilt PC):
- Intel Core i3 530 2.93GHz
- Gigabyte GA-H55M-UD2H
- RAM: Corsair CMX4GX3M2A1600C9 - 4GB (2x2GB) XMS3 DDR3 PC3-12800 (1600), Non-ECC, Unbuffered, CAS 9-9-9-24, XMP, 1.65V
- Hard drive: Samsung HD103SJ - Spinpoint F3 1TB
- Optical drive: Samsung SH-S223C DVD writer
- Antec P183 Case
- Corsair CX400W PSU - 400W
I diagnosed and rebuilt the PC by component swapping after this failure, and narrowed down the failing components to:
- CPU or motherboard (Core i3 530 and Gigabyte GA-H55M-UD2H)
- Hard drive (Samsung F3 1TB, HD103SJ)
The PC was on an APC UPS (Back-UPS) - don't have any record of power events as the UPS was switched off for quite a while. I could find out from other UPSs in the house if there was anything.
After the failure (seemed to be after a 0x3B BSOD but unclear how that's involved), the system would not boot - PSU powered up fans and the 4 LEDs lit up on motherboard, and about half of the time I would get a single beep on PC speaker, and half of the time no beep. Then it would reboot after spinning fans for a few seconds. Looked a bit like a RAM problem, but removing CMOS battery for a few minutes didn't help.
POST code shown on a Startech PCI card (added after failure) was 00. I tested with different CPU fan (stock), removed all SATA/PATA connections, removed 1 RAM stick, etc. Also tried a PCI graphics card, which the BIOS defaults to.
I ordered new instances of the same model PSU, RAM, CPU and motherboard. I built with new CPU+motherboard, testing with the new PSU and old PSU, and the old RAM, and it booted OK (though only 1 RAM stick). I didn't want to blow the new CPU or motherboard so I haven't tested them against the old ones.
The hard drive seemed to work for the C and D partitions, and I booted Win7 using this drive with the rebuilt PC, then realised that some parts of the disk were generating large numbers of unrecoverable read errors. The errors only under Ubuntu, and under Win7 there were no errors in event log about the hard disk, only the BSOD. However, while in Win7 I did have another BSOD (0x7A)
I've now replaced that drive with a new one of same model.
I have a complete backup of the C and D drives of the PC as of 1 hour before the BSOD, so I can see most of the event log up to that point. (Image backups using the excellent ShadowProtect.) I've also got the original BSOD (0x3B) details from the event log.
A power event seems the most likely cause, but the APC UPS includes AVR that should stop that (even though the battery is now 3 years old and the PC is powered up 24/7).
If it was 2 independent failures, I would have thought the hard drive would have shown progressively more read errors, as it did immediately on Linux afterwards. If it was the CPU/motherboard failing first, how could that create bad blocks on the hard disk?
Any ideas at all on what might have happened? Your help would really be appreciated on this one, as I'm completely at sea here...
Full components (original and rebuilt PC):
- Intel Core i3 530 2.93GHz
- Gigabyte GA-H55M-UD2H
- RAM: Corsair CMX4GX3M2A1600C9 - 4GB (2x2GB) XMS3 DDR3 PC3-12800 (1600), Non-ECC, Unbuffered, CAS 9-9-9-24, XMP, 1.65V
- Hard drive: Samsung HD103SJ - Spinpoint F3 1TB
- Optical drive: Samsung SH-S223C DVD writer
- Antec P183 Case
- Corsair CX400W PSU - 400W
Last edited: