Had a problem with a server at work last week.
All the drives in the main array failed within about 100 ms of each other. Got a ton of SNMP alerts "physical drive status change 3046" - one for each drive in the array.
Anyway, someone calls HP, and they send out a dude, who diagnoses that all the hard drives have failed, replaces the drives and takes the old ones as RMA.
I'm just a bit annoyed, as that server was used as staging storage for all our raw data - due to the volume of data, we only archive and back-up selected compressed/processed data - but this was our only copy of the raw data.
It's not a catastrophic loss, but for 8 drives to drop out of an array simultaneously, it really does sound like some sort of software glitch, and a force-mount might have sorted it out.
Of course, no one actually thought to mention that the "repair" might have meant the loss of all data to the server. I got into work this morning and found the disk blank and had to raise a ticket with our IT dept to get the bad news.
All the drives in the main array failed within about 100 ms of each other. Got a ton of SNMP alerts "physical drive status change 3046" - one for each drive in the array.
Anyway, someone calls HP, and they send out a dude, who diagnoses that all the hard drives have failed, replaces the drives and takes the old ones as RMA.
I'm just a bit annoyed, as that server was used as staging storage for all our raw data - due to the volume of data, we only archive and back-up selected compressed/processed data - but this was our only copy of the raw data.
It's not a catastrophic loss, but for 8 drives to drop out of an array simultaneously, it really does sound like some sort of software glitch, and a force-mount might have sorted it out.
Of course, no one actually thought to mention that the "repair" might have meant the loss of all data to the server. I got into work this morning and found the disk blank and had to raise a ticket with our IT dept to get the bad news.
