New machine built by Monarch
opteron 240 w/ Tyan Thunder mobo
1gb ECC corsair
3ware 9xxx with a ton of drives
i have linux kernel 2.6.12.5 running on it. it ran fine for a month in testing and then 2 months in production, but a week ago it decided to crash. i got ping replies, but could not ssh in or console in. no response from any services. i checked the backups, they ran fine the night before so i reset the machine. it came up just fine. this morning the same thing happened. nothing in the logs that tell me anything.
im thinking that the most common things that can cause a complete system outage like this is a kernel or hardware problem. but i would have thought that after 3 months, the logs would show me some sort of anomaly with the kernel or in dmesg that would clue me in that its the kernel. otherwise, maybe hardware? i got an alert from our environmental monitor about low humidity... the alert goes off at 13% RH so it could have dropped way beneath that... looking at google i see some reference that amd64s dont like anything below 10% RH...
has anyone experienced problems due to low humidity in the server room? from what im told, low RH can cause excess static... that could take a server down or perhaps the ground isnt disapating the electricity properly?
any ideas are appreciated!
thanks
opteron 240 w/ Tyan Thunder mobo
1gb ECC corsair
3ware 9xxx with a ton of drives
i have linux kernel 2.6.12.5 running on it. it ran fine for a month in testing and then 2 months in production, but a week ago it decided to crash. i got ping replies, but could not ssh in or console in. no response from any services. i checked the backups, they ran fine the night before so i reset the machine. it came up just fine. this morning the same thing happened. nothing in the logs that tell me anything.
im thinking that the most common things that can cause a complete system outage like this is a kernel or hardware problem. but i would have thought that after 3 months, the logs would show me some sort of anomaly with the kernel or in dmesg that would clue me in that its the kernel. otherwise, maybe hardware? i got an alert from our environmental monitor about low humidity... the alert goes off at 13% RH so it could have dropped way beneath that... looking at google i see some reference that amd64s dont like anything below 10% RH...
has anyone experienced problems due to low humidity in the server room? from what im told, low RH can cause excess static... that could take a server down or perhaps the ground isnt disapating the electricity properly?
any ideas are appreciated!
thanks