I think I just figured out what randomly crashes my file server. I've learned that any situation in Linux where the disk IO is bogged down too much, it causes a chain reaction and stuff starts to crash, and you get errors like "task delayed for 120 secs" and it will start to kill off stuff.
I think I found the source of what occasionally bogs down my entire file server, which in the end will bog down everything else such as my VMs. It's the stupid raid-check cron job! I don't know who's bright idea that was, but it runs once a month and it checks ALL arrays at the same time! The load average right now on my server is over 10! It's barely responsive.
Is this raid check really necessary? Is there a way I can split it up so it only does one array at a time at least? The cron job is just this:
Is there a way to make it only check one array at once, and perhaps do it at a lower priority?
I think I found the source of what occasionally bogs down my entire file server, which in the end will bog down everything else such as my VMs. It's the stupid raid-check cron job! I don't know who's bright idea that was, but it runs once a month and it checks ALL arrays at the same time! The load average right now on my server is over 10! It's barely responsive.
Is this raid check really necessary? Is there a way I can split it up so it only does one array at a time at least? The cron job is just this:
Code:
0 1 * * Sun root /usr/sbin/raid-check
Is there a way to make it only check one array at once, and perhaps do it at a lower priority?