Any tricks for optimizing a system to handle bad shutdowns?

Red Squirrel

No Lifer
May 24, 2003
67,380
12,129
126
www.anyf.ca
I have this Linux VM at work that acts as a server for the department. Problem is they often force reboot on our machines at random with no warning. This results in the VM shutting down hard and sometimes stuff corrupts. Is there any "hardening" I can do to make the system more resilient to that? Like maybe a different file system or a setting or something? I would love to run it native so I don't need to use a VM but it has to be Windows on the host.
 

mxnerd

Diamond Member
Jul 6, 2007
6,799
1,101
126
Don't think there is any way to prevent hard shutdown file corruptions.

Ask the IT dept to suspend all VMs before shutting down the host.

Assuming VMWare Workstation is the software running on Windows host.


Create a Windows batch file vmsuspends.bat under

C:\Program Files (x86)\VMware\VMware Workstation directory

contents:

for /f "tokens=* skip=1" %%a in ('vmrun list') do vmrun suspend "%%a"

and ask IT dept do the VMs suspending before shutting down the Windows host.

It's easy.

==

Tested,
 

Red Squirrel

No Lifer
May 24, 2003
67,380
12,129
126
www.anyf.ca
It's running Virtualbox. There is no way of knowing when these reboots hit so no way of really scripting anything. Even our own IT has no control over the reboots, that's all done by some other desktop group that's not even in the same building.

One thing I just though of though is trying to monitor the system processes to see if a new process gets spawned right before a forced reboot, so I could write a script that just keeps monitoring for that, but I don't think the system would shut down on time so it's probably worse to have it shut down hard in middle of a shut down than when it's just idle. Though I could maybe pause it...
 

mxnerd

Diamond Member
Jul 6, 2007
6,799
1,101
126
Wow. How can a big company's IT dept or outsourced tech group so incompetent?

A corrupted file system can take hours to fix or restore from the archive.

Proper system shutdown is critical and they handle it like nothing? No words.

I don't think you will have enough time to do proper suspension / shudown for the VM even if you can detect the spawned preceding process.
 
Last edited:

Red Squirrel

No Lifer
May 24, 2003
67,380
12,129
126
www.anyf.ca
Yeah it's messed up but not surprising, incompetence runs amok in this company. Seems the bigger a company gets the more the incompetence and silowing grows.
 

mv2devnull

Golden Member
Apr 13, 2010
1,498
144
106
There is no way of knowing when these reboots hit so no way of really scripting anything. Even our own IT has no control over the reboots, that's all done by some other desktop group that's not even in the same building.
The suspend scripts should be run by the group that initiates the reboot. Why their toys are in your building?

Why have you put your services on a platform, whose admins you cannot communicate with?
 

extide

Senior member
Nov 18, 2009
261
64
101
www.teraknor.net
Have them host the VM. Your team should not be relying on a VM running on your own workstation to do anything important.

Also these days all filesystems are journaled so it's not like the old days of ext2 where a non-graceful shutdown would cause many issues in the filesystem -- so a random shutdown really shouldn't corrupt much. Perhaps your user-mode apps? Even a DBMS should handle a random shutdown these days fairly gracefully.
 

Red Squirrel

No Lifer
May 24, 2003
67,380
12,129
126
www.anyf.ca
The suspend scripts should be run by the group that initiates the reboot. Why their toys are in your building?

Why have you put your services on a platform, whose admins you cannot communicate with?

Don't have a choice. Not allowed to just put an arbitrary machine on the network. It has to be an approved machine which has the company image.

They also won't provide us any official space for the VM, this is not an official thing, but something I built to help the department. Essentially we rely on tons of spreadsheets and other stuff where the information is all over the place and hard to get to, so I built a web based system and put all the info in one place in a way we can also edit/add etc. It's all management approved and all, it's just not an official infrastructure as far as IT is concerned.

The biggest issue I run into with the hard shutdowns is the initial boot after takes around 20 minutes. There is this kdump file it needs to rebuild as it corrupts. So far it's only happened once where I had to rebuild the VM from backup.

It's a dirty setup I would never even think of doing at home and a native Linux box would be 100x better, but you can only F with the C you got. :p
 

Fallen Kell

Diamond Member
Oct 9, 1999
6,037
431
126
I can only think of one thing that can help solve the issue, and that is to have a dedicated hard drive that you pass-through to the VM instead of using an image file. For the dedicated hard drive, get a SSD with power loss imminent (PLI) tech like an Intel DC 3700 or similar, and use ext4 or similar filesystem on it. The drive is fast enough with long enough internal power to finish committing writes to the disk that were ongoing at the time of the shutdown/power loss. I would also turn off the kernel dump (or limit it's size) as you know the reason for the fault (i.e. someone keeps rebooting the physical host). This will remove/reduce time it takes to reboot once it does come back online.
 

Red Squirrel

No Lifer
May 24, 2003
67,380
12,129
126
www.anyf.ca
Hmm would the external drive idea actually work? Wouldn't the computer still command the disk to stop hard even if I keep it powered on? The hard drive of the machine never goes down hard, the Windows OS itself is still shutting down properly, it's the VM that does not shut down properly as the host just kills it off, like the same way you would initiate a shutdown without turning off your VMs first.

One thing I had also thought of is to use ICS and then have an actual Linux box running but that didn't work, I think I didn't have access to ICS or something, I forget the reason why... I might revisit it in case I missed something. I could add another NIC to the machine then have the server run on a RPI. That way when they force reboot the machine it does not kill the Linux part.
 

mxnerd

Diamond Member
Jul 6, 2007
6,799
1,101
126
If the Linux VM can be installed on an SSD and the Windows host is actually shutting down properly, I believe it's possible to suspend the Linux VM properly before Windows shuts down.

Windows shutdown is a long process. VM Suspension on an SSD should be super fast.

You just need to detect the shutdown event and find the commands/scripts to do the dirty things.

==
ex:




You get the idea.

then add / create Virtual Box VM suspend commands/scripts.
 
Last edited:

Red Squirrel

No Lifer
May 24, 2003
67,380
12,129
126
www.anyf.ca
Hmmm good to know there is a way to detect the shutdown with powershell then maybe that is what I need. Then I can just command vbox to shut down the vm, even if it's on local HDD it won't really matter at that point provided it can do it fast enough. I can put it on a SSD to speed it up if need be. I do have local admin so I SHOULD be able to install powershell on it, if it's not already installed as part of the normal image.

It does not use the conventional shutdown.exe command though, it uses something else (something lower level/faster) but the command might still work.
 
  • Like
Reactions: mxnerd